A mention in the BeSpecific blog tipped me off to an interesting project called CourtListener.com. From the about page:
The goal of the site is to create a free and competitive real time alert tool for the U.S. judicial system.
At present, the site has daily information regarding all precedential opinions issued by the 13 federal circuit courts and the Supreme Court of the United States. Each day, we also have the non-precedential opinions from all of the Circuit courts except the D.C. Circuit. This means that by 5:10pm PST, the database will be updated with the opinions of the day, with custom alerts going out shortly thereafter.
The site was created by Michael Lissner as a Masters thesis project at UC Berkley School of Information.
A quick perusal of the site and its associated documents tells us that Michael is using a scraping technique to visit court websites looking for recently released opinions. Once found, the opinions are retrieved, converted from PDF to text, indexed, and stored. Atom RSS feeds are then generated to provide current alerts.
The site is powered by Python using the Django web framework and is open source, so you can download the code. The backend database is MySQL and search is handled by Sphinx. The conversion from PDF appears to be plain text. If you register on the site you can create custom alerts based on saved searches.
All in all CourtListener.com provides another good source for current Federal appellate court opinions. Be sure to check the coverage page to see how far back the site goes for each court. Perhaps the future will bring an expansion to more courts and jurisdictions.