ProPublica

Journalism in the Public Interest

Cancel

The ProPublica Nerd Blog

TimelineSetter: Easy Timelines From Spreadsheets, Now Open to All

Last week we announced TimelineSetter, our new tool for creating beautiful interactive HTML timelines. Today, after a short private beta with some of our fellow news application developers, we’re opening the code to everyone.

TimelineSetter: A New Way to Display Timelines on the Web

The timeline is a very useful way to visualize sequences of events, and they’re especially useful to orient readers within the complex investigative stories we do at ProPublica. But they’re not very easy to make. As far as we know, there are no good open source frameworks that web developers can use to generate timelines quickly without losing design flexibility. So we made our own, which is debuting today.

By way of background, our most recent timeline was part of our story on disability discharges last month. We found some interesting parallels between what the Education Department was doing to reform the program and one borrower’s attempt to navigate the bureaucracy over the same five-year period. To visualize these parallel paths, we designed a timeline that showed both series events on one bar, but differentiated by color and space.

FOIA b(3) Exemptions

Information about watermelon handlers, avocado importers and caves are some of the categories of information that have been withheld from federal Freedom of Information Act requesters using sections of laws that are otherwise unrelated to disclosure. There are hundreds of such laws, according to data compiled by the Sunshine in Government Initiative. They fall under number three—known as b(3)—of the nine exemptions. Use our database to see how extensively agencies use b(3) exemptions.

Autopsies in the U.S.A.

ProPublica, in partnership with PBS “Frontline” and NPR, surveyed almost 70 of the largest coroner and medical examiner systems in the U.S.

FCIC Document Dive

When the Financial Crisis Inquiry Commission released its final report on the causes of the financial crisis, it released an extensive document archive. We’ve tried to make searching through it a bit easier. Use the form below to search for people, places, or organizations mentioned in the documents.

Scraping for Journalism: A Guide for Collecting Data

A series of technical and programming tutorials on how scraped, parsed, and organized data for “Dollars for Docs.”

Scraping for Journalism: A Guide for Collecting Data

Our Dollars for Docs news application lets readers search pharmaceutical company payments to doctors. We’ve written a series of how-to guides explaining how we collected the data.

The Coder’s Cause in “Dollars for Docs”

Our investigation of the financial ties between drug companies and doctors, Dollars for Docs, was sparked by a computational challenge. Several drug companies had been ordered to disclose who they paid to speak and consult on their behalf. But they made the records hard to analyze, seemingly making the data “impossible to download.”

We wanted to change that.

Chapter 3: Turning PDFs to Text

Chapter 2: Reading Data from Flash Sites

Chapter 1. Using Google Refine to Clean Messy Data

Chapter 4: Scraping Data from HTML

Chapter 5: Getting Text Out of an Image-Only PDF

In the previous guide, we describe several methods for turning PDFs into data usable for spreadsheets. However, those only handle PDFs that have actual text embedded within them. When a PDF contains just images of text, as they do in scanned documents, then the problem isn't just how to convert them into neat tabular data, but how to extract any text, period.

Find Homes With Tainted Drywall

When the Consumer Products Safety Commission provided data in October, the agency said it had received fewer than 3,500 reports of tainted drywall. ProPublica and the Sarasota Herald-Tribune compiled a list of addresses from county property appraiser data and records in consolidated lawsuits filed in New Orleans federal court and found nearly twice that number: around 6,900 homes.

Interactive: Which Banks Got Emergency Loans from the Fed During the Financial Meltdown?

Wednesday the Federal Reserve released data on more than 21,000 loans and other deals it made through a dozen emergency programs created during the financial crisis. We’ve combined the Fed’s three programs that loaned directly to banks and other financial firms with the goal of getting them to start lending again.

Open Source Project: Thinner

Today we're releasing a new open source project called "Thinner." It's for websites, like ours, that use the open source caching engine Varnish.

Use Our Dollars for Docs Widget on Your Site

As part of ProPublica’s “Dollars for Docs” series and interactive news application, we've created a small widget that you can embed on your web site. It will let your readers look up whether their health care providers are taking money from the drug companies in our database. The widget shows the amount of money paid to each practitioner in our database, which company made the payment, and in some cases, what the companies saidthey were paying for: speaking fees, consulting, etc. The widget also lists what drugs each company sells so readers can check their own prescriptions.

A Tale of Two Documents

On Oct. 8, we published an investigation examining how a judicial opinion in a pivotal lawsuit brought by a Guantanamo detainee vanished, only to be replaced weeks later by an entirely different opinion. At the center of our reporting are two documents representing separate versions of that same opinion: the original opinion written by Judge Henry H. Kennedy, and a second opinion quietly put in the original's place more than a month later.

Why are there two opinions? As reporter Dafna Linzer explains, redactions that were supposed to be made in the original opinion never were. Once government security officials, who are responsible for reviewing and redacting classified information from sensitive cases, discovered the error, the decision was quickly removed from the court file. In Judge Kennedy’s courtroom four days later, the Justice Department refused to have the opinion redacted and re-released. With the detainee, Uthman Abdul Rahim Mohammed Uthman, slated for indefinite detention, the stakes were high. Officials did not want to risk that those who had seen the original opinion would know exactly what the government had meant to keep classified.

The Rainbow Connection: How We Made Our CDO Connections Graphic

On Wednesday, we launched an interactive news application to help readers understand the cross-owned nature of Collateralized Debt Obligations (CDOs) in 2006-2007. This cross-ownership helped inflate the bubble, and ultimately made the financial crisis worse.

We received a list of cross-owned CDOs as a result of a study ProPublica commissioned from Thetica, a consulting company in New York. It consisted of a list of CDOs, the banks that sponsored them, the CDO managers who managed them, and an enumerated list of other CDOs in which it had both sold and bought a stake. Reporters Jake Bernstein and Jesse Eisinger had already used the data in their story, Banks’ Self-Dealing Super-Charged Financial Crisis.

The News Apps Team

Download Our Data

Use ProPublica's data — cleaned, categorized and often created from multiple sources — in your reporting and research.

Use Our Code

Explore Our Work