ProPublica

Journalism in the Public Interest

Cancel

The ProPublica Nerd Blog

Dollars for Docs

Has Your Doctor Received Drug Company Money?

ALEC-Related Contributions

Use this database to find campaign contributions from some ALEC-affiliated groups to some ALEC-member state legislators.

Facebook for News Apps: How We Harnessed the Social Network for ‘The Opportunity Gap’

Last week we published The Opportunity Gap, a news application that lets readers find out how equally their state provides poor and wealthier schools access to advanced classes that researchers say will help students later in life.

It features a host of technologies we’re using for the first time, and which we anticipate will be part of many of our news apps in the future. These include a new JavaScript-based “workspace” approach to placing, sorting and removing multiple entities on a single page. We also built our own map server, which we’ll have lots to say about later this summer.

We designed the app so it was oriented behaviorally, and not just hierarchically. That is, rather than simply showing users a collection of items (as so many interactive databases do), we wanted to encourage people to take the conversations they were already having about their schools and communities, and extend that behavior onto our app. We owe a debt to Nick Disabato, whose SXSW panel got us thinking about this concept. We were also inspired by The New York Times’ Oscars app, which uses Facebook to let readers create, share and compare their ballots with friends, and to let them reproduce real-world behavior — competing on Oscar predictions — within a news app.

This emphasis on encouraging behavior — coupled with our preference for keeping our apps light on database writes — spurred us to integrate Facebook in a deeper way than we’ve done before.

The Opportunity Gap

ProPublica analyzed new data from the U.S. Department of Education Office of Civil Rights along with other federal education data to examine whether states provide students equal access.

How the Heart Rhythm Society Sells Access

The Heart Rhythm Society’s annual conference is a marketing bonanza for drug companies and medical device makers.

How Much Money Do Groups Receive From Industry?

In a response to a request from Sen. Charles Grassley, R-Iowa, 33 professional associations and health advocacy groups listed their payments from the pharmaceutical, medical device and insurance industries. They also detailed the relationships that the groups’ executives and board members had with the same companies.

TimelineSetter: Easy Timelines From Spreadsheets, Now Open to All

Last week we announced TimelineSetter, our new tool for creating beautiful interactive HTML timelines. Today, after a short private beta with some of our fellow news application developers, we’re opening the code to everyone.

TimelineSetter: A New Way to Display Timelines on the Web

The timeline is a very useful way to visualize sequences of events, and they’re especially useful to orient readers within the complex investigative stories we do at ProPublica. But they’re not very easy to make. As far as we know, there are no good open source frameworks that web developers can use to generate timelines quickly without losing design flexibility. So we made our own, which is debuting today.

By way of background, our most recent timeline was part of our story on disability discharges last month. We found some interesting parallels between what the Education Department was doing to reform the program and one borrower’s attempt to navigate the bureaucracy over the same five-year period. To visualize these parallel paths, we designed a timeline that showed both series events on one bar, but differentiated by color and space.

FOIA b(3) Exemptions

Information about watermelon handlers, avocado importers and caves are some of the categories of information that have been withheld from federal Freedom of Information Act requesters using sections of laws that are otherwise unrelated to disclosure. There are hundreds of such laws, according to data compiled by the Sunshine in Government Initiative. They fall under number three—known as b(3)—of the nine exemptions. Use our database to see how extensively agencies use b(3) exemptions.

Autopsies in the U.S.A.

ProPublica, in partnership with PBS “Frontline” and NPR, surveyed almost 70 of the largest coroner and medical examiner systems in the U.S.

FCIC Document Dive

When the Financial Crisis Inquiry Commission released its final report on the causes of the financial crisis, it released an extensive document archive. We’ve tried to make searching through it a bit easier. Use the form below to search for people, places, or organizations mentioned in the documents.

Scraping for Journalism: A Guide for Collecting Data

A series of technical and programming tutorials on how scraped, parsed, and organized data for “Dollars for Docs.”

Scraping for Journalism: A Guide for Collecting Data

Our Dollars for Docs news application lets readers search pharmaceutical company payments to doctors. We’ve written a series of how-to guides explaining how we collected the data.

The Coder’s Cause in “Dollars for Docs”

Our investigation of the financial ties between drug companies and doctors, Dollars for Docs, was sparked by a computational challenge. Several drug companies had been ordered to disclose who they paid to speak and consult on their behalf. But they made the records hard to analyze, seemingly making the data “impossible to download.”

We wanted to change that.

Chapter 3: Turning PDFs to Text

Chapter 2: Reading Data from Flash Sites

Chapter 1. Using Google Refine to Clean Messy Data

Chapter 4: Scraping Data from HTML

Chapter 5: Getting Text Out of an Image-Only PDF

In the previous guide, we describe several methods for turning PDFs into data usable for spreadsheets. However, those only handle PDFs that have actual text embedded within them. When a PDF contains just images of text, as they do in scanned documents, then the problem isn't just how to convert them into neat tabular data, but how to extract any text, period.

The News Apps Team

Hack With Us

ProPublica hosts newsroom developers -- or developers who want to see what it's like to work in news -- for 3-5 day job shadowing residencies called the ProPublica Pair Programming Project, or P5.

Download Our Data

Use ProPublica's data -- cleaned, categorized and often created from multiple sources -- in your reporting and research.

Use Our Code

Explore Our Work