Everything We Know (So Far) About Obama’s Big Data Tactics
A new look at what the Obama campaign did with its much-heralded data operation.
This post is being kept up-to-date. It was first published on Nov. 13.
For the past nine months, we’ve been following how political campaigns use data about voters to target them in different ways. During the election, the Obama campaign, which had assembled a cutting-edge team of data scientists, developers, and digital advertising experts, refused to say anything about how it was targeting voters.
Now, members of the campaign are starting to open up about what their team actually did with all that data. Based on our own interviews with campaign officials, as well as reporting elsewhere, here’s an overview of some of the campaign’s key strategies, from connecting Facebook data to the national list of registered voters, to measuring how “persuadable” individual swing state voters might be.
Here’s what the nerds did.
What data did they use—and were they tracking you across the web?
It’s still not clear.
Chief Innovation and Integration Officer Michael Slaby and other campaign officials said again that they relied less on consumer data, and more on public voting records and the responses that voters provided directly to campaign canvassers. They would not comment on exactly what data the campaign collected about individual people.
Officials within the campaign said media reports had overemphasized the importance of online web cookie data to track voters across the web.
Slaby said the campaign primarily used web cookies to serve ads to people who had visited the campaign site — a tactic called “retargeting,” which Mitt Romney’s campaign also used.
The campaign also told us web cookies were useful for helping them analyze whether volunteers were actually reading the online training materials the campaign had prepared for them. (They didn’t track this on a volunteer-by-volunteer basis.)
The backbone of the campaign’s effort was a database on registered voters compiled by the Democratic National Committee. It was updated weekly with new public records from state level election officials, one official said.
The campaign then added its own voter data to the mix. Officials wouldn’t say exactly what information they added.
What will happen to the data about millions of voters collected during the campaign?
It's still not clear.
As the Washington Post reported earlier this month, other Democratic candidates are eager to use Obama's voter data for their own campaigns.
Some of the president's data will certainly go to the Democratic National Committee, where it can be used to help other Democrats.
But both the Post and the Wall Street Journal reported that it's unclear if the DNC has the resources — technological and financial — to manage all of the voter data and analysis the campaign produced.
The Wall Street Journal cited an anonymous Democratic Party official, who said that a new organization might be created to manage and update Obama's campaign data. This kind of organization "would have the potential to give the president leverage in the selection of the next Democratic presidential nominee," the Journal reported.
After his 2008 win, Obama created "Organizing for America," a group that worked within the DNC to advance the president's agenda.
How did the Obama campaign know which TV shows voters were watching?
Did the Obama campaign really "get a big list of the names of people who watched certain things on TV," as Gawker asked earlier this month?
No. But they were able to get very detailed information about the television habits of certain groups of voters.
In order to decide where to buy their TV ads, the Obama campaign matched lists of voters to the names and addresses of cable subscribers, as the Washington Post, The Atlantic, and others have reported.
This allowed them to analyze which channels the voters they wanted to reach were watching.
The campaign focused on swing state voters the campaign had scored as "persuadable," and voters who were supporters but needed to be encouraged to turn out at the polls, Carol Davidsen, who ran the campaign's television ad "Optimizer" project, told ProPublica. They also looked at voters who are Latino and African-American.
Working with Rentrak, a data company, the campaign tracked the television viewership of these groups across 60 channels, looking at how viewership changed for every quarter hour of the day, Davidsen said.
The campaign was able to identify the audience size of each group for a particular channel at a particular time — and then analyze where and when the campaign could advertise to key voters at the lowest cost.
For instance, the campaign was able to identify a subset of persuadable voters who lived in households that watched less than two hours of television a day. This allowed the campaign to schedule ads during the times these voters would actually be watching TV.
"Even if it's more expensive, it's worth it, because we can't catch them later," Davidsen told ProPublica.
So, how was the campaign able to get all this viewership data?
In Ohio, the campaign worked with FourthWall Media, a data and targeting company, to get television viewership data for individual homes, which had "anonymous but consistent" household ID numbers, Davidsen said. This allowed the campaign to track household viewing behavior over time, without knowing which exact voters they were analyzing. (FourthWall Media has yet to respond to a request for comment on how their process works.)
Working with Rentrak, the campaign could follow the viewing habits of larger groups of voters in different TV markets across the country.
Rentrak used a third-party data company to match lists of voters to TV operator data about subscribers — and then match that information to the anonymous ID numbers that Rentrak uses to track the usage patterns of television set-top boxes.
To protect users' privacy, none of the companies involved have all the information they would need to know what shows a specific voter watched on TV, Rentrak's Chris Wilson told ProPublica. For instance, Rentrak knows viewers' ID numbers and viewing habits, but doesn't know which ID numbers correspond with which name or address. In fact, Rentrak never knows the name or address of the household, Wilson said.
While commercial advertisers are beginning to do this kind of data matching and analysis, "What [the Obama campaign] did was probably on the more sophisticated side compared to a lot of folks," Wilson told ProPublica. He said the campaign had done "pioneering work" in television targeting.
Rentrak also works with large consumer data companies, including Experian and Epsilon, to match television viewer data with consumer data. According to the Federal Communications Commission privacy rules, cable operators are not allowed to disclose subscribers' "personally identifiable information" without their consent, but they can collect and share "aggregate" data.
How important was the data the campaign could access through its Facebook app about volunteers and their friends?
Observers noted that in the last days of the campaign, Obama supporters who used the campaign’s Facebook app received emails with the names and profile photos of friends in swing states. The e-mails urged supporters to contact their friends and encourage them to vote.
It wasn’t clear how well the effort went or what the response was. Some people had been encouraged to ask their Republican friends to vote. A Romney official who had signed up for the campaign’s e-mail list was told to contact his Facebook friend Eric Cantor, the Republican House Majority Leader.
What we now know is that the campaign did in fact try to match Facebook profile to people’s voting records. So if you got a message encouraging you to contact a friend in Ohio, the campaign may have targeted that friend based on their public voting record and other information the campaign had.
But the matching process was far from perfect, in part because the information the campaign could access about volunteers’ friends was limited.
Were privacy concerns about the campaign’s data collection justified?
We’ve reported on some of the concerns about the amount of data the campaign has amassed on individual voters.
Were those concerns at all justified? It’s hard to say right now, since we still don’t know where the campaign drew the line about what data they would and would not use.
Obama officials did dismiss the idea that the campaign cared about voters’ porn habits.
It all came down to four numbers.
The Obama campaign had a list of every registered voter in the battleground states. The job of the campaign’s much-heralded data scientists was to use the information they had amassed to determine which voters the campaign should target— and what each voter needed to hear.
They needed to go a little deeper than targeting “waitress moms.”
“White suburban women? They’re not all the same. The Latino community is very diverse with very different interests,” Dan Wagner, the campaign’s chief analytics officer, told The Los Angeles Times. “What the data permits you to do is figure out that diversity.”
What Obama’s data scientists produced was a set of four individual estimates for each swing state voter’s behavior. These four numbers were included in the campaign’s voter database, and each score, typically on a scale of 1 to 100, predicted a different element of how that voter was likely to behave.
Two of the numbers calculated voters’ likelihood of supporting Obama, and of actually showing up to the polls. These estimates had been used in 2008. But the analysts also used data about individual voters to make new, more complicated predictions.
If a voter supported Obama, but didn’t vote regularly, how likely was he or she to respond to the campaign’s reminders to get to the polls?
The final estimate was the one that had proved most elusive to earlier campaigns—and that may be most influential in the future. If voters were not strong Obama supporters, how likely was it that a conversation about a particular issue — like Obamacare or the Buffett rule—could persuade them to change their minds?
Slaby said that there was early evidence that the campaign’s estimate of how “persuadable” voters would be on certain issues had actually worked.
“This is very much a competitive advantage for us,” he said.
Wagner began working on persuasion targeting during the 2010 campaign, Slaby said — giving the campaign a long time to perfect their estimates.
Did everyone on the campaign have access to these scores?
Campaign volunteers were not given access to these individual scores, one official said. “Oh, my neighbor Lucy is a 67 and her husband is a 72—we would probably consider that a distraction.”
Do other political campaigns also assign me a set of numerical scores?
The use of microtargeting scores — a tactic from the commercial world — is a standard part of campaign data efforts, and one that has been well documented before.
In his book exploring the rise of experimentally focused campaigns, Sasha Issenberg compares microtargeting scores to credit scores for the political world.
In 2008, the Obama campaign ranked voters’ likely support of the senator from Illinois using a 1 to 100 scale. This year, The Guardian reported, Americans for Prosperity, a conservative group backed by the Koch brothers, ranked some swing state voters on a scale from 0 to 1, with 0 being “so leftwing and pro-government” that they are “not worth bothering with,” and 1 being “already so in favour of tax and spending cuts that to talk to them would be preaching to the converted.”
What’s different about what Obama’s data scientists did?
Obama campaign’s persuadability score tried to capture not just a voter’s current opinion, but how that individual opinion was likely to change after interactions with the campaign. Most importantly, Obama’s analysts did not assume that voters who said they were “undecided” were necessarily persuadable—a mistake campaigns have made in the past, according to targeting experts.
“Undecided is just a big lump,” said Daniel Kreiss, who wrote a book on the evolution of Democratic “networked” politics from Howard Dean through the 2008 Obama campaign.
Voters who call themselves “undecided” might actually be strong partisans who are unwilling to share their views—or simply people who are disengaged.
“Someone who is undecided and potentially not very persuadable, you might spend all the time in the world talking to that person, and their mind doesn’t change. They stay undecided,” Slaby said.
To pinpoint voters who might actually change their minds, the Obama campaign conducted randomized experiments, Slaby said. Voters received phone calls in which they were asked to rate their support for the president, and then engaged in a conversation about different policy issues. At the end of the conversation, they were asked to rate their support for the president again. Using the results of these experiments, combined with detailed demographic information about individual voters, the campaign was able to pinpoint both what kinds of voters had been persuaded to support the president, and which issues had persuaded them.
Avi Feller, a graduate student in statistics at Harvard who has worked on this kind of modeling, compared it to medical research.
“The statistics of drug trials are very similar to the statistics of experiments in campaigns,” he said. “I have some cancer drug, and I know it works well on some people—for whom is the cancer drug more or less effective?”
“Campaigns have always been about trying to persuade people. What’s new here is we’ve spent the time and energy to go through this randomization process,” Slaby said.
Issenberg reported that Democratic strategists have been experimenting with persuasion targeting since 2010, and that the Analyst Institute, an organization devoted to improving Democratic campaign tactics through experimentation, had played a key role in its development.
Slaby said the Obama campaign’s persuasion strategy built on these efforts, but at greater scale.
Aaron Strauss, the targeting director at the Democratic Congressional Campaign Committee, said in a statement that the DCCC was also running a persuasion targeting program this year using randomized experiments as part of its work on congressional races.
What were the persuasion scores good for—and how well did they work?
The persuasion scores allowed the campaign to focus its outreach efforts—and their volunteer phone calls—on voters who might actually change their minds as the result. It also guided them in what policy messages individual voters should hear.
Slaby said the campaign had some initial data suggesting that the persuasion score had been effective—but that the campaign was still working on an in-depth analysis of which of its individual tactics actually worked.
But a successful “persuasion” phone call may not change a voter’s mind forever—just like a single drug dose will not be effective forever.
One official with knowledge of the campaign’s data operation said that the campaign’s experiments also tested how long the “persuasion” effect lasted after the initial phone conversation—and found that it was only about three weeks.
“There is no generic conclusion to draw from this experimentation that persuasion via phone only lasts a certain amount of time,” Slaby said. “Any durability effects we saw were specific to this electorate and this race.”
ProPublica is following the money and exploring campaign issues in the 2012 election you won't read about elsewhere.
Latest Stories in this Project
Our Hottest Stories
- Big Investors Push for Auditors to Sign Financial Statements
- Q&A: What Can U.S. Health Care Learn from the Ebola Outbreak?
- Report: Drillers Illegally Using Diesel Fuel to Frack
- Government Will Withhold One-Third of the Records from Database of Physician Payments
- What to Look For In Dueling Autopsies of Michael Brown
- New York City Will Pay $10 Million to Settle Wrongful Conviction Case
- The Best Reporting on Federal Push to Militarize Local Police
- Q&A: The Hidden Costs of Tobacco Debt
- Pro-Troop Charity Shoots Back
- In California, Some Efforts to Toughen Oversight of Assisted Living Falter