Everything We Know (So Far) About Obama’s Big Data Tactics

A new look at what the Obama campaign did with its much-heralded data operation.

November 29, 2012, 10:45 am

This post is being kept up-to-date. It was first published on Nov. 13.

For the past nine months, we’ve been following how political campaigns use data
about voters to target them in different ways. During the election, the Obama
campaign, which had assembled a cutting-edge team of data scientists, developers, and digital
advertising experts, refused to say anything about how it was targeting voters.

Now, members of the campaign are starting
to open up about what their team actually did with all that data. Based on our
own interviews with campaign officials, as well as reporting elsewhere, here’s
an overview of some of the campaign’s key strategies, from connecting Facebook
data to the national list of registered voters, to measuring how “persuadable”
individual swing state voters might be.

Here’s what the nerds did.

What data did they use—and were they tracking you across the web?

It’s still not clear.

Chief Innovation and Integration Officer
Michael Slaby and other campaign officials said again that they relied less on
consumer data, and more on public voting records and the responses that voters
provided directly to campaign canvassers. They would not comment on exactly
what data the campaign collected about individual people.

Officials within the campaign said media
reports had overemphasized the importance of online web
cookie data to track voters across the web.

Slaby said the campaign primarily used web
cookies to serve ads to people who had visited the campaign site — a
tactic called “retargeting,” which Mitt Romney’s campaign also used.

The campaign also told us web cookies were
useful for helping them analyze whether volunteers were actually reading the
online training materials the campaign had prepared for them. (They didn’t
track this on a volunteer-by-volunteer basis.)

The backbone of the campaign’s effort was a
database on registered voters compiled by the Democratic National Committee. It
was updated weekly with new public records from state level election officials,
one official said.

The campaign then added its own voter data
to the mix. Officials wouldn’t say exactly what information they added.

What will happen to the data about millions of voters collected during the campaign?

It’s still not clear.

As the Washington Post reported earlier this month, other Democratic candidates are eager to use Obama’s voter data for their own campaigns.

Some of the president’s data will certainly go to the Democratic National Committee, where it can be used to help other Democrats.

But both the Post and the Wall Street Journal reported that it’s unclear if the DNC has the resources — technological and financial — to manage all of the voter data and analysis the campaign produced.

The Wall Street Journal cited an anonymous Democratic Party official, who said that a new organization might be created to manage and update Obama’s campaign data. This kind of organization “would have the potential to give the president leverage in the selection of the next Democratic presidential nominee,” the Journal reported.

After his 2008 win, Obama created “Organizing for America,” a group that worked within the DNC to advance the president’s agenda.

How did the Obama campaign know which TV shows voters were watching?

Did the Obama campaign really “get a big list of the names of people who watched certain things on TV,” as Gawker asked earlier this month?

No. But they were able to get very detailed information about the television habits of certain groups of voters.

In order to decide where to buy their TV ads, the Obama campaign matched lists of voters to the names and addresses of cable subscribers, as the Washington Post, The Atlantic, and others have reported.

This allowed them to analyze which channels the voters they wanted to reach were watching.

The campaign focused on swing state voters the campaign had scored as “persuadable,” and voters who were supporters but needed to be encouraged to turn out at the polls, Carol Davidsen, who ran the campaign’s television ad “Optimizer” project, told ProPublica. They also looked at voters who are Latino and African-American.

Working with Rentrak, a data company, the campaign tracked the television viewership of these groups across 60 channels, looking at how viewership changed for every quarter hour of the day, Davidsen said.

The campaign was able to identify the audience size of each group for a particular channel at a particular time — and then analyze where and when the campaign could advertise to key voters at the lowest cost.

For instance, the campaign was able to identify a subset of persuadable voters who lived in households that watched less than two hours of television a day. This allowed the campaign to schedule ads during the times these voters would actually be watching TV.

“Even if it’s more expensive, it’s worth it, because we can’t catch them later,” Davidsen told ProPublica.

So, how was the campaign able to get all this viewership data?

In Ohio, the campaign worked with FourthWall Media, a data and targeting company, to get television viewership data for individual homes, which had “anonymous but consistent” household ID numbers, Davidsen said. This allowed the campaign to track household viewing behavior over time, without knowing which exact voters they were analyzing. (FourthWall Media has yet to respond to a request for comment on how their process works.)

Working with Rentrak, the campaign could follow the viewing habits of larger groups of voters in different TV markets across the country.

Rentrak used a third-party data company to match lists of voters to TV operator data about subscribers — and then match that information to the anonymous ID numbers that Rentrak uses to track the usage patterns of television set-top boxes.

To protect users’ privacy, none of the companies involved have all the information they would need to know what shows a specific voter watched on TV, Rentrak’s Chris Wilson told ProPublica. For instance, Rentrak knows viewers’ ID numbers and viewing habits, but doesn’t know which ID numbers correspond with which name or address. In fact, Rentrak never knows the name or address of the household, Wilson said.

While commercial advertisers are beginning to do this kind of data matching and analysis, “What [the Obama campaign] did was probably on the more sophisticated side compared to a lot of folks,” Wilson told ProPublica. He said the campaign had done “pioneering work” in television targeting.

Rentrak also works with large consumer data companies, including Experian and Epsilon, to match television viewer data with consumer data. According to the Federal Communications Commission privacy rules, cable operators are not allowed to disclose subscribers’ “personally identifiable information” without their consent, but they can collect and share “aggregate” data.

How important was the data the campaign could access through its Facebook app about volunteers
and their friends?

Observers noted that in the last days of
the campaign, Obama supporters who used the campaign’s Facebook app received
emails with the names and profile photos of friends in swing states.
The e-mails urged supporters to contact their friends and encourage them to
vote.

It wasn’t clear how well the effort went or
what the response was. Some people had been encouraged to ask their Republican
friends to vote. A Romney official who had signed up for the campaign’s e-mail
list was told to contact his Facebook friend Eric Cantor,
the Republican House Majority Leader.

What we now know is that the campaign did
in fact try to match Facebook profile to people’s voting records. So if you got
a message encouraging you to contact a friend in Ohio, the campaign may have
targeted that friend based on their public voting record and other information
the campaign had.

But the matching process was far from
perfect, in part because the information the campaign could access about
volunteers’ friends was limited.

Were privacy concerns about the campaign’s data collection justified?

We’ve reported on some of the concerns about the
amount of data the campaign has amassed on individual voters.

Were those concerns at all justified? It’s
hard to say right now, since we still don’t know where the campaign drew the
line about what data they would and would not use.

Obama officials did dismiss the idea that the
campaign cared about voters’ porn habits.

The analytics team estimated how “persuadable” voters are. What does that
mean?

It all came down to four numbers.

The Obama campaign had a list of every
registered voter in the battleground states. The job of the campaign’s
much-heralded data scientists was to use the information they had amassed to
determine which voters the campaign should target— and what each voter
needed to hear.

They needed to go a little deeper than
targeting “waitress moms.”

“White suburban women? They’re not all the
same. The Latino community is very diverse with very different interests,” Dan
Wagner, the campaign’s chief analytics officer, told The Los Angeles Times. “What the data
permits you to do is figure out that diversity.”

What Obama’s data scientists produced was a
set of four individual estimates for each swing state voter’s behavior. These
four numbers were included in the campaign’s voter database, and each score,
typically on a scale of 1 to 100, predicted a different element of how that
voter was likely to behave.

Two of the numbers calculated voters’
likelihood of supporting Obama, and of actually showing up to the polls. These
estimates had been used in 2008. But the
analysts also used data about individual voters to make new, more complicated
predictions.

If a voter supported Obama, but didn’t vote
regularly, how likely was he or she to respond to the campaign’s reminders to
get to the polls?

The final estimate was the one that had
proved most elusive to earlier campaigns—and that may be most influential
in the future. If voters were not strong Obama supporters, how likely was it
that a conversation about a particular issue — like Obamacare or the
Buffett rule—could persuade them to change their minds?

Slaby said that there was early evidence
that the campaign’s estimate of how “persuadable” voters would be on certain
issues had actually worked.

“This is very much a competitive advantage
for us,” he said.

Wagner began working on persuasion
targeting during the 2010 campaign, Slaby said — giving the campaign a
long time to perfect their estimates.

Did everyone on the campaign have access to these scores?

No.

Campaign volunteers were not given access
to these individual scores, one official said. “Oh, my neighbor Lucy is a 67
and her husband is a 72—we would probably consider that a distraction.”

Do other political campaigns also assign
me a set of numerical scores?

Yes.

The use of microtargeting scores — a
tactic from the commercial world — is a standard part of campaign data
efforts, and one that has been well documented before.

In his book exploring the rise of experimentally
focused campaigns, Sasha Issenberg compares microtargeting scores to
credit scores for the political world.

In 2008, the Obama campaign ranked voters’
likely support of the senator from Illinois using a 1 to 100 scale.
This year, The Guardian reported, Americans for Prosperity, a conservative
group backed by the Koch brothers, ranked some swing state voters on a scale from 0 to 1,
with 0 being “so leftwing and pro-government” that they are “not worth
bothering with,” and 1 being “already so in favour of tax and spending cuts
that to talk to them would be preaching to the converted.”

What’s different about what Obama’s data scientists did?

Obama campaign’s persuadability score tried
to capture not just a voter’s current opinion, but how that individual opinion
was likely to change after interactions with the campaign. Most importantly, Obama’s
analysts did not assume that voters who said they were “undecided” were
necessarily persuadable—a mistake campaigns have made in the past,
according to targeting experts.

“Undecided is just a big lump,” said Daniel Kreiss, who wrote a book on the
evolution of Democratic “networked” politics from Howard
Dean through the 2008 Obama campaign.

Voters who call themselves “undecided”
might actually be strong partisans who are unwilling to share their
views—or simply people who are disengaged.

“Someone who is
undecided and potentially not very persuadable, you might spend all the time in
the world talking to that person, and their mind doesn’t change. They stay
undecided,” Slaby said.

To pinpoint voters who might actually
change their minds, the Obama campaign conducted randomized experiments, Slaby
said. Voters received phone calls in which they were asked to rate their
support for the president, and then engaged in a conversation about different
policy issues. At the end of the conversation, they were asked to rate their
support for the president again. Using the results of these experiments,
combined with detailed demographic information about individual voters, the
campaign was able to pinpoint both what kinds of voters had been persuaded to
support the president, and which issues had persuaded them.

Avi Feller, a graduate student in statistics at Harvard who has worked on this kind of modeling, compared it to medical research.

“The statistics of drug trials are very similar to the statistics of experiments in campaigns,” he said. “I have some cancer drug, and I know it works well on some people—for whom is the cancer drug more or less effective?”

“Campaigns have always been
about trying to persuade people. What’s new here is we’ve spent the time and
energy to go through this randomization process,” Slaby said.

Issenberg
reported that Democratic strategists have been experimenting with persuasion
targeting since 2010, and that the Analyst
Institute, an organization devoted to improving Democratic campaign
tactics through experimentation, had played a key role in its development.

Slaby said the Obama campaign’s persuasion
strategy built on these efforts, but at greater scale.

Aaron Strauss, the targeting director at
the Democratic Congressional Campaign Committee, said in a statement that the
DCCC was also running a persuasion targeting program this year using randomized
experiments as part of its work on congressional races.

What were the persuasion scores good
for—and how well did they work?

The persuasion scores allowed the campaign
to focus its outreach efforts—and their volunteer phone calls—on
voters who might actually change their minds as the result. It also guided them
in what policy messages individual voters should hear.

Slaby said the campaign had some initial
data suggesting that the persuasion score had been effective—but that the
campaign was still working on an in-depth analysis of which of its individual
tactics actually worked.

But a successful
“persuasion” phone call may not change a voter’s mind forever—just like a
single drug dose will not be effective forever.

One official
with knowledge of the campaign’s data operation said that the campaign’s
experiments also tested how long the “persuasion” effect lasted after the
initial phone conversation—and found that it was only about three weeks.

“There is no
generic conclusion to draw from this experimentation that persuasion via phone
only lasts a certain amount of time,” Slaby said. “Any durability effects we
saw were specific to this electorate and this race.”