This story was prepared for the March 2014 conference, "Big Data Future," at Ohio State's Moritz College of Law, and will be published in I/S: A Journal of Law and Policy for the Information Society, 10:2 (2015). For more information, see http://bigdatafuture.org.
It’s easy to think of data journalism as a modern invention. With all the hype, a casual reader might assume that it was invented sometime during the 2012 presidential campaign. Better-informed observers can push the start date back a few decades, noting with self-satisfaction that Philip Meyer did his pioneering work during the Detroit riots in the late 1960s. Some go back even further, archly telling the tale of Election Night 1952, when a UNIVAC computer used its thousands of vacuum tubes to predict the presidential election within four electoral votes.
But all of these estimates are wrong – in fact, they’re off by centuries. The real history of data journalism pre-dates newspapers, and traces the history of news itself. The earliest regularly published periodicals of the 17th century, little more than letters home from correspondents hired by international merchants to report on the business details and the court gossip of faraway cities, were data-rich reports.
Early 18th century newspapers were also rich with data. If it were ever in doubt that the unavoidable facts of human existence are death and taxes, early newspapers published tables of property tax liens and of mortality and its causes. Commodity prices and the contents of arriving ships — cargo and visiting dignitaries — were a regular and prominent feature of newspapers throughout the 18th and 19th centuries.
Beyond business figures and population statistics, data was used in a wide variety of contexts. The very first issue of the Manchester Guardian on May 5, 1821 contains on the last of its four pages a large table showing that the real number of students in church schools far exceeded the estimates of the student population made by proponents of education reform.
Data was also used, as it is today, as both the input to and the output of investigative exposés. This is the story of one such investigative story, and of its author, New York Tribune editor Horace Greeley. It’s a remarkable tale, and one with important lessons for “big data” journalism today.
Though he’s no longer a household name, Horace Greeley was one of the most important public figures of the 19th century. His Tribune had a circulation larger than any paper in the city except for cross-town rival James Gordon Bennett’s New York Herald. More than 286,000 copies of the Tribune’s daily, weekly and semi-weekly editions were sold in the city and across the country by 1860, which by its own reckoning made it the largest-circulation newspaper in the U.S. Ralph Waldo Emerson observed, “Greeley does the thinking for the whole West at $2 per year for his paper.”
Greeley himself was a popular public speaker and a hugely influential national figure. He was a fascinating, frustrating, contradictory man. He was a leading abolitionist whose support for the Civil War was limited at best, yet his abolitionist writing in the Tribune made the paper the target of an angry mob during the Draft Riots in 1863. He was a vegetarian and a utopian socialist who published Karl Marx in the Tribune, but believed fervently in manifest destiny and America’s western expansion. He was a New York icon who thought the city was a terrible influence on working people and encouraged them to “Go West” to escape it. Though he was one of the founders of the Republican Party, his relationship with Abraham Lincoln was strained, and he ran for president in 1872 on what amounted to the Democratic ticket, losing big and dying broken-hearted before the Electoral College could meet to certify Grant’s election.
Long before his presidential campaign, and for decades, Greeley and his paper held sway with hundreds of thousands of everyday Americans. But if he was a celebrity with the people, he was far less successful convincing political elites to sponsor his entry into political office. His moralism and mercurial nature seem to have been a steady annoyance to powerful figures like New York’s William Seward and Whig (and later Republican) party boss Thurlow Weed.
Historian Richard Kluger noted of the relationship,
“[Greeley] was more useful to [Seward and Weed] than they ever proved to him. As the eloquent editor of a rising newspaper that reached, through its weekly edition, throughout the Empire State, Greeley was a lively fish on the hook, to be fed enough line to thrash about picturesquely until reeled in tightly during campaign season.”
It was perhaps out of a desire to shut Greeley up — and yet also a recognition of the care necessary when dealing with a man, as The Nation put it, “with a newspaper at his back” — that the Whigs nominated Greeley to fill a temporary vacancy in the House for the second session of the 30th Congress in 1848. The session would last only three months, and Greeley’s Congressional career would end when the term did. But what Greeley did with his time was remarkable.
By the middle of the 1800s, Congressmen’s compensation for travel to and from their districts had been an unsuccessful but simmering reform target for years. The law provided for a 40-cent per-mile mileage reimbursement, and computed the distance “by the usually travelled route.” after taking his seat, Greeley got a look at the schedule listing every congressman’s mileage and was shocked by the sums. To Greeley, the disbursements were a wasteful relic of an earlier time, when travel to and from the far-flung reaches of the United States would have been a costly, bruising affair. The 40-cent mileage had been calculated decades earlier to match a pre-1816 congressman’s pay rate of $8 a day, assuming he could travel a mere 20 miles per day. However, thanks to steamships and the increasing prevalence of trains, travelers could go far faster than that.
Greeley saw it as an outrageous waste of the taxpayer’s money, and deployed his newspaper to correct that wrong. “If the route usually travelled from California to Washington is around Cape Horn — or the Members from that embryo State shall choose to think it is — they will each be entitled to charge some $12,000 Mileage per session accordingly.”
Rather than simply opining against it, he conceived and published a data-journalism project that, in form if not in execution, would be very much at home in a newsroom today. He asked one of his reporters, Douglas Howard, a former postal clerk, to use a U.S. Post Office book of mail routes to calculate the shortest path from each congressman’s district to the Capitol, and compared those distances with each congressman’s mileage reimbursements. On Dec. 22, 1848, with Greeley now simultaneously its editor and a brand new congressman from New York, the Tribune published a story and a table in two columns of agate type. The table listed each congressman by name with the mileage he received, the mileage the postal route would have granted him and the difference in cost between them. “Let no man jump at the conclusion that this excess has been charged and received contrary to law,” wrote Greeley in the accompanying text. “The fact is otherwise. The members are all honorable men — if any irrelevant infidel should doubt it, we can silence him by referring to the prefix to their names in the newspapers.”
It wasn’t his colleagues Greeley inveighed against, but rather, he claimed, the system.“We assume that each has charged precisely what the law allows him and thereupon we press home the question — ‘Ought not THAT LAW to be amended?’”
Among the accused stood Abraham Lincoln, in his only term as congressman. Lincoln’s travel from faraway Springfield, Illinois, made him the recipient of some $677 in excess mileage — more than $18,700 today — among the House’s worst. Beside Lincoln, Greeley’s findings included a list of historical legends, including both of Lincoln’s vice presidents — Hannibal Hamlin, who took only an extra $64.80 to go between Washington and Maine, and Andrew Johnson, who got $122.40 extra to get to the Capitol and back from Tennessee. Daniel Webster received $72 extra for travel to and from the Senate from Massachusetts. John C. Calhoun and Jefferson Davis were recipients of an extra $313.60 and $736.80, respectively, for round-trip travel from South Carolina and Mississippi. The excesses tracked roughly according to distance from Washington. Isaac Morse, a Democrat from Louisiana whose journey comprised some 1,200 miles by postal route, received 2,600 miles in mileage from the House. A helpful if imprecise note, I assume written by Greeley, offered: “Only 409 miles less than to London.”
Congressional Mileage Map
This map shows the "excess" mileage paid to each member of the 30th Congress, according to Greeley's story. Although the law that stood in 1848 specified only that the mileage would be paid by "the usually traveled route," Greeley argued that the postal route from each member's district to the Capitol ought to be the standard by which mileage was paid.