For this news app, ProPublica, Investigative Post and the Columbia University Graduate School of Journalism combined data from 12 of New York State’s major subsidy programs. Since most of the data is scattered in agency reports or only available through public records requests, we wanted to provide a more comprehensive overview of the companies that are receiving taxpayer money for projects in the state, and the investment and jobs they promise — and create — in return. The result was a database with nearly $5.3 billion of state and local subsidies we documented from 2011 to 2014.

The database allows users for the first time to match the recipient lists of different subsidy programs that the state has set up to aid economic development and to identify companies that benefit from more than one. See our story on one upstate company, Corning, that benefits from multiple programs, the data shows.

We organized the database so that, for the most part, each row represents one subsidy contract. For some programs, such as subsidies issued by local Industrial Development Agencies and the state’s Brownfields Cleanup Program, governments reported annual updates on the status of a long-term tax break, and these updates are each represented by individual rows. For other programs, such as the state’s Economic Development Fund and JOBS Now, agencies reported total amounts of one-time grants. These distinctions are reported in the “time period” column. A separate column provides the year the data was reported. Regardless of whether the contract in question was a one-time grant or a recurring annual tax benefit, the subsidy values for any company, county or program can be added up.

Still, differences in the availability of data from different programs and the challenges of cleaning some of the data make the database incomplete. While providing a meaningful sample of the subsidies given by New York State and local governments, it is not a complete accounting of every dollar of these economic incentives. Some of the state’s biggest deals often bypass established subsidy programs and are appropriated directly from the budget. As a result, the contracts contained in the app only provide an “at least” estimate of the state’s economic development spending.

And while broadly comparable, job creation data — positions promised and created — can also vary across programs and cannot be summed. For example, “jobs promised” reported by Industrial Development Agencies represents the total number of jobs promised throughout the lifespan of a project’s contract, and the “jobs reported” field represents the cumulative total of jobs created through a particular reporting year. As a result, only the most recent year’s job count shows how the project is doing compared to what was promised; summing several years’ worth of cumulative job reporting totals would overcount job creation.

Below is more information on how the project came together.

How We Put the Database Together

New York offers subsidies to companies through numerous state and local agencies. The local agencies include Industrial Development Agencies and Local Development Corporations, which are typically run by county governments and give out tax breaks and other incentives to local businesses. They may also provide assistance to local governments and nonprofits, which is also captured in the data.

The state agencies include Empire State Development, the state’s economic development arm, which provided data for seven of the programs we tracked — Commercial Tax Credits, Film Tax Credits, the Economic Development Fund, JOBS Now, Regional Economic Development Councils, StartUp New York and Excelsior; the New York Power Authority, which provided two additional subsidy programs; and the state’s Taxation and Finance Department, which provided data on the Brownfield Cleanup Program. With the exception of the Brownfield Cleanup Program, which provides incentives for remediating polluted sites, all of the programs tracked are tied to job creation goals.

The availability of public data on subsidy recipients varied by program. The data we included comes from the state’s Open Data portal, agency reports, good government groups and numerous public records requests under the state’s Freedom of Information Law. Where data was available from multiple sources, we combined it so as to give as comprehensive a view as possible for each subsidy program. This involved joining program databases together on common elements and eliminating duplicate entries.

Our published data came from the following sources:

Brownfields: We used data from the state’s Open Data portal and verified addresses using data provided by the Citizens Budget Commission.

Commercial Tax Credit Program, Economic Development Fund, JOBS Now, REDCs, Western New York Power Proceeds Allocation Board, Film Tax Credit Program: We received data through public records requests. For the Film Tax Credit Program, disclosure laws changed for the program in March 2013, so we could not use data for projects submitted for tax credits before this time. Several projects were missing from the data when compared to data published in the program’s quarterly reports, so we added in those projects manually. County information pulled from ESD’s “Soundstages” directory when available. For JOBS Now, we supplemented the data we received with the years when projects were approved from the Open Data portal.

Excelsior Jobs Program: We used data from the Open Data site and requested additional information on the projects’ addresses and dates when they were admitted. Jobs reported figures come from the Sept. 30, 2016, quarterly report.

Industrial Development Agencies and Local Development Corporations: We used data published on the state’s Open Data portal. We downloaded IDA data on July 21, 2016, and LDC data on Feb. 21, 2017.

NYPA: We received the underlying data published in this 2014 report through a public records request. We eliminated from the data companies that were approved for NYPA allocations but never actually used them.

Start-Up: We received the underlying data published in this report through a public records request.

We cleaned and organized each program’s data separately and focused on having comprehensive information on job growth, project location and subsidy values. Each subsidy was then assigned a unique ID that was used to track these values and ultimately merge common data elements such as job growth data, project location and subsidy value, into a master table of contracts across all 12 programs.

We limited our database to the years 2011 through 2014 based on the availability of data across programs. To identify corporate relationships between participants in multiple subsidy programs, we researched businesses to establish connections and find their parent companies. We only included connections that we were able to verify; even if a company is not listed as having a subsidiary, it may have one.

Despite this, there are still some unavoidable inconsistencies when dealing with many different programs with different reporting requirements. Some of these are outlined below:

  • For the Film Tax Credits, Commercial Tax Credits, Excelsior and Brownfields programs, companies receive tax credits after reporting job or project growth. For the Industrial Development Agencies, companies receive a tax incentive that allows them to pay reduced property taxes while reporting on their job creation progress annually.
  • For the Industrial Development Agencies data, some projects listed more than one tax break recipient. In these cases, it was unclear how the tax break was divided between companies. We split up the recipients and note in our database when the company received the subsidy with other businesses. The agencies’ data also listed individuals as subsidy recipients. We could not fully verify the businesses that they work for, so we left their names in the database as is.
  • For the Film Tax Credit Program, the studio is the ultimate recipient of the tax credit. For this reason, we listed the studio as the parent company of the production company. Generally we sought to identify the parent companies as the recipients of the subsidies provided.
  • Estimating the dollar value of NYPA subsidies, which are discounted power allocations rather than tax breaks or cash grants, represented a particular challenge. The value of the allocations fluctuates depending on the market rate of electricity: The higher prices are, the more the discount is worth. We calculated the approximate value of each allocation based on average savings estimated by NYPA officials, based on 2014 electricity prices.
  • The Commercial Tax Credit program was unique in that it rolled over tax credits from one year to another, meaning that applicants didn’t apply for the credit that year but received a prorated credit from previous years. We counted these credits as of the year they were received.

Some of the most compelling information in the data we received was the descriptions of the projects. However, these descriptions were not consistent or easy to clean, and some programs had no descriptions at all. So we excluded these descriptions, though they can still be found in the raw data available for download.