The next pandemic is just a forest clearing away. We’re not doing enough to prevent viruses from spilling over from wildlife to humans.

As a devastating outbreak of Ebola spread to Tommy Garnett’s homeland of Sierra Leone in 2014, the conservationist had a hunch.

Garnett long lamented the deforestation from farming, mining and logging in the region and wondered if tree loss had anything to do with the outbreak that had swept into Sierra Leone from a forested area of Guinea. With activities in his country at a standstill due to the outbreak, Garnett asked the ERM Foundation, the nonprofit arm of a sustainability consulting firm in London, to help him analyze patterns of deforestation.

Their findings suggested Garnett’s hypothesis was valid: A particular pattern of deforestation seemed to explain a number of Ebola outbreaks they studied, including the one that began in Meliandou, Guinea.

The majority of emerging infectious diseases originate from wildlife, but understanding how, why and when a pathogen will jump from one species to another, including humans — a phenomenon called spillover — continues to be studied by academics and scientists worldwide.

One study analyzing historical outbreaks found that land-use change — such as clearing forests for agriculture — was the biggest driver of spillover, exceeding factors like climate change and the consumption of meat from wild animals.

We wondered: Is it possible to calculate the risk of a spillover event happening because of deforestation? So we set out to examine how clearing trees can increase the likelihood of such an event, using Ebola as an example pathogen.

We found that the risk of another spillover due to forest loss has increased within the past two decades in the locations of five previous Ebola outbreaks — including the site in Guinea where the largest Ebola outbreak in history began.

As part of the reporting process, ProPublica journalists consulted with biologists, ecologists and infectious disease experts to model how the risk of spillover events has changed over time. Our analysis was based on two peer-reviewed scientific models, generating completely new results. One of the researchers we interviewed said that the analysis ProPublica performed is exactly what they would have liked to do, had they more time and resources.

Here’s how we did it.

To read more about the specifics of our approach, select anywhere it says: Click to show technical details

Our inquiry began with an academic article that was a direct result of Garnett and the ERM team’s study from 2015. The ERM researchers had pitched their work to academics, hoping it could be validated and expanded in a rigorous, peer-reviewed study. Their findings caught the attention of a nonprofit scientific research institute specializing in forest science and its academic collaborators, biologists at the University of Málaga, Spain. Led by Jesús Olivero, a biologist specializing in geographic distributions of animals, the group continued exploring the link between spillover and forest loss.

Olivero and the team focused on five main categories of factors: forest loss, forest fragmentation, human population, geographic location and a measure of the possibility that Ebola was circulating in wildlife based on the environmental features of a particular area. They tested more than 100 variables related to those five factors. They did not examine other factors that may have played a role, such as how often residents came into contact with wildlife, hygiene practices or accessibility of health care.

In a 2017 journal article, the team found that a handful of variables about forest loss in the two years leading up to an outbreak were best able to explain the pattern of where and when recent spillover-induced Ebola outbreaks have occurred. They used the variables to create a model, which identified seven Ebola outbreaks that were significantly related to forest loss.

We were curious about the outbreak locations that had been singled out by Olivero’s deforestation model. We wanted to know: Has deforestation gotten worse in those places? And if so, did the loss of forest increase the risk of another spillover event occurring?

To answer the first question, we used satellite image data to quantify the degree of deforestation over time. For each of the seven outbreak locations, we defined a circular area with a radius of 20 kilometers, or about 12.5 miles, and calculated the amount of forest loss in each year from 2001 to 2021, the range of time for which data is available.

Click to show technical details: Ebola Outbreaks Significantly Related to Forest Loss

Seven spillover-induced Ebola outbreaks linked to forest loss events in Olivero et al., 2017. Coordinates provided by Jesús Olivero and checked by Irena Hwang and Al Shaw.

Forest cover data is from the University of Maryland’s Global Forest Change database. When we accessed the database, it had data from 2000 through 2021, and it had last been updated on April 28, 2022. We focused on circular areas of interest, or AOI, with a 20 kilometer radius centered around each outbreak location, an area experts said was reasonable for a person there to cover on foot or by bicycle. Pixels with at least 30% tree cover in 2020 were classified as forest, and pixels that are water or other unavailable data were classified as no-data pixels. For each subsequent year of data, we checked if pixels were no-data, remaining existing forest (i.e., at least 30% trees in 2000) or had lost forest. For each AOI and each year of data available, we calculated the fraction of total forest pixel area divided by the total area of all pixels that are not classified as no-data, and we defined the fraction of area converted (Φ) to be 1 minus that quantity.

In all seven locations, deforestation had increased since the previous outbreaks occurred. But to understand how these trends in deforestation might affect spillover risk, we needed another model.

The epidemiological model: an incorporation of changes in forest loss into spillover risk over time

Around the same time Olivero’s team developed the deforestation model, a different group of researchers, led by Christina Faust at the University of Glasgow, Scotland, created an epidemiological model that calculates an area’s spillover risk by using information about its deforestation over time. This model, unlike the deforestation model, doesn’t only consider changes to forests in aggregate, but it also takes into account how the patterns of tree loss might impact risk.

It is an adaptation of a classic epidemiological model that tracks how populations of susceptible, infected and recovered individuals change over time as a virus spreads. Crucially, it incorporates information about the degree and type of deforestation that’s occurring in an area over time.

When we think of deforestation, we might picture large swaths of forest clear-cut for acres of industrial agriculture. But deforestation often occurs on a smaller scale. Activities like clearing trees for subsistence farming or gathering wood for charcoal can result in many smaller patches of tree loss, rather than huge clearings. When deforestation occurs in small patches, the total area around the “edge” — the border area around clearings where humans and potentially disease-carrying animals can interact — will often exceed the total area of cleared forest.

The researchers found that the highest risk of spillover occurs at intermediate levels of forest loss. That’s because there’s just enough disturbed forest left for adaptable species like bats to survive. At the same time, the total amount of edge around those deforested patches — the places where people are most likely to come in contact with wildlife — is at its peak. When the scale tips beyond that intermediate level of habitat loss, there isn’t as much forest to support the wildlife, resulting in less total edge where humans and animals can collide.

Using the same satellite image data that we relied on to quantify forest cover over time, we calculated the edge area for each location each year between 2001 and 2021. Then, we calculated trend lines linking total edge area to degree of deforestation for each location. We refer to these lines as “deforestation trends.”

Click to show technical details:

The core of Faust’s epidemiological model is a modified compartmental model that incorporates edge as a function of habitat conversion, ε[Φ], into changes in the populations of susceptible, infected and recovered core and matrix species (Sc, Ic, Rc, Sm, Im, Rm). Core species are assumed to be reservoirs for the virus of interest. We consider humans to be the matrix species. The ordinary differential equations describing the model are shown below. For descriptions of the other variables in the equations, please see Faust et al., 2018.

We implemented the definition of edge described in Faust’s 2018 paper. For each AOI around an outbreak location, we calculated the total area covered by all pixels representing forest and all pixels where forest loss was detected during a year between 2001 and 2021. For all pixels where forest loss was detected, we established a 200-meter-wide, or 656-foot-wide, buffer area around contiguous loss pixels, summed the areas of all of those buffer areas and divided that value by the total area of pixels in the AOI not classified as “no data.” We refer to this fraction as edge, ε.

We included all loss areas in edge calculations, even those smaller than 200 meters across. This sometimes results in doughnut-shaped buffers around small areas of loss, but we chose to keep these as they show locations where humans have been inhabiting the forest. Because some buffer areas may overlap, edge can exceed 1, as described by Faust.

The experts we consulted said that a 200-meter buffer was reasonable for our simulations, though the buffer may be adjusted for a different virus or host species or for different human behaviors. For example, a malaria-carrying mosquito may travel a much greater distance than 200 meters, or there may be places where humans do not tend to go quite so far into forests for resources.

We then paired each edge quantity calculated for year t, εt with the corresponding fraction of area converted for that year, Φt, generating seven sets of 21 data points. Deforestation trends linking edge and area converted, ε[Φ], were obtained by performing polynomial regression to each set of data, testing second-, third- and fourth-order polynomials. We selected the best fits using a minimum mean squared error criterion, except in cases where overfitting appeared to be excessive.

The epidemiological model assumes a direct relationship between deforestation and the susceptibility of humans and wild animals to viral infection. As forest is destroyed, the transmissibility of a virus among wild animals is assumed to decrease, simply because there is less habitat, and thus fewer animals that can sustain the virus. Conversely, as animal habitats are destroyed, the model assumes that the number of humans increases proportionally, since the increased ability to grow food can support a larger population.

Click to show technical details:

Faust’s model assumes that each species’ carrying capacity is directly related to the transmissibility of a virus, represented by a quantity known as R0. The model also assumes that carrying capacity, and thus each species’ R0, changes linearly with percent forest converted. The virus reservoir species’ R0 decreases linearly to zero as forest is converted, while the R0 for humans increases linearly from 0.

In Faust’s paper and code, which we adapted, they use the method presented in Diekmann et al., 1990 to calculate a community R0. Although the community R0 describes secondary infection capabilities within a mixed population consisting of a presumed virus reservoir species and humans, Faust and co-authors confirmed that the quantity can be interpreted as a proxy for spillover risk from reservoir species to humans.

In sum, the model takes in deforestation trends and characteristics about human and wildlife populations, and it translates these inputs into risk of spillover over time.

We took the deforestation trends calculated for the seven locations from the deforestation model and combined them with the epidemiological model. We also customized the epidemiological model code with parameter values specific to the particular Ebola strains that each location encountered. The parameters included a range of transmissibility of Ebola among humans, estimated from known Ebola outbreaks, and an estimate of transmissibility of Ebola among bats, the presumed host species for the virus.

Click to show technical details:

Bats are believed to be Ebola reservoirs, since antibodies to the virus and even fragments of viral genetic material have been found in them. However, live Ebola virus has not been isolated from bats, and certain properties of the virus, including R0 in bats, are not known. One of the experts we consulted suggested using a value of 1.1 for R0 for the reservoir species, since by definition, a virus must be able to persist within a reservoir species (i.e., R0 > 1).

Four of the outbreaks we considered were of the Ebola Zaire strain, and three were Ebola Sudan outbreaks. Though much more is known about Ebola Zaire, ranges of R0 in human populations for both strains have been estimated from previous outbreaks. For our calculations, we tested 100 values of R0 sampled evenly across each strain’s range: from 1.36 to 4.71 for Ebola Zaire outbreaks and 1.34 to 2.7 for Ebola Sudan outbreaks.

The Faust epidemiological model also depends heavily on a parameter called Ψ, a measure of the efficiency of viral transmission between different species. This quantity has not yet been estimated for Ebola, though experts we consulted agreed that it was likely less than 0.5 and greater than zero. For our calculations, we tested 100 values of Ψ sampled evenly between 0.005 and 0.5.

Altogether, for each location we tested 10,000 different combinations of Ψ and R0 in humans.

In six out of the seven locations, deforestation over the past 20 years was significant, reaching a maximum degree of forest loss between approximately 10% and 30%. We excluded one location from our analysis, a village called Inkanamongo-Boende in the Democratic Republic of Congo, where an Ebola outbreak occurred in 2014 yet deforestation has remained minimal, below 4%.

Deforestation trends varied between the six remaining locations. In some locations, increasing deforestation has been accompanied by a steady increase in total edge area. This is consistent with forest being cleared in numerous small patches. In other locations, deforestation has progressed to a point where remaining patches of forest are so spread out and isolated, overlap between the patches leads to less edge area than at lower levels of deforestation.

In all six locations, the maximum total edge area resulting from deforestation was at least twice the area of intact forest, and in some locations, it was more than three times as much. In other words, the areas where humans and wild animals were likely to interact was up to three times larger than the areas that animals have left to live in.

Integrating the deforestation trends into our customized version of the epidemiological model showed that in five of the six locations, spillover risk in 2021 — the most recent year for which data was available — was higher than during the years the original outbreaks occurred.

We observed qualitative differences in deforestation trends between locations that had experienced outbreaks of the Ebola Sudan strain versus the Ebola Zaire strain. Despite these differences, our analysis shows that local land-use change has consistently led to an increased risk of Ebola spilling over from wild animals to humans.

It’s worth keeping in mind that these findings are based on a theoretical model, and that all models, including this one, have limitations.

We chose this model because it directly translates deforestation trends into spillover risk. However, the model does not consider other factors, like how humans are consuming or interacting with wildlife, whether multiple types of wildlife may be present or how humans are using the forest. As mentioned above, the model assumes a direct relationship between the amount of forest available and the sizes of human and wildlife populations that can be sustained.

For that reason, we cannot interpret the model’s results as a measure of absolute risk. The experts we consulted said it was best used to compare risk over time for the same location, rather than among different locations. This is why we did not use the model’s results to compare risk levels between different countries or between different locations within the same country. Instead, we reported on relative increases in risk.

Finally, the model does not tell us why, how or when a spillover event might occur.

Despite these caveats, we felt it was important to conduct this analysis because it helps to crystallize trends in spillover risk due to deforestation in these key locations. Hamish McCallum, professor of infectious disease ecology at Griffith University in Australia and co-author of the epidemiological model, noted that results like ours are important because they help to “make explicit what’s essentially intuition.”

The science clearly shows that deforestation should stop, but that doesn’t take into account the realities of the people living in these areas. Residents in Meliandou are subsistence farmers. Besides growing rice, they also venture into the forest to gather fruit from oil palms and burn trees to make charcoal to sell. Fertilizer, different crop rotations and help from agricultural specialists could improve their rice yields, but our reporting found that residents don’t have access to those things. And when there are poor harvests, like residents said they had in 2021, they are forced to continue cutting down trees to sustain their families. As governments and global agencies debate how to best prevent the next pandemic, some experts are calling for more funding to prevent spillover from happening, not just improving our preparation and response to an outbreak after it begins. Analyses like ours can highlight locations that may be prime for ecological interventions by helping us better understand the role land-use change plays in driving spillover events.


We would like to thank the following people for the time and expertise they shared in reviewing our work. Their review does not constitute an endorsement of our methods or our discussion, and any errors are our own.

Christina Faust, research fellow at the University of Glasgow

Jesús Olivero, associate professor in the department of animal biology at the University of Málaga, Spain

Heather Lynch, professor of ecology and evolution at Stony Brook University and ProPublica data science adviser

Caroline Chen contributed reporting.