Printed and electronic copies of Modeling and Simulation in Python are available from No Starch Press and Bookshop.org and Amazon.
Projecting Population Growth#
In the previous chapter we developed a quadratic model of world population growth from 1950 to 2016. It is a simple model, but it fits the data well and the mechanisms it’s based on are plausible.
In this chapter we’ll use the quadratic model to generate projections of future growth, and compare our results to projections from actual demographers.
Generating Projections#
Let’s run the quadratic model, extending the results until 2100, and see how our projections compare to the professionals’.
Here’s the quadratic growth function again.
def growth_func_quad(t, pop, system):
return system.alpha * pop + system.beta * pop**2
And here are the system parameters.
t_0 = census.index[0]
p_0 = census[t_0]
system = System(t_0 = t_0,
p_0 = p_0,
alpha = 25 / 1000,
beta = -1.8 / 1000,
t_end = 2100)
With t_end=2100
, we can generate the projection by calling run_simulation
the usual way.
results = run_simulation(system, growth_func_quad)
Here are the last few values in the results.
show(results.tail())
Quantity | |
---|---|
Time | |
2096 | 12.462519 |
2097 | 12.494516 |
2098 | 12.525875 |
2099 | 12.556607 |
2100 | 12.586719 |
Here’s what the results look like.
results.plot(color='gray', label='model')
decorate(xlabel='Year',
ylabel='World population (billions)',
title='Quadratic model projection')
According to the model, population growth will slow gradually after 2020, approaching 12.6 billion by 2100.
I am using the word projection deliberately, rather than prediction, with the following distinction: “prediction” implies something like “this is what we expect to happen, at least approximately”; “projection” implies something like “if this model is a good description of the system, and if nothing in the future causes the system parameters to change, this is what would happen.”
Using “projection” leaves open the possibility that there are important things in the real world that are not captured in the model. It also suggests that, even if the model is good, the parameters we estimate based on the past might be different in the future.
The quadratic model we’ve been working with is based on the assumption that population growth is limited by the availability of resources; in that scenario, as the population approaches carrying capacity, birth rates fall and death rates rise because resources become scarce.
If that assumption is valid, we might be able to use actual population growth to estimate carrying capacity, provided we observe the transition into the population range where the growth rate starts to fall.
But in the case of world population growth, those conditions don’t apply. Over the last 50 years, the net growth rate has leveled off, but not yet started to fall, so we don’t have enough data to make a credible estimate of carrying capacity. And resource limitations are probably not the primary reason growth has slowed. As evidence, consider:
First, the death rate is not increasing; rather, it has declined from 1.9% in 1950 to 0.8% now (see http://modsimpy.com/mortality). So the decrease in net growth is due entirely to declining birth rates.
Second, the relationship between resources and birth rate is the opposite of what the model assumes; as nations develop and people become more wealthy, birth rates tend to fall.
We should not take too seriously the idea that this model can estimate carrying capacity. But the predictions of a model can be credible even if the assumptions of the model are not strictly true. For example, population growth might behave as if it is resource limited, even if the actual mechanism is something else.
In fact, demographers who study population growth often use models similar to ours. In the next section, we’ll compare our projections to theirs.
Comparing Projections#
From the same Wikipedia page where we got the past population estimates, we’ll read table3
, which contains projections for population growth over the next 50-100 years, generated by the U.S. Census, U.N. DESA, and the Population Reference Bureau.
table3 = tables[3]
The column names are long strings; for convenience, I’ll replace them with abbreviations.
table3.columns = ['census', 'prb', 'un']
Here are the first few rows:
table3.head()
census | prb | un | |
---|---|---|---|
Year | |||
2016 | 7.334772e+09 | NaN | 7.432663e+09 |
2017 | 7.412779e+09 | NaN | NaN |
2018 | 7.490428e+09 | NaN | NaN |
2019 | 7.567403e+09 | NaN | NaN |
2020 | 7.643402e+09 | NaN | 7.758157e+09 |
Some values are NaN
, which indicates missing data, because some organizations did not publish projections for some years.
The following function plots projections from the U.N. DESA and U.S. Census. It uses dropna
to remove the NaN
values from each series before plotting it.
def plot_projections(table):
"""Plot world population projections.
table: DataFrame with columns 'un' and 'census'
"""
census_proj = table.census.dropna() / 1e9
un_proj = table.un.dropna() / 1e9
census_proj.plot(style=':', label='US Census')
un_proj.plot(style='--', label='UN DESA')
decorate(xlabel='Year',
ylabel='World population (billions)')
Here are the professional projections compared to the results of the quadratic model.
results.plot(color='gray', label='model')
plot_projections(table3)
decorate(title='Quadratic model projection, with UN and Census')
The U.N. DESA expects the world population to reach 11 billion around 2100, and then level off. Projections by U.S. Census are a little lower, and they only go until 2050.
Summary#
In this chapter we use the quadratic growth model to project world population growth between now and 2100.
Real demographers expect world population to grow more slowly than our model, probably because their models are broken down by region and country, where conditions are different, and they take into account expected economic development.
Nevertheless, their projections are qualitatively similar to ours, and theirs differ from each other almost as much as they differ from ours. So the results from the model, simple as it is, are not entirely unreasonable.
If you are interested in some of the factors that go into the professional projections, you might like this video by Hans Rosling about the demographic changes we expect this century: https://www.youtube.com/watch?v=ezVk1ahRF78.
Exercises#
This chapter is available as a Jupyter notebook where you can read the text, run the code, and work on the exercises. You can access the notebooks at https://allendowney.github.io/ModSimPy/.
Exercise 1#
The net growth rate of world population has been declining for several decades. That observation suggests one more way to generate more realistic projections, by extrapolating observed changes in growth rate.
To compute past growth rates, we’ll use a function called diff
, which computes the difference between successive elements in a Series
. For example, here are the changes from one year to the next in census
:
diff = census.diff()
show(diff.head())
census | |
---|---|
Year | |
1950 | NaN |
1951 | 0.037311 |
1952 | 0.041832 |
1953 | 0.045281 |
1954 | 0.048175 |
The first element is NaN
because we don’t have the data for 1949, so we can’t compute the first difference.
If we divide these differences by the populations, the result is an estimate of the growth rate during each year:
alpha = census.diff() / census
show(alpha.head())
census | |
---|---|
Year | |
1950 | NaN |
1951 | 0.014378 |
1952 | 0.015865 |
1953 | 0.016883 |
1954 | 0.017645 |
The following function computes and plots the growth rates for the census
and un
estimates:
def plot_alpha():
alpha_census = census.diff() / census
alpha_census.plot(style='.', label='US Census')
alpha_un = un.diff() / un
alpha_un.plot(style='.', label='UN DESA')
decorate(xlabel='Year', ylabel='Net growth rate')
It uses style='.'
to plot each data point with a small circle.
And here’s what it looks like.
plot_alpha()
Other than a bump around 1990, the net growth rate has been declining roughly linearly since 1970.
We can model the decline by fitting a line to this data and extrapolating into the future. Here’s a function that takes a time stamp and computes a line that roughly fits the growth rates since 1970.
def alpha_func(t):
intercept = 0.02
slope = -0.00021
return intercept + slope * (t - 1970)
To see what it looks like, I’ll create an array of time stamps from 1960 to 2020 and use alpha_func
to compute the corresponding growth rates.
t_array = linspace(1960, 2020, 5)
alpha_array = alpha_func(t_array)
Here’s what it looks like, compared to the data.
from matplotlib.pyplot import plot
plot_alpha()
plot(t_array, alpha_array, label='model', color='gray')
decorate(ylabel='Net growth rate',
title='Linear model of net growth rate')
If you don’t like the slope
and intercept
I chose, feel free to adjust them.
Now, as an exercise, you can use this function to project world population until 2100.
Create a
System
object that includesalpha_func
as a system parameter.Define a growth function that uses
alpha_func
to compute the net growth rate at the given timet
.Run a simulation from 1960 to 2100 with your growth function, and plot the results.
Compare your projections with those from the US Census and UN.
Show code cell content
# Solution
t_0 = 1960
t_end = 2100
p_0 = census[t_0]
Show code cell content
# Solution
system = System(t_0=t_0,
t_end=t_end,
p_0=p_0,
alpha_func=alpha_func)
Show code cell content
# Solution
def growth_func_alpha(t, pop, system):
return system.alpha_func(t) * pop
Show code cell content
# Solution
growth_func_alpha(t_0, p_0, system)
0.06725033332680001
Show code cell content
# Solution
results2 = run_simulation(system, growth_func_alpha);
Show code cell content
# Solution
results2.plot(color='gray', label='model')
plot_projections(table3)
decorate(title='Proportional model, linearly decreasing rate')
Show code cell content
# Solution
# By extrapolating the observed decline in growth rate,
# we generate projections that are similar to the
# professionals' until about 2050.
# After that, our model predicts that world population
# will peak around 2065 at about 9.8 billion, and then
# start to decline.
Show code cell content
# Solution
results2.idxmax(), results2.max()
(2066, 9.75910331482456)