Glucose and Insulin#
This chapter is available as a Jupyter notebook where you can read the text, run the code, and work on the exercises. Click here to access the notebooks: https://allendowney.github.io/ModSimPy/.
The previous chapter presents the minimal model of the glucose-insulin system and introduces a tool we need to implement it: interpolation.
In this chapter, we’ll implement the model two ways:
We’ll start by rewriting the differential equations as difference equations; then we’ll solve the difference equations using a version of
run_simulationsimilar to what we have used in previous chapters.
Then we’ll use a new SciPy function, called
solve_ivp, to solve the differential equation using a better algorithm.
We’ll see that
solve_ivp is faster and more accurate than
As a result, we will use it for the models in the rest of the book.
Implementing the Model#
To get started, let’s assume that the parameters of the model are known.
We’ll implement the model and use it to generate time series for
Then we’ll see how we can choose parameters that make the simulation fit the data.
Here are the parameters.
G0 = 270 k1 = 0.02 k2 = 0.02 k3 = 1.5e-05
I’ll put these values in a sequence which we’ll pass to
params = G0, k1, k2, k3
Here’s a version of
make_system that takes
data as parameters.
def make_system(params, data): G0, k1, k2, k3 = params t_0 = data.index t_end = data.index[-1] Gb = data.glucose[t_0] Ib = data.insulin[t_0] I = interpolate(data.insulin) init = State(G=G0, X=0) return System(init=init, params=params, Gb=Gb, Ib=Ib, I=I, t_0=t_0, t_end=t_end, dt=2)
t_end from the data.
It uses the measurements at
t=0 as the basal levels,
And it uses the parameter
G0 as the initial value for
G. Then it
packs everything into a
system = make_system(params, data)
The Update Function#
The minimal model is expressed in terms of differential equations:
To simulate this system, we will rewrite them as difference equations. If we multiply both sides by \(dt\), we have:
If we think of \(dt\) as a small step in time, these equations tell us how to compute the corresponding changes in \(G\) and \(X\). Here’s an update function that computes these changes:
def update_func(t, state, system): G, X = state G0, k1, k2, k3 = system.params I, Ib, Gb = system.I, system.Ib, system.Gb dt = system.dt dGdt = -k1 * (G - Gb) - X*G dXdt = k3 * (I(t) - Ib) - k2 * X G += dGdt * dt X += dXdt * dt return State(G=G, X=X)
As usual, the update function takes a timestamp, a
State object, and a
System object as parameters. The first line uses multiple assignment to extract the current values of
The following lines unpack the parameters we need from the
To compute the derivatives
dXdt we translate the equations from math notation to Python.
Then, to perform the update, we multiply each derivative by the time step
dt, which is 2 min in this example.
The return value is a
State object with the new values of
Before running the simulation, it is a good idea to run the update function with the initial conditions:
update_func(system.t_0, system.init, system)
G 262.88 X 0.00 Name: state, dtype: float64
If it runs without errors and there is nothing obviously wrong with the results, we are ready to run the simulation.
Running the Simulation#
We’ll use the following version of
def run_simulation(system, update_func): t_array = linrange(system.t_0, system.t_end, system.dt) n = len(t_array) frame = TimeFrame(index=t_array, columns=system.init.index) frame.iloc = system.init for i in range(n-1): t = t_array[i] state = frame.iloc[i] frame.iloc[i+1] = update_func(t, state, system) return frame
This version is similar to the one we used for the coffee cooling problem.
The biggest difference is that it makes and returns a
TimeFrame, which contains one column for each state variable, rather than a
TimeSeries, which can only store one state variable.
When we make the
TimeFrame, we use
index to indicate that the index is the array of time stamps,
columns to indicate that the column names are the state variables we get from
We can run it like this:
results = run_simulation(system, update_func)
The result is a
TimeFrame with a row for each time step and a column for each of the state variables,
Here are the first few time steps.
The following plot shows the simulated glucose levels from the model along with the measured data.
data.glucose.plot(style='o', alpha=0.5, label='glucose data') results.G.plot(style='-', color='C0', label='simulation') decorate(xlabel='Time (min)', ylabel='Concentration (mg/dL)')
With the parameters I chose, the model fits the data well except during the first few minutes after the injection. But we don’t expect the model to do well in this part of the time series.
The problem is that the model is non-spatial; that is, it does not take into account different concentrations in different parts of the body. Instead, it assumes that the concentrations of glucose and insulin in blood, and insulin in tissue fluid, are the same throughout the body. This way of representing the body is known among experts as the “bag of blood” model.
Immediately after injection, it takes time for the injected glucose to
circulate. During that time, we don’t expect a non-spatial model to be
accurate. For this reason, we should not take the estimated value of
G0 too seriously; it is useful for fitting the model, but not meant to correspond to a physical, measurable quantity.
The following plot shows simulated insulin levels in the hypothetical “remote compartment”, which is in unspecified units.
results.X.plot(color='C1', label='remote insulin') decorate(xlabel='Time (min)', ylabel='Concentration (arbitrary units)')
X represents the concentration of insulin in the “remote compartment”, which is believed to be tissue fluid, so we can’t compare it to the measured concentration of insulin in the blood.
X rises quickly after the initial injection and then declines as the concentration of glucose declines. Qualitatively, this behavior is as expected, but because
X is not an observable quantity, we can’t validate this part of the model quantitatively.
Solving Differential Equations#
To implement the minimal model, we rewrote the differential equations as difference equations with a finite time step,
When \(dt\) is very small, or more precisely infinitesimal, the difference equations are the same as the differential equations.
But in our simulations, \(dt\) is 2 min, which is not very small, and definitely not infinitesimal.
In effect, the simulations assume that the derivatives \(dG/dt\) and \(dX/dt\) are constant during each 2 min time step. This method, evaluating derivatives at discrete time steps and assuming that they are constant in between, is called Euler’s method (see http://modsimpy.com/euler).
Euler’s method is good enough for many problems, but sometimes it is not very accurate.
In that case, we can usually make it more accurate by decreasing the size of
But then it is not very efficient.
There are other methods that are more accurate and more efficient than Euler’s method.
SciPy provides several of them wrapped in a function called
ivp stands for initial value problem, which is the term for problems like the ones we’ve been solving, where we are given the initial conditions and try to predict what will happen.
The ModSim library provides a function called
run_solve_ivp that makes
solve_ivp a little easier to use.
To use it, we have to provide a slope function, which is similar to an update function; in fact, it takes the same parameters: a time stamp, a
State object, and a
Here’s a slope function that evaluates the differential equations of the minimal model.
def slope_func(t, state, system): G, X = state G0, k1, k2, k3 = system.params I, Ib, Gb = system.I, system.Ib, system.Gb dGdt = -k1 * (G - Gb) - X*G dXdt = k3 * (I(t) - Ib) - k2 * X return dGdt, dXdt
slope_func is a little simpler than
update_func because it computes only the derivatives, that is, the slopes. It doesn’t do the updates; the solver does them for us.
Now we can call
run_solve_ivp like this:
results2, details = run_solve_ivp(system, slope_func, t_eval=results.index)
run_solve_ivp is similar to
run_simulation: it takes a
object and a slope function as parameters.
The third argument,
t_eval, is optional; it specifies where the solution should be evaluated.
It returns two values: a
TimeFrame, which we assign to
results2, and an
OdeResult object, which we assign to
OdeResult object contains information about how the solver ran, including a success code and a diagnostic message.
'The solver successfully reached the end of the integration interval.'
It’s important to check these messages after running the solver, in case anything went wrong.
TimeFrame has one row for each time step and one column for each state variable. In this example, the rows are time from 0 to 182 minutes; the columns are the state variables,
Here are the first few time steps:
Because we used
t_eval=results.index, the time stamps in
results2 are the same as in
results, which makes them easier to compare.
The following figure shows the results from
run_solve_ivp along with the results from
results.G.plot(style='--', label='simulation') results2.G.plot(style='-', label='solve ivp') decorate(xlabel='Time (min)', ylabel='Concentration (mg/dL)')
The differences are barely visible. We can compute the relative differences like this:
diff = results.G - results2.G percent_diff = diff / results2.G * 100
And we can use
describe to compute summary statistics:
count 92.000000 mean 0.649121 std 0.392903 min 0.000000 25% 0.274854 50% 0.684262 75% 1.009868 max 1.278168 Name: G, dtype: float64
The mean relative difference is about 0.65% and the maximum is a little more than 1%.
Here are the results for
results.X.plot(style='--', label='simulation') results2.X.plot(style='-', label='solve ivp') decorate(xlabel='Time (min)', ylabel='Concentration (arbitrary units)')
These differences are a little bigger, especially at the beginning.
If we use
run_simulation with smaller time steps, the results are more accurate, but they take longer to compute.
For some problems, we can find a value of
dt that produces accurate results in a reasonable time. However, if
dt is too small, the results can be inaccurate again. So it can be tricky to get it right.
The advantage of
run_solve_ivp is that it chooses the step size automatically in order to balance accuracy and efficiency.
You can use keyword arguments to adjust this balance, but most of the time the results are accurate enough, and the computation is fast enough, without any intervention.
In this chapter, we implemented the glucose minimal model two ways, using
run_solve_ivp, and compared the results.
We found that in this example,
run_simulation, which uses Euler’s method, is probably good enough.
But soon we will see examples where it is not.
So far, we have assumed that the parameters of the system are known, but in practice that’s not true. As one of the case studies in the next chapter, you’ll have a chance to see where those parameters came from.
This chapter is available as a Jupyter notebook where you can read the text, run the code, and work on the exercises. You can access the notebooks at https://allendowney.github.io/ModSimPy/.
Our solution to the differential equations is only approximate because we used a finite step size,
If we make the step size smaller, we expect the solution to be more accurate. Run the simulation with
dt=1 and compare the results. What is the largest relative error between the two solutions?
Show code cell content Hide code cell content
# Solution system3 = system.set(dt = 1) results3 = run_simulation(system3, update_func)
Show code cell content Hide code cell content
# Solution results.G.plot(style='C2--', label='simulation (dt=2)') results3.G.plot(style='C3:', label='simulation (dt=1)') decorate(xlabel='Time (min)', ylabel='Concentration (mg/dL)')
Show code cell content Hide code cell content
# Solution diff = results.G - results3.G percent_diff = diff / results3.G * 100 percent_diff.abs().describe()
count 92.000000 mean 0.527015 std 0.291531 min 0.000000 25% 0.292193 50% 0.505702 75% 0.801698 max 0.970760 Name: G, dtype: float64