Combining differential equations and regression

Last week I gave my first lecture for the semester to the SEB113 students. While they tend to not have a particularly strong mathematics background I got some very positive feedback on how much they enjoyed learning about mathematical modelling. We revised differentiation, what derivatives are and then jumped into a bit about formulating differential equations from words that represent the assumptions that the model makes.

The bulk of that week’s lecture is showing where the non-linear regression models we used in the previous week (first order compartment, asymptotic, biexponential) come from. To do this we have a chat about exponential growth and decay models as some of the easiest differential equation models to deal with. I show them how we solve the exponential model exactly and then make reference to the fact that I don’t expect them to solve these equations in this subject. We show the solutions to the DE systems and make it very clear that the non-linear regression models are the solutions to differential equations that represent different assumptions.

We finish the lecture off with a section on how we can’t always get a “pen and paper” solution to differential equations and so sometimes we either simplify the system to one we can solve (alluding to perturbation methods) or give it to a numerical solver (alluding to computational mathematics). Because it’s how I learned about numerical solutions to DEs I showed the students the Lotka-Volterra model and discussed why we can’t solve X(t) and Y(t) and so have to use numerical methods. For different parameter values we get variations on the same behaviour: cyclic patterns, prey population growth followed by predator population growth followed by overconsumption of prey leading to fewer predators being born to replace the dying. Many students seemed to enjoy investigating this model in the workshops, as it’s quite different to everything we’ve learned so far. Solution is via the deSolve package in R but we introduce the students to Euler’s method and discuss numerical instability and the accumulation of numerical error.

I finish off the lecture with a chat about how regression tends to make assumptions about the form of the mean relationship between variables so we can do parameter estimation and that differential equations give us a system we can solve to obtain that mean relationship. I state that while we can solve the DE numerically while simultaneously estimating the parameter it is way outside the scope of the course.

I had a bit of time this morning to spend on next week’s lecture material (linear algebra) so decided to have a go at numerical estimation for the logistic growth model and some data based on the Orange tree circumference data set in R with JAGS/rjags. It’s the first time I’ve had a go at combining regression and numerical solutions to DEs in the same code, so I’ve only used Euler’s method. That said, I was very happy with the solution and the code is provided below the cut.

# euler.bugs
model{

  y[1] ~ dnorm(mu[1], tau.y)
  mu[1] <- y0 + dt * exp(lr) * y0 * (1 - y0/K)

  for (i in 2:n){
    y[i] ~ dnorm(mu[i], tau.y)
    mu[i] <- y[i-1] + dt * exp(lr) * y[i-1] * (1 - y[i-1]/K)
  }

  for (i in 1:n){
    y.p[i] ~ dnorm(mu[i], tau.y)
  }

  lr ~ dnorm(0, 1e-6)
  K ~ dnorm(0, 1e-6)
  y0 ~ dunif(0.001, 1000)
  tau.y ~ dgamma(0.001, 0.001)
}

Which can be called appropriately with

library(rjags)
library(ggplot2)

my.orange <- data.frame(age=seq(100, 1900, by=200),
                        circumference = c(32, 47, 73, 101, 134, 162, 182, 194, 205, 214))

dt <- 10

orange.dat <- data.frame(age=seq(0, 3000, by=dt),circumference=NA)

orange.dat[match(table=orange.dat$age, x=my.orange$age),"circumference"] <- my.orange[, "circumference"]

orange.m <- jags.model(file="euler.bugs", data=list(y=orange.dat$circumference, n = nrow(orange.dat), dt=dt),inits=list(y0=50, K=500, lr=-5))

orange.b <- jags.samples(model=orange.m, n.iter=10000,
                         variable.names=c("y.p","K","lr","y0"))

orange.pred <- coda.samples(model=orange.m, n.iter=1000,
                            variable.names=c("y.p"))

orange.sum <- summary(orange.pred, q=c(0.025, 0.5, 0.975))

orange.gg <- data.frame(orange.sum$quantiles)
orange.gg$age <- orange.dat$age

windows(height=3.5)
ggplot(data=orange.gg, aes(x=age, y=X50.)) + geom_line() +
 geom_line(aes(y=X2.5.), lty=2) +
 geom_line(aes(y=X97.5.), lty=2) +
 geom_point(data= my.orange, aes(y=circumference), alpha=0.5) +
 theme_bw() + xlab("Time (days)") + ylab("Tree circumference (mm)")

The resulting picture can be seen below.

Prediction of tree circumference from logistic growth differential equation

Prediction of tree circumference from logistic growth differential equation

Advertisements

2 thoughts on “Combining differential equations and regression

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s