Estimating reaction kinetics with Stan and R

Introduction

In chemical kinetics, the rate of a solid-state reaction is generally modeled by a reaction rate law (or rate equation) of the form:

\[ \frac{d\alpha}{dt} \ = \ k \cdot f(\alpha)\ = \ Ae^{-E_a/RT} \cdot f(\alpha) \]

Here, \(\alpha(t) \in [0,1]\) is the reaction’s conversion fraction in time and \(f(\alpha)\) is the reaction model. The reaction rate constant \(k = Ae^{E_a / RT}\) is represented by an Arrhenius equation based on the absolute temperature \(T\), model parameters \(A\) (pre-exponential factor) and \(E_a\) (activation energy), and ideal gas constant \(R\). The reaction model \(f(\alpha)\) can take on many different forms based on mechanistic considerations or empirical observation. The following table provides a list of common reaction models for \(f(\alpha)\) taken from (Khawam and Flanagan 2006) describing various types of reaction kinetics:

Reaction model \(f(\alpha)\)
Power-law (P2) \(2 \alpha^{1/2}\)
Power-law (P3) \(3 \alpha^{2/3}\)
Power-law (P4) \(4 \alpha^{3/4}\)
Avrami-Erofeyev (A2) \(2(1 - \alpha)[-\ln(1 - \alpha)]^{1/2}\)
Avrami-Erofeyev (A3) \(3(1-\alpha)[-\ln(1-\alpha)]^{2/3}\)
Avrami-Erofeyev (A4) \(4(1-\alpha)[-\ln(1-\alpha)]^{3/4}\)
Contracting area (R2) \(2(1-\alpha)^{1/2}\)
Contracting volume (R3) \(3(1-\alpha)^{2/3}\)
1-D Difussion (D1) \(1/(2\alpha)\)
2-D Diffusion (D2) \(-[1/\ln(1-\alpha)]\)
3-D Diffusion-Jander (D3) \([3(1-\alpha)^{2/3}]/[2(1-(1-\alpha)^{1/3})]\)
Ginstling-Brounshtein (D4) \(3/[2((1-\alpha)^{-1/3}-1)]\)
Zero-order (P1/R1/F0) \(1\)
First-order (F1) \((1-\alpha)\)
Second-order (F2) \((1-\alpha)^2\)
Third-order (F3) \((1-\alpha)^3\)
Exponential (E1) \(\alpha\)

Using deSolve::ode(), we can integrate the kinetic trends for all listed reaction models \(f(\alpha)\) with a fixed rate constant \(k\):

library(deSolve)

alpha <- ode(
  y = c(P2 = 0, P3 = 0, P4 = 0, A2 = 0, A3 = 0, A4 = 0, R2 = 0, R3 = 0, D1 = 0, 
        D2 = 0, D3 = 0, D4 = 0, F0 = 0, F1 = 0, F2 = 0, F3 = 0, E1 = 1) + 1E-6,
  times = (1:100)/100,
  func = function(t, y, parms) {
    list(
      dadt = parms["k"] * c(
        P2 = 2 * sqrt(y[1]),
        P3 = 3 * y[2]^(2/3),
        P4 = 4 * y[3]^(3/4),
        A2 = 2 * (1 - y[4]) * sqrt(-log(1 - y[4])),
        A3 = 3 * (1 - y[5]) * (-log(1 - y[5]))^(2/3),
        A4 = 4 * (1 - y[6]) * (-log(1 - y[6]))^(3/4),
        R2 = 2 * sqrt(1 - y[7]),
        R3 = 3 * (1 - y[8])^(2/3),
        D1 = 1 / (2 * y[9]),
        D2 = -1 / log(1 - y[10]),
        D3 = 3 * (1 - y[11])^(2/3) / (2 * (1 - (1 - y[11])^(1/3))),
        D4 = 3 / (2 * (1 - y[12])^(-1/3) - 1),
        F0 = 1,
        F1 = 1 - y[14],
        F2 = (1 - y[15])^2,
        F3 = (1 - y[16])^3,
        E1 = y[17]
      )
    )
  },
  parms = c(k = 0.5)
)

head(alpha)
#>      time       P2         P3           P4           A2           A3
#> [1,] 0.01 0.000001 1.0000e-06 1.000000e-06 1.000000e-06 1.000000e-06
#> [2,] 0.02 0.000036 3.3750e-06 1.798892e-06 3.599936e-05 3.374995e-06
#> [3,] 0.03 0.000121 8.0000e-06 3.001402e-06 1.209927e-04 7.999970e-06
#> [4,] 0.04 0.000256 1.5625e-05 4.724899e-06 2.559672e-04 1.562488e-05
#> [5,] 0.05 0.000441 2.7000e-05 7.101751e-06 4.409028e-04 2.699964e-05
#> [6,] 0.06 0.000676 4.2875e-05 1.027933e-05 6.757716e-04 4.287409e-05
#>                A4          R2         R3         D1         D2        D3
#> [1,] 1.000000e-06 0.000001000 0.00000100 0.00000100 0.00000100 0.0000010
#> [2,] 1.798891e-06 0.009975995 0.01492612 0.07071117 0.09831976 0.1974870
#> [3,] 3.001399e-06 0.019900990 0.02970198 0.10000072 0.13804818 0.2710020
#> [4,] 4.724890e-06 0.029775985 0.04432935 0.12247531 0.16813049 0.3242628
#> [5,] 7.101728e-06 0.039600980 0.05880896 0.14142228 0.19321710 0.3670950
#> [6,] 1.027928e-05 0.049375975 0.07314158 0.15811481 0.21510937 0.4032969
#>              D4       F0          F1          F2          F3       E1
#> [1,] 0.00000100 0.000001 0.000001000 0.000001000 0.000001000 1.000001
#> [2,] 0.01492624 0.005001 0.004988516 0.004976114 0.004963795 1.005014
#> [3,] 0.02970296 0.010001 0.009951156 0.009901970 0.009853428 1.010051
#> [4,] 0.04433262 0.015001 0.014889046 0.014779296 0.014671678 1.015114
#> [5,] 0.05881665 0.020001 0.019802307 0.019608804 0.019420267 1.020202
#> [6,] 0.07315644 0.025001 0.024691063 0.024391196 0.024100856 1.025316

We plot all integrated kinetic trends in a single display by rescaling the trends \(\alpha(t)\) to the unit interval. This provides a quick overview of the different types of reaction kinetics represented by the reaction models \(f(\alpha)\).

To get a more detailed view of the available types of kinetic behavior, we integrate the same kinetic trends by also varying the rate constant \(k\). The reaction models are grouped by the same categories as described in (Khawam and Flanagan 2006, Table 1).

Note that most of the reaction models \(f(\alpha)\) shown here are different representations of the following general form:

\[ f(\alpha) \ \propto \ \alpha^m(1 - \alpha)^n(-\ln(1 - \alpha))^p \] By varying the constants \(m\), \(n\), and \(p\), many different types of reaction kinetics behavior can be generated and modeled.

Statistical modeling

The aim of this post is to demonstrate the use of Stan in the context of estimation and prediction of reaction kinetics based on observed data. In practice, the go-to approach to fit any of the above reaction models to observed data is likely least squares (or maximum likelihood) optimization, as this is the most convenient to implement in most mathematical programming languages. The least squares model fits allow us to quickly assess which reaction models provide reasonable fits to the data, but statistical inference (e.g. construction of confidence regions) for the estimated kinetic trends is not always straightforward due to the nonlinear nature (or missing closed-form solutions) of the integrated reaction models.

In the following demonstrating examples, we benefit from Stan’s ODE solvers to assess the complete posterior distributions of the reaction model parameters and integrated reaction trends. Bayesian inference directly provides us with valuable information on the uncertainty in the estimated kinetic trends. In addition, because of Stan’s flexible and efficient modeling interface, it is possible to build more complex statistical models involving numerically integrated reaction models and/or (highly) correlated model parameters.

Note: the Stan programs in this post are compiled with cmdstanr based on CmdStan >= 2.24. Stan 2.24 introduces new ode_*** functions that avoid packing and unpacking schemes required by the now deprecated integrate_ode_*** functions in previous Stan versions1.

Example 1: Error propagation

Model scenario

Consider a fictional scenario where \(\alpha(t)\) is known to follow a first-order reaction model:

\[ \frac{d\alpha}{dt} \ = \ k \cdot f(\alpha) \ = \ Ae^{-E_a/RT} \cdot (1 - \alpha) \] Instead of observing \(\alpha(t)\) directly, we observe a set of normally distributed rate constants \(k_1, \ldots, k_S\) centered around \(k = A e^{-E_a/RT}\). That is,

\[ k_s | A, E_a, \sigma \overset{\text{iid}}{\sim} N(A e^{-E_a/RT}, \sigma^2) \quad \text{for} \ \ s = 1, \ldots, S \]

The aim of this exercise is to predict the integrated trend \(\alpha(t)\) based on the Arrhenius model fitted to the observations \(k_1, \ldots, k_S\).

Note: the integrated first-order reaction model has a simple closed-form solution,

\[ \alpha(t) \ = \ 1 - e^{-kt} \] For the sake of generality, we solve for \(\alpha(t)\) based on its differential form, as not all reaction models listed above have such straightforward closed-form solutions. Moreover, in terms of model sampling it should make no real difference for Stan2 whether \(\alpha(t)\) is represented by its integrated form or integrated from its differential form using one of Stan’s ODE solvers.

Simulated data

A set of observed i.i.d. rate constants \(k_1,\ldots, k_S\) is simulated with 8 observations at each of three absolute temperatures \(T = \{273.15, 293.15, 313.15\}\). The model parameters \(A\) and \(E_a\) in the Arrhenius equation are specified as \(A = 1 \times 10^{15}\ \text{min}^{-1}\) and \(E_a = 80\ \text{kJ/mol}\) identical to (Khawam and Flanagan 2006, Figure 2) and the noise variance is set to \(\sigma = 2\).

## Arrhenius model parameters
set.seed(1)
pars <- list(log_A = log(1E15), Ea = 80, sigma = 2)
temp_K <- rep(c(283.15, 293.15, 313.15), each = 8)

## simulated observations
k_obs <- with(
  pars,
  data.frame(
    temp_K = temp_K,
    k = rnorm(n = length(temp_K), mean = exp(log_A - Ea / (0.00831446 * temp_K)), sd = sigma)
  )
)

Maximum likelihood

Firstly, given the observations \(k_1,\ldots,k_S\), we can compute maximum likelihood estimates \(\hat{A}\) and \(\hat{E}_a\) by fitting the Arrhenius model \(k_s \sim N(A e^{-E_a/RT}, \sigma^2)\) as a generalized linear model with Gaussian error distribution and a log-link function:

## 1) Arrhenius model fit
glmfit <- glm(k ~ 1 + I(-1 / (0.0083144 * temp_K)), family = gaussian(link = "log"), data = k_obs)
summary(glmfit)
#> 
#> Call:
#> glm(formula = k ~ 1 + I(-1/(0.0083144 * temp_K)), family = gaussian(link = "log"), 
#>     data = k_obs)
#> 
#> Deviance Residuals: 
#>     Min       1Q   Median       3Q      Max  
#> -4.6222  -0.9366   0.5815   1.1630   3.1075  
#> 
#> Coefficients:
#>                            Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)                  34.221      1.605   21.32 3.50e-16 ***
#> I(-1/(0.0083144 * temp_K))   79.143      4.174   18.96 4.06e-15 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for gaussian family taken to be 3.881107)
#> 
#>     Null deviance: 9536.172  on 23  degrees of freedom
#> Residual deviance:   85.384  on 22  degrees of freedom
#> AIC: 104.57
#> 
#> Number of Fisher Scoring iterations: 5

Next, by evaluating the fitted GLM at a specific temperature (e.g. \(T = 298.15 K\)), we obtain a point estimate \(\hat{k}(T)\), which can be plugged into the reaction model to predict the reaction kinetics at the specified temperature:

## 2) evaluate k-hat at room temperature
khat <- predict(glmfit, newdata = list(temp_K = 298.15), type = "response")
khat
#>        1 
#> 9.927213

## 3) integrate alpha(t) by plugging in k-hat
alpha <- ode(
  y = c(alpha = 0),
  times = (0:100)/100,
  func = function(t, y, parms) list(khat * (1 - y))
)

head(alpha)
#>      time      alpha
#> [1,] 0.00 0.00000000
#> [2,] 0.01 0.09450281
#> [3,] 0.02 0.18007554
#> [4,] 0.03 0.25756128
#> [5,] 0.04 0.32772478
#> [6,] 0.05 0.39125752

The following plot displays three such individual trajectories evaluated at three different temperatures:

Here the integrated trajectories are evaluated as deterministic curves conditional on the point estimates \(\hat{k}(T)\). Propagating the estimation error in \(\hat{k}(T)\) to the solution of the ODE is a not a trivial task and probably requires the use of computationally intensive Monte Carlo simulation or other (re)sampling techniques, such as bootstrapping.

MCMC sampling

A sensible solution to the previous issue is to simulate the posterior distribution of the integrated trajectories by means of Bayesian MCMC sampling. In Stan, we can combine both the initial generalized linear model fit and the integration of the reaction model making use of Stan’s ODE solvers. Since the numerical integration of the ODE is not part of the model itself, it is performed in the generated quantities block of the Stan program. Using the new ode_rk45 function available in Stan >= 2.24, the Stan model can be encoded as follows:

// ex1.stan (Stan >= 2.24)
functions {
  // derivative function
  vector deriv(real time, vector y, real k) {
    vector[1] dydt = k * (1 - y);
    return dydt;
  }
}
data {
  // rate estimation
  int<lower = 1> n;
  vector[n] k_obs;  // observed rate constants (1/min)
  vector[n] temp_K; // observed temperatures (K)
  // ode integration
  int<lower = 1> n1;
  vector[1] y0;
  real times[n1];   // evaluated time points (min)
  real temp_K_new;  // evaluated temperature (K) 
}
parameters {
  real lnA;                 // pre-exponential factor
  real<lower = 0> Ea;       // activation energy
  real<lower = 1E-6> sigma; // error standard deviation                  
}
model {
  // (naive) priors
  lnA ~ normal(25, 10);
  Ea ~ normal(50, 10);
  sigma ~ normal(0, 1);
  // likelihood
  k_obs ~ normal(exp(lnA - Ea ./ (0.0083144 * temp_K)), sigma); 
}
generated quantities {
  vector[1] y_new[n1];
  // predicted rate constant
  real k_hat = exp(lnA - Ea / (0.0083144 * temp_K_new));
  // integrated reaction model
  y_new = ode_rk45(deriv, y0, 0.0, times, k_hat); 
}

The GLM likelihood is specified in the model block and we assign (weakly informative) normal priors for the model parameters: \(\ln(A)\) (natural logarithm of the pre-exponential factor), \(E_a\) (activation energy) and \(\sigma\) (error standard deviation). The generated quantities block contains the instructions to integrate a new reaction trend at each sampling iteration based on the the posterior predicted rate constant \(\hat{k}(T)\) and the derivative function defined in the functions block.

The Stan model compilation and sampling is executed with CmdStan (>=2.24) through the cmdstanr-package. First, we compile the model with cmdstan_model(), second we call the sample() method of the generated R6-object (ex1_model) to sample a total of 4000 posterior predicted reaction trends:

library(cmdstanr)

## 1) compile model
ex1_model <- cmdstan_model("ex1.stan")
## 2) sample reaction trends
ex1_fit <- ex1_model$sample(
  seed = 1,
  data = list(
    n = nrow(k_obs),       
    k_obs = k_obs$k,       ## observed rate constants
    temp_K = k_obs$temp_K, ## observed temperatures (K)
    n1 = 100L,
    y0 = array(0),             
    times = (1:100)/100,   ## evaluated times (min)
    temp_K_new = 298.15    ## evaluated temperature (K)
  ),
  chains = 4,
  iter_warmup = 1000,
  iter_sampling = 1000
)
## model fit summary
ex1_fit$cmdstan_summary()
#> Inference for Stan model: ex1_model
#> 4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.
#> 
#> Warmup took (1.1, 1.0, 0.71, 0.70) seconds, 3.6 seconds total
#> Sampling took (1.3, 1.7, 0.93, 0.91) seconds, 4.8 seconds total
#> 
#>                    Mean     MCSE   StdDev       5%      50%      95%    N_Eff  N_Eff/s    R_hat
#> 
#> lp__                -30  4.1e-02  1.3e+00      -32      -29      -28      931      192      1.0
#> accept_stat__   9.5e-01  1.4e-03  8.2e-02  7.8e-01  9.8e-01  1.0e+00  3.3e+03  6.7e+02  1.0e+00
#> stepsize__      7.1e-03  3.3e-04  4.7e-04  6.4e-03  7.3e-03  7.7e-03  2.0e+00  4.1e-01  2.1e+13
#> treedepth__     6.9e+00  4.2e-02  2.4e+00  2.0e+00  8.0e+00  9.0e+00  3.3e+03  6.7e+02  1.0e+00
#> n_leapfrog__    3.0e+02  3.9e+00  2.2e+02  3.0e+00  2.6e+02  5.5e+02  3.3e+03  6.7e+02  1.0e+00
#> divergent__     0.0e+00      nan  0.0e+00  0.0e+00  0.0e+00  0.0e+00      nan      nan      nan
#> energy__        3.1e+01  5.9e-02  1.8e+00  2.9e+01  3.1e+01  3.4e+01  8.9e+02  1.8e+02  1.0e+00
#> 
#> lnA                  33  4.2e-02  1.3e+00       31       33       35      977      202      1.0
#> Ea                   75  1.1e-01  3.4e+00       70       75       81      977      201      1.0
#> sigma               1.9  8.0e-03  2.7e-01      1.5      1.9      2.4     1120      231      1.0
#> y_new[1,1]         0.10  2.0e-04  6.3e-03    0.091     0.10     0.11     1013      209      1.0
#> y_new[2,1]         0.19  3.6e-04  1.1e-02     0.17     0.19     0.21     1012      209      1.0
#> y_new[3,1]         0.27  4.8e-04  1.5e-02     0.25     0.27     0.30     1012      209      1.0
#> y_new[4,1]         0.35  5.8e-04  1.8e-02     0.32     0.35     0.38     1012      209      1.0
....

Looking at CmdStan’s sampling diagnostics and summary results, no obvious sampling problems are observed3. Also, the posterior distributions of \(\log(A)\), \(E_a\) and \(\sigma\) all cover the target parameters reasonably well, using a total of \(n = 24\) observations to fit the Arrhenius regression model. Error propagation from the estimated rate constants to the reaction model integration is no longer an issue, as the reaction model ODE is solved for each sampled value of \(\hat{k}(T)\).

Given the sampling results, we can directly assess the prediction uncertainty present in the integrated reaction trends based on fitting the Arrhenius model to the observed rate constants. To illustrate, the following plot displays a (random) subset of 500 individual integrated reaction trends, as well as the posterior medians and 2.5% and 97.5% posterior quantiles pointwise in time across all reaction trends (resp. the black solid and dashed lines):

Example 2: Reaction model regression

Model scenario

In the previous example, the reaction model was not part of the regression model itself, i.e. we only evaluated the reaction model conditional on an estimated rate constant \(\hat{k}(T)\). In this second example, we consider fitting an integrated reaction model directly to observations of multiple replicated reaction trends.

More precisely, let \(\alpha_1(t), \ldots, \alpha_S(t)\) be a collection of replicated reaction processes, subject to the same experimental conditions, where each process is assumed to follow a reaction order model of the form:

\[ \frac{d\alpha_s}{dt} \ = \ k_s \cdot f(\alpha_s) \ = \ k_s \cdot (1 - \alpha_s)^m \]

The reaction order \(m\) is unknown, but is assumed to be the same for all processes \(\alpha_1(t), \ldots, \alpha_S(t)\). The rate constants \(k_1, \ldots, k_S\) are also unknown and are assumed to randomly vary across experiments. In particular, we impose a Gaussian (random effects) distribution for the replicated rate constants:

\[ k_1,\ldots,k_S\, |\, k, \sigma_u \ \overset{\text{iid}}{\sim} \ N(k, \sigma_u^2) \]

Note that modeling the rate constants according to an Arrhenius equation is not necessary in this example, as the temperature covariate is assumed to be fixed and identical for each replicated reaction process.

In addition to the between-replicate variation introduced by the rate constants \(k_1, \ldots, k_S\), we assume that there is some within-replicate (noise) variation present in the observed reaction process \(y_s(t)\). This noise variation is modeled according to a Beta distribution to ensure that the observed reaction process \(y_s(t)\) is always contained within the unit interval:

\[ y_s(t)\, | \, \alpha_s(t), \phi \ \sim \ \mathcal{Be}\left(\alpha_s(t) \cdot \phi, (1 - \alpha_s(t)) \cdot \phi\right), \quad \text{for}\ s = 1,\ldots, S \] The two shape parameters are parameterized in such a way that:

\[ \begin{aligned} \mathbb{E}[y_s(t)\, |\, \alpha_s(t)] & \ = \ \alpha_s(t) \\ \text{Var}(y_s(t)\, |\, \alpha_s(t)) & \ = \ \frac{\alpha_s(t) \cdot(1 - \alpha_s(t))}{\phi + 1} \end{aligned} \]

This means that, conditional on \(\alpha_s(t)\), \(y_s(t)\) is centered around \(\alpha_s(t)\), and the noise variance tends to zero whenever \(\alpha_s(t)\) is close to zero or one, with \(\phi\) serving as a scale parameter for the noise variance.

Combining the above expressions, for each replicate \(s = 1,\ldots, S\), the generative model for the observed reaction process \(y_s(t)\) with model parameters \(k\), \(m\), \(\sigma_u\) and \(\phi\) is given by:

\[ \begin{aligned} & y_s(t)\, |\, \alpha_s(t), \phi \ \sim \ \mathcal{Be}\left(\alpha_s(t) \cdot \phi, (1 - \alpha_s(t)) \cdot \phi\right), \\[3mm] & \begin{cases} k_s\, |\, k, \sigma_u \ \sim \ N(k, \sigma_u^2), \\[2mm] \frac{d \alpha_s}{dt} \, \Big|\, k_s,m \ =\ k_s \cdot (1 - \alpha_s(t))^m, \quad \text{with} \ \alpha_s(0) \ = \ 0 \\[3mm] \end{cases} \end{aligned} \]

Simulated data

Based on the generative model above, we generate \(S = 10\) replicated second-order reaction trends with model parameters \(k = 0.1\ \text{min}^{-1}\), \(m = 2\), \(\sigma_u = 0.05\ \text{min}^{-1}\) and \(\phi^{-1} = 0.0025\), where each replicate is evaluated at the same 10 time points.

## model parameters
set.seed(1)
S <- 10
times <- (0:10)*10
pars <- list(k = 0.1, m = 2, sigma_u = 0.05, phi_inv = 0.0025)
pars[["ks"]] <- with(pars, rnorm(S, mean = k, sd = sigma_u))

## helper function to integrate alpha_s(t)
alpha_s <- function(pars, times, s = 1) {
  as.data.frame(
    ode(
      y = c(alpha = 0),
      times = times,
      func = function(t, y, ...) list(pars[["ks"]][s] * (1 - y)^pars[["m"]])
    )
  )
}

## generate y_s(t)
y_obs <- lapply(seq_len(S), function(s) {
  y_s <- alpha_s(pars, times, s)
  y_s[["y"]] <- with(pars, rbeta(nrow(y_s), shape1 = y_s[["alpha"]] / phi_inv, shape2 = (1 - y_s[["alpha"]]) / phi_inv))
  y_s[["rep"]] <- sprintf("Replicate %02d", s)
  y_s
})

head(y_obs[[1]])
#>   time     alpha         y          rep
#> 1    0 0.0000000 0.0000000 Replicate 01
#> 2   10 0.4071514 0.4540743 Replicate 01
#> 3   20 0.5786884 0.5677154 Replicate 01
#> 4   30 0.6732356 0.6897440 Replicate 01
#> 5   40 0.7331251 0.7023899 Replicate 01
#> 6   50 0.7744618 0.7755208 Replicate 01

Nonlinear least squares

For a single observed reaction process \(y_s(t)\), we can fit the integrated reaction model \(\alpha_s(t)\) by means of least squares optimization using the deSolve and FME-packages. First, given reaction model parameters \(k_s\), \(m\), we define a function that returns the vector of residuals \(y_s(t_i) - \alpha_s(t_i)\) for \(i = 1,\ldots,n\), where the numerical integration of \(\alpha_s(t)\) relies on deSolve::ode().

library(FME)

## single replicate residuals y_s(t) - alpha_s(t)
resid_s <- function(p, times, y_s) {
  y_s[["y"]] - alpha_s(p, times)[["alpha"]]
}

resid_s(p = list(ks = pars[["ks"]][1], m = 2), times, y_obs[[1]])
#>  [1]  0.0000000000  0.0469228977 -0.0109729722  0.0165083908 -0.0307352512
#>  [6]  0.0010589923  0.0003623959 -0.0219780481 -0.0180789262 -0.0122449312
#> [11] -0.0191027469

Secondly, we pass the defined residual function to FME::modFit(), which provides a convenient wrapper for various least squares optimization routines, defaulting to the Levenberg-Marquardt algorithm available through minpack.lm::nls.lm(). Note that FME::modFit() is not restricted to only least squares optimization for ODE models, but historically is aimed at this context.

## single replicate nls optimization
alpha_nls <- modFit(
  f = resid_s,                ## residual function
  p = c(ks = 0, m = 1),       ## parameter initializations
  times = times,              ## additional arguments to resid_s
  y_s = y_obs[[1]]
)

summary(alpha_nls)
#> 
#> Parameters:
#>    Estimate Std. Error t value Pr(>|t|)    
#> ks 0.085421   0.006826   12.51 5.38e-07 ***
#> m  2.299285   0.097091   23.68 2.04e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.01563 on 9 degrees of freedom
#> 
#> Parameter correlation:
#>        ks      m
#> ks 1.0000 0.9238
#> m  0.9238 1.0000

The following plot displays the least squares fits \(\hat{\alpha}_s(t)\) obtained with FME::modFit() for each individual replicate \(s = 1,\ldots,S\):

MCMC sampling

Based on the individual least squares fits we can quickly assess whether the reaction-order models provide reasonable fits to the observed data, but obtaining least squares or maximum likelihood estimates based on the complete Beta regression model (including random effects) is less straightforward4. In Stan, on the other hand, the regression model is implemented as a standard random effects model by specifying the relevant contributions to the joint (log-)probability function as usual. The fact that the conditional expectations \(\mathbb{E}[ y_s(t)\, |\, \alpha_s(t)]\) are based on numerical (ODE) integration is not fundamentally different for the MCMC sampler from any other nonlinear (or linear) function specification for \(\alpha_s(t)\).

The following Stan program encodes the regression model using a non-centered parameterization for the random effects \(k_1,\ldots,k_S\). The reaction trends \(\alpha_s(t)\) are integrated using the ode_rk45 function analogous to the previous example. Both the random effects parameters and the smooth reaction trends are specified in the transformed parameters block, but these could also have been specified in the model block. The specification of the Beta likelihood for \(y_s(t)\, |\, \alpha_s(t), \phi\) in the model block is similar to the model description above. Note that the somewhat contrived data format for y_s and alpha_s (an array of vectors) is used because this is the format returned by ode_rk45.

// ex2.stan (Stan >= 2.24)
functions {
  vector deriv(real time, vector y, vector ks, real m) {
    int S = rows(ks);
    vector[S] dydt = ks .* pow(1 - y, rep_vector(m, S));
    return dydt;
  }
}
data {
  int<lower = 1> n;                       // # observations
  int<lower = 1> S;                       // # replicates
  real times[n];                          // observed times
  vector<lower = 0, upper = 1>[S] y_s[n]; // observed trends
}
transformed data {
  vector[S] y0 = rep_vector(0.0, S);
}
parameters {
  real<lower = 0> k;                 // mean rate constant
  real<lower = 0, upper = 3> m;      // reaction order
  real<lower = 0> sigma_u;           // replicate sd
  real<lower = 1E-10> phi_inv;       // residual sd scale
  vector[S] u_s;                     // random effects          
}
transformed parameters{
  vector[S] ks;
  vector[S] alpha_s[n];
  ks = k + sigma_u * u_s;
  alpha_s = ode_rk45(deriv, y0, 0, times, ks, m); 
}
model {
  // (naive) normal priors
  k ~ normal(0, 1);
  m ~ normal(1, 1);
  sigma_u ~ normal(0, 1);
  phi_inv ~ normal(0, 1);
  // likelihood
  u_s ~ std_normal();
  for(s in 1:S) 
  y_s[][s] ~ beta(alpha_s[][s] / phi_inv, (1 - alpha_s[][s]) / phi_inv);
}

Stan model compilation and sampling is again executed with CmdStan (>=2.24) using the cmdstanr-package. Note that we remove the observations at time zero in the input data argument passed to Stan.

## 1) compile model
ex2_model <- cmdstan_model("ex2.stan")
## 2) sample posteriors
ex2_fit <- ex2_model$sample(
  seed = 1,
  data = list(
    n = length(times) - 1,
    S = S,
    times = times[-1],
    y_s = sapply(y_obs, function(y_s) y_s$y[-1])
  ),
  chains = 4,
  parallel_chains = 4,
  iter_warmup = 1000,
  iter_sampling = 1000
)

Assessing CmdStan’s summary diagnostics, all MCMC chains seem to have converged correctly. Also, the posterior distributions of the main model parameters are all well-behaved as seen from the pairs plot below.

## model fit summary
ex2_fit$summary()
#> # A tibble: 125 x 10
#>    variable     mean   median      sd     mad       q5      q95  rhat ess_bulk
#>    <chr>       <dbl>    <dbl>   <dbl>   <dbl>    <dbl>    <dbl> <dbl>    <dbl>
#>  1 lp__      2.48e+2  2.49e+2 3.59e+0 3.43e+0  2.42e+2  2.53e+2  1.00     692.
#>  2 k         1.14e-1  1.14e-1 1.68e-2 1.58e-2  8.66e-2  1.42e-1  1.00     754.
#>  3 m         2.08e+0  2.08e+0 3.35e-2 3.34e-2  2.02e+0  2.13e+0  1.00    1798.
#>  4 sigma_u   4.88e-2  4.63e-2 1.41e-2 1.20e-2  3.15e-2  7.43e-2  1.00     701.
#>  5 phi_inv   2.22e-3  2.18e-3 3.58e-4 3.40e-4  1.71e-3  2.87e-3  1.00    2222.
#>  6 u_s[1]   -9.36e-1 -9.22e-1 4.08e-1 4.00e-1 -1.63e+0 -2.73e-1  1.00     730.
#>  7 u_s[2]    3.75e-2  3.96e-2 3.35e-1 3.27e-1 -5.39e-1  5.80e-1  1.00     791.
....
library(bayesplot)

## pairs plot of posteriors
color_scheme_set("red")
mcmc_pairs(ex2_fit$draws(c("k", "m", "sigma_u", "phi_inv")), off_diag_fun = "hex")

As expected, the posterior reaction trend fits exhibit a smaller overall mean squared error (MSE) compared to the individual least squares fits, as seen in the plot below by comparing the target reaction trends \(\alpha_s(t)\) to the posterior predicted median (pointwise in time) trends5. This is attributed to the following facts: (i) the Stan model corresponds exactly to the generative model of the observed processes \(y_s(t)\) for \(s = 1,\ldots,S\), and (ii) the random effects model allows to share information across replicates, which is not possible in the individual least squares fits. Moreover, statistical (posterior) inference for any quantity of interest should be straightforward using the sampled posterior parameters and posterior predicted reaction trends. To illustrate, for each replicate \(s\), the plot also includes 0.5% and 99.5% posterior quantiles (pointwise in time) for the replicate-specific posterior predicted reaction trends.

Session Info

sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.5 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] bayesplot_1.8.0   FME_1.3.6.1       coda_0.19-4       rootSolve_1.8.2.1
#>  [5] cmdstanr_0.3.0    deSolve_1.28      data.table_1.13.6 ggplot2_3.3.3    
#>  [9] kableExtra_1.3.1  knitr_1.31       
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.6        lattice_0.20-41   ps_1.5.0          utf8_1.1.4       
#>  [5] assertthat_0.2.1  digest_0.6.27     R6_2.5.0          plyr_1.8.6       
#>  [9] ggridges_0.5.3    backports_1.2.1   evaluate_0.14     httr_1.4.2       
#> [13] highr_0.8         blogdown_1.2      pillar_1.4.7      rlang_0.4.10     
#> [17] rstudioapi_0.13   minqa_1.2.4       hexbin_1.28.2     jquerylib_0.1.3  
#> [21] checkmate_2.0.0   rmarkdown_2.6.6   labeling_0.4.2    webshot_0.5.2    
#> [25] stringr_1.4.0     munsell_0.5.0     compiler_4.0.2    xfun_0.22        
#> [29] pkgconfig_2.0.3   htmltools_0.5.1.1 tidyselect_1.1.0  gridExtra_2.3    
#> [33] tibble_3.0.6      bookdown_0.21     fansi_0.4.2       viridisLite_0.3.0
#> [37] crayon_1.4.1      dplyr_1.0.4       withr_2.4.1       MASS_7.3-52      
#> [41] grid_4.0.2        jsonlite_1.7.2    gtable_0.3.0      lifecycle_0.2.0  
#> [45] DBI_1.1.1         magrittr_2.0.1    posterior_0.1.3   scales_1.1.1     
#> [49] cli_2.3.0         stringi_1.5.3     reshape2_1.4.4    farver_2.0.3     
#> [53] xml2_1.3.2        bslib_0.2.4       ellipsis_0.3.1    generics_0.1.0   
#> [57] vctrs_0.3.6       tools_4.0.2       glue_1.4.2        purrr_0.3.4      
#> [61] processx_3.4.5    abind_1.4-5       yaml_2.2.1        colorspace_2.0-0 
#> [65] rvest_0.3.6       minpack.lm_1.2-1  sass_0.3.1

References

Khawam, A., and D. R. Flanagan. 2006. “Solid-State Kinetic Models: Basics and Mathematical Fundamentals.” Journal of Physical Chemistry B 110(45): 17315–28. https://doi.org/10.1021/jp062746a.


  1. For additional details, see https://mc-stan.org/users/documentation/case-studies/convert_odes.html↩︎

  2. Except possibly in terms of computational efficiency. For efficiently solving a system of linear ODE’s, see e.g. https://mc-stan.org/docs/2_26/stan-users-guide/solving-a-system-of-linear-odes-using-a-matrix-exponential.html↩︎

  3. The Arrhenius parameters \(A\) and \(E_a\) are highly correlated in their posteriors, but this does not cause any major sampling issues for Stan.↩︎

  4. The FME::modMCMC() function allows for MCMC sampling using a number of custom Metropolis algorithms, but is aimed primarily at fitting (ODE) signal plus i.i.d. noise models, see e.g. https://cran.r-project.org/web/packages/FME/vignettes/FMEmcmc.pdf↩︎

  5. For simplicity, in ex2.stan, the predicted reaction trends are only evaluated at the observed time points. To evaluate trend predictions at different time points, an additional generated quantities block can be included in the Stan program analogous to ex1.stan.↩︎

Joris Chau
Joris Chau
Statistician/Data Scientist
Next
Previous

Related