This tutorial addresses modeling of mortality across multiple populations using the classical full-rank kernel for factor covariates. We present a case study written in RMarkdown.
Loading all packages into the current working directory:
DiceKriging
: kriging methods for single-output GP (SOGP).rgenoud
: genetic optimization solver used for Maximum Likelihood Estimation of GP hyperparameters.kergp
: kriging methods for user-defined kernel, needed for both Full-rank and ICM Multi-output GP.nloptr
: solve nonlinear optimization using R interface to NLopt.data.table
: extension of data.frame
to enhance data manipulation.Loading all user-defined functions:
Two important assumptions are made in the above equation:
Lastly, we create an R function to assign integer labels for all populations in a MOGP model. It is equivalent to create dummy variables for a categorical variable.
We first prepare an aggregated data that combines Danish, French, Swiss, and UK mortality for Male observations, on Ages 70–84 and Years 1990–2016. Then, we assign integer labels for these populations such that: \(\text{Denmark}\equiv 1\), \(\text{France}\equiv 2\), \(\text{Sweden}\equiv 3\), and \(\text{UK}\equiv 4\).
mortData = createMortData(year_start=1990, year_end=2016,
age_start=70, age_end=84,
sex="m", sex_cat="no")
# contain mortality data for all 16 Male populations
ctry = c("Sweden","Denmark","France","UK")
mortData = mortData[country %in% ctry] # select data for 4 populations above
mortData = intCtry(ctry)
The output vector (\(y\)) and input vector (\(X\)) must be provided. We change names of the columns in \(X\) to match the input names defined in \(\color{blue}{\texttt{covFull.R}}\) script.
y = mortData[,y] # output values
X = mortData[,.(age,year,popN)] # input values
X = as.matrix(X,ncol=3)
X = as.data.frame(X)
names(X) = c("x1","x2","x3") # age: x1,
Prior to fitting the model, we also need to specify the number of populations. In this example, we have 4 populations. The function \(\color{blue}{\texttt{gp()}}\) within kergp
package can be called to fit the model.
num_pop = 4
gpFit <- gp(formula = y ~ X$x1 + as.factor(X$x3),
data = data.frame(y, X),
inputs = names(X),
cov = covFull(num_pop),
compGrad = FALSE,
estim = TRUE,
noise = TRUE,
varNoiseIni = 1.0e-3,
varNoiseLower = 1.0e-5,
varNoiseUpper = 1.0e-2,
parCovIni = c(15,15,
rep(0.10,ncol(combn(1:num_pop,2))),
0.02),
optimMethod = "NLOPT_LN_COBYLA",
opts=list("xtol_rel" = 1.0e-5,
"check_derivatives_tol" = 1.0e-5,
"population" = 150,
# "print_level" = 2,
"maxeval" = 150))
There are several important arguments in \(\color{blue}{\texttt{gp()}}\) function:
doParallel
).The summary output of the fitted model shows the parameters in the mean and covariance function:
Call:
gp(formula = y ~ X$x1 + as.factor(X$x3), data = data.frame(y,
X), inputs = names(X), cov = covFull(num_pop), estim = TRUE,
compGrad = FALSE, noise = TRUE, varNoiseIni = 0.001, varNoiseLower = 1e-05,
varNoiseUpper = 0.01, parCovIni = c(15, 15, rep(0.1, ncol(combn(1:num_pop,
2))), 0.02), optimMethod = "NLOPT_LN_COBYLA", opts = list(xtol_rel = 1e-05,
check_derivatives_tol = 1e-05, population = 150, maxeval = 150))
Number of observations: 1620
Trend coef.:
Value
(Intercept) -10.48524
X$x1 0.10002
as.factor(X$x3)2 -0.11554
as.factor(X$x3)3 -0.15023
as.factor(X$x3)4 -0.02777
Covariance whith class "covMan"
'User' covariance kernel
o Description: myGauss
o Dimension 'd' (nb of inputs): 3
o Parameters: "theta.ag", "theta.yr", "pop12", "pop13", "pop14", "pop23", "pop24", "pop34", "eta2"
o Number of parameters: 9
o Accept matrix inputs.
o Param. values:
Value Lower Upper
theta.ag 11.19665 5e+00 30.00
theta.yr 8.13307 5e+00 30.00
pop12 0.18673 1e-06 0.25
pop13 0.15395 1e-06 0.25
pop14 0.14343 1e-06 0.25
pop23 0.20068 1e-06 0.25
pop24 0.19093 1e-06 0.25
pop34 0.16848 1e-06 0.25
eta2 0.03441 1e-06 0.25
Noise variance: 0.001
The mean function can be extracted from the model output:
Intercept | -10.4852 |
Age | 0.1000 |
FRA (vs. DEN) | -0.1155 |
SWE (vs. DEN) | -0.1502 |
GBR (vs. DEN) | -0.0278 |
Denmark is chosen to be the baseline in the mean function. Across all four populations, the common linear trend in Age is approximately 0.1, meaning that mortality rate increases by 10% for each additional year of Age.
The covariance \(\Gamma_{l_1,l_2}\) is driven by the hyperparameters \(\theta_{l_1,l_2}\): large value of \(\theta_{l_1,l_2}\) implies low correlation \(r_{l_1,l_2}:=\exp{(-\theta_{l_1,l_2})}\) between the two populations. We first extract the hyperparameters \(\theta_{l_1,l_2}\)s’ and then compute the cross-correlation:
DEN & FRA | 0.8297 |
DEN & SWE | 0.8573 |
DEN & GBR | 0.8664 |
FRA & SWE | 0.8182 |
FRA & GBR | 0.8262 |
SWE & GBR | 0.8449 |
In this section, we will illustrate how to derive the distribution of the backward-looking annual improvements for Ages 70–84 in 2016: \[MI_{back}^{obs}(x_{ag};2016)=1-\dfrac{\exp{\big(y(x_{ag};2016)\big)}}{\exp{\big(y(x_{ag};2015)\big)}}=1-\exp{\big(y(x_{ag};2016)-y(x_{ag};2015)\big)}\] with \(y(x_{ag},2016)\) and \(y(x_{ag},2015)\) the raw log-mortality rate for \((x_{ag},2016)\) and \((x_{ag},2015)\). Furthermore, we will compare the smoothed improvement factors and their 95% credible bands derived from the fitted full-rank MOGP model in the previous section in contrast to the single-population model.
Improvement rates via 4-population full-rank MOGP
Let’s first create a test set that contains forecasted Ages in two calendar years 2015 and 2016 for 4 countries: Denmark, France, Sweden, and UK. Then, we apply the function \(\color{blue}{\texttt{gp.predict()}}\) in \(\color{blue}{\texttt{gp_predictionFull.R}}\) to obtain \(\mathbf{f_*}\) (the predicted means) along with the cross-covariance at new input \(\mathbf{x_*}\).
agesForecast = 70:84
year = 2016
t1 = year; t0 = year-1
nAg = length(agesForecast)
xPred = data.frame(x1 = rep(agesForecast,2*num_pop),
x2 = rep(c(rep(t0,nAg),rep(t1,nAg)),num_pop),
x3 = rep(1:num_pop,each=nAg*2))
We then extract \(\sigma^2_l\) from single-population models. In this case, we want \(\sigma^2_{DEN}\), \(\sigma^2_{FRA}\), \(\sigma^2_{FRA}\), and \(\sigma^2_{GBR}\) resulted from the single-population models, fitted on Male populations in these 4 populations, Ages 70–84 and Years 1990–2016. The nugget_mortData.Rda
is the R dataset that contains the estimated \(\sigma^2_l\) in SOGP models for 16 Male populations in this study.
# nugget extracted from individual models:
load("nugget_mortData.Rda")
nugget = nugget[country %in% ctry & nug.year==2016]
nugget = nugget[order(country)]
(list.nug = nugget$nug)
[1] 0.0016012 0.0003384 0.0008631 0.0006554
Finally, we can apply the function \(\color{blue}{\texttt{gp.predict()}}\) in \(\color{blue}{\texttt{gp_prediction.R}}\) for prediction:
# prediction using 4-population rull-rank MOGP
out = gp.predict(newdata = xPred,
gpmodel = gpFit,
list.noise = list.nug,
meanTr = "linearAg", # linear in Age only
typePred = "f") # fhat
In this example, we will focus on the improvement rates in Sweden. The results for other countries can be easily replicated. Below, we extract \(\mathbf{f_*}\) and the covariance for new input \(\mathbf{x_*}\) in Sweden.
# extract the predicted mean and variance:
res = out$res
res_Swe = res[res$popN==3,] # results for Sweden
Examples of Swedish prediction, including the predicted means and variances:
age | year | popN | mean | vr | std | |
---|---|---|---|---|---|---|
61 | 70 | 2015 | 3 | -4.049 | 1e-04 | 0.0094 |
76 | 70 | 2016 | 3 | -4.057 | 1e-04 | 0.0117 |
62 | 71 | 2015 | 3 | -3.950 | 1e-04 | 0.0074 |
77 | 71 | 2016 | 3 | -3.960 | 1e-04 | 0.0097 |
Examples of the covariance matrix at a few input \(\mathbf{x_*}\) for Sweden:
(70, 2016) | (71, 2016) | (72, 2016) | (73, 2016) | (74, 2016) | |
---|---|---|---|---|---|
(70, 2015) | 1.02e-04 | 7.78e-05 | 5.64e-05 | 3.81e-05 | 2.32e-05 |
(71, 2015) | 7.78e-05 | 6.53e-05 | 5.30e-05 | 4.12e-05 | 3.04e-05 |
(72, 2015) | 5.64e-05 | 5.30e-05 | 4.81e-05 | 4.21e-05 | 3.53e-05 |
(73, 2015) | 3.81e-05 | 4.12e-05 | 4.21e-05 | 4.09e-05 | 3.78e-05 |
(74, 2015) | 2.32e-05 | 3.04e-05 | 3.53e-05 | 3.78e-05 | 3.81e-05 |
We take advantage of GP properties such that a difference of two GPs is a GP with new mean and covariance function (E.g: \(\mathbb{E}(\mu_1-\mu_2)=\mathbb{E}(\mu_1)-\mathbb{E}(\mu_2)\) and \(\mathbb{V}(\mu_1-\mu_2)=\mathbb{V}(\mu_1)+\mathbb{V}(\mu_2)-2Cov(\mu_1,\mu_2)\)). Then, compute the 95% quantiles of a normal distribution with a given mean and variance:
# mean and variance of the difference:
predictedlogMortDiff = data.frame(age = agesForecast, year = rep(year,nAg))
predictedlogMortDiff$m = res_Swe$mean[(nAg+1):(2*nAg)] - res_Swe$mean[1:nAg]
predictedlogMortDiff$sd2 = res_Swe$vr[1:nAg] + res_Swe$vr[(nAg+1):(2*nAg)] -
2*diag(cov_Swe[1:nAg,(nAg+1):(2*nAg)])
# 95% quantiles (lower and upper) given the mean and variance of the difference:
predictedlogMortDiff$lower <- qnorm(0.025, mean=predictedlogMortDiff$m,
sd=sqrt(predictedlogMortDiff$sd2))
predictedlogMortDiff$upper <- qnorm(0.975, mean=predictedlogMortDiff$m,
sd=sqrt(predictedlogMortDiff$sd2))
Given the distribution of the difference (follows a normal distribution), we can derive the distribution of a transformed variable, which is a one-on-one function of the difference.
If \(U\) has a normal distribution (\(U \sim \mathcal{N}(m,\sigma^2)\)), then the exponential function of \(U\), \(W=\exp{(U)}\), has a log-normal distribution with \(\mathbb{E}(W)=\exp{\Big(m+\dfrac{\sigma^2}{2}\Big)}\) and \(\mathbb{V}(W)=\exp{(\sigma^2-1)\exp{(2m+\sigma^2)}}\).
# mean and variance of a differnece in log mortality (U)
mu <- predictedlogMortDiff$m; sd2 <- predictedlogMortDiff$sd2
predictedMortDiffFull <- data.frame(age = agesForecast, year = rep(year,nAg))
# mean, variance, and 95% CB of transformed variable: W = 1-exp(W)
predictedMortDiffFull$m <- 1-exp(mu+sd2/2)
predictedMortDiffFull$sd2 <- exp(2*mu+sd2)*(exp(sd2)-1)
predictedMortDiffFull$upper <- 1-exp(predictedlogMortDiff$lower)
predictedMortDiffFull$lower <- 1-exp(predictedlogMortDiff$upper)
Multi-output GP is the generalized framework based on Single-population Gaussian Process study, written by Michael Ludkovski, Jimmy Risk, and Howard Zail. The paper can be found here with an R notebook publicly available at: https://github.com/jimmyrisk/GPmortalityNotebook.
We employ \(\color{blue}{\texttt{km()}}\) function in the package DiceKriging
to fit single-population model for Male observations, Ages 70–84 and Years 1990–2016 in Sweden.
mortData = createMortData(year_start=1990, year_end=2016,
age_start=70, age_end=84,
sex="m", sex_cat="no")
mortData = mortData[country=="Sweden"]
xMort = data.frame(age = mortData$age,
year = mortData$year)
yMort = mortData$y
# model fitting:
mortSingle_nug = km(formula = ~x.age,
design = data.frame(x = xMort), response = yMort,
nugget.estim=TRUE,
covtype="gauss",
optim.method="gen",
upper = c(45,45),
control=list(max.generations=100,pop.size=100,
wait.generations=10,
solution.tolerance=1e-5,
print.level = 0))
optimisation start
------------------
* estimation method : MLE
* optimisation method : gen
* analytical gradient : used
* trend model : ~x.age
* covariance model :
- type : gauss
- nugget : unknown homogenous nugget effect
- parameters lower bounds : 1e-10 1e-10
- parameters upper bounds : 45 45
- upper bound for alpha : 1
- best initial criterion value(s) : 830.0952
nug = mortSingle_nug@covariance@nugget
mortSingle = km(formula = ~x.age,
design = mortSingle_nug@X,
response = mortSingle_nug@y,
noise.var = rep(nug,mortSingle_nug@n),
coef.trend = mortSingle_nug@trend.coef,
coef.cov = mortSingle_nug@covariance@range.val,
coef.var = mortSingle_nug@covariance@sd2,
covtype = mortSingle_nug@covariance@name)
Summary of the fitted SOGP model:
Call:
km(formula = ~x.age, design = data.frame(x = xMort), response = yMort,
covtype = "gauss", nugget.estim = TRUE, optim.method = "gen",
upper = c(45, 45), control = list(max.generations = 100,
pop.size = 100, wait.generations = 10, solution.tolerance = 1e-05,
print.level = 0))
Trend coeff.:
Estimate
(Intercept) -10.9148
x.age 0.1040
Covar. type : gauss
Covar. coeff.:
Estimate
theta(x.age) 19.2481
theta(x.year) 26.7457
Variance estimate: 0.09330939
Nugget effect estimate: 0.0008630776
Our goal is to estimate the smoothed improvement rates in Sweden via SOGP model and compare the results with the full-rank MOGP model in the previous section. Similar to what we have done earlier, we first derive the distribution of the difference between two log-mortality variables:
agesForecast = 70:84
year = 2016
t1 = year; t0 = year-1
nAg = length(agesForecast)
xPred <- data.frame(age=rep(agesForecast, 2),year=c(rep(t0,nAg),rep(t1,nAg)))
pred <- predict(mortSingle, newdata = data.frame(x=xPred),cov.compute=TRUE, type="UK")
predictedlogMortDiff <- data.frame(age = agesForecast, year = rep(year,nAg))
predictedlogMortDiff$m <- pred$m[(nAg+1):(2*nAg)] - pred$m[1:nAg]
predictedlogMortDiff$sd2 <- pred$sd[1:nAg]^2 + pred$sd[(nAg+1):(2*nAg)]^2 -
2*diag(pred$cov[1:nAg,(nAg+1):(2*nAg)])
predictedlogMortDiff$lower <- qnorm(0.025, mean=predictedlogMortDiff$m,
sd=sqrt(predictedlogMortDiff$sd2))
predictedlogMortDiff$upper <- qnorm(0.975, mean=predictedlogMortDiff$m,
sd=sqrt(predictedlogMortDiff$sd2))
Using variable transfromation, we derive the mean and 95% credible bands for the improvement rate factors:
mu <- predictedlogMortDiff$m; sd2 <- predictedlogMortDiff$sd2
predictedMortDiff <- data.frame(age = agesForecast, year = rep(year,nAg))
predictedMortDiff$m <- 1-exp(mu+sd2/2)
predictedMortDiff$sd2 <- exp(2*mu+sd2)*(exp(sd2)-1)
predictedMortDiff$upper <- 1-exp(predictedlogMortDiff$lower)
predictedMortDiff$lower <- 1-exp(predictedlogMortDiff$upper)
The large lengthscale in Age (\(\theta_{ag}\)) in SOGP model lead to essentially linear improvement rate factors (blue curves). In the four-population model (orange curves), the Age lengthscale decreases, so the improvement rate factors become more Age-dependent.