The analysis below applies the SOGP framework to investigate Covid-19 excess deaths. To do so we utilize the recently published short-term mortality fluctuation (STMF) dataset published by the Human Mortality Database. The STMF was expressly created to accomodate the critical need of short-term mortality analyses in response to the Covid-19 pandamic. The STMF contains detailed country-specific mortality, segregated by week (52 weeks per year), sex, and 3 age groups in 32 countries:

Austria (AUT)
Belgium (BEL)
Bulgaria (BGR)
Switzerland (CHE)
Czech Republic (CZE)
Germany (DEUTNP)
Denmark (DNK)
Spain (ESP)
Estonia (EST)
Finland (FIN)
France (FRATNP)
England and Wales (GBRTENW)
Northern Ireland (GBR_NIR)
Scotland (GBR_SCO)
Greece (GRC)
Croatia (HRV)
Hungary (HUN)
Iceland (ISL)
Israel (ISR)
Italy (ITA)
Lithuania (LTU)
Luxembourg (LUX)
Latvia (LVA)
Netherlands (NLD)
Norway (NOR)
Poland (POL)
Portugal (PRT)
Russia (RUS)
Slovakia (SVK)
Slovenia (SVN)
Sweden (SWE)
United States (USA)

The dataset is frequently updated. The analysis below is based on datasets last updated on September 17, 2020:

deaths <- readr::read_csv("https://www.mortality.org/Public/STMF/Outputs/stmf.csv", skip=1)

For our analysis we work with mortality rates (provided in STMF as the RTotal field) transformed to the log scale. We furthermore apply a few adjustments:

deaths <- data.table(deaths)
deaths <- deaths[Sex=="b",.(Year,Week,RTotal,DTotal,CountryCode)] 
deaths <- deaths[,y:=log(RTotal)]

\(\color{#c64329}{\textbf{Gaussian Process Regression in estimating 2020 excess deaths}}\)

Our primary goal is to estimate excess mortality during 2020. This is defined as the difference between the observed and the expected number of deaths. Epidemiologists use excess mortality as a standard tool to understand the full impact of the pandemic. It does not only reflect deaths dirctly due to Covid-19 but also other causes associated with the pandemic, such as delayed or forgone care due to hospitals prioritizing Covid-19 emergencies or patients themselves staying away.

To define excess mortality requires agreeing on the baseline, i.e. the expected number of deaths if Covid-19 did not occur. A very simple method is to look at historical weekly mortality and then average. For example, one common metric is the median mortality over the previous 5 years, 2015-2019. However, this approach ignores trends in the data (specifically the mortality improvement factors whereby mortality gets better YoY), is sensitive to data outliers, and most importantly does not provide good uncertainty quantification. UQ is critical to answer the question of whether observed death counts are statistically significant or can be attributed to “random” fluctuations. Mortality data is intrinsically volatile and it is not straightforward to distinguish between frequent fluctuations that could happen versus an unambiguous Covid-19 spike.

Our analysis uses statistical learning, namely GP models to answer the above. To do so, we build a GP model on the historical data, taking for our training set Years 2010-2019, as well as Weeks 1-5 of 2020. We then use this training set to make probabilistic forecasts for Weeks 6-present of 2020 and compare this forecast to the observed mortality. The vertical blue line in the figure below delineates our training set from the forecast period.

The Figure below displays the raw weekly log-mortality rates (after adjustments listed above) for each country indexed by weeks of the year. The 2020 observed log-mortality series is highlighted in red.

To apply the Gaussian process modeling, we treat data as a 2-dimensional table, indexed by Weeks (1-52, coordinate 1) and Years (2010-2020, coordinate 2):

Our goal is to use a Gaussian Process model to:

  1. smooth log-mortality rates in Year and Week dimensions;

  2. perform out-of-sample predictions in 2020 starting from Week 5 to Week 37 (middle of September 2020).

Note that GP is effectively a spatial model and does not have any time-series concepts embedded; it treats retrospective and future forecasts identically.

\(\color{#c64329}{\textbf{Sample Gaussian Process Regression for Sweden all-Age Mortality}}\) Below is an illustration on the application of Gaussian Process to analyze and forecast time-series data in Sweden.

 

\(\color{#c64329}{\textbf{Excess mortality}}\)

We can compare the GP-based mean forecast for 2020 against the observed weekly log-mortality rates to identify excess mortality. In the Figure below, the shaded region is the excess mortality starting at Week 11 (early March). Some countries have experienced much more severe impact than others. Observe that 2020 data ends at different weeks in different nations. We also note that in most countries there is a strong seasonal effect whereby mortality is expected to decrease as we enter the Summer months.

 

\(\color{#c64329}{\textbf{Credible Bands}}\)

To highlight the uncertainty quantification available with a GP model, we plot below the credible intervals around our GP-based 2020 forecasts. Below, we plot the mean forecasts for each country along with the 80% and 95% credible bands. We also show the 5-year medians and the observed 2020 rate for comparison purpose.

 

 

 

We note that in several countries, there is no statistically significant deviation from expected mortality. In other cases, such as Spain and USA, the Covid-19 spikes are clear.

A few remarks can be made relative to the historical 5-year median:

\(\color{#c64329}{\textbf{Excess deaths}}\)

Country Total Excess Deaths in 2020 Last Week Reported
Austria 363 35
Belgium 8328 35
Bulgaria -944 36
Switzerland 674 33
Czech Republic -411 30
Germany 4011 33
Denmark 363 36
Spain 43614 35
Estonia 148 36
Finland 297 34
France 18415 33
England and Wales 46768 36
Nothern Ireland 724 32
Scotland 3622 37
Greece -506 26
Croatia -599 14
Hungary -2242 33
Israel 321 34
Italy 48789 26
Lithuania 188 31
Latvia -386 37
Netherlands 7430 35
Norway 305 34
Poland 184 26
Portugal 3641 35
Slovakia -746 30
Slovenia -106 13
Sweden 5741 35
USA 216381 33

\(\color{#c64329}{\textbf{GP Diagnostics}}\)

For completeness, we report below the fitted GP mean functions for each country, as well as the GP hyperparameters. We note that our basic model had difficulty finding appropriate lengthscales for E&W and was fitted manually for now.

Intercept Year trend A_1 A_2 A_3 A_4
AUT -6.010 0.0007 0.0785 0.0531 0.0036 0.0306
BEL -1.517 -0.0016 0.0917 0.0720 0.0095 0.0261
BGR -18.014 0.0069 0.0937 0.0651 0.0275 0.0154
CHE -1.193 -0.0018 0.0879 0.0456 0.0006 0.0293
CZE -12.113 0.0037 0.0643 0.0478 -0.0024 0.0274
DEUTNP -16.387 0.0059 0.0787 0.0646 -0.0067 0.0335
DNK 3.331 -0.0040 0.0677 0.0569 0.0047 0.0194
ESP -26.612 0.0108 0.1048 0.0805 0.0365 0.0419
EST -5.406 0.0005 0.0758 0.0575 -0.0046 0.0245
FIN -13.872 0.0046 0.0636 0.0436 -0.0009 0.0116
FRATNP -22.259 0.0087 0.0988 0.0526 0.0130 0.0233
GBRTENW -13.563 0.0044 0.1077 0.0152 0.0161 -0.0019
GBR_NIR 2.126 -0.0034 0.1048 0.0604 0.0116 0.0175
GBR_SCO -20.849 0.0081 0.0913 -0.0189 0.0243 -0.0157
GRC -36.836 0.0160 0.0769 0.0526 0.0595 0.0442
HRV -22.022 0.0087 0.0705 0.0583 0.0141 0.0301
HUN -12.419 0.0040 0.0845 0.0573 -0.0009 0.0284
ISR 2.278 -0.0037 0.1123 0.0675 0.0131 0.0383
ITA -2.048 -0.0013 0.1022 0.0638 0.0397 0.0481
LTU -9.550 0.0026 0.0669 0.0626 0.0037 0.0112
LVA -12.836 0.0043 0.0798 0.0636 -0.0029 0.0119
NLD -23.324 0.0092 0.0780 0.0543 0.0073 0.0159
NOR 20.725 -0.0127 0.0858 0.0435 0.0117 0.0193
POL -24.111 0.0097 0.0746 0.0481 -0.0020 0.0146
PRT -28.526 0.0119 0.1410 0.0910 0.0403 0.0504
SVK -8.113 0.0017 0.0607 0.0457 0.0013 0.0286
SVN -26.044 0.0106 0.0900 0.0629 -0.0102 0.0360
SWE 23.256 -0.0139 0.0887 0.0569 0.0004 0.0229
USA -26.001 0.0105 0.0722 0.0370 0.0147 0.0065

Hyper-parameters in the covariance function for each country:

Year Lengthscale Week Lengthscale Process variance Observ. Noise
AUT 0.0820 3.8761 0.0019 0.0015
BEL 0.1108 1.9774 0.0021 0.0012
BGR 0.0134 1.6602 0.0027 0.0011
CHE 0.0776 3.3045 0.0016 0.0011
CZE 0.0618 3.1972 0.0018 0.0013
DEUTNP 0.0936 1.8566 0.0023 0.0005
DNK 0.0552 2.8559 0.0012 0.0011
ESP 0.0244 1.9043 0.0023 0.0004
EST 0.0128 2.2728 0.0014 0.0041
FIN 0.3635 1.7041 0.0013 0.0011
FRATNP 0.1073 1.9666 0.0016 0.0004
GBRTENW 0.1755 3.2533 0.0031 0.0036
GBR_NIR 0.1755 0.7333 0.0060 0.0091
GBR_SCO 0.7294 19.4302 0.0073 0.0032
GRC 0.0762 1.3665 0.0030 0.0012
HRV 0.1203 3.1614 0.0024 0.0026
HUN 0.1065 3.0152 0.0023 0.0019
ISR 0.0509 2.5408 0.0010 0.0016
ITA 0.3921 2.3206 0.0029 0.0008
LTU 0.5755 1.8085 0.0023 0.0020
LVA 0.5023 1.9217 0.0018 0.0023
NLD 0.1267 3.5372 0.0015 0.0007
NOR 0.1333 4.5492 0.0008 0.0017
POL 0.0681 2.8480 0.0018 0.0009
PRT 0.0795 1.8138 0.0030 0.0014
SVK 0.0234 2.8201 0.0021 0.0019
SVN 0.3811 2.6331 0.0022 0.0031
SWE 0.0023 2.5665 0.0011 0.0009
USA 0.3682 4.9320 0.0008 0.0001