NNMoMo: An R Package for Mortality Modeling with Neural Networks

library(NNMoMo)

Before any data can be trained, the model needs to be set up.

The neural network can be customized using the function’s parameters, but the predefined configuration has been tested and found to be sufficient for most applications. In this example, two different models are configured to show the resulting differences in the outcomes.

First, a basic model is computed using the predefined parameters with a “linear” activation function, a “MSE” loss function and “FCN” connections.

Second, a model with a “CNN” connection of the neurons is defined. Moreover, the activation function is switched to “tanh” and q_z1 is increased to 50, though these two modifications are rather minor and are expected to have little effect on the overall results. Additionally, a “Poisson” loss function is chosen.

model_basic <- lcNN()

model_CNN_Poisson <- lcNN(loss_type = "Poisson",
                          activation = "tanh", 
                          model_type = "CNN", 
                          q_z1 = 50)

The configuration of the model can be checked by printing it or using summary().

model_CNN_Poisson
#> NNMoMo object: a Poisson neural network model of type CNN with tanh activation

summary(model_CNN_Poisson)
#> Summary of NNMoMo object:
#>   Model type : CNN 
#>   Activation : tanh 
#>   Loss       : Poisson 
#>   q_e        : 10 
#>   q_z1       : 50

Next, the data needs to be set up for analysis. Country-specific datasets can, for example, be downloaded from the Human Mortality Database or from other reliable sources of mortality data. The datasets for the USA, Canada, Australia, Japan, and Great Britain have already been downloaded and are included with the package (note that they may be outdated). These datasets are used in this example.

nn_data <- NNMoMoData(NNMoMo_data_USA, 
                      NNMoMo_data_CAN,
                      NNMoMo_data_AUS,
                      NNMoMo_data_JPN,
                      NNMoMo_data_GBR)

Alternatively, data can be downloaded using demography::hmd.mx() or by calling NNMoMoData() without any arguments. In this case, a predefined list of 40 countries is downloaded automatically.

It is also important to note that, to ensure the neural network has sufficient data, at least 10 countries should normally be included.

Again, the obtained data file can be printed or summarized.

nn_data
#> NNMoMoData object:
#>   Countries : 5 
#>   Years     : 1921-2023 
#>   Rows      : 946

summary(nn_data)
#> Summary of NNMoMoData object:
#>   Number of countries : 5 
#>   Countries           : AUS, CAN, GBR, JPN, USA 
#>   Sexes               : female, male 
#>   Years               : 1921 - 2023 
#>   Ages                : 0 - 110 
#>   Total rows          : 946

Once the model has been configured and the data prepared, it can be fitted. At this stage, additional parameters can be specified. Most importantly, fitting.epochs determines the number of training epochs, while years.fit and ages.fit define the range of years and ages considered. In this example, common settings have been chosen, and the number of epochs has been set to 5 to reduce computation time in the vignette. This is of course far too few for a proper fit, as around 2000 epochs are recommended.

fitted_basic <- fit(model_basic,
                    nn_data, 
                    years.fit = 1950:1999,
                    ages.fit = 0:99,
                    fitting.epochs = 5)

fitted_CNN_Poisson <- fit(model_CNN_Poisson, 
                          nn_data, 
                          years.fit = 1950:1999, 
                          ages.fit = 0:99,
                          fitting.epochs = 5)

After the fitting process, the calculated Lee–Carter models for each country and gender are stored in list-like objects. Information about the fitting process and the fitted models can be obtained by printing or summarizing.

fitted_CNN_Poisson
#> fitStMoMo_list object: a list of 'fitNNMoMo' / 'fitStMoMo' objects
#> Contains Poisson neural network models of type CNN with tanh activation

summary(fitted_CNN_Poisson)
#> Summary of fitStMoMo_list object:
#> 
#>   Countries         : AUS, CAN, GBR, JPN, USA 
#>   Sexes             : female, male 
#> 
#> Fitting parameters:
#>   Ages fitted:      0-99 
#>   Years fitted:     1950-1999 
#>   Epochs:           5 
#>   Batch size:       128 
#> 
#> Model configuration:
#>   Activation:       tanh 
#>   Model type:       CNN 
#>   Loss type:        Poisson 
#>   q_e:              10 
#>   q_z1:             50 
#> 
#> Results:
#>   Final training loss: 417646.9

The models can then be further analyzed and processed using native StMoMo functions. For demonstration purposes, a pre-trained model with 2000 fitting epochs and more countries is loaded and used for plotting, since the results from only 5 epochs are insufficient for proper visualization. For example, the data for females from the USA can be plotted with:

plot(fitted_basic_2000$USA_female)

plot(fitted_CNN_Poisson_2000$USA_female)

As the native StMoMo::residuals() method allows for scaling, which is not supported in the current NNMoMo implementation, a method that raises an error when scaling = TRUE has been implemented.

try(residuals(fitted_basic_2000$USA_female, scale = TRUE))
#> Error in residuals.fitNNMoMo(fitted_basic_2000$USA_female, scale = TRUE) : 
#>   'scale = TRUE' is not allowed for objects from fit.NNMoMo(). Residuals cannot be scaled when using a neural network.

Moreover, computation of the logLik for fitted models has not yet been implemented in the package, and an error is therefore raised.

try(logLik(fitted_basic_2000$USA_female))
#> Error in logLik.fitNNMoMo(fitted_basic_2000$USA_female) : 
#>   Log-likelihood computation has not yet been implemented in NNMoMo