Extending Bootstrap Aggregation of Neural Networks for Prediction with an Application to COVID-19 Forecasting

Document Type

Presentation Abstract

Presentation Date

4-23-2021

Abstract

The aim of the research discussed herein to improve the forecasting accuracy of artificial neural networks. The focus on forecasting for epidemiological purposes, and in particular, the problem of predicting case and death counts from seven to n days in the future for a spatially contiguous region such as a county. The task poses several challenges: the data are both spatially and temporally correlated, and the data sets are quite small for the intended purpose. To overcome these challenges, the methods attempt to exploit information induced by spatial and temporal dependencies. More importantly, we have developed a fusion of artificial neural networks and bootstrap methods.

Bootstrap aggregation (bagging) is an ensemble technique used for (1) reduction prediction function variance and a concurrent improvement in the predictive accuracy (2) construction of prediction intervals. Note that random forests extend bagging by sampling predictor variables in addition to sample observations with the result of often dramatic improvement in accuracy compared to the base prediction function (binary recursive trees). The method developed herein resembles random forests though there are important differences. To improve predictive accuracy and to construct prediction intervals, we apply the bagging mechanism to create a collection of fitted neural networks from a single data set. A forecast is the mean of the forecasts computed from each prediction function in the collection. We refer to this new approach as extended bagging.

Covid-19 is a highly contagious virus that has almost frozen the world and its economy. Accurate predictions of disease trajectory in the near term are critical for the efficient allocation of resources for combating the disease. Artificial neural networks are presently the single best class of predictive functions. Recurrent neural networks (RNNs) are a subclass that exploits temporal data structures; however, they are problematic in use and remain poorly understood by both researchers and practitioners. Hence, we propose a simple alternative referred to as Weighted Neural Network (WNN) and use this new neural network with extended bagging. To investigate and compare these innovations with standard neural network approaches, we apply the methods to Covid-19 datasets using counties as the spatial units.

The predictive functions forecast the number of deaths for two weeks in the future using four of the most populous counties in the United States: Los Angeles County in California, Cook County in Illinois, Harris County in Texas, and New York County in New York State. The performance of neural network-based models is quantified by the mean absolute error (MAE) between predicted and observed numbers of deaths. In the majority of cases, the extended bagging of GRU and WNN models yield highly informative predictions and outperformed the other prediction models. Our proposed technique, extended bagging improved the results of both GRU and WNN models. The assessment of constructed prediction intervals is measured by coverage probability (CP) which is the percentage of target values covered by the constructed prediction intervals. The extended bagging GRU models performed best for building prediction intervals with a CP of 84.2%. Our results show that extended bagging enhanced prediction accuracy, extended bagging of GRU can be exploited for pandemic prediction for better planning and management. These methods can be applied to a wide variety of other situations from Ebola outbreak mitigation to intra and inter-day stock price forecasting.

Additional Details

Doctoral Defense. Link to the presenter's dissertation.

April 23, 2021 at 3:00 p.m. via Zoom

This document is currently not available here.

Share

COinS