Year of Award

2021

Document Type

Dissertation

Degree Type

Doctor of Philosophy (PhD)

Degree Name

Mathematics

Department or School/College

Department of Mathematical Sciences

Committee Chair

Brian Steele

Commitee Members

Jonathan Graham, Johnathan Bardsley, Javier Perez Alvaro, Erin Landguth

Keywords

Bagging, Disease, GRU, Prediction Interval, Weighted Neural Network

Abstract

The aim of this study is to improve the forecasting accuracy of artificial neural networks (ANNs) and construct prediction bands for ANN models. The focus is on forecasting for epidemiological purposes, and in particular, the problem of predicting new case and death counts from seven to h days into the future for spatially contiguous regions. The task poses several challenges: datasets are quite small, and both spatially and temporally correlated. To overcome these, the methods attempt to exploit information induced by spatial and temporal dependencies. More importantly, we have developed a fusion of ANNs and bootstrap methods. Bootstrap aggregation (bagging) is an ensemble technique used for reducing the prediction variance and concurrently improving predictive accuracy and constructing prediction bands. Random forests extend bagging by sampling predictors in addition to observations with the result of often dramatic improvement in accuracy. The method developed herein resembles random forests to improve predictive accuracy and to construct prediction bands. We refer to this new approach as extended-bagging (EBagging).

Covid-19 is a highly contagious virus that has disrupted life around the world. Accurate predictions of disease trajectory in the near term are critical. Recurrent neural networks based on gated recurrent units (GRU) are a subclass of ANNs that exploits temporal data structures; however, they are problematic to use and remain poorly understood by researchers. Hence, we propose a simple alternative referred to as weighted neural networks and use this with E-Bagging. To investigate and compare these innovations with standard ANN approaches, we apply the methods to Covid-19 datasets using four counties as the spatial units. The predictive functions forecast the number of deaths for 14 days ahead using four of the most populous US counties. The performance of models is quantified by the mean absolute error. The E-Bagging of GRU models yields highly informative predictions and outperformed the other prediction models. The assessment of constructed prediction bands is measured by coverage probability and the GRU model with the E-Bagging technique performed best. These methods can be applied to a wide variety of other situations from Ebola outbreak mitigation to intra and inter-day stock price forecasting.

Share

COinS
 

© Copyright 2021 Mohsen Tabibian