Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published October 9, 2020 | public
Book Section - Chapter

Time Warping Clustering for the Forecast and Analysis of COVID-19

Jin, Qixuan

Abstract

This paper presents an effective algorithm for the clustering of confirmed COVID-19 cases at the county-level in the United States. Dynamic time warping and Euclidean distance are examined as the k-means clustering distance metrics. Dynamic time warping can compare time series varying in speed, as counties often experience similar outbreak trends without the timelines matching up exactly. The effect of data preprocessing on clustering was systematically studied. Further analyses demonstrate the immediate value of our clusters for both retrospective interpretation of the pandemic and as informative inputs for case prediction models. We visualize the time progression of COVID-19 from April 5, 2020 to August 23, 2020. We proposed a Monte-Carlo dropout feedforward neural network with the ability to forecast four weeks into the future. Predictions evaluated from July 24, 2020 to August 20, 2020 demonstrate the better empirical performance of the model when trained on the clusters, in comparison with the model trained on individual counties and the model trained on counties clustered by state.

Additional Information

© 2020 IEEE. The major sponsor of this research is the Clinard Innovation Fund. This work was part of the CS156 Model project at Caltech (http://cs156.caltech.edu). We thank Yaser Abu-Mostafa for the supervision and revision of this work. We thank Amanda Li and Tynesha Pham for their support and discussion during the CS156b competition. We thank Dominic Yurk for the improved Hampel filter.

Additional details

Created:
August 19, 2023
Modified:
October 23, 2023