Abstract

Advances in statistical learning models for prediction have led to broader application across a variety of disciplines, granting generalizations and adaptations that were previously intractable even with advanced computational techniques. Among these is the allowance of correlated data with inherent paneled structure such as longitudinal or clustered data; adjustments which have already begun to be applied to a variety of supervised and unsupervised machine learning methods which had previously focused on cross-sectional data. These modifications have seen rudimentary testing in a number of applied disciplines where correlated data is commonly observed, including medical and clinical research. One field in particular that has garnered interest is Alzheimer’s disease and related dementias. As this disorder is characterized by a prolonged and progressive disease course with an extensive variety of potential biomarkers, its feature-dense datasets with repeated patient measures are well suited for applications of machine learning prediction while utilizing longitudinal modifications. While some novel adaptations of longitudinal machine learning methods have already been tested in the realm of Alzheimer’s disease, there has not yet been a comprehensive evaluation to compare these techniques against each other or against widely accepted standards such as traditional inferential techniques like mixed-effects regression. Nor has there been rigorous investigation into how subject-specific effects can impact the error and bias of these predictions and the distinctions which may arise when developing entire temporal profiles as compared to the forecasting of future observations while leveraging previously observed data. This dissertation addresses these deficiencies in the literature by directly comparing a variety of machine learning techniques with longitudinal adaptations against each other and reference standards using a large, multi-study Alzheimer’s disease meta-database as well as assessing the role of subject-specific effects using synthetic data. This study is especially comprehensive, considering both continuous and categorical outcomes as well as differences when generating whole profiles de novo or forecasting of future observations based on prior sequences. With its emphasis on longitudinal data, this study considers not only predictive capacity for unobserved data using population-level characteristics, but also prediction of future observations using a variety of subject-specific effects.

Details

Title
Applications of Longitudinal Machine Learning Methods in Multi-Study Alzheimer's Disease Datasets
Author
Murchison, Charles F.
Publication year
2021
Publisher
ProQuest Dissertations & Theses
ISBN
9798762198219
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2623879865
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.