Full text

Turn on search term navigation

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Genome sequencing from wastewater enables accurate and cost-effective identification of SARS-CoV-2 variants. However, existing computational pipelines have limitations in detecting emerging variants not yet characterized in humans. Here, we present an unsupervised learning approach that clusters co-varying and time-evolving mutation patterns to identify SARS-CoV-2 variants. To build our model, we sequence 3659 wastewater samples collected over two years from urban and rural locations in Southern Nevada. We then develop a multivariate independent component analysis (ICA)-based pipeline to transform mutation frequencies into independent sources. These data-driven time-evolving and co-varying sources are compared to 8810 SARS-CoV-2 clinical genomes from Nevadans. Our method accurately detects the Delta variant in late 2021, Omicron variants in 2022, and emerging recombinant XBB variants in 2023. Our approach also reveals the spatial and temporal dynamics of variants in both urban and rural regions; achieves earlier detection of most variants compared to other computational tools; and uncovers unique co-varying mutation patterns not associated with any known variant. The multivariate nature of our pipeline boosts statistical power and supports accurate early detection of SARS-CoV-2 variants. This feature offers a unique opportunity to detect emerging variants and pathogens, even in the absence of clinical testing.

Wastewater surveillance can help in pandemic or outbreak response. Here, the authors report an unsupervised learning approach to detect emerging SARS-CoV-2 variants from rural and urban wastewater showing it achieves earlier detection than existing methods and detects new variants without clinical testing data.

Details

Title
Early detection of emerging SARS-CoV-2 Variants from wastewater through genome sequencing and machine learning
Author
Zhuang, Xiaowei 1 ; Vo, Van 2   VIAFID ORCID Logo  ; Moshi, Michael A. 3 ; Dhede, Ketan 3   VIAFID ORCID Logo  ; Ghani, Nabih 2   VIAFID ORCID Logo  ; Akbar, Shahraiz 2 ; Chang, Ching-Lan 3   VIAFID ORCID Logo  ; Young, Angelia K. 4   VIAFID ORCID Logo  ; Buttery, Erin 4 ; Bendik, William 4   VIAFID ORCID Logo  ; Zhang, Hong 4 ; Afzal, Salman 4 ; Moser, Duane 5   VIAFID ORCID Logo  ; Cordes, Dietmar 6 ; Lockett, Cassius 4 ; Gerrity, Daniel 7 ; Kan, Horng-Yuan 4 ; Oh, Edwin C. 8   VIAFID ORCID Logo 

 University of Nevada Las Vegas, Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926); University of Nevada Las Vegas, Neuroscience Interdisciplinary Ph.D. program, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926); Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, USA (GRID:grid.239578.2) (ISNI:0000 0001 0675 4725) 
 University of Nevada Las Vegas, Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926) 
 University of Nevada Las Vegas, Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926); University of Nevada Las Vegas, Neuroscience Interdisciplinary Ph.D. program, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926) 
 Southern Nevada Health District, Las Vegas, USA (GRID:grid.422451.4) (ISNI:0000 0004 0383 2216) 
 Desert Research Institute, Division of Hydrologic Sciences, Las Vegas, USA (GRID:grid.474431.1) (ISNI:0000 0004 0525 4843) 
 Cleveland Clinic Lou Ruvo Center for Brain Health, Las Vegas, USA (GRID:grid.239578.2) (ISNI:0000 0001 0675 4725) 
 P.O. Box 99954, Southern Nevada Water Authority, Las Vegas, USA (GRID:grid.509521.a) (ISNI:0000 0000 9767 1388) 
 University of Nevada Las Vegas, Laboratory of Neurogenetics and Precision Medicine, College of Sciences, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926); University of Nevada Las Vegas, Neuroscience Interdisciplinary Ph.D. program, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926); University of Nevada Las Vegas, Department of Brain Health, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926); University of Nevada Las Vegas, Department of Internal Medicine, Kirk Kerkorian School of Medicine at UNLV, Las Vegas, USA (GRID:grid.272362.0) (ISNI:0000 0001 0806 6926) 
Pages
6272
Publication year
2025
Publication date
2025
Publisher
Nature Publishing Group
e-ISSN
20411723
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3227750527
Copyright
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.