Abstract

Due to the intricate relationship between the small non-coding ribonucleic acid (miRNA) sequences, the classification of miRNA species, namely Human, Gorilla, Rat, and Mouse is challenging. Previous methods are not robust and accurate. In this study, we present AtheroPoint’s GeneAI 3.0, a powerful, novel, and generalized method for extracting features from the fixed patterns of purines and pyrimidines in each miRNA sequence in ensemble paradigms in machine learning (EML) and convolutional neural network (CNN)-based deep learning (EDL) frameworks. GeneAI 3.0 utilized five conventional (Entropy, Dissimilarity, Energy, Homogeneity, and Contrast), and three contemporary (Shannon entropy, Hurst exponent, Fractal dimension) features, to generate a composite feature set from given miRNA sequences which were then passed into our ML and DL classification framework. A set of 11 new classifiers was designed consisting of 5 EML and 6 EDL for binary/multiclass classification. It was benchmarked against 9 solo ML (SML), 6 solo DL (SDL), 12 hybrid DL (HDL) models, resulting in a total of 11 + 27 = 38 models were designed. Four hypotheses were formulated and validated using explainable AI (XAI) as well as reliability/statistical tests. The order of the mean performance using accuracy (ACC)/area-under-the-curve (AUC) of the 24 DL classifiers was: EDL > HDL > SDL. The mean performance of EDL models with CNN layers was superior to that without CNN layers by 0.73%/0.92%. Mean performance of EML models was superior to SML models with improvements of ACC/AUC by 6.24%/6.46%. EDL models performed significantly better than EML models, with a mean increase in ACC/AUC of 7.09%/6.96%. The GeneAI 3.0 tool produced expected XAI feature plots, and the statistical tests showed significant p-values. Ensemble models with composite features are highly effective and generalized models for effectively classifying miRNA sequences.

Details

Title
GeneAI 3.0: powerful, novel, generalized hybrid and ensemble deep learning frameworks for miRNA species classification of stationary patterns from nucleotides
Author
Singh, Jaskaran 1 ; Khanna, Narendra N. 2 ; Rout, Ranjeet K. 3 ; Singh, Narpinder 4 ; Laird, John R. 5 ; Singh, Inder M. 6 ; Kalra, Mannudeep K. 7 ; Mantella, Laura E. 8 ; Johri, Amer M. 8 ; Isenovic, Esma R. 9 ; Fouda, Mostafa M. 10 ; Saba, Luca 11 ; Fatemi, Mostafa 12 ; Suri, Jasjit S. 13 

 Graphic Era Deemed to be University, Department of Computer Science, Dehradun, India (GRID:grid.449504.8) (ISNI:0000 0004 1766 2457) 
 Indraprastha APOLLO Hospitals, Department of Cardiology, New Delhi, India (GRID:grid.414612.4) (ISNI:0000 0004 1804 700X) 
 NIT Srinagar, Department of Computer Science and Engineering, Hazratbal, Srinagar, India (GRID:grid.414612.4) 
 Graphic Era Deemed to be University, Department of Food Science, Dehradun, India (GRID:grid.449504.8) (ISNI:0000 0004 1766 2457) 
 Adventist Health St. Helena, Heart and Vascular Institute, St Helena, USA (GRID:grid.239578.2) (ISNI:0000 0001 0675 4725) 
 Advanced Cardiac and Vascular Institute, Sacramento, USA (GRID:grid.239578.2) 
 Massachusetts General Hospital, Department of Radiology, Boston, USA (GRID:grid.32224.35) (ISNI:0000 0004 0386 9924) 
 Queen’s University, Department of Biomedical and Molecular Sciences, Kingston, Canada (GRID:grid.410356.5) (ISNI:0000 0004 1936 8331) 
 University of Belgrade, Laboratory for Molecular Genetics and Radiobiology, Belgrade, Serbia (GRID:grid.7149.b) (ISNI:0000 0001 2166 9385) 
10  Idaho State University, Department of Electrical and Computer Engineering, Pocatello, USA (GRID:grid.257296.d) (ISNI:0000 0004 1936 9027) 
11  University of Cagliari, Department of Neurology, Cagliari, Italy (GRID:grid.7763.5) (ISNI:0000 0004 1755 3242) 
12  Mayo Clinic, Department of Physiology and Biomedical Engineering, Rochester, USA (GRID:grid.66875.3a) (ISNI:0000 0004 0459 167X) 
13  AtheroPoint LLC, Stroke Monitoring and Diagnostic Division, Roseville, USA (GRID:grid.66875.3a) 
Pages
7154
Publication year
2024
Publication date
2024
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2986724187
Copyright
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.