Abstract

Proteins are a vital component of cells that perform physiological functions to ensure smooth operations of bodily functions. Identification of a protein's function involves a detailed understanding of the structure of proteins. Stress proteins are essential mediators of several responses to cellular stress and are categorized based on their structural characteristics. These proteins are found to be conserved across many eukaryotic and prokaryotic linkages and demonstrate varied crucial functional activities inside a cell. The in-vivo, ex vivo, and in-vitro identification of stress proteins are a time-consuming and costly task. This study is aimed at the identification of stress protein sequences with the aid of mathematical modelling and machine learning methods to supplement the aforementioned wet lab methods. The model developed using Random Forest showed remarkable results with 91.1% accuracy while models based on neural network and support vector machine showed 87.7% and 47.0% accuracy, respectively. Based on evaluation results it was concluded that random-forest based classifier surpassed all other predictors and is suitable for use in practical applications for the identification of stress proteins. Live web server is available at http://biopred.org/stressprotiens, while the webserver code available is at https://github.com/abdullah5naveed/SRP_WebServer.git

Details

Title
Identification of stress response proteins through fusion of machine learning models and statistical paradigms
Author
Alzahrani Ebraheem 1 ; Alghamdi Wajdi 2 ; Ullah Malik Zaka 1 ; Khan, Yaser Daanial 3 

 King Abdulaziz University, Department of Mathematics, Faculty of Science, Jeddah, Saudi Arabia (GRID:grid.412125.1) (ISNI:0000 0001 0619 1117) 
 King Abdulaziz University, Department of Information Technology, Faculty of Computing and Information Technology, Jeddah, Saudi Arabia (GRID:grid.412125.1) (ISNI:0000 0001 0619 1117) 
 University of Management and Technology, Department of Computer Science, Lahore, Pakistan (GRID:grid.444940.9) 
Publication year
2021
Publication date
2021
Publisher
Nature Publishing Group
e-ISSN
20452322
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2593744865
Copyright
© The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.