Full text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

In the realm of predictive toxicology for small molecules, the applicability domain of QSAR models is often limited by the coverage of the chemical space in the training set. Consequently, classical models fail to provide reliable predictions for wide classes of molecules. However, the emergence of innovative data collection methods such as intensive hackathons have promise to quickly expand the available chemical space for model construction. Combined with algorithmic refinement methods, these tools can address the challenges of toxicity prediction, enhancing both the robustness and applicability of the corresponding models. This study aimed to investigate the roles of gradient boosting and strategic data aggregation in enhancing the predictivity ability of models for the toxicity of small organic molecules. We focused on evaluating the impact of incorporating fragment features and expanding the chemical space, facilitated by a comprehensive dataset procured in an open hackathon. We used gradient boosting techniques, accounting for critical features such as the structural fragments or functional groups often associated with manifestations of toxicity.

Details

Title
Expanding Predictive Capacities in Toxicology: Insights from Hackathon-Enhanced Data and Model Aggregation
Author
Shkil, Dmitrii O 1   VIAFID ORCID Logo  ; Muhamedzhanova, Alina A 2   VIAFID ORCID Logo  ; Petrov, Philipp I 3 ; Skorb, Ekaterina V 4 ; Aliev, Timur A 4   VIAFID ORCID Logo  ; Steshin, Ilya S 2   VIAFID ORCID Logo  ; Tumanov, Alexander V 2 ; Kislinskiy, Alexander S 2 ; Fedorov, Maxim V 5 

 Syntelly LLC, Moscow 121205, Russia; [email protected] (A.A.M.); [email protected] (I.S.S.); [email protected] (A.V.T.); [email protected] (A.S.K.); Moscow Institute of Physics and Technology, Moscow 141700, Russia 
 Syntelly LLC, Moscow 121205, Russia; [email protected] (A.A.M.); [email protected] (I.S.S.); [email protected] (A.V.T.); [email protected] (A.S.K.) 
 Medtech.Moscow, Moscow 119571, Russia; [email protected] 
 Infochemistry Scientific Center, ITMO University, Saint-Petersburg 191002, Russia; [email protected] (E.V.S.); [email protected] (T.A.A.) 
 Kharkevich Institute for Information Transmission Problems of Russian Academy of Sciences, Moscow 127994, Russia 
First page
1826
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
14203049
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3047000794
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.