Abstract

Transformer models have been developed in molecular science with excellent performance in applications including quantitative structure-activity relationship (QSAR) and virtual screening (VS). Compared with other types of models, however, they are large and need voluminous data for training, which results in a high hardware requirement to abridge time for both training and inference processes. In this work, cross-layer parameter sharing (CLPS), and knowledge distillation (KD) are used to reduce the sizes of transformers in molecular science. Both methods not only have competitive QSAR predictive performance as compared to the original BERT model, but also are more parameter efficient. Furthermore, by integrating CLPS and KD into a two-state chemical network, we introduce a new deep lite chemical transformer model, DeLiCaTe. DeLiCaTe accomplishes 4× faster rate for training and inference, due to a 10- and 3-times reduction of the number of parameters and layers, respectively. Meanwhile, the integrated model achieves comparable performance in QSAR and VS, because of capturing general-domain (basic structure) and task-specific knowledge (specific property prediction). Moreover, we anticipate that the model compression strategy provides a pathway to the creation of effective generative transformer models for organic drugs and material design.

Details

Title
Chemical transformer compression for accelerating both training and inference of molecular modeling
Author
Yu, Yi 1   VIAFID ORCID Logo  ; Börjesson, Karl 1   VIAFID ORCID Logo 

 Department of Chemistry and Molecular Biology, University of Gothenburg , Kemivägen 10, 412 96 Gothenburg, Sweden 
First page
045009
Publication year
2022
Publication date
Dec 2022
Publisher
IOP Publishing
e-ISSN
26322153
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2730749412
Copyright
© 2022 The Author(s). Published by IOP Publishing Ltd. This work is published under http://creativecommons.org/licenses/by/4.0 (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.