Content area
Drug exposure, a key determinant of drug safety and efficacy, is governed by pharmacokinetic (PK) parameters such as volume of distribution (VDss), clearance (CL), half-life (t½), fraction unbound in plasma (fu), and mean residence time (MRT). In this study, we developed machine learning models to predict human PK parameters for 1,283 unique compounds using molecular structure, physicochemical properties, and predicted animal PK data. Our approach involved a two-stage modeling pipeline. First, we trained models to predict rat, dog, and monkey PK parameters (VDss, CL, fu) from chemical structure and properties for 371 compounds. These models were used to predict animal PK values for 1,283 unique compounds with human PK data. These animal PK predictions were then integrated with molecular descriptors and fingerprints to build Random Forest models for human PK parameters. The models demonstrated consistent performance across nested cross-validation and external validation sets, with predictive accuracy for VDss comparable to proprietary models developed by AstraZeneca. Notably, human VDss and CL predictions achieved external R2 values of 0.39 and 0.46, respectively. To support broad accessibility and integration into early drug discovery workflows such as Design-Make-Test-Analyze (DMTA), we developed PKSmart (
Scientific contribution
This study introduces the first publicly available pharmacokinetic (PK) models that match industry-standard predictions, utilizing molecular structural fingerprints, physicochemical properties, and predicted animal PK data to model human pharmacokinetics. Our approach is validated through repeated nested cross-validation and an external test set, including comparing predictions to an industry standard model. The models are released via a web-hosted application (
Details
Accessibility;
Plasma;
Source code;
Datasets;
Physicochemical properties;
Applications programs;
Molecular structure;
Fingerprints;
Metabolism;
Machine learning;
Animal models;
Drug development;
Open source software;
Prediction models;
Drug dosages;
Bioavailability;
Proteins;
Pharmacokinetics;
Industry standards;
Molecular weight;
Pharmacovigilance;
Annotations;
Parameters
1 Broad Institute of MIT and Harvard, Imaging Platform, Cambridge, USA (GRID:grid.66859.34) (ISNI:0000 0004 0546 1623); University of Cambridge, Yusuf Hamied Department of Chemistry, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000 0001 2188 5934)
2 AstraZeneca R&D, Imaging & Data Analytics, Clinical Pharmacology & Safety Sciences, Cambridge, UK (GRID:grid.417815.e) (ISNI:0000 0004 5929 4381)
3 Bombay College of Pharmacy Kalina Santacruz (E), Mumbai, India (GRID:grid.427618.e) (ISNI:0000 0004 1799 8620)
4 AstraZeneca R&D, Safety Innovation, Clinical Pharmacology and Safety Sciences, Mölndal, Sweden (GRID:grid.418151.8) (ISNI:0000 0001 1519 6403)
5 AstraZeneca R&D, Imaging & Data Analytics, Clinical Pharmacology & Safety Sciences, Waltham, USA (GRID:grid.418151.8)
6 Uppsala University, Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala, Sweden (GRID:grid.8993.b) (ISNI:0000 0004 1936 9457)
7 University of Cambridge, Yusuf Hamied Department of Chemistry, Cambridge, UK (GRID:grid.5335.0) (ISNI:0000 0001 2188 5934); Khalifa University of Science and Technology, College of Medicine and Health Sciences, Abu Dhabi, United Arab Emirates (GRID:grid.440568.b) (ISNI:0000 0004 1762 9729)