Content area

Abstract

The application of machine learning in materials science, known as Quantitative Structure-Activity Relationship (QSAR) modeling, has grown rapidly, driven by increasing demand for composite materials. However, the success of ML in mixture modeling depends on the quality of mixture descriptors, which are challenging to generate due to structural complexity. To address this, we proposed 13 mixture descriptors from existing and novel mixing rules, categorized as additive and non-additive. Two Python packages, MixtureMetrics and CombinatorxPy, were developed to automate descriptor calculations. MixtureMetrics computes 12 additive descriptors with linear complexity O(M). To handle the exponential complexity of CombinatorxPy, based on the Cartesian product and scaling as O(MN ), a distributed approach was implemented using Dask. These descriptors were applied to predict the fouling release activity of 18 silicone oil-infused PDMS coating polymers for U. linza removal at 110 kPa. We employed a two-stage feature importance method for feature selection to identify the optimal descriptor. A decision tree model achieved the highest performance, with an R 2 of 0.987 for both the training and test sets, and a cross-validation Q 2 LOO of 0.791. Further validation on 40 PDMS-poly(SBMA)-based amphiphilic additives demonstrated the effectiveness of combinatorial descriptors in predicting C. lytica and N. incerta removal at 10 and 20 psi. Optimal descriptors for random forest and decision tree models were identified for each target using univariate model screening and randomized feature pair selection. For C. lytica removal, decision tree models at 10 and 20 psi achieved R 2Test values of 0.875 and 0.834, respectively. For N. incerta removal, random forest models reached R 2Test values of 0.716 (10 psi) and 0.819 (20 psi). These findings highlight the importance of capturing complex and synergistic interactions in mixtures, demonstrating that combinatorial descriptors provide superior predictive power compared to additive approaches.

Details

1010268
Title
Advancing Mixture Descriptors and Machine Learning Analysis for Multi-Component Materials
Number of pages
158
Publication year
2025
Degree date
2025
School code
0157
Source
DAI-B 86/12(E), Dissertation Abstracts International
ISBN
9798280736054
Committee member
Anwar, Zahid; Huang, Ying
University/institution
North Dakota State University
Department
Computer Science
University location
United States -- North Dakota
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32040384
ProQuest document ID
3217435074
Document URL
https://www.proquest.com/dissertations-theses/advancing-mixture-descriptors-machine-learning/docview/3217435074/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic