Content area
The application of machine learning in materials science, known as Quantitative Structure-Activity Relationship (QSAR) modeling, has grown rapidly, driven by increasing demand for composite materials. However, the success of ML in mixture modeling depends on the quality of mixture descriptors, which are challenging to generate due to structural complexity. To address this, we proposed 13 mixture descriptors from existing and novel mixing rules, categorized as additive and non-additive. Two Python packages, MixtureMetrics and CombinatorxPy, were developed to automate descriptor calculations. MixtureMetrics computes 12 additive descriptors with linear complexity O(M). To handle the exponential complexity of CombinatorxPy, based on the Cartesian product and scaling as O(MN ), a distributed approach was implemented using Dask. These descriptors were applied to predict the fouling release activity of 18 silicone oil-infused PDMS coating polymers for U. linza removal at 110 kPa. We employed a two-stage feature importance method for feature selection to identify the optimal descriptor. A decision tree model achieved the highest performance, with an R 2 of 0.987 for both the training and test sets, and a cross-validation Q 2 LOO of 0.791. Further validation on 40 PDMS-poly(SBMA)-based amphiphilic additives demonstrated the effectiveness of combinatorial descriptors in predicting C. lytica and N. incerta removal at 10 and 20 psi. Optimal descriptors for random forest and decision tree models were identified for each target using univariate model screening and randomized feature pair selection. For C. lytica removal, decision tree models at 10 and 20 psi achieved R 2Test values of 0.875 and 0.834, respectively. For N. incerta removal, random forest models reached R 2Test values of 0.716 (10 psi) and 0.819 (20 psi). These findings highlight the importance of capturing complex and synergistic interactions in mixtures, demonstrating that combinatorial descriptors provide superior predictive power compared to additive approaches.