Evolutionary Optimization of the Reduced

Full text

Turn on search term navigation

Introduction

Model reduction is a common strategy for modeling complex systems that are computationally constrained. Atmospheric isoprene chemistry is one of such systems, where the full extent of the known isoprene chemistry is far larger than can be implemented in 3-dimensional chemical transport models of the atmosphere. Given this, researchers have developed highly accurate reduced isoprene mechanisms using both manual (Bates & Jacob, 2019) and algorithmic (Wiser et al., 2023) methods. These models are used to forecast air quality and climate modeling, to which isoprene is a major contributor.

In this article, we present the use of particle swarm optimization (PSO) to optimize stoichiometric coefficients and rate constants for the recently published AMORE-Isoprene mechanisms v1.1 and v1.2 (Skipper et al., 2024; Wiser et al., 2023; Yang et al., 2023), referred to as AMORE v1.1 and AMORE v1.2 respectively throughout the rest of the text. With this method, we are able to improve mechanism performance by optimizing the mechanism with respect to a select set of priority species (see Section 3.2) in order to more closely match the full mechanism that these reduced mechanisms were derived from. We define an objective function to minimize using an error metric (see Section 3.2) to quantify the ability of the reduced mechanism to match the output of the full mechanism. PSO was chosen because it is an efficient method that can optimize a large number of parameters with an objective function that is computationally costly to evaluate. We undertook this project to address the need for automated optimization of reduced chemical mechanisms. As automated mechanism reduction becomes more widely available, there will be a greater need for reduced model optimization to improve the resulting mechanisms. We demonstrate PSO for these reduced gas-phase isoprene mechanisms as an example of the utility of this method, with the potential for it to be applied to other systems.

The AMORE v1.1 and AMORE v1.2 mechanisms are reduced isoprene mechanisms developed from the Caltech isoprene mechanism (Wennberg et al., 2018) using a mechanism reduction algorithm. The AMORE-Isoprene mechanisms were created using a graph-theory-based algorithm that measures the sensitivity of the full mechanism to a wide range of input conditions and creates a set of reduced mechanistic pathways that have output similar to the full mechanism. This algorithm was motivated by the need to create highly reduced volatile organic compound (VOC) oxidation mechanisms for use in computationally expensive 3D chemical transport models, which are used to model atmospheric aerosol formation, and air quality. In complete chemical models, rate constants and stoichiometric parameters are constrained by experimental results and conservation laws. However, in reduced models, such as the AMORE v1 isoprene mechanisms, the mechanism is condensed to a small set of species with much fewer reaction generations. Thus, the yields of products actually represent a much more complex chemistry, and so stoichiometric coefficients are not well-constrained in all cases. Rate constants are averaged over several reactions, and sometimes represent multiple generations of chemistry, and therefore are not well constrained either. Furthermore, the current state of the art requires manual tuning of stoichiometric coefficients and rate constants, so there is much potential for improvement. The AMORE v1 mechanism was initially developed and tested in CMAQ (Wiser et al., 2023). Subsequent GEOSChem modeling lead improvements and the release of AMORE v1.1 (Yang et al., 2023). The mechanism was updated once more, by removing some less significant reactions, adding in new secondary organic aerosol producing species, and recalibrating stoichiometric coefficients. This update was tested in CMAQ and released as AMORE v1.2 (Skipper et al., 2024). Both AMORE mechanisms have undergone rigorous testing for use as reduced isoprene models in chemical transport models. While both mechanisms originate from the same original mechanism, they have sufficiently different reactions and products to warrant separate treatment for optimization. These mechanisms along with our optimized mechanisms are available in the supporting files.

Optimization of stoichiometric coefficients and rate constants for chemical reaction mechanisms is not trivial, and represents a substantial bottleneck in the generation of accurate reduced mechanisms. The candidate reduced model is tasked with accurately representing the full chemistry in terms of the consumption and production of several priority species over a wide range of atmospheric conditions. For our work, we have developed a box model testing protocol to compare a candidate reduced mechanism to the full mechanism. This protocol involves running the full mechanism and candidate reduced mechanism under multiple conditions (Table 1) and comparing the net production and consumption of multiple priority species. We have developed a quantitative error metric (see Section 3.2) which our optimization seeks to minimize. Multiple box model runs are required to measure the error metric of a candidate mechanism, there is a high computational cost to measure the objective function that is being optimized. Additionally, mechanism parameters are highly coupled, and changes in one parameter often impact the optimal value for many other parameters. This means that parameters must be optimized simultaneously and that there are many potential local minima in the objective function. Although reduced mechanisms are considerably smaller than the full mechanisms on which they are based, they still contain a large number of parameters. For example, the AMORE v1.2 mechanism contains 107 stoichiometric coefficients, and 22 rate constants. The high number of coupled parameters (stoichiometric coefficients and rate constants) to optimize, combined with the relatively slow objective function evaluation time, makes this a challenging optimization problem.

Table 1 Six Different Run Conditions Used to Evaluate Mechanisms

Run description	ISOP	OH	${\text{HO}}_{2}$	NO	${\mathrm{O}}_{3}$	${\text{NO}}_{3}$	${\text{RO}}_{2}$	Photolysis
High OH	5	0.0002	0.007	0.01	0	0	0.001	1
High OH and NO	5	0.0002	0.007	0.2	0	0	0.001	0
High ${\mathrm{O}}_{3}$	2	0.00001	0.007	0.01	100	0	0.001	1
High ${\text{NO}}_{3}$	1	0.00001	0.007	0.1	0	0.0002	0.001	1
High ${\text{NO}}_{3}$ no $h\nu$	1	0.00001	0.007	0.1	0	0.0002	0.001	0
High Isop	10	0.0002	0.007	0.02	0	0	0.001	1

The remainder of this paper is organized as follows: in Section 2, we discuss the problem of chemical reaction modeling, and present a brief overview of the methods used to optimize reduced chemical mechanisms. In Section 3, we outline the details of the particle swarm optimization algorithm, and how it has been adapted to our problem of optimizing stoichiometric parameters for reduced chemical reaction mechanisms. This is followed by presenting the results in Section 4 for the reaction mechanisms under study, atmospheric gas-phase isoprene oxidation, for both variants, namely AMORE v1.1 and AMORE v1.2. Finally, in Section 5 we conclude and summarize our work presented in this article.

Background

Reaction Mechanism Modeling

Atmospheric chemical modeling is used for predictions and source apportionment of pollutants and particulate matter (PM), as well as being critical for accurate climate modeling (Pye et al., 2022). Accurate atmospheric chemical modeling relies on compact and high-quality chemical mechanisms for a range of atmospheric species. VOCs have particularly complex chemistry, and many highly detailed mechanisms have been developed for such compounds, which are far too large to be incorporated into atmospheric chemical models (Jenkin et al., 2015; Wennberg et al., 2018). These more complex models were developed with extensive experimental data and are expected to be more accurate than existing reduced models, though we are limited in our ability to verify this by our lack of ability to test these models in atmospheric simulations. Reduced chemical mechanisms are used to model the VOC chemistry, although some accuracy is lost. Generating accurate reduced representations of complex mechanisms has the potential to simultaneously test the performance of the complex model via the reduced model and generate reduced models that are more in line with the more detailed models. Where both complex and reduced mechanisms exist for a given compound, there is the potential to optimize reduced VOC mechanisms against the complex mechanism baseline using representative simulations, which are less computationally costly. However, few tools have been developed to do these optimizations up to this point.

Isoprene was chosen because it is a major component of atmospheric VOC's (Guenther et al., 2006); it influences tropospheric oxidant levels (Butler et al., 2008); it contributes significantly to secondary organic aerosol (Farmer et al., 2010; Fu et al., 2009; Henze & Seinfeld, 2006; Kroll et al., 2006; Liu et al., 2016), ozone (Fiore et al., 2011; Guo et al., 2018), and formaldehyde (Wolfe, Kaiser, et al., 2016) which are key factors in air quality; and due to the existence of recently published complex reference (Wennberg et al., 2018) and reduced (Wiser et al., 2023) isoprene mechanisms.

Particle Swarm Optimization

The evolutionary optimization strategy employed in this article is particle swarm optimization (PSO) (Kennedy & Eberhart, 1995). It belongs to a class of nature-inspired computing techniques (Patnaik et al., 2017) for optimization, termed swarm intelligence (Fister Jr et al., 2013). PSO has been deployed in a wide range of applications because of the versatility of the approach for challenging optimization processes. These applications include chemical mechanism analysis (Ourique et al., 2002a; H. Wang et al., 2023), parameter estimation (Schwaab et al., 2008), dynamic optimization (Mann et al., 2021; Ourique et al., 2002b; Zhou et al., 2014), forecasting (Y. Wang et al., 2019), data clustering (Alam et al., 2014), training feedforward neural networks (J.-R. Zhang et al., 2007), robotics (Camci et al., 2018), smart grid design (El-Zonkoly, 2011), astronomy (Jin & Rahmat-Samii, 2008), manufacturing (Navalertporn & Afzulpurkar, 2011), and additional applications (Pluhacek et al., 2018). Within the field of atmospheric chemistry, PSO algorithms have been used for various problems, including parameter optimization for custom instruments (Tong et al., 2016), identifying atmospheric gas species sources (Ma et al., 2014, 2018; J. Wang et al., 2017), predicting concentrations of select species and particulate matter (Kouziokas, 2020; J. Zhang et al., 2016), and estimating particle size distributions (Yuan et al., 2010).

A major benefit of using PSO is that we can choose to impose first-principles-based constraints on the optimization, which include bounds and heuristics for the optimization variables. This is an avenue for the inclusion of domain knowledge in the modeling framework, resulting in a hybrid artificial intelligence (AI) approach (Chakraborty et al., 2022). PSO belongs to the class of evolutionary algorithms, which is inspired by the process of evolution as observed in nature. These have had success in domains such as model discovery (Chakraborty et al., 2020, 2021), structure-to-property prediction (Chakraborty et al., 2024), process systems engineering (Jul-Rasmussen et al., 2023, 2024), inverse design (Venkatasubramanian et al., 1995), materials design (Srinivasan et al., 2013), and many others (Fang et al., 2023).

Inspired by the movement of a flock of birds, PSO attempts to model the collective intelligence of particles (or agents) toward the optimization of a global objective while adhering to local rules. It relies on a combination of global and local search by weighting their respective deviations, such that it is able to sufficiently explore the search space of objective variables while honing in on well-performing spaces that result in the optimization of the objective function. Its strength, which enables its applicability to a myriad of domains, is due to having a limited set of tunable parameters, and relatively simple update rules as one proceeds from one iteration to the next. As an evolutionary algorithm, we must point out that one of the drawbacks is that the algorithm does not ensure that the optimal value obtained after the pre-specified iterations is the global optimum. Accordingly, we must proceed with additional runs and save the best-performing optimal value(s).

Methods

We employ, particle swarm optimization, a derivative-free optimization to optimize the stoichiometric coefficients and rate constants of the reduced chemical mechanisms (AMORE v1.1 and AMORE v1.2). A derivative-free optimization is defined as one which does not utilize the derivative of the objective function in order to determine the next step in the optimization. This is in contrast to gradient based approaches, which utilize the derivative of the objective function to determine the next step. The reason for a derivative-free evolutionary optimization approach is three-fold. First, by virtue of the problem formulation, there is no unique mathematical function that can accurately and reliably map the multiple coefficients (stoichiometric and/or rate constants) to a continuous function, for every discrete possibility of reactant(s) and product(s). Accordingly, it is not possible to evaluate a gradient of the same. Second, the use of an evolutionary optimization scheme allows exploration of the huge parameter space, which is often a shortcoming of gradient-based approaches. With gradient based approaches, the number of error measurements needed is proportional to the number of optimizable parameters. As mentioned in Section 1, there are more than 100 parameters for the reduced chemical mechanisms used in this work, making gradient based approaches infeasible. On the other hand, evolutionary optimization methods do not require more error measurements for more optimizable parameters, and thus are not as computationally constrained by larger systems. Third, a derivative-free approach is able to escape local minima unlike gradient-based approaches.

PSO is well-suited to the problem discussed in this article for several reasons. First, the search space is high-dimensional (equal to the number of free-roaming stoichiometric coefficients and rate constants), meaning that the search space is very large. Second, there are multiple local minima in the objective function, which can be more readily explored using stochastic methods. Finally, the computational cost to measure the objective function on an individual mechanism is high. Taken together, these features favor an approach, such as PSO, which efficiently explores the search space in a stochastic manner, is derivative-free, and requires a low number of objective function evaluations. We note that PSO is an exemplar of an evolutionary optimization algorithm that is simple and particularly well-suited to our scenario, but it is not the only evolutionary algorithm that could potentially be applied. PSO is designed for continuous rather than discrete variables. In our case, we are optimizing stoichiometric coefficients and rate constants, which are both continuous variables, making PSO an appropriate choice. There are many evolutionary algorithms that differ from PSO in terms of specific implementation, but all utilize populations of parameter sets that evolve over the course of generations. We chose PSO because it is the most well-studied within its class of algorithms. In the subsequent Sections 3.1 and 3.2, we discuss the PSO algorithm and the objective function used in this study, respectively.

Particle Swarm Optimization

Particle swarm is an optimization algorithm that seeks to minimize the value of a user-defined objective function by modifying the optimizable parameters which are inputs to that function. Consider an objective function $f(x):{\mathbb{R}}^{n}\to \mathbb{R}$ that we wish to optimize. For our current problem, we minimize the difference between the net production of a select set of priority species predicted by our reduced mechanism, and that predicted by the full mechanism. The objective of PSO is to minimize this difference, which it does by changing the values of the stoichiometric parameters and rate constants of the reduced mechanism such that optimal parameters are obtained. These stoichiometric parameters and rate constants are the optimization variables in this problem.

At the start of the algorithm, several sets of random optimization variables are generated. These variables can be thought of as particles in a space of $N$ dimensions, where each instance is the location of the particle. Thus, the goal is to find the optimum position of the particles that minimizes the value of the objective function $f(x)$ . Let $\overrightarrow{g}(t)$ denote the best position the algorithm has encountered in iteration $t$ , and $\overrightarrow{p}$ denote the best position the algorithm has encountered since the start of the algorithm. Let ${\overrightarrow{x}}_{i}(t)$ denote the position of the particle $i$ during iteration $t$ . This position is updated when the particle moves to a new position with some velocity ${v}_{i}(t)$ . These positions are updated based on update rules as follows: 1 ${v}_{i}(t+1)=\chi \ast {v}_{i}(t)+{\phi }_{1}\ast {\omega }_{1}\ast \left(\overrightarrow{p}-{\overrightarrow{x}}_{i}(t)\right)+{\phi }_{2}\ast {\omega }_{2}\ast \left(\overrightarrow{g}(t)-{\overrightarrow{x}}_{i}(t)\right)$ 2 ${\overrightarrow{x}}_{i}(t+1)={\overrightarrow{x}}_{i}(t)+{v}_{i}(t+1)$

Here, ${\omega }_{1}$ and ${\omega }_{2}$ are random numbers uniformly sampled between 0 and 1. These incorporate stochasticity into the calculation of velocity of the particle. ${\omega }_{1}$ and ${\omega }_{2}$ are constant parameters that weigh the emphasis given to deviation from the best globally and locally performing particles in the swarm respectively. $\chi$ is termed the inertia weight, which is a measure of the contribution of the previous velocity of a particle to its current velocity (Bansal et al., 2011). Based on an agent's new velocity, its position is updated. Together, these terms determine the balance between exploration (global search) and exploitation (local search) in PSO. This is repeated for the pre-specified number of iterations until we obtain the best-performing particles. The algorithm ensures that the best-performing particle is at least at par with the optimum in a previous iteration, but not worse, unlike gradient descent which can overshoot depending on the learning rate. This is unlike gradient-based approaches, where due to an incorrect choice of the learning rate, the search for the optimum value(s) across the loss landscape leads to overshooting and/or divergence. Due to PSO algorithm's inherent stochastic nature, it is recommended to run the algorithm for a few runs, as the optimum obtained after the pre-specified number of iterations can vary. This also mitigates the risk of getting stuck in a local minimum of the objective function.

The progression of the PSO on a sample reduced mechanism from one iteration to the next is depicted in Figure 1. Since we use MATLAB (Inc, 2023a) for the problem discussed in this article, we refer the reader to the implementation of PSO Global Optimization Toolbox (Inc, 2023b), which includes modifications from Mezura-Montes and Coello (2011), and Pedersen (2010).

[IMAGE OMITTED. SEE PDF]

Objective Function

In our problem, the objective function is an aggregate measure of the fidelity of the net production rates of priority species as obtained from the reduced mechanism, compared to those obtained from the full mechanism. This is measured under six different conditions shown in Table 1, pertaining to isoprene-relevant conditions that occur in the atmosphere. Thus, we choose to minimize this objective function, as a lower objective function value quantitatively corresponds to a more accurate condensed representation of the full mechanism.

The goal of the objective function is to guide the optimization toward an accurate reduced mechanism that behaves similarly to the original full mechanism. Ultimately, a highly accurate reduced mechanism can be incorporated into a three-dimensional transport model for accurate air quality simulations. However, these models are highly computationally expensive, taking on the order of hours to days to simulate on supercomputers. This means that the final use case is not applicable for optimization with multiple runs, where a rapid evaluation of the objective function is necessary. Therefore, we used a standard method for testing chemical mechanisms, a box model, which does not have a spatial component, greatly reducing computational costs. Box model simulations can be run under a set of invariant conditions (temperature, pressure, solar intensity, and concentrations of reactive background species), which are chosen based on frequently encountered atmospheric conditions. The box model used in this work is the F0a.m. v4.0.2 box model (Wolfe, Marvin, et al., 2016), which runs in MATLAB. A 24-hr simulation of a candidate mechanism under one set of conditions takes approximately 0.8 s to run (on the Dell 2,000 MHz Inspiron 15 8-core laptop with 16 GB RAM which was used in this work). We used a sample of representative conditions meant to capture the variety seen in the atmosphere. In general, greater or fewer input conditions can be selected, inducing a trade-off between computational cost and atmospheric representation. This trade-off is also influenced by the variety of situations in which the mechanism being tested is relevant. In the case of the isoprene mechanism, we chose six different input conditions meant to select the most relevant conditions for isoprene. Table 1 lists these conditions. The set of conditions is provided as an input to the objective function evaluation, and all mechanisms are evaluated on all conditions. Although not addressed here, optimally selecting the input conditions is an orthogonal problem to pursue in future work.

The isoprene mechanism influences several important atmospheric species, including OH, ${\text{HO}}_{2}$ , NO, ${\text{NO}}_{2}$ , ozone ( ${\mathrm{O}}_{3}$ ), formaldehyde (HCHO), and isoprene epoxy-diol (IEPOX), lumped isoprene nitrates (ISOPN), glyoxal (GLY), methylglyoxal (MGLY), methyl vinyl ketone (MVK), and methacrolein (MACR). We note that these priority species are user-defined and more or less of them can be included in the optimization process depending on model goals. The function includes individual performance metrics for each of the priority species involved in the mechanism, which are given an importance weighting based on the environmental context. In order to take into consideration the performance of the mechanism across multiple species and conditions, the objective function consists of a weighted average of individual species-run performance metrics. A species-run is defined as a simulation of an individual species under one set of input conditions. The following weights were used for our model runs: OH, 1; ${\text{HO}}_{2}$ , 1; NO, 1; ${\text{NO}}_{2}$ , 1; ${\mathrm{O}}_{3}$ , 1; formaldehyde, 1; isoprene, 0.5; IEPOX, 1; methylglyoxal, 0.5; glyoxal, 0.5; ${\text{CH}}_{3}{\text{CO}}_{3}$ , 0.8; ${\text{CH}}_{3}$ OO, 0.8; methacrolein, 0.5; methyl vinyl ketone, 0.5; isoprene nitrates, 0.8. These weights are user-defined and depend on the priorities of the reduced mechanism. All of our priority species have weights in a narrow range of 0.5–1, which prioritizes breadth of improvement, but we do assign heavier weighting to small inorganic species and the most important organics. Changing the weights will have no impacts on the optimization process aside from changing the final optimized result.

The ultimate performance goal of the reduced mechanism is to accurately match the concentration of the priority species in the full mechanism. The rate of production and consumption are the two forces that influence the overall concentration of the priority species. Our error metric is defined using the net production and consumption of a given species. We note that other metrics can be used involving concentration, time-dependent production rates, or any other user-defined quantity. The isoprene mechanism influences primarily the production rate of several priority organic species and also the production and consumption rate of some reactive background species. In order to quantify the performance of a mechanism, the error must be calculated for each species with simulation results from each input condition. The combination of results from a single species under a set of conditions is termed a species-run. A useful species-run metric is normalized, so that averages can be taken without being skewed by significantly higher or lower values. The production and consumption rate of the priority species in the isoprene mechanism varies over time as the mechanism simulation progresses. The species-run metric captures this time dependence by integrating the difference in production and consumption rates of the target species between the test and reference mechanism over the entire run time. It must be noted that the reference mechanism was run on the same box model. The sum of the reference and test values is used as the denominator so that the quantity is normalized to be less than or equal to one. The following equations give the metric used for the individual species run, which was averaged to create the objective function: 3 ${P}_{x,s}^{T}=\int \nolimits_{{t}_{0}}^{{t}_{f}}{p}_{x,s}^{T}(t)dt$ 4 ${P}_{x,s}^{R}=\int \nolimits_{{t}_{0}}^{{t}_{f}}{p}_{x,s}^{R}(t)dt$ 5 ${C}_{x,s}^{T}=\int \nolimits_{{t}_{0}}^{{t}_{f}}{c}_{x,s}^{T}(t)dt$ 6 ${C}_{x,s}^{R}=\int \nolimits_{{t}_{0}}^{{t}_{f}}{c}_{x,s}^{R}(t)dt$ 7 ${f}_{x,s}(T,R)=\frac{abs\left[\left({P}_{x,s}^{T}-\alpha {C}_{x,s}^{T}\right)-\left({P}_{x,s}^{R}-\alpha {C}_{x,s}^{R}\right)\right]}{abs\left[\left({P}_{x,s}^{T}-\alpha {C}_{x,s}^{T}\right)+\left({P}_{x,s}^{R}-\alpha {C}_{x,s}^{R}\right)\right]}$

Here, $x$ represents a set of input conditions, $s$ represents the priority species being measured, $T$ denotes that the test mechanism is being measured, $R$ denotes that the reference mechanism is being measured, ${p}_{x,s}^{T}(t)$ represents the rate of production of species $s$ with input conditions $x$ using mechanism $T$ , ${c}_{x,s}^{T}(t)$ represents the rate of consumption of the same, ${\alpha }_{s}$ is a binary variable which denotes whether or not consumption should be taken into account for species $s$ , ${C}_{x,s}^{T}$ and ${P}_{x,s}^{T}$ represent the total net consumption and production of species $s$ with input conditions $x$ for mechanism $T$ over the total run time from ${t}_{0}$ to ${t}_{f}$ respectively, and ${f}_{x,s}(T,R)$ represents the species-run error metric. The error metric ranges from 0 to 1, where 0 represents perfect alignment with the entire mechanism, and 1 represents an infinite deviation from the reference mechanism. Only test mechanisms that represent and match the net production and consumption rate of each species will have a error metric of 0. Equation 8 shows the overall objective function used for a test mechanism. 8 $F(T,R)=\sum\limits _{x=0}^{X}\sum\limits _{s=0}^{S}{\omega }_{s}{f}_{x,s}(T,R)$

Here, $F(T,R)$ is the objective function for a mechanism T compared to the reference mechanism R, $X$ represents the set of all test conditions, $S$ represents all the priority species being measured, ${\omega }_{s}$ represents the weighting assigned to a given species, and ${f}_{x,s}(T,R)$ is given in Equation 7. By virtue of the problem formulation, we can explore a few orders of magnitude of the acceptable rate constants, and similarly, for stoichiometric coefficients, we can search within a user-defined range. Here, the rate constants were allowed to perturb within 2 orders of magnitude of the previously user-defined default values, which served as a reasonable starting point for the algorithm. The stoichiometric coefficients of the products were restricted to be within 0.01–2. The reactant stoichiometric coefficients were held constant. These stoichiometric coefficients and rate constant values are the optimization variables used in PSO. We first optimize only the stoichiometric coefficients while keeping the rate constants the same, and obtain their results. Separately, we optimized the stoichiometric coefficients and rate constants simultaneously. This was done in order to investigate the effect of including rate constants on optimization results.

In the next section, we present the results of optimizing the AMORE v1.1 and AMORE v1.2 reduced mechanisms, using both: only stoichiometric parameter optimization, and stoichiometric and rate constant optimization. The results of the same are compared to the concentration plots obtained from the AMORE v1.1, AMORE v1.2, and the Caltech Isoprene mechanism designed by human experts (Wennberg et al., 2018; Wiser et al., 2023). We note that the optimized reduced mechanisms are expected to be no more or less computationally efficient than the reduced mechanisms they are derived from, since they contain the same number of species and reactions. Rather, the purpose of this optimization is to improve upon an existing reduced mechanism without impacting computational cost.

Results and Discussion

We conducted several runs of the PSO algorithm on the AMORE v1.2 mechanism. All PSO-optimized mechanisms scored better on the objective function than the AMORE v1.2 baseline mechanism. We ran the optimization using different particle populations and number of generations. Based on conventional evolutionary optimization terminology, population refers to the entire collection of optimization variables. Thus, a population of 50 individuals would have 50 instances of N-dimensional optimization variables, with each set of N-dimensional optimization variables being referred to as an individual. Generation refers to the iterations of the optimization algorithm.

We further use the term fitness value to quantify an individual's performance in the optimization routine, by measuring its fitness, or how well is the individual performing in comparison to the ground truth. It can be the output of a custom-loss function, that calculates the difference between the prediction and true value (as is the case in our study), or a conventional sum-of-squared errors, among other such metrics. Based on this value, individuals in an evolutionary optimization routine are ranked for the subsequent steps of genetic operations. In our case, the fitness value is determined by the objective function score defined in Section 3.2.

Figure 2 shows the best individual performance within the current population versus the number of parameter sets tested for several different particle populations. The x-axis scales with the run time, as testing each parameter set takes roughly the same amount of time. The starting objective function score for each run is different due to the stochastic nature of the initial particle selection. Although larger populations will have lower starting values on average, there is no guarantee for an individual run. The 5 particle population run had the highest initial objective function value of 0.4697, and the 25 particle population run had the lowest initial objective functional value of 0.3851.

[IMAGE OMITTED. SEE PDF]

Each run starts at a different number of parameter sets tested, since the starting point represents the best fitness after the first generation of parameter sets has been tested. In all cases, the objective function decreases rapidly at first and more slowly as the optimization goes on. From the data, we can see that, initially, small particle populations are able to descend more rapidly toward a better objective function score, but more quickly reach a plateau where the descent is much more gradual. Larger particle populations tend to show a much slower initial descent that is sustained for a longer period. This can be explained by the fact that for larger particle populations, each generation requires more parameter sets to be tested, leading to a much slower convergence toward the vicinity of the best particle.

The 5 particle population run achieved a final objective function score of 0.2693 after 1,255 parameter sets tested. Due to the small population, this run hit a clear plateau after 227 generations, showing no improvement in the subsequent 24 generations. The 25 particle population run achieved a final objective function score of 0.2076 after 6,275 parameter sets tested. The 100 particle population achieved a final score of 0.2125 after 5,700 parameter sets tested. The 50 particle population run achieved a objective function score of 0.2277 after 1,850 parameter sets tested, and was on pace to outperform the 25 particle population run. The results demonstrates diminishing returns as particle population increases, as well as the inherent variability in runs. However, the data set is not large enough to draw conclusions about the optimal generation size, and due to the stochastic nature of the algorithm, results will vary significantly between runs.

We investigated the amount of variation between identical runs using a $10\times 20$ optimization of the AMORE v1.2 stoiciometric coefficients with the original mechanism as a seed particle and ±30% of the original value as constraints on all coefficients. We ran this optimization 10 separate times. The initial best score ranged from 0.2785 to 0.2863 with a percent standard deviation of 1.0%. The final best score ranged from 0.2476 to 0.2567, with a percent standard deviation of 1.19%.

It took approximately 5 s to test a single parameter set on each of the six conditions (on the Dell 2,000 MHz Inspiron 15 8-core laptop with 16 GB RAM which was used in this work). Our longest run took approximately 8 hr, but we were able to achieve significant improvement on runs that were only 1 hr long. We used generations of 25–100 depending on the particle population and desired runtime. For a given mechanism, we found that there was a minimum objective function score that multiple runs converged toward. For example, none of the AMORE v1.2 mechanism were able to achieve a fitness value below 0.2, while several were able to achieve a fitness value below 0.23, as demonstrated by the final values described above. While we do not know what the global minimum is for a given mechanism, we do know that it is influenced by the structure of the mechanism itself, and therefore there is a limit to the amount of improvement that is possible.

The optimal population will depend on the use case, but particle population should roughly scale with desired runtime. As shown in Figure 2, there are diminishing returns in running the algorithm near the plateau value, which will be reached more quickly in a small particle population. Likewise, the initial descent will tend to be slower in larger particle populations, leading to marginal improvement if the number of generations is too small. From our testing, particle populations of 50 with 100 generations are an ideal balance between runtime and model improvement. Increasing constraints and using seed particles will generally reduce the particle population size required and the number of generations needed. In general, choosing smaller population sizes leaves the search of optimal solutions to chance while larger population sizes result in much more compute while promoting only the best obtained individuals. Thus, it is more likely that locally optimal solutions get carried forward through the generations without exploring the parameter space sufficiently (Coello et al., 2023; Goldberg et al., 1991).

We chose a selection of our best PSO mechanisms for a more detailed analysis. These mechanisms include two optimized variants of both, the AMORE v1.1 isoprene mechanism, and the AMORE v1.2 isoprene mechanism. For both, we first optimize only the stoichiometric coefficients, followed by optimization of stoichiometric coefficients and rate constants. All optimizations were performed for 100 generations with a particle population of 50. This was the largest run that we were able to do on our system within an 8 hr runtime. Table 2 shows the fitness values for each optimized mechanism, and the mechanism it is optimizing. The optimization of the PSO without rate coefficients was able to improve the AMORE v1.1 mechanism by 24.0%, and improve the AMORE v1.2 mechanism by 26.2%. With the rate constants included in optimization, the AMORE v1.1 mechanism was improved by 28.7% and the AMORE v1.2 mechanism was improved by 28.8%. The PSO optimization had strong breadth of improvement in all six testing conditions. The AMORE v1.1 PSO with rates, AMORE v1.2 PSO, and AMORE v1.2 PSO with rates performed equal to or better than the original mechanism for every condition tested. The AMORE v1.1 PSO was better than the original mechanism for five out of six conditions, and had a slightly worse performance under high ${\text{NO}}_{3}$ low light conditions.

Table 2 Table Showing Measured Fitness Values for Six Reduced Isoprene Mechanisms Under Six Different Testing Conditions

Mechanism	AMORE v1.1	AMORE v1.1 PSO	AMORE v1.1 PSO + rates	AMORE v1.2	AMORE v1.2 PSO	AMORE v1.2 PSO + rates
High OH	0.42	0.27	0.16	0.27	0.19	0.16
High OH + NO	0.35	0.31	0.34	0.28	0.23	0.24
High ${\mathrm{O}}_{3}$	0.29	0.12	0.18	0.30	0.21	0.16
High ${\text{NO}}_{3}$	0.34	0.32	0.24	0.27	0.19	0.21
High ${\text{NO}}_{3}$ no hv	0.35	0.36	0.35	0.33	0.29	0.29
High Isoprene	0.39	0.26	0.15	0.27	0.17	0.16
Average	0.36	0.27	0.24	0.29	0.21	0.20
% improvement	–	24.0	28.7	–	26.2	28.8

Figures 3 and 4 show the concentration of a select set of organic species (formaldehyde (HCHO), isoprene epoxy-diol (IEPOX), lumped isoprene nitrates, methylglyoxal (MGLY), methacrolein (MACR), and glyoxal (GLYX)) under high OH conditions for several isoprene mechanisms. Figure 3 shows the AMORE v1.1 mechanism in comparison to the AMORE v1.1 $50\times 100$ PSO-optimized (stoichiometric coefficients only) alongside the full Caltech reference isoprene mechanism. Figure 4 shows the AMORE v1.2 mechanism in comparison to the AMORE v1.2 PSO-optimized mechanism (particle population of 50, for 100 generations), without rate constants, alongside the full Caltech reference isoprene mechanism. For the AMORE v1.1 results shown in Figure 3, the performance in improved to near zero error (looking at end concentrations) for formaldehyde, IEPOX, methylglyoxal, and glyoxal, with a marginal increase in error for isoprene nitrates, and no change for methacrolein. For the AMORE v1.2 results shown in Figure 4, there is a reduction in error for methyglyoxal and methacrolein, little change for IEPOX and isoprene nitrates, and an increase in error for formaldehyde and glyoxal. We note that end concentrations are not the only metric for accuracy, but the optimization error metric prioritized end concentrations by focusing on net production and consumption. Thus, end concentrations are a good indicator of the performance of the optimization. Using a different error metric is fully compatible with the PSO method and would yield different results.

[IMAGE OMITTED. SEE PDF]

We conducted further investigation into the increased error in formaldehyde, and identified the isoprene + OH reaction formaldehyde product coefficient as the key difference between the unoptimized and optimized AMORE v1.2 mechanisms (see supporting files for mechanism details). The optimized mechanism had a lower stoichiometric coefficient for formaldehyde, leading to lower concentrations than the reference mechanism. The stochastic nature of the PSO algorithm means that particles will have parameters that are not perfectly optimized. This set of parameters was selected in spite of the bias incurred from the identified coefficient, because in aggregate, it had a higher performance. In general, there is a trade-off between increasing breadth of improvement by having several species in the objective score, and extent of improvement, by having fewer heavily weighted species in the objective score.

This error can be counteracted by identifying a species for further improvement and isolating its coefficients. We ran a 10 × 10 PSO optimization using the original optimization as a seed particle, and only changing stoichiometric coefficients involving formaldehyde. The formaldehyde error score (7) decreased from 0.13 to 0.040 with the additional optimization, much lower than the formaldehyde score of 0.058 in the original mechanism. The overall mechanism objective function value decreased from 0.21 to 0.20, showing that no trade-offs arose from this isolated optimization. PSO is additive by nature, meaning that optimizations can be done sequentially to the same mechanism, allowing for this process to continue until the user has achieved the desired level of accuracy.

Figures 5 and 6 compare the deviation from reference value of PSO-optimized mechanisms to the AMORE baseline mechanisms for six of the most important species under the six different testing conditions (specified in Table 1). There is variation in the deviations between species and conditions, but on average, there is a significant reduction in deviations from the AMORE mechanisms to the PSO mechanisms. We selected a few species to highlight for causal explanations. For the AMORE v1.1 based mechanisms, the stoichiometric and rates optimization did well for ${\text{HO}}_{2}$ .

[IMAGE OMITTED. SEE PDF]

We identified the reaction of ISOPOO with ${\text{HO}}_{2}$ as the cause of this performance change. In the original mechanism, this reaction produces 0.6 mol of ${\text{HO}}_{2}$ , whereas in the optimized mechanism it produced 1.72 mol of ${\text{HO}}_{2}$ . This substantially increased ${\text{HO}}_{2}$ production accounts for the higher net production of ${\text{HO}}_{2}$ in the full mechanism due to production from more oxidized compounds that are removed from the reduced AMORE mechanism. The stoichiometric optimization did well on OH under high ${\mathrm{O}}_{3}$ conditions. The full mechanism has a net production of near zero for OH under high ${\mathrm{O}}_{3}$ conditions as a result of OH production from the reaction of isoprene with OH compensating for the consumption of OH reacting with isoprene. The optimized mechanism increased the yield of OH in the reaction of isoprene with ${\mathrm{O}}_{3}$ leading to stronger agreement. For the AMORE v1.2 based mechanisms, both optimizations performed well for ${\text{HO}}_{2}$ , particularly under high OH and NO conditions. We identified the ISOPOO ${+}$ NO reaction as the key reason for improved performance. The yield of ${\text{HO}}_{2}$ went from 0.48 in the original mechanism to 1.67 in the stoichiometric optimization and 1.83 in the optimization with rates, leading to more ${\text{HO}}_{2}$ production in line with the full mechanism. For both sets of optimization, NO had higher bias than the original mechanism. In both versions of AMORE, NO is consumed rather than produced, and the key reaction involved in NO consumption is ISOPOO ${+}$ NO. We found that the yield of ISOPOO in the optimized mechanisms was reduced, and the product yields from ISOPOO were increased to compensate for the reduced yield. However, there was no way to compensate for reduce NO consumption, leading to higher bias. This impact could be fixed by imposing a limit constraint on the ISOPOO yield. We describe the results of a constrained run in the following paragraph.

The randomness associated with PSO may be undesirable in some use cases. The primary benefit of PSO's randomness is in its ability to optimize under strict time constraints where other methods, such as gradient descent, will not work. However, there are constraints that may be added to PSO to ensure a baseline level of improvement and reduce undesired outcomes. One method to ensure a baseline of performance is to include a seed particle in the particle swarm optimization. This seed particle represents an existing set of coefficients, such as the original coefficients or a previously optimized set of coefficients. This method ensures that the optimized result will be no worse than the seed particle, and will generally be much better than the seed particle. Additionally, tighter constraints can be set for the parameters. In the case of our original runs, we used no seed particles, and set relatively generous bounds of 0–2 for each of the optimized stoichiometric coefficients. On the one hand, this allows the PSO to explore a much larger space, but the drawback is that the optimization is starting from scratch, and has to make up some ground before it is at the same level of performance to the original mechanism. To demonstrate the versatility of the method, we ran a $25\times 25$ optimization on the AMORE 1v.2 mechanism with a seed of the original mechanism and constraints on each parameter such that they could vary by no more than 30% above or below their original value. Additionally we select isoprene-based species coefficients, such as ISOPOO, to preserve their yield. After 625 parameter sets tested, this run achieved an objective function score of 0.24, an improvement of 19.5% over the AMORE v1.2 baseline. Notably, the score for NO under high NO conditions was 0.004, compared to 0.28 and 0.30 for the two unconstrained AMORE v1.2 PSO mechanisms. This shows that constraints have the ability to minimize unwanted results in species specific cases.

Improvement in one set of conditions may not lead to improvement in another set of conditions. While our selection of conditions was not arbitrary, it does not necessarily reflect the full breadth of atmospheric condtions. In order to demonstrate the ability of the algorithm to work across a diverse set of conditions, we utilized species concentration and meteorological data set derived from a 200 km gridded global GEOS-Chem simulation for 24 hr on 1 July for the year 2016. From this data set, we randomly selected 1,000 data points in which the concentration of isoprene was above 0.1 ppb. In this data set, conditions range from 0.1 to 110.7 ppb for isoprene, $2.45\times 1{0}^{-7}$ to $9.28\times 1{0}^{-4}$ ppb for OH, $8.04\times 1{0}^{-8}$ to 0.036 ppb for ${\text{HO}}_{2}$ , $3.06\times 1{0}^{-13}$ to 5.71 ppb for NO, $1.15\times 1{0}^{-3}$ to 17.9 ppb for ${\text{NO}}_{2}$ , 4.9–84.7 ppb for ${\mathrm{O}}_{3}$ , $1.85\times 1{0}^{-8}$ to $1.71\times 1{0}^{-3}$ ppb for ${\text{NO}}_{3}$ , 0 to 0.39 solar intensity (as a fraction of full sun), and 227 to $315{}^{\circ}K$ . The full GEOS-Chem data set was far too large to use directly for PSO, so we represented it using the k-means clustering algorithm to create six clusters representative of the full sample. The full data set and k-means cluster data are available in the supporting files. We ran a $50\times 80$ optimization on the AMORE 1.1 mechanism and a $50\times 50$ optimization on the AMORE v1.2 mechanism with the respective original mechanisms as seed particles. Table 3 shows the results of these runs, along with the fitness values of the original AMORE mechanisms and the PSO mechanisms from the prior testing conditions. The mechanisms optimized under prior testing conditions did not perform well under the new GEOS-Chem derived conditions, however, the newly optimized mechanisms had improved performance over the AMORE baseline mechanisms. The standard deviations increased by a modest amount, suggesting that a subset of the conditions in the data set may have been poorly represented by the clustered data. Further investigation showed that most of the high error results were in conditions with OH concentrations less than $1{0}^{-5}$ ppb, suggesting that this range was not as well represented in the clusters, or in the original conditions used. We tested these new mechanisms on the original conditions in Table 1 and obtained errors of 0.32 for the AMORE v1.1 PSO and 0.26 for the AMORE v1.2 PSO, which is better than the original mechanism values shown in Table 2. These results demonstrate that PSO can be used effectively to optimize mechanisms in a diverse set of conditions, including in ones that span the atmospheric condition space.

Table 3 Table Showing Mean and Standard Deviation of Fitness Values for Six Reduced Isoprene Mechanisms Under 1,000 Different Testing Conditions Derived From a Global GEOS-Chem Simulation

Mechanism	AMv1.2	AMv1.2 PSO	AMv1.2 GeosPSO	AMv1.1	AMv1.1 PSO	AMv1.1 GeosPSO
Average Score	0.29	0.38	0.27	0.32	0.37	0.28
Standard Deviation	0.05	0.11	0.07	0.06	0.07	0.08

Conclusion

In this paper, we present an optimization approach for obtaining the optimal parameters of a reduced isoprene mechanism, such that the fidelity to the full mechanism is maximized. The approach relies on the popular and effective evolutionary optimization algorithm, particle swarm optimization (PSO). We have discussed the results for optimization of only stoichiometric coefficients, and that of stoichiometric coefficients and rate constants simultaneously. The latter results in a larger search space due to additional objective variables to be optimized, which PSO is able to handle reasonably well.

The benefits accrued from the optimization of parameters of a reduced mechanism are its increased accuracy when compared to the complete large-scale mechanism. Such an optimized reduced mechanism can be used independently for making predictions of the concentrations of important species in the atmosphere, for a fraction of the computational power in comparison to the full reference mechanism. While the parameters obtained are not globally optimum, the approach yields optimal parameter values for both the reduced mechanisms considered in this article, with an improvement by up to 28.8% in the objective function (for the conditions shown in Table 1) over the baseline state-of-the-art mechanism. Our results are specific to the reduced isoprene models tested and our objective function. However, the PSO method has potential to be applied to other reduced models with differing constraints and optimization goals, as the method is flexible to using any objective function and mechanism. Here we have demonstrated that PSO is a viable method for optimizing reduced isoprene mechanisms using box model simulations with useful levels of improvement in a reasonable time-frame using widely available software tools and modest computational resources.

Acknowledgments

This publication was developed under Assistance Agreement No. 84001301 awarded by the U.S. Environmental Protection Agency to McNeill. The views expressed in this article are those of the authors and do not necessarily represent the views or policies of the U.S. Environmental Protection Agency. EPA does not endorse any products or commercial services mentioned in this publication. VV is grateful for support in part by the NSF EFRI-DCheM 2132142 Grant and funding from the Center for the Management of Systemic Risk (CMSR) at Columbia University.

Data Availability Statement

The MATLAB and python scripts and mechanism result data for running the PSO algorithm and measuring mechanism performance in this study are available on Github () via with open access (Wiser & Chakraborty, 2024). This repository contains the scripts needed to run the PSO algorithm in Matlab, along with ancillary scripts for running and testing mechanisms. Data is also provided for the performance of the mechanisms demonstrated in this work, along with the mechanism files.

References

Alam, S., Dobbie, G., Koh, Y. S., Riddle, P., & Ur Rehman, S. (2014). Research on particle swarm optimization based clustering: A systematic review of literature and techniques. Swarm and Evolutionary Computation, 17, 1–13. https://doi.org/10.1016/j.swevo.2014.02.001

Bansal, J. C., Singh, P., Saraswat, M., Verma, A., Jadon, S. S., & Abraham, A. (2011). Inertia weight strategies in particle swarm optimization. In 2011 third world congress on nature and biologically inspired computing (pp. 633–640).

Bates, K. H., & Jacob, D. J. (2019). A new model mechanism for atmospheric oxidation of isoprene: Global effects on oxidants, nitrogen oxides, organic products, and secondary organic aerosol. Atmospheric Chemistry and Physics, 19(14), 9613–9640. https://doi.org/10.5194/acp‐19‐9613‐2019

Butler, D., Taraborrelli, T. M., Brühl, C., Fischer, H., Harder, H., Martinez, M., et al. (2008). Improved simulation of isoprene oxidation chemistry with the ECHAM5/messy chemistry‐climate model: Lessons from the Gabriel airborne field campaign. Atmospheric Chemistry and Physics, 8(16), 4529–4546. https://doi.org/10.5194/acp‐8‐4529‐2008

Camci, E., Kripalani, D. R., Ma, L., Kayacan, E., & Khanesar, M. A. (2018). An aerial robot for rice farm quality inspection with type‐2 fuzzy neural networks tuned by particle swarm optimization‐sliding mode control hybrid algorithm. Swarm and Evolutionary Computation, 41, 1–8. https://doi.org/10.1016/j.swevo.2017.10.003

Chakraborty, A., Gandhi, A., Hasan, M. F., & Venkatasubramanian, V. (2024). Discovering zeolite adsorption isotherms: A hybrid AI modeling approach. In Computer aided chemical engineering (Vol. 53, pp. 511–516). Elsevier. https://doi.org/10.1016/b978‐0‐443‐28824‐1.50086‐7

Chakraborty, A., Serneels, S., Claussen, H., & Venkatasubramanian, V. (2022). Hybrid AI models in chemical engineering–a purpose‐driven perspective. Computer Aided Chemical Engineering, 51, 1507–1512. https://doi.org/10.1016/b978‐0‐323‐95879‐0.50252‐6

Chakraborty, A., Sivaram, A., Samavedham, L., & Venkatasubramanian, V. (2020). Mechanism discovery and model identification using genetic feature extraction and statistical testing. Computers & Chemical Engineering, 140, 106900. https://doi.org/10.1016/j.compchemeng.2020.106900

Chakraborty, A., Sivaram, A., & Venkatasubramanian, V. (2021). Ai‐Darwin: A first principles‐based model discovery engine using machine learning. Computers & Chemical Engineering, 154, 107470. https://doi.org/10.1016/j.compchemeng.2021.107470

Coello, C., Goodman, E., Miettinen, K., Saxena, D., Schütze, O., & Thiele, L. (2023). Interview: Kalyanmoy deb talks about formation, development and challenges of the emo community, important positions in his career, and issues faced getting his works published. MDPI, 28, 34. https://doi.org/10.3390/mca28020034

El‐Zonkoly, A. (2011). Optimal placement of multi‐distributed generation units including different load models using particle swarm optimization. Swarm and Evolutionary Computation, 1(1), 50–59. https://doi.org/10.1016/j.swevo.2011.02.003

Fang, J., Liu, W., Chen, L., Lauria, S., Miron, A., & Liu, X. (2023). A survey of algorithms, applications and trends for particle swarm optimization. International Journal of Network Dynamics and Intelligence, 24–50. https://doi.org/10.53941/ijndi0201002

Farmer, D. K., Matsunaga, A., Docherty, K. S., Surratt, J. D., Seinfeld, J. H., Ziemann, P. J., & Jimenez, J. L. (2010). Response of an aerosol mass spectrometer to organonitrates and organosulfates and implications for atmospheric chemistry. In Proceedings of the National Academy of Sciences, (Vol. 107(15), pp. 6670–6675). https://doi.org/10.1073/pnas.0912340107

Fiore, A. M., Levy, H., II, & Jaffe, D. A. (2011). North American isoprene influence on intercontinental ozone pollution. Atmospheric Chemistry and Physics, 11(4), 1697–1710. https://doi.org/10.5194/acp‐11‐1697‐2011

Fister, I., Jr., Yang, X.‐S., Fister, I., Brest, J., & Fister, D. (2013). A brief review of nature‐inspired algorithms for optimization. Retrieved from https://arxiv.org/abs/1307.4186

Fu, T.‐M., Jacob, D. J., & Heald, C. L. (2009). Aqueous‐phase reactive uptake of dicarbonyls as a source of organic aerosol over Eastern North America. Atmospheric Environment, 43(10), 1814–1822. https://doi.org/10.1016/j.atmosenv.2008.12.029

Goldberg, D. E., Deb, K., & Clark, J. H. (1991). Genetic algorithms, noise, and the sizing of populations. Complex Systems, 6, 333–362.

Guenther, A., Karl, T., Harley, P., Wiedinmyer, C., Palmer, P. I., & Geron, C. (2006). Estimates of global terrestrial isoprene emissions using Megan (model of emissions of gases and aerosols from nature). Atmospheric Chemistry and Physics, 6(11), 3181–3210. https://doi.org/10.5194/acp‐6‐3181‐2006

Guo, J. J., Fiore, A. M., Murray, L. T., Jaffe, D. A., Schnell, J. L., Moore, C. T., & Milly, G. P. (2018). Average versus high surface ozone levels over the continental USA: Model bias, background influences, and interannual variability. Atmospheric Chemistry and Physics, 18(16), 12123–12140. https://doi.org/10.5194/acp‐18‐12123‐2018

Henze, D. K., & Seinfeld, J. H. (2006). Global secondary organic aerosol from isoprene oxidation. Geophysical Research Letters, 33(9). https://doi.org/10.1029/2006GL025976

Inc, T. M. (2023a). Matlab version: 23.2.0.2391609 (r2023b). Natick, Massachusetts, United States: Author. Retrieved from https://www.mathworks.com

Inc, T. M. (2023b). Optimization toolbox version: 23.2 (r2023b). Natick, Massachusetts, United States: Author. Retrieved from https://www.mathworks.com

Jenkin, M. E., Young, J. C., & Rickard, A. R. (2015). The MCM v3.3.1 degradation scheme for isoprene. Atmospheric Chemistry and Physics, 15(20), 11433–11459. https://doi.org/10.5194/acp‐15‐11433‐2015

Jin, N., & Rahmat‐Samii, Y. (2008). Analysis and particle swarm optimization of correlator antenna arrays for radio astronomy applications. IEEE Transactions on Antennas and Propagation, 56(5), 1269–1279. https://doi.org/10.1109/TAP.2008.922622

Jul‐Rasmussen, P., Chakraborty, A., Venkatasubramanian, V., Liang, X., & Huusom, J. K. (2023). Identifying first‐principles models for bubble column aeration using machine learning. In Computer aided chemical engineering (Vol. 52, pp. 1089–1094). Elsevier. https://doi.org/10.1016/b978‐0‐443‐15274‐0.50174‐8

Jul‐Rasmussen, P., Chakraborty, A., Venkatasubramanian, V., Liang, X., & Huusom, J. K. (2024). Hybrid AI modeling techniques for pilot scale bubble column aeration: A comparative study. Computers & Chemical Engineering, 185, 108655. https://doi.org/10.1016/j.compchemeng.2024.108655

Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of ICNN'95‐international conference on neural networks (Vol. 4, pp. 1942–1948).

Kouziokas, G. N. (2020). SVM kernel based on particle swarm optimized vector and Bayesian optimized SVM in atmospheric particulate matter forecasting. Applied Soft Computing, 93, 106410. https://doi.org/10.1016/j.asoc.2020.106410

Kroll, J. H., Ng, N. L., Murphy, S. M., Flagan, R. C., & Seinfeld, J. H. (2006). Secondary organic aerosol formation from isoprene photooxidation. Environmental Science & Technology, 40(6), 1869–1877. https://doi.org/10.1021/es0524301

Liu, J., D'Ambro, E. L., Lee, B. H., Lopez‐Hilfiker, F. D., Zaveri, R. A., Rivera‐Rios, J. C., et al. (2016). Efficient isoprene secondary organic aerosol formation from a non‐Iepox pathway. Environmental Science & Technology, 50(18), 9872–9880. https://doi.org/10.1021/acs.est.6b01872

Ma, D., Tan, W., Wang, Q., Zhang, Z., Gao, J., Zeng, Q., et al. (2018). Application and improvement of swarm intelligence optimization algorithm in gas emission source identification in atmosphere. Journal of Loss Prevention in the Process Industries, 56, 262–271. https://doi.org/10.1016/j.jlp.2018.09.008

Ma, D., Wang, S., & Zhang, Z. (2014). Hybrid algorithm of minimum relative entropy‐particle swarm optimization with adjustment parameters for gas source term identification in atmosphere. Atmospheric Environment, 94, 637–646. https://doi.org/10.1016/j.atmosenv.2014.05.034

Mann, V., Sivaram, A., Das, L., & Venkatasubramanian, V. (2021). Robust and efficient swarm communication topologies for hostile environments. Swarm and Evolutionary Computation, 62, 100848. https://doi.org/10.1016/j.swevo.2021.100848

Mezura‐Montes, E., & Coello, C. A. C. (2011). Constraint‐handling in nature‐inspired numerical optimization: Past, present and future. Swarm and Evolutionary Computation, 1(4), 173–194. https://doi.org/10.1016/j.swevo.2011.10.001

Navalertporn, T., & Afzulpurkar, N. V. (2011). Optimization of tile manufacturing process using particle swarm optimization. Swarm and Evolutionary Computation, 1(2), 97–109. https://doi.org/10.1016/j.swevo.2011.05.003

Ourique, C. O., Biscaia, E. C., & Pinto, J. C. (2002). The use of particle swarm optimization for dynamical analysis in chemical processes. Computers & Chemical Engineering, 26(12), 1783–1793. https://doi.org/10.1016/S0098‐1354(02)00153‐9

Patnaik, S., Yang, X.‐S., & Nakamatsu, K. (2017). Nature‐inspired computing and optimization (Vol. 10). Springer.

Pedersen, M. E. H. (2010). Good parameters for particle swarm optimization. Hvass Lab., Copenhagen, Denmark. Tech. Rep. HL1001, 1551–3203.

Pluhacek, M., Senkerik, R., Viktorin, A., Kadavy, T., & Zelinka, I. (2018). A review of real‐world applications of particle swarm optimization algorithm. Lecture Notes in Electrical Engineering, 115–122. https://doi.org/10.1007/978‐3‐319‐69814‐4_11

Pye, H. O. T., Place, B. K., Murphy, B. N., Seltzer, K. M., D'Ambro, E. L., Allen, C., et al. (2022). Linking gas, particulate, and toxic endpoints to air emissions in the Community Regional Atmospheric Chemistry Multiphase Mechanism (CRACMM) version 1.0. Atmospheric Chemistry and Physics Discussions, 2022, 1–88. https://doi.org/10.5194/acp‐2022‐695

Schwaab, M., Biscaia, E. C., Jr., Monteiro, J. L., & Pinto, J. C. (2008). Nonlinear parameter estimation through particle swarm optimization. Chemical Engineering Science, 63(6), 1542–1552. https://doi.org/10.1016/j.ces.2007.11.024

Skipper, T. N., D'Ambro, E. L., Wiser, F. C., McNeill, V. F., Schwantes, R. H., Henderson, B. H., et al. (2024). Role of chemical production and depositional losses on formaldehyde in the Community Regional Atmospheric Chemistry Multiphase Mechanism (CRACMM). Atmospheric Chemistry and Physics, 24(22), 12903–12924. https://doi.org/10.5194/acp‐24‐12903‐2024

Srinivasan, B., Vo, T., Zhang, Y., Gang, O., Kumar, S., & Venkatasubramanian, V. (2013). Designing DNA‐grafted particles that self‐assemble into desired crystalline structures using the genetic algorithm. Proceedings of the National Academy of Sciences, 110(46), 18431–18435. https://doi.org/10.1073/pnas.1316533110

Tong, H., Arangio, A. M., Lakey, P. S. J., Berkemeier, T., Liu, F., Kampf, C. J., et al. (2016). Hydroxyl radicals from secondary organic aerosol decomposition in water. Atmospheric Chemistry and Physics, 16(3), 1761–1771. https://doi.org/10.5194/acp‐16‐1761‐2016

Venkatasubramanian, V., Chan, K., & Caruthers, J. M. (1995). Evolutionary design of molecules with desired properties using the genetic algorithm. Journal of Chemical Information and Computer Sciences, 35(2), 188–195. https://doi.org/10.1021/ci00024a003

Wang, H., Sun, C., Haidn, O., Aliya, A., Manfletti, C., & Slavinskaya, N. (2023). A joint hydrogen and syngas chemical kinetic model optimized by particle swarm optimization. Fuel, 332, 125945. https://doi.org/10.1016/j.fuel.2022.125945

Wang, J., Zhang, R., Yan, Y., Dong, X., & Li, J. M. (2017). Locating hazardous gas leaks in the atmosphere via modified genetic, MCMC and particle swarm optimization algorithms. Atmospheric Environment, 157, 27–37. https://doi.org/10.1016/j.atmosenv.2017.03.009

Wang, Y., Ni, Y., Li, N., Lu, S., Zhang, S., Feng, Z., & Wang, J. (2019). A method based on improved ant lion optimization and support vector regression for remaining useful life estimation of lithium‐ion batteries. Energy Science & Engineering, 7(6), 2797–2813. https://doi.org/10.1002/ese3.460

Wennberg, P. O., Bates, K. H., Crounse, J. D., Dodson, L. G., McVay, R. C., Mertens, L. A., et al. (2018). Gas‐phase reactions of isoprene and its major oxidation products. Chemical Reviews, 118(7), 3337–3390. https://doi.org/10.1021/acs.chemrev.7b00439

Wiser, F., & Chakraborty, A. (2024). fcw2110/ga‐pso‐amore: Amore_pso [Software]. Zenodo. https://doi.org/10.5281/zenodo.11663457

Wiser, F., Place, B. K., Sen, S., Pye, H. O., Yang, B., Westervelt, D. M., et al. (2023). Amore‐isoprene v1. 0: A new reduced mechanism for gas‐phase isoprene oxidation. Geoscientific Model Development, 16(6), 1801–1821. https://doi.org/10.5194/gmd‐16‐1801‐2023

Wolfe, G. M., Kaiser, J., Hanisco, T. F., Keutsch, F. N., de Gouw, J. A., Gilman, J. B., et al. (2016a). Formaldehyde production from isoprene oxidation across nox regimes. Atmospheric Chemistry and Physics, 16(4), 2597–2610. https://doi.org/10.5194/acp‐16‐2597‐2016

Wolfe, G. M., Marvin, M. R., Roberts, S. J., Travis, K. R., & Liao, J. (2016b). The Framework for 0‐d Atmospheric Modeling (f0am) v3.1. Geoscientific Model Development, 9(9), 3309–3319. https://doi.org/10.5194/gmd‐9‐3309‐2016

Yang, B., Wiser, F. C., McNeill, V. F., Fiore, A. M., Tao, M., Henze, D. K., et al. (2023). Implementation and evaluation of the Automated Model Reduction (AMORE) version 1.1 isoprene oxidation mechanism in geos‐chem. Environmental Sciences: Atmos, 3(12), 1820–1833. https://doi.org/10.1039/D3EA00121K

Yuan, Y., Yi, H.‐L., Shuai, Y., Wang, F.‐Q., & Tan, H.‐P. (2010). Inverse problem for particle size distributions of atmospheric aerosols using stochastic particle swarm optimization. Journal of Quantitative Spectroscopy and Radiative Transfer, 111(14), 2106–2114. https://doi.org/10.1016/j.jqsrt.2010.03.019

Zhang, J., Tittel, F., Gong, L., Lewicki, R., Griffin, R., Jiang, W., et al. (2016). Support vector machine modeling using particle swarm optimization approach for the retrieval of atmospheric ammonia concentrations. Environmental Modeling & Assessment, 21(4), 531–546. https://doi.org/10.1007/s10666‐015‐9495‐x

Zhang, J.‐R., Zhang, J., Lok, T.‐M., & Lyu, M. R. (2007). A hybrid particle swarm optimization–back‐propagation algorithm for feedforward neural network training. Applied Mathematics and Computation, 185(2), 1026–1037. https://doi.org/10.1016/j.amc.2006.07.025

Zhou, Y., Zhao, C., & Liu, X. (2014). An iteratively adaptive particle swarm optimization approach for solving chemical dynamic optimization problems. CIE Journal, 65(4), 1296–1302.

Word count: 9736

Show less

© 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Atmospheric chemistry is highly complex, and significant reductions in the size of the chemical mechanism are required to simulate the atmosphere. One of the bottlenecks in creating reduced models is identifying optimal numerical parameters. This process has been difficult to automate, and often relies on manual testing. In this work, we present the application of particle swarm optimization (PSO) toward optimizing the stoichiometric coefficients and rate constants of a reduced isoprene atmospheric oxidation mechanism. Using PSO, we are able to achieve up to 28.8% improvement in our error metric when compared to a manually tuned reduced mechanism, leading to a significantly optimized final mechanism. This work demonstrates PSO as a promising and thus far underutilized tool for atmospheric chemical mechanism development.

Details

Title

Evolutionary Optimization of the Reduced Gas‐Phase Isoprene Oxidation Mechanism

Author

Chakraborty, Arijit¹; Wiser, Forwood Cloud¹

; Sen, Siddhartha²; Westervelt, Daniel M.³

; Carter, Reese⁴; McNeill, V. Faye⁵

; Venkatasubramanian, Venkat¹

¹ Department of Chemical Engineering, Columbia University, New York, NY, USA
² Microsoft Research, New York, NY, USA
³ Lamont‐Doherty Earth Observatory of Columbia University, Palisades, NY, USA, NASA Goddard Institute for Space Studies, New York, NY, USA
⁴ The Dalton School, New York, NY, USA
⁵ Department of Chemical Engineering, Columbia University, New York, NY, USA, Department of Earth and Environmental Sciences, Columbia University, New York, NY, USA

Section

Research Article

Publication year

2025

Publication date

May 1, 2025

Publisher

John Wiley & Sons, Inc.

e-ISSN

19422466

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1029/2024MS004511

ProQuest document ID

3211941790

Evolutionary Optimization of the Reduced Gas‐Phase Isoprene Oxidation Mechanism

Jump to:

Full text

Abstract

Details

Suggested sources