SupraFit – An Open Source Qt Based Fitting

Full text

Turn on search term navigation

Introduction

After the work of Pederson, Lehn and Cram in the second half of the 20th century (nobel prize in 1987 “for their development and use of molecules with structure-specific interactions of high selectivity”), supramolecular chemistry has become a popular field of research. The experimental determination of association constants utilising supramolecular titration experiments plays a big role in the analytical zoo of this research area. Several software packages have been written in the last three decades, each having its own strength and weaknesses. In times of open science, open data and open source software, some of these older software solutions might be considered as not state-of-the-art. The most recent tool for supramolecular titration experiments has been developed by the group of Thordarson and is available via www.supramolecular.org (last checked 17. 01. 2022) as online service. As an alternative to the older offline applications and as well as to the online tools, a newly written software package for supramolecular titration experiments called SupraFit is reported. While online services have the advantage of making data easily Findable, Accessible, Interoperable and Re-usable (FAIR principle), offline applications on local hardware give users control on how to spend computational resources for example in terms computation time. As SupraFit is able to perform computational demanding post-processing, an offline application is more suited for the analysis. However, in the further development of SupraFit, the realisation of the FAIR principles shall become more attention.

SupraFit is written in C++ utilising the Qt Software Development Toolkit^[1] and the Eigen Library.^[2] SupraFit is mainly developed for NMR titration and ITC experiments, providing methods to globally and locally analyse 1 : 1, 2 : 1/1 : 1 and 1 : 1/1 : 2 complexes out of the box. Fully statistical analysis based on Monte Carlo simulation and F-Test approaches with a good scaling on multicore systems are implemented as well as an intuitive user interface to deal with several models on single data sets. Due to being open source, own models can be implemented in the source code, with all functionality eg. statistical analysis being provided for the new models.

Software

Several packages already exist for the analysis of NMR titrations or ITC data, some of them did not receive updates or improvements recently. Additionally, these programs may provide statistical analysis, which are not always comparable to each other as they are based on different theories. A third point is the advantage of software to run on different operating systems (OS) or even being independent of an OS, although Windows systems dominate the PC market.

In the last decade, the idea of open source software, as well as open data has evolved, and more scientific software is not only freely usable but the source code is published under the terms of an open source licences, such as GPL or^[3] MIT.^[4] In contrast to SupraFit, the available open source programs are mainly focused on computational chemistry and chemoinformatics.^[5–8]

Some common tools used to analyse supramolecular titration experiments will be listed in the next section, however without any claim to completeness.

NMR Titration

WinEQNMR, initially a DOS program called EQNMR, has been written by M. J. Hynes^[9] and is available for Windows systems. WinEQNMR provides methods for protonation equilibria, hydrolysis of metal ions or stability of metal complexes. An archive containing the binaries was freely downloadable at http://www.nuigalway.ie/chem/Mike/wineqnmr.htm. However the website is not available any more, but can be accessed via the wayback machine (https://web.archive.org/web/20210518005317/http://www.nuigalway.ie/chem/Mike/wineqnmr.htm). The password protected archive containing the program is not available via the wayback machine service.

HypNMR^[10] is part of the Hyperquad software package developed Sabatini, Vacca and coworkers, providing tools for different methods such as NMR titration, ITC and spectrophotometry. HypNMR runs on Windows system and information on how to obtain the software are available upon request.¹ The most recent version according to their website (http://www.hyperquad.co.uk/hypnmr.htm, last checked 17. 01. 2022) is HypNMR2018, without pointing out the differences to the older versions.

M. Maeder and P. King founded Jplus Consulting in 2009 and provide a software packages called ReactLab to analyse and simulate for example equilibrium titrations and kinetics. The software is based on a combination of MatLab and Excel and is available for purchase. More information can be found on their official website: (https://jplusconsulting.com (last checked 17. 01. 2022).

Open Data Fit^[11] is a collection of online services provided by P. Thordarson, where titration data can be analysed. The service can be accessed at http://opendatafit.org (last checked 17. 01. 2022). For now, supramolecular experiments^[12] and a demo version for cell viability^[13] are available. The kinetics service is under construction.^[14] BindFit, the part focusing on supramolecular titration, was initially provided as free MatLab scripts included in the tutorial review by Thordarson 2011.^[15] The latest version supports analysis of NMR and UV/VIS titration of typical 1 : 1, 2 : 1 and 1 : 2 systems, with the python source code being available at https://github.com/echus/supramolecular-apps (last checked 17. 01. 2022). New features such as Monte Carlo simulation based statistics are announced for future versions.

ITC

NanoAnalyze is available from TA Instruments, that assemble and sell instruments for several analysis (thermal, microcalorimetric and rheologic analysis). NanoAnalyze is freely available for Windows systems, provides several binding models, analysis of thermograms and statistics based on Monte Carlo simulations. It can be obtained from their website https://www.tainstruments.com/itcrun-dscrun-nanoanalyze-software (last checked 17. 01. 2022).

Harms et al.^[16] released pytc (python itc) as open source software, built on top of python3 to analyse ITC data, having the most important binding models already implemented. The project is hosted on GitHub: https://github.com/harmslab/pytc (last checked 17. 01. 2022). Since it is written in python3, other models can easily be added. Statistical methods like F-Test or Information Criterion^[17–20] methods are implemented and can be used to determine the performance of the models. An graphical user interface using PyQt5 can be downloaded separately at https://github.com/harmslab/pytc-gui (last checked 17. 01. 2022).

SEDFIT and SEDPHAT form a program package to globally analyse ITC data (gITC), with powerful statistical analysis based on Monte Carlo simulations or the F-Test approach.^[21] It is freely available at https://sedfitsedphat.nibib.nih.gov/software/default.aspx (last checked 17. 01. 2022), however other systems apart from windows are not supported. Thermogram analysis can be performed with NITPIC (http://biophysics.swmed.edu/MBR/software.html last checked 17. 01. 2022) from Keller et al.^[22] and then imported into SEDFIT.

Supramolecular Titrations

The theory of complexation and supramolecular titration is already reviewed in articles by Thordarson,^[15,23] as well as in text books like Analytical Methods in Supramolecular Chemistry^[24] but the main aspects will be summarised here:

General Approach

Starting from the general mass balance equations (eq. 1 and 2) for a two-component system, the relationship between the concentration of two components [A] and [B] can be described through the cumulative stability constants (eq. 3). For example individual stability constants for a system with two complex species $A_{a} B_{b}$ defined with $a = b = 1$ and $a = 2, b = 1$ read as in equation 4. 1 ${[A]}_{0} = \sum_{\binom{a = 0}{b = 0}}^{l, m} a β_{a b} {[A]}^{a} {[B]}^{b}$ 2 ${[B]}_{0} = \sum_{\binom{a = 0}{b = 0}}^{l, m} b β_{a b} {[A]}^{a} {[B]}^{b}$ 3 $β_{a b} = \prod_{\binom{a = 0}{b = 0}}^{l, m} K_{a b}$ 4 $K_{11} = \frac{[A B]}{[A] [B]} K_{21} = \frac{[A_{2} B]}{[A] [A B]}$

Depending on the values for l and m, e. g. the stoichiometry of molecules of A and B that are involved in forming the complex, different systems can be described. SupraFit reports all stability constants as individual logarithmic constants lgK ( $l o g_{10} K$ ), in contrast to other software that may report them as plain stability constants K in M⁻¹ or as cumulative constants β.

Determining Stability Constants

The determination of association constants with titration experiments is based on the idea, that each component influences the response signal: Assuming a linear relationship between the amount of species and the response signal, equation 5 can be formed, where each component X_i contributes to the overall signal y by a factor Y_i. 5 $\begin{matrix} \end{matrix}$

NMR Titration

Upon performing ¹H-NMR titration, the chemical shift of specific protons bound to X (eg. receptor) changes during complexation due to non-covalent interactions with another component. Depending on the kinetics of the complex formation, fast and slow exchange can be observed. SupraFit, as most of the other applications, can only handle fast exchange, where the observed signal is the weighted average of all signals of the specific proton in the components, e. g. the shift of a proton assigned to the isolated receptor and one to the complex.

Since the relative change of the chemical shifts is of interest, it is defined as the ratio of each component to the reference component: in following case using the first component. Equation 5 reads for NMR titration as follows: 6 $Δ δ = \sum_{i} |δ_{i} \frac{{[X]}_{i}}{{[X]}_{0}}$

On the other hand, for the slow exchange, for each component a signal for the specific proton can be observed, where the intensity is related to the amount of the species.^[25]

UV/VIS Titration

In the UV/VIS titration, the overall absorbance is the sum of the individual extinction coefficient ε_i multiplied by concentration of each component. The equation holds true for low concentrations that fulfill Lambert-Beers Law. 7 $\begin{matrix} \end{matrix}$

ITC

General Aspects

The basic part of isothermal titration calorimetry is the observation of the change of heat due to a complex formation in a reaction cell while keeping the temperature constant. The guest component B is sequentially added to a solution of the host component A. Details on that method can be found in literature of Freire,^[26,27] in Analytical Methods of Supramolecular Chemistry^[28] by Schmidtchen, as well as in reviews by Thordarson.^[15,29]

The basic ITC equation 8 describes a sum over all formed complex species multiplied with corresponding heat of formation. In contrast to NMR and UV/VIS titrations, the pure host signal does not contribute to the observed heat. At the current state, SupraFit only makes use of models, that are of fixed stoichiometry and equal to the well known NMR titration models, that are summarised in section 3.3. Furthermore, SupraFit handles titration experiments with both, a fixed-volume set up as well as a set up with variable volume. 8 $\begin{matrix} \end{matrix}$

Handling Dilution Effects

Since upon each injection of B the concentration of B itself changes, an amount of signal can be lead back to a heat of dilution (Q_d), that cannot be neglected. Assuming a linear relationship between the concentration of B and the response heat signal, one can use equation 9 to add blank effects to the experiment (eq. 10), as done for example in pytc.^[16] 9 $\begin{matrix} Q_{d, i} = m_{δ} [B_{i}] + n_{δ} \end{matrix}$ 10 $Q = V \sum_{i} Δ H_{i}$

As a consequence, different approaches to deal with the dilution can be realised using SupraFit:

Using equation 10, two parameters are introduced ( $m_{δ}$ and $n_{δ}$ ) and fitted alongside with the stability constants and the heat of formation to the experimental titration curve. An additional blank experiment does not have to be performed.
The two blank parameters ( $m_{δ}$ and $n_{δ}$ ) are obtained from an independent blank experiment and are added as fixed terms to equation 10.
The result of the independent blank experiment is subtracted from the titration experiment and which is used to fit the parameters in equation 8 afterwards.
The blank parameters are fitted to a blank experiment and the titration simulaneously, while the stability constants and the heat of formation are fitted to the titration experiment only (eq. 10).

Thermogram Handling

SupraFit provides ready-to-use thermogram integration functions with elementary baseline corrections for *.itc and plain thermogram files consisting of columns with time and heat per time, respectively. The baseline is separately calculated for each peak as a linear function, where the integration range can be adjusted manually. In case of very unregular baselines, different software packages may be more sufficient, such as NITPIC or software provided by the hardware supplier. After integration using third party software, the plain data can be processed with SupraFit.

1 : 1 Model

The simplest form of complexes with two components are the 1 : 1 complexes ( $a = 1$ , $b = 1$ ), which are formed according to equation 11. K₁₁ denotes the step-wise complex formation constant. The approach is sketched in Appendix B, resulting in equation 12. 11 $A + B \overset{⇀}{↽} AB K_{11} = \frac{[A B]}{[A] [B]}$ 12 $0 = K_{11} {[A B]}^{2} - [A B] (K_{11} {[A]}_{0} + + K_{11} {[B]}_{0} + 1) + K_{11} {[A]}_{0} {[B]}_{0}$

Using the solution of $[A B]$ from the quadratic equation 12 all remaining concentrations can be calculated according to the mass-balance equation. The resulting equations for 1 : 1 models used in SupraFit are summarised in Table 1 with only the shifts of the host and the complex are taken into account. Signals of component B are ignored. For UV/VIS this holds true if the component is not UV/VIS active at the selected wave length.

Table 1 Equations used in 1 : 1 models.

Method	Equation
NMR	$δ_{c a l c} = δ_{A} \frac{[A]}{[A]_{0}} + δ_{A B} \frac{[A B]}{[A]_{0}}$
UV/VIS	$A_{A b s, c a l c} = ϵ_{A} [A] + ϵ_{A B} [A B]$
ITC	$Q_{i} = V ([A B]_{i} - [A B]_{i - 1} \cdot (1 - \frac{v}{V})) \cdot Δ H_{A B}$

2 : 1/1 : 1 Model

A model of 2 : 1/1 : 1 stoichiometry is defined through the following relationship: 13 $A + B \overset{⇀}{↽} AB K_{11} = \frac{[A B]}{[A] [B]}$ 14 $A + AB \overset{⇀}{↽} A_{2} B K_{21} = \frac{[A_{2} B]}{[A] [A B]}$

The stepwise stability constants K₁₁ and K₂₁ combine to the cumulative association constants as follows: 15 $K_{11} K_{21} = \frac{[A B]}{[A] [B]} \frac{[A_{2} B]}{[A] [A B]} = \frac{[A_{2} B]}{{[A]}^{2} [A]} = β_{21}$

The solution for the concentration of A is given in equation 47 in the appendix.^[15] The corresponding equations to describe a 2 : 1/1 : 1 model used within SupraFit are summarised in Table 2, with the guest molecule being silent. In case of ITC experiments, 2 : 1/1 : 1 are not used regularly, but have already been reported.^[30,31]

Table 2 Equations used in 2 : 1/1 : 1 models.

Method	Equation
NMR	$δ_{c a l c} = δ_{A} \frac{[A]}{{[A]}_{0}} + δ_{A B} \frac{[A B]}{{[A]}_{0}} + 2 δ_{A_{2} B} \frac{[A_{2} B]}{{[A]}_{0}}$
UV/VIS	$A_{a b s, c a l c} = ϵ_{A} [A] + ϵ_{A B} [A B] + 2 ϵ_{A_{2} B} [A_{2} B]$
ITC	$Q_{i} = V (({[A B]}_{i} - {[A B]}_{i - 1} \cdot (1 - \frac{v}{V})) \cdot Δ H_{A B} + ({[A_{2} B]}_{i} - {[A_{2} B]}_{i - 1} \cdot (1 - \frac{ν}{V})) \cdot Δ H_{A_{2} B})$

1 : 1/1 : 2 Model

The 1 : 1/1 : 2 system is defined through following law of mass action: 16 $A + B \overset{⇀}{↽} AB K_{11} = \frac{[A B]}{[A] [B]}$ 17 $AB + B \overset{⇀}{↽} {AB}_{2} K_{12} = \frac{[A B_{2}]}{[A B] [B]}$

The concentration of unbound guest can be calculated analogously to the 2 : 1/1 : 1 systems using equation 50,^[15] where the free host concentration can be determined using the mass-balance equations for 1 : 1/1 : 2 system.

Having the free and complexed host concentrations, the signals are calculated in SupraFit using the equations in Table 3, with the guest molecule being silent.

Table 3 Equations used in 1 : 1/1 : 2 models.

Method	Equation
NMR	$δ_{c a l c} = δ_{A} \frac{[A]}{{[A]}_{0}} + δ_{A B} \frac{[A B]}{{[A]}_{0}} + δ_{A B_{2}} \frac{[A B_{2}]}{{[A]}_{0}}$
UV/VIS	$A_{a b s, c a l c} = ϵ_{A} [A] + ϵ_{A B} [A B] + ϵ_{A B_{2}} [A B_{2}]$
ITC	$Q_{i} = V (({[A B]}_{i} - {[A B]}_{i - 1} \cdot (1 - \frac{v}{V})) \cdot Δ H_{A B} + ({[A B_{2}]}_{i} - {[A B_{2}]}_{i - 1} \cdot (1 - \frac{ν}{V})) \cdot Δ H_{A B_{2}})$

2 : 1/1 : 1/1 : 2 Model

The last titration model implemented in SupraFit is the mixed model with 2 : 1, 1 : 1 and 1 : 2 species. 18 $A + B \overset{⇀}{↽} AB K_{11} = \frac{[A B]}{[A] [B]}$ 19 $AB + B \overset{⇀}{↽} {AB}_{2} K_{12} = \frac{[A B_{2}]}{[A B] [B]}$ 20 $AB + A \overset{⇀}{↽} A_{2} B K_{21} = \frac{[A_{2} B]}{[A B] [A]}$

The solution of this system is defined by the mass-balance equation 21 ${[A]}_{0} = [A] + β_{11} [A] [B] + β_{12} [A] {[B]}^{2} + 2 β_{21} {[A]}^{2} [B]$ 22 ${[B]}_{0} = [B] + β_{11} [A] [B] + 2 β_{12} [A] {[B]}^{2} + β_{21} {[A]}^{2} [B]$

The mass balance equation can be simplified and reads as: 23 $[A] ([B]) = (2 β_{21} [B]) \cdot {[A]}^{2} + (β_{12} {[B]}^{2} + K_{11} [B] + 1) \cdot [A] - {[A]}_{0}$ 24 $[B] ([A]) = (2 β_{12} [A]) \cdot {[B]}^{2} + (β_{21} {[A]}^{2} + K_{11} [A] + 1) \cdot [B] - {[B]}_{0}$

The solution to this equilibrium system is obtained using an iterative procedure: The initial concentrations are guessed as $[A] = m i n ({[A]}_{0}, {[B]}_{0}) / 10 [B] = B ([A]) (according to eq . 24)$

followed by the calculation of [A] and [B] with then equation 23 and 24. The calculations are repeated until the change in the equilibrium concentrations reaches a threshold. Alternatively to this algorithm, methods to solve any equilibria system based on a Gauss-Newton optimisation have been published.^[32] A Levenberg-Marquardt optimisation has been tested in SupraFit, but was disabled.²

Having the concentrations of the free and complex species, the signals are calculated in SupraFit using the equations listed in Table 4, with the guest molecule being silent.

Table 4 Equations used in 2 : 1/1 : 1/1 : 2 models.

Method	Equation
NMR	$δ_{c a l c} = δ_{A} \frac{[A]}{{[A]}_{0}} + δ_{A B} \frac{[A B]}{{[A]}_{0}} + 2 δ_{A_{2} B} \frac{[A_{2} B]}{{[A]}_{0}} + δ_{A B_{2}} \frac{[A B_{2}]}{{[A]}_{0}}$
UV/VIS	$A_{a b s, c a l c} = ϵ_{A} [A] + ϵ_{A B} [A B] + 2 ϵ_{A_{2} B} [A_{2} B] + ϵ_{A B_{2}} [A B_{2}]$
	$Q_{i} = V ({[A B]}_{i} - {[A B]}_{i - 1} \cdot (1 - \frac{v}{V})) \cdot Δ H_{A B} +$
ITC	$+ ({[A_{2} B]}_{i} - {[A_{2} B]}_{i - 1} \cdot (1 - \frac{v}{V})) \cdot Δ H_{A_{2} B})$
	$+ ({[A B_{2}]}_{i} - {[A B_{2}]}_{i - 1} \cdot (1 - \frac{v}{V})) \cdot Δ H_{A B_{2}})$

Cooperativity

Cooperative effects describe increasing or decreasing step-wise bindings constants in multi-step systems and have been discussed in the literature.^[29,33,34] Following the notation of Thordarson,^[11,15,29] four different types can be distinguished: full, noncooperative, additive and statistical. These models can be applied to 2 : 1 and 1 : 2 complex species in the mixed models in SupraFit. The different kinds of relationship that can be set up in the model options are summarised in Table 5.

Table 5 Different cooperative binding models define the relationship of the estimated model parameters. The relationships are taken from Hibbert and Thordarson, 2016.^[11] K₂ refers to either K₁₂ or K₂₁, depending on the stoichiometry of the complex. Similar, $δ_{Δ 2}$ refers to the signal of either the 2 : 1 or 1 : 2 species, whereas $δ_{Δ 1}$ denotes the 1 : 1 species.

model	K	δ
full	K₁ $\neq$ 4K₂	$δ_{Δ 2} \neq δ_{Δ 1}$
noncooperative	K₁=4K₂	$δ_{Δ 2} \neq δ_{Δ 1}$
additive	K₁ $\neq$ 4K₂	$δ_{Δ 2} = δ_{Δ 1}$
statistical	K₁=4K₂	$δ_{Δ 2} = δ_{Δ 1}$

Michaelis-Menten Theory

Michaelis-Menten theory is usually used to describe how the rate r of an enzymatic reaction, that transforms a substrate S to a product P (eq. 25), depends on the amount of substrate S₀.^[35] $E + S \leftrightarrow_{k_{- 1}}^{k_{1}} ES$ 25 $ES \leftrightarrow_{k_{- 2}}^{k_{2}} E + P$

The rate is defined as 26 $\begin{matrix} r = \frac{v_{m a x} \cdot S}{K_{M} + S} \end{matrix}$

At high concentrations of S, the rate r tends towards v_max. A linearised form of the Michaelis-Menten equations, the Lineweaver-Burke form (eq. 27), is usually used to determine K_M and v_max. 27 $\begin{matrix} \frac{1}{r} = \frac{K_{M}}{v_{m a x}} \frac{1}{S} + \frac{1}{v_{m a x}} \end{matrix}$

SupraFit provides a model to determine K_M and v_max using nonlinear regression. The starting guess is calculated using eq. 27.

Nonlinear Least-squares Regression

The set of unknown parameters $\underline{θ}$ , that are used to describe the relation of the independent data x and the experimental data y_exp (eq. 28), have to be adjusted to minimise the sum of squared errors (SSE, eq. 29). In case of NMR titrations $\underline{θ}$ corresponds to the stability constants and chemical shifts of each component, x to the concentrations and y to the observed chemical shifts. In connection with ITC experiments $\underline{θ}$ refers again to the the stability constants as well as the heat of formation and optional to the dilution parameters. The integrated peaks of the a thermogram form y and the concentrations remain to be the independent parameters x.

For the nonlinear problem, the Levenberg-Marquardt Algorithm^[36,37] as implemented in Eigen, is used. 28 $y_{c a l c, i} = f (θ, x_{i}) + e_{i}$ 29 $S S E = \sum_{i} {(y_{e x p, i} - y_{c a l c, i})}^{2} = \sum_{i} e_{i}^{2} \to 0$

y_exp,i denotes the experimentally observed value at i, y_calc,i the estimation of the observed value according to the model parameter and e_i the residual at each data point. The parameters θ are henceforth referred as to $\hat{θ}$ in case they are the best-fit parameters after least-squares optimisation. Characterisation of the fit can be realised using the standard deviation of the residuals σ_fit (eq. 30), SE_y (eq. 31) and χ² (eq. 32:^[15] 30 $σ_{f i t} = \frac{\sum_{i} e_{i}^{2}}{N - 1}$ 31 $S E_{y} = \frac{\sum_{i} e_{i}^{2}}{N - k}$ 32 $χ^{2} = \frac{\sum_{i} e_{i}^{2}}{N - k - 1}$

SE_y is the corrected standard deviation with respect to the number of parameters (k) in the applied model.

Features General

An introduction to SupraFit is not reported in that article, it can be found in the SupraFit Quickstart,^[38] however the main aspects will be summarised: The SupraFit package contains two binaries, the suprafit.exe binary providing the graphical user interface (GUI) and suprafit_cli.exe providing command line interface. The GUI comes with all basic functionalities for loading and saving data sets as well as thermogram integration in case of ITC experiments. Most of the results obtained with SupraFit are provided as adjustable charts and text information, where the diagrams can be exported to *.png files. Many charts presented in this article were exported directly from SupraFit, the remaining charts, mainly the boxplots, are LaTeX and TikZ based. A screenshot of the main window can be found in Figure 1.

FIGURE:

Screenshot of the main window with the dialog box to import thermograms open.

SupraFit reads simple Table files as well as *.itc files. For the later, the thermogram import is straight forward. Additionally, data simulation and basic experimental planning are available with the current functions. More details on the usage of SupraFit are available in the quickstart, that can be downloaded on the GitHub webpage at https://github.com/conradhuebler/SupraFit.

Technical Aspects and Implementation

SupraFit is written in C++ relying on the C++14 standard and should be compilable on every platform, that is supported by Qt and Eigen. The model implementation makes use of object-oriented programming to easily implement new models. It is out of the scope of this article to deal with the detailed implementation, but a short summary will be given:

The source code is separated into four parts: (1) the core components containing the models, source code for optimisation and collected mathematical tools. Statistical analysis is implemented in the second part (2). Both parts, (1) and (2), are independent of any user interface and provide the functionalities for the pure command line application suprafit_cli.exe (3) and the graphical user interface suprafit.exe (4). Due to the separation of the user interface (4) from the models (1) and the statistical functions (2), interfaces with other programs or environments can be realised. Basic work to establish an interface to python has already been done, but is not available in the recent stable version of SupraFit.

The core part holds the functionality to store the experimental data (DataClass), that is realised using a shared data pointer. Model preparation is done in the abstract class AbstractModel, that is based on that DataClass. Therefore, each implemented model inherits from AbstractModel and DataClass, respectively (Figure 2). In the specific model implementation, the equations of the model and the number of input parameters have to be defined, as well as the names of each parameter. More details can be found in the source code documentation for the AbstractModel, AbstractTitrationModel and Michaelis-Menten-Model.^[38] A shorty summary on how to implement new models is given at the GitHub repository at https://github.com/conradhuebler/SupraFit/tree/master/docs. Furthermore, work to bring scripted models to SupraFit has basically been done in the development version. However, it can not be applied to titration experiments yet.

FIGURE:

Inheritance relationship in SupraFits model implementation. To implement a new model, a C++ class has to be derived from AbstractModel class and the most important virtual functions have to be implemented.

Parallelisation is mostly done using the threads concept utilising QThreadPool and QRunnable, but individual parts use openMP. Data storage is done using the JSON Format (*.json) or Zip compressed JSON (*.suprafit).

Statistical Tools and Further Analysis Confidence Intervals

Parameter (θ) estimation is the main question in regression, as it allows the rational analysis and comparison of data sets and experiments. Yet the knowledge of θ is often not sufficient for rational analysis,^[39] as the best fit values may differ for several performed experiments. The confidence interval of a parameter θ_i estimates the range $[θ_{i, -}, θ_{i, +}]$ , within which the true parameter $\tilde{θ_{i}}$ can be expected. However, the standard approach used in (multiple) linear regression cannot be applied for non-linear problems. SupraFit provides two basic routes to approximate the confidence interval, both being described in the literature before. Explicit references will be given in each section. One of the goals of this article and SupraFit regarding the statistical tools is not to have one correct way to calculate confidence intervals, but rather present the already known techniques, provide an easy way to access those and show some examples on how these methods can be applied to parameter estimation problems.

Confidence Intervals by Monte Carlo Simulations and Percentile Method

A powerful tool, that is used in many fields of science is the Monte Carlo simulation.^[40] It has already been applied to both ITC and NMR titration apart from confidence calculation.^[41–43] The application to calculate confidence intervals has been reported for titration experiments by Thordarson^[15] and in general by Motulsky and Christopoulos.^[44] The confidence intervals from Monte Carlo simulations are obtained using the percentile method, which has been discussed alongside with resampling methods by Efron.^[39] Efron noted, that the section dealing with confidence intervals ”is highly speculative in content.”^[39]

The basic idea of the Monte Carlo approach is to theoretically repeat the performed experiment several times (T). A single theoretic step is being realised by adding a random error ε_i to y_calc,i and then obtain a new set of data mimicking the original experimental data including realistic errors. These data can be used to estimate a new set $\underline{θ}$ . Performing these steps T times is denoted as Monte Carlo (MC) simulation within this context.

Two main approaches to define the errors ε are implemented in SupraFit; (a) they are calculated from the standard normal distribution $ϵ \in N (μ = 0, σ^{2})$ or (b) randomly chosen from the absolute errors obtained after the successful fit ( $ϵ \in \underline{e}$ ). The later will be called bootstrapping (BS) in SupraFit and may be interpreted as a mixture of a typical Monte Carlo simulation and resampling technique. Bootstrapping is one of the resampling plans discussed by Efron.^[39,45] More recent discussions and problems using the bootstrapping method can be found in Canty et al.^[46] and in Efron and Hastie.^[47]

The applied standard deviation σ_MC in approach (a) can be taken from the SE_y, σ_fit or as manually defined value, where SE_y is the default choice as proposed by Motulsky and Christopoulos since it is the corrected standard deviation (eq. 31). The $1 - 2 α$ confidence interval for each model parameter is then calculated using the percentile method: 33 $\begin{matrix} θ_{i, -} = C \hat{D} F^{- 1} (α) θ_{i, +} = C \hat{D} F^{- 1} (1 - α) \end{matrix}$

which results in the 95 % confidence interval if $α = 0.025$ . In SupraFit, this is realised by collecting all model parameters for each Monte Carlo step and then take $α \cdot T$ and $(1 - α) \cdot T$ entry of the ordered list of the corresponding parameter. More advanced percentile methods, which are available in octave or R, are not implemented, so for a smaller number of T the results differ from those obtained with the standard approach using the quantile function in octave or R.³ More robust methods will be implemented in future releases. Efron proposed 2000 steps as minimum for bootstrapping methods,^[47] which is taken as standard for all Monte Carlo simulations in conjunction with the percentile method. Since Monte Carlo simulations are parallelised,⁴ it benefits from the multicore architecture of modern desktop computers. Monte Carlo results are then reported as histogram-like charts as printed in Figure 3. The box represents the 95 % confidence interval, the dash-dotted line the estimated parameter. The individual bins are not plotted as typical bars but rather as a line-plot.

FIGURE:

Standard representation of a histogram-like chart obtained after performing a Monte Carlo simulation.

Alternatively to the variation of y_calc, Thordarson proposed the variation of input data, which are the initial concentrations of host and guest molecules in case of NMR titration.^[15] This derivation can be performed alongside with standard Monte Carlo simulations. To the best of the authors knowledge confidence interval calculations have not been reported for this derivation, however percentiles can be calculated in the same way.

Confidence Intervals using the F-Test Approach

The F-Test approach to confidence intervals has first been proposed by Box^[48] and Beale,^[49] and further outlined by Beechem^[50] as well as Bates and Watts.^[51] Taking the least-squares estimated set of parameters $\hat{θ}$ , the confidence interval then includes all values θ that are equal to the best-fit estimation $\hat{θ}$ . This can be formulated as following hypotheses $H_{0} : θ = \hat{θ}$ and the alternative $H_{A} : θ \neq \hat{θ}$ . The decision is based on the F-Test (eq. 34), where the ratio of $S S E (θ)$ and $S S E (\hat{θ})$ has to be smaller than the value, that defines the $(1 - α) \cdot$ 100 % confidence interval.^[52] 34 $\frac{S S E (θ) - S S E (\hat{θ})}{S S E (\hat{θ})} \leq \frac{K}{N - K} F_{N, N - K}^{α}$ 35 $S S E_{m a x} = S S E (θ) \leq S S E (\hat{θ}) \cdot (1 + F_{N, N - K}^{α} \frac{K}{N - K})$

In equation 34, K refers to the number of parameters, N to the number of data points and $F_{N, N - K}^{α}$ to the critical value in the F-distribution for the given degrees of freedom and desired confidence interval. A graphical interpretation is given in Figure 4. The sum of squares has a minimum at $S S E (\hat{θ})$ and can be decreased to θ_i,− or increased to θ_i,+ while the error is smaller than SSE_max.

FIGURE:

Graphical interpretation of the F-Test approach. The confidence interval $[θ_{i, -}, θ_{i, +}]$ is not necessarily symmetric.

At least two different approaches to the F-Test are mentioned in the literature (a) the Weakened Grid Search^[15,50] (WGS) and (b) Model Comparison (MOC).^[11,44] Keller at al.^[53] published an Excel-Guide to apply the F-Test to Michaelis-Menten Kinetics using the Weakened Grid Search. SupraFit provides both approaches to the F-Test, that will now be introduced:

Weakened Grid Search

Having K parameters to be analysed, the first θ_i is changed by small $δ_{θ}$ ⁵ and then fixed, while the remaining $θ_{j \neq i}$ are optimised. The parameter θ_i is changed again by $δ_{θ}$ and the $θ_{i \neq j}$ parameters are estimated anew. This is to be repeated as long as $S S E (θ)$ is smaller than SSE_max and therefore H₀ is not rejected. This procedure is performed for all parameters in the same manner and all θ that satisfy equation 35 define the confidence region.^[15] In SupraFit some additional parameters are introduced to control the procedure, like the maximum number of steps, the step size and the convergence threshold for the sum of squares. A comprehensive list is given in the manual of SupraFit and a short description of each parameter is shown as tooltip in the SupraFit program. Obtained results are graphically presented as shown in Figure 5, where one parameter was analysed. The dash-dotted line indicates the estimated value ${\hat{θ}}_{i}$ and the solid line indicates the obtained sum of squares for each variation of θ_i while $θ_{i \neq j}$ are being optimised. Only values where the error is smaller than the threshold are plotted. The Grid Search is parallelised, so that for each parameter θ_i two processes independently evaluate either $θ_{i, -}$ or $θ_{i, +}$ .

FIGURE:

Sample representation of the Weakened Grid Search result.

Model Comparison

An alternative way to the F-Test approach is denoted as Model Comparison. During MOC calculations θ_i is varied by an amount of $δ_{θ}$ while the remaining $θ_{i \neq j}$ are not optimised, but systematically varied to fullfill equation 35. The parameter θ_i is then again changed by $δ_{θ}$ and the remaining $θ_{i \neq j}$ are varied to meet the condition in equation 35. This is repeated until the change of θ_i disobeys equation 35. After performing this approach for all K parameters, the limits of the confidence region can be extracted from all obtained values of θ_i as $θ_{i, -} = m i n (θ_{i})$ and $θ_{i, +} = m a x (θ_{i})$ .^[44] Assuming that there is only one parameter to be optimised, applying WGS and MOC as described in Motulsky and Christopoulos,^[44] both methods perform similarly: θ_k will be varied by $\pm δ_{θ}$ until $S S E (θ_{k})$ reaches the maximum possible SSE and the tuple $(θ_{k, +}, θ_{k, -})$ correspondence to the confidence interval. This approach of continuously varying one parameter is implemented in SupraFit as Simplified Model Comparison (SMOC). Like WGS, the Simplified Model Comparison benefits from multiple processes, since each parameter is evaluated in a single thread.

Instead of systematic variation, SupraFit provides the Model Comparison as Monte Carlo experiment just like the calculation of an arbitrary area: Uniform random numbers are generated within defined boundaries for every θ_i, where these random parameters are stored if $S S E (θ)$ meets equation 35. The confidence interval is then defined by the minimum and maximum values for all θ. The implementation works as follows: Simplified Model Comparison is applied to each parameter and the confidence interval $[θ_{i, -}^{S M O C}, θ_{i, +}^{S M O C}]$ is obtained. The intervals are scaled by variable parameters, which define a rectangular box in case of two variables, a cuboid for three parameters etc. (dash-dotted box in Figure 6). Uniform random numbers are generated within the interval defined by the box and checked if they obey equation 35. If they do, the parameters are kept, otherwise they are discarded. An ideal confidence interval is represented in Figure 6 as red ellipsoid, with the maximum values for θ₁ and θ₂ form the limits of the confidence interval. Similar to previous methods, Model Comparison is parallelised, where amount of Monte Carlo steps is equally divided across the threads.

FIGURE:

Calculation of the confidence interval using the Model Comparison and the Monte Carlo approach. Random values of θ₁ and θ₂ are generated within the dash-dotted boundaries. If $S S E (θ_{1}, θ_{2})$ meet equation 35, the parameters are kept.

Resampling Methods

Cross Validation (CV) is a powerful tool, applied for example in QSAR in conjunction with principal component selection.^[54] In SupraFit, CV will be applied to determine the sufficiency of the used model. Another method, not yet described and applied to supramolecular titration experiments is called ”Reduction Analysis.” Both methods will be introduced in a subsequent article, that focuses on a statistical approach to analyse binding stoichiometry.

Linear Regression Tool

SupraFit provides a linear regression tool for experimental data, that can be used to fit several linear functions to experimental data. The data points are continuously divided: In case of three functions, the first function is fitted using the first $n_{1} . . . n_{i}$ data points, the next functions uses the next $n_{i + 1} . . . n_{j}$ data points and the last functions uses the remaining $n_{j + 1} . . . n_{N}$ data points. The maximal number of functions is $N / 2$ , where each function is described by two points. The currently implemented method tests all available combinations and returns an ordered list. One field of use will be shown for NMR titration, to create Mole Ratio plots. An other application will be shown within the ITC examples.

Global Fitting

Programs like pytc or SEDFIT provide methods to perform a global fit,^[16,21] that is to fit a single set of parameters to more than one experiment. In that fashion, analysing several signals in NMR titrations is already a global fit,^[15] since one formation constant is connected to two or more signals. While a global fit for NMR titration is straightforward, combining several ITC experiments is performed with MetaModels in SupraFit. MetaModels are empty container models, that can hold and manage real models. Model parameters can be handled individually or any in combination thereof. However, the first approach is identical to a local fit. Statistical analysis or global search can be performed on MetaModels in the same way as on simple models. An example of MetaModels will be discussed in the ITC section.

Examples Model Function with Uncorrelated and Correlated Parameters

Uncorrelated Parameters

An example using a function with two uncorrelated parameters θ₁ and θ₂ is used to illustrate the preceding aspects of the statistical analysis. The function in equation 36 acts on the element m of the vector having M elements: 36

Thus, θ₁ acts on the first half of the interval while θ₂ acts on the second half. Depending on the values for θ, the function is discontinuous at $m = M / 2$ . In the range of $\{0.05, 0.05 + 0.05, . . ., 1.95 - 0.05, 1.95\}$ with $θ_{1} = 1.8801$ and $θ_{2} = 7.4043$ , after adding a random error ( $ϵ \in N (0, 0.25)$ ), the function (eq. 36) is drawn in Figure 7.

FIGURE:

Representation of (a) a sample function with two uncorrelated parameters and (b) the added normal distributed error as well as the variation of θ and the corresponding SSE during the (c) Simplified Model Comparison and (d) Weakened Grid Search.

The 95 % confidence intervals using F-Test based methods applying the (Simplified) Model Comparison and Weakened Grid Search approach are given in Table 6. Both have been applied to either parameters individually (MOC^a, WGS^a) or to both (MOC^b, WGS^b) together. The F-Test confidence intervals are effectively the same, independent of the approach, with some numerical differences due to step size during the evaluation. Using Monte Carlo simulation ( $ϵ = S E y$ ) with the percentile method, the confidence interval is much narrower than these obtained with the F-Test approach. Those differences were already pointed out by Motulsky and Christopoulos.^[44]

Table 6 95 % Confidence Intervals obtained after Simplified Model Comparison (SMOC), Weakened Grid Search (WGS), Model Comparison (MOC) and Monte Carlo simulation (MC). ^aBoth parameters are analysed individually. ^bBoth parameters are analysed at the same time.

	$[θ_{1, -}, θ_{1, +}]$	$[θ_{2, -}, θ_{2, +}]$
SMOC	1.3674–1.9157	7.3429–7.4381
MOC^a	1.3680–1.9151	7.3429–7.4381
MOC^b	1.3680–1.9151	7.3430–7.4380
WGS^a	1.3676–1.9156	7.3435–7.4375
WGS^b	1.3675–1.9157	7.3429–7.4381
MC	1.4217–1.8433	7.3543–7.4230

The variation of the individual parameters θ_i by $\pm δ_{θ, i}$ and the corresponding SSE for SMOC and WGS are shown in Figure 7,c and 7,d. In both charts, the series show a parabolic trend, indicating that the SSE_max can be reached during variation.

The correlation coefficient for θ₁ and θ₂ and the scatter plots (Figure 8) after MOC, WGS and MC clearly indicate that there is no dependency between both parameters, which is in agreement with the given function. The obtained correlation coefficient for θ₁ and θ₂ is 3.6 ⋅ 10⁻⁵ after Model Comparison. Using WGS the accepted values for θ₁ and θ₂ show a correlation coefficient of zero. The lines display two sets of accepted values for θ₁ and θ₂, where one parameter is not affected by changing the other. The model parameters after Monte Carlo simulation indicate no correlation ( $R^{2} = 1.9 \cdot 10^{- 4}$ ) as well, but the pairs of θ₁ and θ₂ do not form a complete ellipse as obtained after Model Comparison. However, in case of functions or models with uncorrelated parameters the implemented F-test based approaches lead to practically identical results, which differ from the Monte Carlo simulation based results.

FIGURE:

Scatter plots after confidence calculation using (a) Model Comparison, (b) Weakened Grid Search and (c) Monte Carlo simulation for the model with uncorrelated parameters.

Correlated Parameters

A function where θ₁ and θ₂ are not independent is given in equation 37. The same input data are used as in previous example, where $θ_{1} = 4.8321$ and $θ_{2} = 8.5912$ . Random error ( $ϵ \in N (0, 0.25)$ ) is added to simulate experimental noise. 37 $\begin{matrix} f (θ_{1}, θ_{2}) = l d (θ_{1}) \cdot l d (θ_{2}) \cdot x^{2} + \frac{l d (θ_{1})}{x} + l d (θ_{2}) \cdot x \end{matrix}$

The corresponding diagrams are plotted in Figure 9, including the graphical interpretation of the SMOC and WGS approaches.

FIGURE:

Representation of (a) a sample function with two correlated parameters and (b) the added normal distributed error as well as the variation of θ and the corresponding SSE during the (c) Simplified Model Comparison and (d) Weakened Grid Search.

The confidence intervals, that are calculated similarly to the previous example, are summarised in Table 7: SMOC and MOC^a result in the same confidence interval, and MOC^b and both WGS^a and WGS^b result in the same confidence intervals, however different from the first one. This is expected, since SMOC and MOC^a take only one parameter into account and fix the remaining, while MOC^b and WGS take both parameters into account. The confidence intervals after Monte Carlo simulation are narrower than the WGS/MOC^b confidence intervals, but broader than the SMOC and MOC^a intervals.

Table 7 95 % Confidence intervals obtained after Simplified Model Comparison (SMOC), Weakened Grid Search (WGS), Model Comparison (MOC) and Monte Carlo simulation (MC). ^aBoth parameters are analysed individually. ^bBoth parameters are analysed at the same time.

	$[θ_{1, -}, θ_{1, +}]$	$[θ_{2, -}, θ_{2, +}]$
SMOC	4.7583–4.8601	8.5829–8.8463
MOC^a	4.7583–4.8601	8.5830–8.8463
MOC^b	4.7164–4.9037	8.4773–8.9618
WGS^a	4.7160–4.9038	8.4760–8.9620
WGS^b	4.7160–4.9030	8.4766–8.9616
MC	4.7381–4.8816	8.5294–8.9031

The graphical interpretation of SMOC and WGS are shown in Figure 9c and 9d. While all series again show a parabolic trend, the series for θ₁ or θ₂ differ slightly for both methods. The correlation between θ₁ and θ₂ can be analysed using the correlation coefficient and the scatter plots as shown in Figure 10. Apart from the different confidence intervals, the ellipsoid after MOC is rotated with respect to the ellipsoid in Figure 8a and correlation can be observed ( $R^{2} = 0.70$ ). The scatter plot after WGS shows two lines again, where each line is assigned to the variation of one parameter. The correlation coefficient indicates a strong correlation ( $R^{2} = 0.98$ ), which however is an artefact since only the best-fit values are included but not all possible values that obey equation 35. Monte Carlo simulation on the other hand leads to a similar scattering of the parameters and a very similar correlation coefficient ( $R^{2} = 0.70$ ).

FIGURE:

Scatter plots after confidence calculation using (a) Model Comparison, (b) Weakened Grid Search and (c) Monte Carlo simulation for the model with correlated parameters.

Having two parameters (θ_k and θ_l) and performing WGS for only one parameter θ_k, the F-Test confidence interval for the corresponding parameter is obtained since θ_j is always adjusted. However, performing the MOC and limiting it to one parameter θ_k, the confidence interval will always be smaller or equal to the correct F-Test confidence interval, since at $S S E (θ_{k}^{M O C}) = S S E_{m a x}$ there is still the other parameter θ_l to be adjusted. If there is no correlation between θ_k and θ_l, both parameters can be varied independently of each other and the F-Test confidence interval of the Simplified Model Comparison and WGS are equal.

Linear Regression

In case of linear models, the (1-α) confidence intervals can be calculated with standard software like Excel or similar spreadsheet programs as well as statistical software like R. In Table 8, we report the confidence intervals for a linear model using the t-distribution approach calculated using Gnumeric^[55] as well as the approaches for non-linear models implemented in SupraFit. The data used were obtained adding $ϵ \in N (μ = 0, 0.01)$ to a linear model $y = θ_{1} x + θ_{2}$ with $θ_{1} = - 820.000$ and $θ_{2} = - 0.333$ (Figure 11). The least-squares estimated parameters are $θ_{1} = - 787.551$ and $θ_{2} = - 0.334$ .

Table 8 95 % Confidence Intervals obtained after Linear Regression, Weakened Grid Search and Monte Carlo simulation (T=50000).

	$[θ_{1, -}, θ_{1, +}]$	$[θ_{2, -}, θ_{2, +}]$
linear	[−846.7845, −728.3161]	[−0.3408, −0.3280]
WGS	[−861.9370, −713.1640]	[−0.3424, −0.3264]
	Monte Carlo simulations
SE_y	[−845.0710, −729.5970]	[−0.3406, −0.3282]
σ	[−844.3855, −730.3190]	[−0.3406, −0.3282]
BS	[−843.3700, −732.0910]	[−0.3404, −0.3283]

FIGURE:

Representation of (a) a linear function and (b) the added normal distributed error.

The non-linear F-Test based confidence interval differs much from the smaller linear t-distribution bases interval. Monte Carlo simulations with $T = 50000$ steps were performed as bootstrapping and using SE_y and σ_fit as input standard deviation. The BS confidence interval is the smallest and the interval using SE_y is the widest, since $S E_{y} > σ_{f i t}$ . However, the obtained confidence intervals after Monte Carlo simulations are very close to the one calculated with the linear approach, being only slightly smaller. Using SE_y as ε for Monte Carlo simulation recovers the linear approach best.

NMR Titration

To demonstrate the application of SupraFit in case of NMR titration, example calculation on an artificial NMR titration with a 1 : 1/1 : 2 binding stoichiometry were performed. The stability constants to set up the experimental data were chosen to be $l g K_{11} = 3.81$ and $l g K_{12} = 2.14$ The chemical shifts can be found in the supporting information. The individual shifts are not meant to represent a realistic example. A random error obtained from a normal distribution with $ϵ \in N (μ = 0, 0.001)$ was added afterwards, where every single signal has the same σ, therefore e. g. signal 6 ( $Δ δ = 2.3038 p p m$ ) and signal 7 ( $Δ δ = 0.2441 p p m$ ) have both the same random error. The “experimental” titration curve can be found in Figure 12a. The four possible models (1 : 1, 2 : 1/1 : 1, 1 : 1/1 : 2 and 2 : 1/1 : 1/1 : 2) were tested without cooperative relationships.

FIGURE:

(a) Simulated titration curves with $l g K_{11} = 3.81$ and $l g K_{12} = 2.14$ and seven observed signals. (b) The Mole Ratio plots show an intersection of two linear functions at a molar ration between 1.0 and 1.5.

Mole Ratio Plot

Using SupraFits linear regression method with two functions, Mole Ratio plots can easily be generated.^[56] A Mole Ratio plot shows, additional to the chemical shift (or any other suitable response signal) on the y axis and the molar ratio on x axis, two linear function which are fitted to the data. The plot can be found in Figure 12b. For each series, all possible intersections of adjacent linear functions are calculated. The result for the best fit, that fit minimising the sum over all SSE, is listed in the supporting information. The intersections of the two functions per signal ranges between 1.13 and 1.27, indicating a system that exhibits 1 : 2 species. This is in accordance with the stoichiometry of the original model.

Fitted Parameter

The resulting stability constants (lg K) after optimisation are printed in Table 9, statistical judgements using SSE and SE_y can be found in Table 10. The titration curve as well as the remaining absolute errors can be found in Figure 13. The complex formation constants for the correct model differ only slightly from the initial ones. The easier 1 : 1 model estimates a $l g K_{11}$ that is too small, as happens upon fitting the 2 : 1/1 : 1 model. The most complex model resamples $l g K_{11}$ and $l g K_{12}$ , but the incorrect model parameter $l g K_{21}$ is realistic. Some of the chemical shifts in the 2 : 1/1 : 1 are smaller than zero ( $δ_{A_{2} B, 1} = - 6.6334 p p m$ ), indicating a change in the chemical shift up to 13 ppm ( $δ_{A B, 1} = 6.5565 p p m$ ). A full list of all parameters can be found in the example file in the SupraFit repository at GitHub.

Table 9 Estimated lg K values for the applied 1 : 1, 2 : 1/1 : 1, 1 : 1/1 : 2 and 2 : 1/1 : 1/1 : 2 models.

model	$l g K_{21}$	$l g K_{11}$	$l g K_{12}$
true model		3.8100	2.1400
1 : 1		3.0991
2 : 1/1 : 1	1.7448	2.6694
1 : 1/1 : 2		3.8092	2.1090
2 : 1/1 : 1/1 : 2	1.9893	3.8063	2.0429

Table 10 The sum of squared errors (SSE) as well as σ and SE_y after testing four models on the simulated data set. ^aNot calculated, since this model is not fitted to the data.

model	parameter	SSE	SE_y	σ
	fitted
1 : 1	15	0.036459	0.017078	0.016196
2 : 1/1 : 1	23	0.001761	0.003878	0.003560
1 : 1/1 : 2	23	0.000132	0.001062	0.000975
2 : 1/1 : 1/1 : 2	31	0.000127	0.001077	0.000954
fitted 1 : 1/1 : 2	23	0.000132	0.000983	0.000975
correct 1 : 1/1 : 2	–	0.000165	–^a	0.001088

FIGURE:

(a) Chemical shifts and fitted curves using an 1 : 1/1 : 2 model (lgK₁₁=3.81 and lgK₁₂ =2.11) and (b) the resulting absolute errors. (c) The absolute errors for all four models are plotted in one chart, showing that the 1 : 1 model and 2 : 1/1 : 1 perform worse than the 1 : 1/1 : 2 and the 2 : 1/1 : 1/1 : 2 model. (d) Both models, 1 : 1/1 : 2 and 2 : 1/1 : 1/1 : 2, show similar residuals.

The “visual inspection” as described by Hynes,^[9] can be performed using the charts in Figure 13a and 13b, where all absolute errors are plotted in Figure 13a and the errors only from the 1 : 1/1 : 2 model and 2 : 1/1 : 1/1 : 2 model are plotted in Figure 13b. Clearly the 1 : 1 model perform worst, followed by the 2 : 1/1 : 1 model, with both having heteroscedastic errors. The remaining two models are optically indistinguishable with both errors being homoscedastic.⁶ Considering the resulting SSE, the decision towards the correct model can already be made, since $S S E_{1 : 1 / 1 : 2} \approx S S E_{2 : 1 / 1 : 1 / 1 : 2}$ and $3 \cdot S S E_{1 : 1 / 1 : 2} < S S E_{1 : 1}$ .^[15] Comparing SSE of the fitted 1 : 1/1 : 2 model and the correct model show the slightly smaller error for the optimised model.

Monte Carlo Confidence Intervals

Following the strategy of the Monte Carlo simulation, the introduced error can be calculated from the standard normal distribution with (a) a defined variance or (b) via bootstrapping. To test the influence of different approaches on the confidence interval, a set of simulations were performed on the given dataset with the optimised 1 : 1/1 : 2 model. The standard normal distributed errors were generated with $σ_{M C} = S E_{y}$ , $σ_{M C} = σ_{F i t}$ , $σ_{M C} = 1 e^{- 3}$ , $σ_{M C} = 2 e^{- 3}$ , $σ_{M C} = 3 e^{- 3}$ and $σ_{M C} = 5 e^{- 3}$ . Monte Carlo simulation with T=100, 200, 300, 500, 700, 1000, 1500, 2000, 2500, 3000 and 5000 steps were performed, where each simulation was repeated 300 times. The 95 % confidence interval was then characterised by the median and standard deviation of the 0.95 inter-percentile ranges (IPR) for these 300 Monte Carlo simulations.

The boxplots and the standard deviation of the 0.95 IPR values for the stability constants $l g K_{11}$ and $l g K_{12}$ after the Monte Carlo simulation are reported in Figure 14 and show expected behaviour: With increasing steps T, the observed standard deviation of the IPR decreases. The same trend is visible for the other Monte Carlo simulations including BS (see Figure S7–S13). With increasing step count the IPR converges to the ideal IPR that could be obtained after an infinite number of steps. As Efron stated,^[47] at least 2000 steps are required for the bootstrap method to obtain reliable results. However, since every Monte Carlo step requires the least-squares estimation of θ, this approach is demanding. As shown in Figure 15, Monte Carlo simulation scales well with the number of threads used and benefits from Hyperthreading technology.⁷ Therefore, accurate Monte Carlo simulation with 2000 steps can easily be realised within minutes even on a desktop computer with fewer cores.

FIGURE:

Variation and standard deviation of the IPR for lgK₁₁ and lgK₁₂ after several Monte Carlo simulations with σ_MC=SE_y =1.062 ⋅ 10⁻³.

FIGURE:

Wall time in seconds and speed up as function of the number of threads used in Monte Carlo simulation. The wall time is averaged over 50 runs. The benchmark was performed on a Intel i9-7920X CPU @ 4.00 GHz (12 physical cores, overclocked) with and without Hyperthreading (HT).

As shown in Figure 16 the confidence intervals obtained from 300 Monte Carlo simulations with each simulation performed with 5000 steps using BS or random errors and different σ_MC differ. As σ_MC increases, the confidence interval gets broader and standard deviation of the IPR increases. However, the differences between bootstrapping and random error with $σ_{M C} = σ_{f i t}$ are very small but since the Kruskal-Wallis-test results in a p-value=0.002<0.05 for $l g K_{11}$ and p=0.023<0.05 for $l g K_{12}$ , the differences are significant for the given example. The corresponding plots for $l g K_{12}$ are presented in Figure S14.

FIGURE:

IPR of lgK₁₁ for several Monte Carlo simulations (300 runs, each run with T=5000 steps) with diﬀerent approaches to define the value of ϵ.

Correlation of IgK₁₁ and IgK₁₂

Since the current NMR titration model has more than two parameters, the correlation of $l g K_{11}$ and $l g K_{12}$ will be analysed either neglecting or taking the parameters, e. g. the chemical shifts, into account. Therefore, a Monte Carlo simulation with $σ_{M C} = S E_{y}$ and T=10000, two runs of Weakened Grid Search, the first only for $l g K_{11}$ and $l g K_{12}$ and the second for all parameters and Model Comparison for $l g K_{11}$ and $l g K_{12}$ were performed. The scatter plots for $l g K_{11}$ vs $l g K_{12}$ are shown in Figure 17 and the confidence intervals are given in Table 11.

FIGURE:

The resulting scatter plots for lgK₁₁ and lgK₁₂ diﬀer for various statistical approaches. (a) Weakened Grid Search with only lgK₁₁ and lgK₁₂ included. (b) Weakened Grid Search with all parameters included. (c) Monte Carlo simulation. (d) Model Comparison with only lgK₁₁ and lgK₁₂ included.

Table 11 95 % Confidence Intervals obtained after Weakened Grid Search (WGS), Model Comparison (MOC) and Monte Carlo simulation (MC). ^aOnly $l g K_{11}$ and $l g K_{12}$ where analysed and ^ball parameter were analysed.

	$[l g K_{11, -}, l g K_{11, +}]$	$[l g K_{12, -}, l g K_{12, +}]$
WGS^a	3.698–3.926	1.935–2.242
WGS^b	3.697–3.927	1.934–2.243
MC	3.773–3.846	2.059–2.155
MOC	3.801–3.818	2.104–2.114

The first two charts show the scattering of the complex formation constants after applying the Weakened Grid Search, where Figure 17a contains only two series, since only two parameters were tested. However, the chart in Figure 17b shows more than two series, as all parameters were taken into account. Incorporating more parameters, the correlation coefficient drops from 0.80 to 0.74 since more points from the original series are available. However, the high correlation is an artefact as already pointed out in the example of the function with correlated parameters in the previous section. The scatter plot after Monte Carlo simulation in Figure 17c shows an ellipsoid, with the parameters having a correlation coefficient of 0.37. On the other hand, using Model Comparison with only taking two parameters into account, one obtains a complete ellipsoid, which however is rotated with respected to the Monte Carlo ellipsoid and to the series obtained after Weakened Grid Search (Figure 17d). Therefore, naive Model Comparison leads to wrong results regarding confidence intervals and the ellipsoid, if correlated parameters are ignored.

Isothermal Titration Calorimetry

The ITC data used in the following section are taken from the pytc-demo. The complex formation of Calcium with EDTA (see https://github.com/harmslab/pytc-demos, last checked 17. 01. 2022) where reported by Harms et al.^[16] to demonstrate the pytc tool. The heat is given in cal and cal/mol. In the first part, the initial guess of the parameters in case of a 1 : 1 model are described, since a good starting point for the non-linear regression is essential.

The fx value, the inflection point of the titration curve,^[57] is guessed by fitting three non-overlapping linear functions to the isotherm. The guessed fx value is then obtained as mean of the intersection of first with the second function and the second with the third function (Figure 18). The heat of formation is calculated using the heat of the third injection Q_2,3 divided by the change in concentration of the added guest $[B]$ component. It is assumed, that at the start of the titration the concentration of the formed complex is nearly the same as the added guest concentration since $[B] ≪ [A]$ . The stability constant is then calculated using the bisection method within the limits of $1 \leq l g K_{11} \leq 10$ . The initial guessed parameters of the 1 : 1 model are applied to the models of mixed stoichiometries as well. See Table S2 for the comparison of the initial guessed and fitted parameters for the hepes data.

FIGURE:

The initial value for fx is guessed using three linear functions.

Global Fit

MetaModels were used to globally fit $l g K_{11}$ and δH_AB to the data of hepes-01, hepes-02 and hepes-03 from the pytc-demo that are followed by Monte Carlo simulation to estimate the confidence intervals. These results were then compared to the confidence intervals obtained from Monte Carlo simulations for the individual experiments. The obtained parameters and the confidence intervals using Monte Carlo simulation ( $σ_{M C} = S E_{y}$ , 5000 steps) are listed in Table 12. While the globally estimated $l g K_{11}$ is nearly the mean of the individual models (7.595), the IPR for $l g K_{11}$ can not be approximated by the mean of the individual IPR (0.065). The same holds true for the enthalpy of complexation, where the average parameter is $- 4.621 k c a l / m o l$ and the average IPR is 0.057. The estimated parameters from pytc and SupraFit are the same.

Table 12 Estimated parameters with pytc and SupraFit for hepes-01, hepes-02 and hepes-03 and the global models with the 95 % confidence intervals. In SupraFit, MC derived confidence intervals were obtained using $σ_{M C} = S E_{y}$ and 5000 steps. The IPR is given in round brackets. MM: Global fit using a MetaModel, 01–03: Local fits.

	$l g K_{11}$	$[l g K_{11, -}, l g K_{11, +}]$	$Δ H_{A B}$	$[Δ H_{A B, -}, Δ H_{A B, +}]$ $\frac{k c a l}{m o l}$
pytc	7.594	[ 7.580, 7.607]	−4.621	[−4.633, −4.610]
MM	7.594	[7.573, 7.614] (0.041)	−4.621	[−4.640, −4.603] (0.037)
01	7.567	[7.546, 7.587] (0.040)	−4.613	[−4.630, −4.595] (0.035)
02	7.604	[7.562, 7.646] (0.084)	−4.668	[−4.706, −4.630] (0.076)
03	7.614	[7.579, 7.651] (0.072)	−4.582	[−4.612, −4.553] (0.059)

Dilution

The same example data set from pytc was used to analyse the effect of the blank experiments on the parameter estimation. The four approaches, described in section 3.2.3, were applied: As first approach (1) the titration was analysed with dilution correction, included according to equation 10 but without referring to any external blank titration. Including dilution using another experiment was realised as follows: (2) An external blank titration was used to estimate the two dilution parameters $m_{δ}$ and $n_{δ}$ in equation 10, which were included and kept constants while $l g K_{11}$ , ΔH and fx were obtained. The third parameter estimation (3) was performed using equation 8 after the blank experiment was subtracted from the complexation experiment. In the last experiment (4) the blank and the complexation experiment were combined as MetaModel. Therefore $m_{δ}$ and $n_{δ}$ were estimated using the blank and the titration experiment globally, while $l g K_{11}$ , ΔH and fx were estimated locally, using only the data from the titration experiment. The corresponding isotherm and blank experiment are shown in Figure 19, the estimated parameter for hepes-01 are listed in Table 13. The heat observed from the blank experiment is very small, compared to the heat from binding experiment. Figure S15 contains the three isotherms and blank experiments for the hepes-01, imid-01 and tris-01 data sets. See Tables S3–S5 for all best fit values as well as the confidence intervals of the parameters $l g K_{11}$ and $Δ H_{A B}$ .

FIGURE:

Isotherms for the complexation and blank experiments. Data are taken from hepes-01 of the pytc-demo.^[16]

Table 13 Estimated parameters $l g K_{11}$ and ΔH with the IPR and standard deviation of the confidence intervals calculated via Monte Carlo simulation using hepes-01 data set and different dilution strategies (1)–(4).

Dilution		lgK₁₁			ΔH [kcal/mol]
strategy		IPR	σ		IPR	$σ \cdot 10^{- 3}$
none	7.565	0.054	0.014	−4.608	0.016	3.939
(1)	7.567	0.039	0.010	−4.613	0.035	9.037
(2)	7.625	0.110	0.028	−4.529	0.029	7.593
(3)	7.599	0.180	0.046	−4.532	0.051	12.822
(4)	7.619	0.108	0.027	−4.534	0.039	10.110

Monte Carlo simulations with 20000 steps and $σ_{M C} = S E_{y}$ were performed, the corresponding boxplots for $l g K_{11}$ and $Δ H_{A B}$ in case of hepes-01 are shown in Figure 20. Boxplots including all parameters and the data sets imid-01 and tris-01 can be found in the supplementary information in Figure S16–S20.

FIGURE:

Boxplot of (a) lgK₁₁ and (b) ΔH values obtained from Monte Carlo simulations performed on the hepes-01 data sets with diﬀerent dilution strategies tested.

In the hepes-01 data set, the differences between neglecting the dilution and strategy (1) are very small in case of the estimated values for $l g K_{11}$ and $Δ H_{A B}$ . However, Monte Carlo simulations reveal, that there is an influence on the confidence intervals. For both, imid and tris data, the differences between the estimated parameters ( $l g K_{11}$ and ΔH) and the corresponding confidence intervals comparing the neglected dilution and strategy (1) are much more intense (see Figure S16 and S17). The results after explicitly including the blank experiment in the parameter estimation following the three remaining approaches show that all three methods result in different best-fit parameters as compared to none dilution and strategy (1). However, the Monte Carlo simulations indicate, that the subtraction of the results of the hepes blank experiment deteriorates the statistical parameters of the obtained values for $l g K_{11}$ and ΔH compared to strategy (2) and (4). In the imid and tris data sets, similar broadened confidence intervals as indicated by IPR and σ were not observed (see Table S16 and Figure S17). This can be explained by using the correlation coefficient for the linear fit of the blank experiment (Figure 19b and Figure S15), where R² is worst for hepes ( $R^{2} = 2.88 \cdot 10^{- 4}$ ) and better for imid ( $R^{2} = 1.58 \cdot 10^{- 3}$ ) and tris ( $R^{2} = 6.52 \cdot 10^{- 3}$ ) dilution data.

It was demonstrated, how the influence of various approaches to include blank experiments can be analysed using Monte Carlo simulation. In the present example, the obtained parameters only change on a very small scale, e. g. the heat of complexation varies in scales of less than 0.5 kcal/mol due to the small heat of dilution. This may however not be true in general and statistical post-processing can help to understand the obtained results more deeply.

Conclusion

A new graphical program to perform non-linear regression with focus on the calculation of stability constants by means of NMR titration and ITC experiments has been presented. The software is written in C++, using the Qt Toolkit and the Eigen library and is fully open source and therefore transparent regarding the underlying mathematics and algorithms. Additionally to the pure estimation of the various physical parameters, that are used to describe the complexation process, statistical analysis can be performed to obtain confidence intervals for each single parameter and to gain a deeper insight in the performed experiments. The adoption of several techniques are reported, which are already described in the literature (Monte Carlo simulation and F-Test approaches), however the routinely usage of these approaches has not been reported yet. We hope, that SupraFit provides a good basis to analyse titration experiments with respect to the statistical judgement and to further improve the insight in the supramolecular systems. We additionally aim to provide SupraFit as easy-as-necessary and as powerfull-as-possible regarding the usability of the user interface, that all the tools brought by SupraFit are straightforwardly accessible. Contributions like new models or statistical post-processing are welcome. Future development of SupraFit will include more built-in stoichiometric models (for example 3 : 1 and 1 : 3 systems), dimerisation constants, but also an interface to implement custom models using a script engine.

The source code and binaries of SupraFit can be obtained free of charge from the GitHub repository at https://github.com/conradhuebler/SupraFit.

Supporting Information Summary

Selected images with higher resolution (Figure S1–S4). Input data for simulated 1 : 1/1 : 2 model. Selected images with higher resolution (Figure S5). Calculated intersection in Mole Ratio plot (Table S1). Selected images with higher resolution (Figure S6). Representation of boxplots from Monte Carlo simulation for NMR titration (Figure S7–S14). Comparison of initial guessed and least-squared estimated parameters in ITC experiments (Table S2). Isotherms and blank experiments (Figure S15). Estimated parameters and confidence interval with different dilution strategies (Table S3–S5, Figure S16–S20).

Appendix Appendix A: Abbreviation and Symbols

$[A]$
concentration of component A

${[A]}_{0}$
initial concentration of component A

$[B]$
concentration of component B

${[B]}_{0}$
initial concentration of component B

$[X]$
concentration of any component

K₁₁
step-wise stability constant for a 1 : 1 complex

K₂₁
step-wise stability constant for a 2 : 1 complex

K₁₂
step-wise stability constant for a 1 : 2 complex

β₂₁
stability constant for a 2 : 1 system

β₁₂
stability constant for a 1 : 2 system

lg K
log₁₀ of stability constant K

y
observed signal or physical property, dependent data

Y
proportionality factor linking concentration with y

δ
observed chemical shift

A_abs
observed absorbance

ε_i
extinction coefficient

V
cell volume

v
inject volume

Q
observed heat

ΔH
heat of formation

$m_{δ}$ , $n_{δ}$
linear coefficients in blank experiments

E
enzyme

S
substrate

P
product

K_M
Michaelis-Menten constant

v_max
maximum reaction rate

r
reaction rate

θ
parameter in general

$\hat{θ}$
estimated parameter/best-fit parameter

$\tilde{θ}$
true value

$[θ_{-}, θ_{+}]$
confidence interval, range within $\tilde{θ}$ is expected to be

IPR
inter-percentile range

x
independent data

y_exp
experimental data

y_calc
(re)calculated experimental data using $\hat{θ}$

SSE
sum of squared errors

e
residual, error: ( $y_{e x p} - y_{c a l c}$ )

ε
random error

σ_fit
standard deviation of the residuals

SE_y
standard error

χ²
chi-squared error

T
number of Monte Carlo steps

σ_MC
standard deviation used to set up Monte Carlo simulations

$N (μ, σ^{2})$
normal distribution with mean μ and standard deviation σ

μ
mean of normal distribution

σ
standard deviation of normal distribution

α
probability

K
number of parameters

N
number of data points

F_N,N–K
critical value in the F-distribution

$δ_{θ}$
increment to change θ during WGS and MOC

WGS
Weakened Grid Search

MOC
Model Comparison

MC
Monte Carlo simulation

BS
Bootstrapping

Appendix B: Equilibrium Equations

Systems of 1 : 1 Stoichiometry 38 $A + B \overset{⇀}{↽} AB$ 39 $K_{11} = β_{11} = \frac{[A B]}{[A] [B]}$ 40 ${[A]}_{0} = [A] + β_{11} [A] [B] = [A] + [A B]$ 41 ${[B]}_{0} = [B] + β_{11} [A] [B] = [B] + [A B]$ 42 $K_{11} = \frac{[A B]}{([{A]}_{0} - [A B]) \cdot ([{B]}_{0} - [A B])}$ 43 $0 = K_{11} ([{A]}_{0} - [A B]) \cdot ([{B]}_{0} - [A B])) - [A B]$ 44 $0 = K_{11} {[A B]}^{2} - [A B] (K_{11} [{A]}_{0} + K_{11}[{B]}_{0} + 1) + K_{11} [{A]}_{0}[{B]}_{0}$

Systems of 2 : 1/1 : 1 Stoichiometry

With the mass balance equations 45 ${[A]}_{0} = [A] + [A B] + 2 [A_{2} B] = [A] + β_{11} [A] [B] + 2 β_{21} {[A]}^{2} [B]$ 46 $[B] + [A B] + [A_{2} B] = [B] + β_{11} [A] [B] + β_{21} {[A]}^{2} [B]$

follows the concentration of unbound host:^[15] 47 $0 = [{A]}^{3} A +[{A]}^{2} B + [A] C - {[A]}_{0} A = K_{11} K_{21} B = K_{11} (2 K_{21} [{B]}_{0} - K_{21}[{A]}_{0} + 1) C = K_{11} ([{B]}_{0} -[{A]}_{0}) + 1$

Systems of 1 : 1/1 : 2 Stoichiometry

The mass balance equations are formed similarly to the other systems, with $β_{12} = K_{11} K_{12}$ 48 ${[A]}_{0} = [A] + [A B] + [A B_{2}] = [A] + K_{11} [A] [B] + K_{11} K_{12} [A] {[B]}^{2} = [A] + β_{11} [A] [B] + β_{12} [A] {[B]}^{2}$ 49 ${[B]}_{0} = [B] + [A B] + 2 [A B_{2}] = [B] + K_{11} [A] [B] + 2 K_{11} K_{12} [A] {[B]}^{2} = [B] + β_{11} [A] [B] + 2 β_{12} [A] {[B]}^{2}$ 50 $0 = [{B]}^{3} A +[{B]}^{2} B + [B] C - {[B]}_{0} A = K_{11} K_{12} B = K_{11} (2 K_{12} [{A]}_{0} - K_{12}[{B]}_{0} + 1) C = K_{11} ([{A]}_{0} -[{B]}_{0}) + 1$

Systems of 2 : 1/1 : 1/1 : 2 Stoichiometry

The solution of that system is defined by the mass-balance equation 51 ${[A]}_{0} = [A] + [A B] + [A B_{2}] + 2 [A_{2} B] = [A] + K_{11} [A] [B] + K_{11} K_{12} [A] {[B]}^{2} + 2 K_{21} K_{11} {[A]}^{2} [B] = [A] + β_{11} [A] [B] + β_{12} [A] {[B]}^{2} + 2 β_{21} {[A]}^{2} [B]$ 52 ${[B]}_{0} = [B] + [A B] + 2 [A B_{2}] + [A_{2} B] = [B] + K_{11} [A] [B] + 2 K_{11} K_{12} [A] {[B]}^{2} + K_{21} K_{11} {[A]}^{2} [B] = [B] + β_{11} [A] [B] + 2 β_{12} [A] {[B]}^{2} + β_{21} {[A]}^{2} [B]$

Acknowledgements

The author thanks Prof. M. Mazik, TU Bergakademie Freiberg for her support as well as Dr. Sebastian Förster and Dr. Stefan Kaiser for finding bugs and constructive feedback on SupraFit and Dr. Jürgen Seidel for helpful his feedback on that manuscript and Mara Büßemeyer for proofreading. C.H gratefully acknowledges the Centre of Advanced Study and Research – Freiberg (GraFA) and the Saxonian Ministry of Science, Culture and Tourism (SMWK) (project number 100333374) for funding. The reviewers are thanked for the constructive feedback.

Conflict of interest

The authors declare no conflict of interest.

Data Availability Statement

The data that support the findings of this study are available in the supplementary material of this article.

ⁿ¹See http://www.hyperquad.co.uk/index.htm for more information on how to obtain parts of the Hyperquad software products.

ⁿ²During Monte Carlo simulations the Levenberg-Marquardt optimisation was not as efficient as the approach described above. However, a detailed benchmark was not prepared.

ⁿ³See https://octave.org/doc/v4.0.1/Descriptive-Statistics.html. Last visit 17. 01. 2022.

ⁿ⁴Monte Carlo simulation are spawned across the threads, that roughly each thread performs $T / N T h r e a d s$ optimisation.

ⁿ⁵ $δ_{θ}$ is both, positive and negative so that $θ_{i}$ is tested for values smaller and greater than $\hat{θ_{i}}$ .

ⁿ⁶This is expected as they resample the original normal distributed random numbers.

ⁿ⁷The benchmark was obtained on a i9-7920X CPU with 12 cores overlocked to 4.00GHz, using openSUSE 15.0 Leap. SupraFit was compiled using gcc 7.4.1.

Word count: 10548

Show less

© 2022. This work is published under http://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

A novel application to determine stability constants from supramolecular titration experiments is presented. The focus lies on NMR titration and ITC experiments for pure 1 : 1 systems, as well as mixed 2 : 1/1 : 1, 1 : 1/1 : 2 and 2 : 1/1 : 1/1 : 2 systems. SupraFit provides global and local fitting and a global search tool. Statistical methods are implemented and can be applied to analyse the results of nonlinear regression. Monte Carlo simulations, combined with the percentile methods and F-Test approaches to calculate confidence intervals are supported. The implemented statistical approaches are illustrated and discussed on model functions. All methods are accessible through an intuitive user interface, providing charts for all (kind of) data produced. SupraFit is written in C++, using the Qt Toolkit for the Graphical User Interface (GUI) and the Eigen library for nonlinear regression and is released under the GNU Public License (GPL).

Details

Title

SupraFit – An Open Source Qt Based Fitting Application to Determine Stability Constants from Titration Experiments**

Author

Hübler, Conrad¹

¹ Institut für Organische Chemie, Technische Universität Bergakademie Freiberg, Freiberg

Section

Research Articles

Publication year

2022

Publication date

Jul 2022

Publisher

John Wiley & Sons, Inc.

e-ISSN

26289725

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1002/cmtd.202200006

ProQuest document ID

2822734100

SupraFit – An Open Source Qt Based Fitting Application to Determine Stability Constants from Titration Experiments**

Jump to:

Full text

Abstract

Details

Suggested sources