1. Introduction
Liquid chromatography combined with high-resolution mass spectrometry (LC-HRMS) is a frequently applied technique for suspect screening (SS) and non-target screening (NTS) in metabolomics, human biomonitoring, and environmental toxicology [1]. However, correctly identifying compounds based on SS or NTS approaches remains challenging [2,3,4,5]. In most studies, only a small percentage of compounds are identified. The molecular formula of unknown compounds detected in LC-HRMS data can be derived from MS1 data by using the accurate mass and isotope pattern and applying the seven golden rules [6]. For further compound identification, MS2 spectra are preferred. A reference standard is needed for compound identification, and both the spectrum and retention time must match the reference standard. A lack of reference standards was recently identified [7] as a major challenge in environmental and human risk assessment sciences, calling—among other measures—for better availability of reference standards. Putative identification of compounds is often performed because of the limited availability of reference standards. For putative identification, the MS2 spectrum must match a spectrum in a mass spectral library or diagnostic evidence, such as characteristic fragments, as Schymanski et al. described [8].
Mass spectral libraries containing experimental MS2 spectra for compound identification are continuously growing. MzCloud [9] and Massbank of North America (MoNA) are two examples of mass spectral libraries. mzCloud contains over 9 million HR-MS spectra obtained with orbitrap instruments, covering over 20,000 unique compounds, and is commercially maintained. MoNA includes almost 2 million spectra, a combination of both user-generated HRMS spectra obtained with (Q)-TOF or orbitrap instruments and in silico predicted spectra, covering over 650,000 unique compounds. Note that mass spectral libraries only cover a limited number of compounds; the CAS registry contains over 127 million unique compounds. Mass spectral libraries are growing, but their growth is limited by the costs of purchasing or synthesizing millions of compounds. This problem is most noticeable for certain compound groups, such as metabolites, for which reference standards are often not commercially available. Due to the lack of reference standards, these compounds are often not included in mass spectral libraries unless scientists actively compile high-quality libraries [10]. Consequently, mass spectral libraries only cover a limited amount of chemical space, restricting the use of mass spectral libraries for compound identification. Additionally, identification results obtained with spectral libraries are strongly influenced by instrument type (analyzer and collision cell) and instrument settings, such as precursor isolation width, and by spectral curation steps, such as noise removal [11,12].
Multiple software tools have been developed to aid in the identification of compounds not included in mass spectral libraries [4,13]. These software tools use in silico predicted MS2 spectra to identify compounds instead of matching the MS2 spectra against a mass spectral library. In silico mass spectrum prediction is performed using molecular fingerprinting, rule-based fragmentation prediction, artificial intelligence, or a combination of these techniques. The experimentally obtained MS2 spectra that the user imports in these software tools are matched to the in silico MS2 spectra. Examples of these tools are CFM-ID [14], Chemdistiller [15], and MSfinder [16]. CFM-ID utilizes hybrid machine learning and rule-based fragmentation prediction. Chemdistiller is based on structural fingerprints and a machine learning algorithm. MSfinder uses rule-based in silico fragmentation prediction using hydrogen rearrangement rules.
Identification software tools are commonly evaluated using experimental MS2 spectra of chemical reference standards in a solvent and often use data extracted from curated mass spectral libraries. These spectra are less complex than MS2 spectra of the same compounds in matrix extracts. Moreover, differences can be observed in MS2 spectra obtained by different acquisition modes. The most commonly used acquisition mode for HR-MS2 spectra is a data-dependent top-n strategy (DDA), where the highest intensity m/z in a spectrum is selected for MS2 acquisition. This strategy yields MS2 spectra containing few interferences, which makes them relatively easy to interpret. However, the downside of this approach is that MS2 spectra for the compounds that fall outside the top-n are not acquired, making compound identification for low abundant signals impossible. To avoid this problem, data-independent acquisition (DIA) can be used. In DIA, MS2 acquisition is performed in parallel for co-eluting ions in a selected m/z range and merged into one fragment spectrum. In this way, DIA fragmentation spectra contain a composite of several co-eluting compounds and, therefore, are more challenging to match to spectral libraries or in silico predicted spectra.
Little is known about how different software tools compare in identifying compounds in solvent standards and spike into complex matrix extracts, especially for DIA spectra. In this study, we evaluated the capabilities of the software tools CFM-ID, Chemdistiller, MSfinder, and spectral library mzCloud for compound identification. The identification capability of each software tool using HRMS2 spectra of 32 compounds in solvent standards and the same compounds spiked into complex feed extracts acquired in DDA and DIA mode were assessed, and the obtained success rates were compared.
2. Materials and Methods
2.1. Chemicals
ULC grade methanol, acetonitrile, and water were obtained from Actu-All Chemicals (Oss, The Netherlands). Formic acid, acetic acid, and sodium acetate were obtained from VWR International (Darmstadt, Germany). Magnesium sulphate was obtained from Sigma-Aldrich (St. Louis, MO, USA). Reference standards were purchased from HPC Standards (Borsdorf, Germany), Sigma Aldrich, Witega (Berlin, Germany), and Santa Cruz (Dallas, TX, USA). To challenge the software tools, some of the selected compounds have the same molecular formula and very similar MS2 spectra.
2.2. Solvent Standard and Spiked Feed Extract Preparation
The solvent standards were analyzed as three mix solutions (A, B, and C; Table 1). The mix solutions were prepared in-house from individual stock standard solutions and diluted in methanol to the desired concentration. The compounds were divided between the three mix solutions to avoid the inclusion of co-eluting compounds of the same molecular formula in one mix. Together, mix A, B, and C contained 32 veterinary drugs and pesticides in varying concentrations (40–2000 µg/L). The concentration of each compound was chosen to reflect the maximum residue limit and to ensure good detectability. The compound selection, including concentrations for each compound, is listed in Table 1. Only the high-concentration level data was evaluated for solvent standards.
Animal feed, a compound feed for poultry, was extracted using a QuEChERS-based method. A total of 5 mL of water and 10 mL 1% acetic acid in acetonitrile was added to 5 g of animal feed. After extraction by shaking end-over-end at 50 rpm for 30 min, 1 g sodium acetate and 4 g magnesium sulfate were added to induce phase separation. The extracts were then centrifuged for 5 min at 3500 rpm. For the high concentration level, three 0.5 mL aliquots of the organic layer were transferred to clean tubes, and 0.5 mL of mix standard solutions A, B, and C were added to the 3 separate tubes. After evaporation to dryness, the extracts were reconstituted in 0.5 mL 50% methanol and transferred to an LC vial. For the low concentration level, three 0.5 mL aliquots were transferred to clean tubes, and 0.25 mL of mix standard solutions A, B, and C were added to the three tubes. After evaporation to dryness, the extracts were reconstituted in 500 µL 50% methanol and transferred to an LC vial.
2.3. Instrumental Analysis
Liquid chromatography–high-resolution mass spectrometry (LC-HRMS) analysis was performed by using an Ultimate 3000 UHPLC system that was coupled to a Q-Exactive OrbitrapTM system with HESI-II electrospray source (Thermo Scientific, San Jose, CA, USA). The system was controlled by using the software packages Xcalibur 4.3, Chromeleon 7.2.9, and Q-Exactive Tune 2.11. The instrument was calibrated before every analysis sequence with a maximum mass deviation of 1 ppm using Pierce LTQ ESI positive ion calibration solution (Thermo Scientific). Chromatographic and overall system performance was checked by analyzing a standard solution of 5 compounds before analysis and comparing mass accuracy, retention time, and intensity with previous performance test data.
For DIA and DDA, the same chromatography was used. The eluents for the LC separation were (A) water and (B) methanol:water 95:5 (v/v), both containing 2 mM ammonium formate and 20 µL formic acid per liter. The LC flow rate was 300 µL min−1. The following gradient was used: 0% B for the first 0.1 min, linear to 45% B in 1.9 min, followed by a rise to 100% B in 6 min. After 6 min of 100% B, a switch back to 0% B was performed in 0.5 min before equilibration at 0% B for 4.5 min. The injection volume was 5 µL. The chromatographic separation was performed on an Atlantis T3 analytical column (100 mm × 3 mm, 3 µm particles, Waters, Milford, MA, USA) at a column temperature of 40 °C.
Solvent standards and spiked feed extracts were analyzed in DIA and DDA mode. The DIA method was previously described by Zomer et al. [17]. In the DIA method, a total of 6 scan events were combined: 1 full scan event with a resolving power of 70,000 (defined at m/z 200, FWHM) for m/z range 135–1000 and 5 DIA fragmentation events with a resolving power of 35,000. The fragment scan events selected the precursor ion ranges m/z 95–205, 195–305, 295–405, 395–505, and 495–1005 for simultaneous fragmentation. DDA was performed by combining 2 scan events: 1 full scan event (mass range m/z 135–1000) with a resolving power of 70,000 (defined at m/z 200, FWHM) and a top 5-MS2 event with a resolving power of 17,500 (defined at m/z 200, FWHM) using a dynamic exclusion time of 10 s. In both DIA and DDA, the fragmentation was performed using higher-energy collisional dissociation (HCD) with a stepped collision energy of 30, 80 NCE. Figure 1 shows a visual representation of the DDA and DIA methods.
2.4. Data Processing
Spectra were manually exported by using Xcalibur Qualbrowser version 4.5 (Thermo Scientific). For each MS2 spectrum, the top 20 highest intensity peaks were converted to the required file format for each tested software tool. DIA spectra were used without deconvolution. The experimental parent m/z and a 5-ppm mass tolerance was used as MS1 input. In all tools, all the available databases were used. Other settings were kept at default. The identification software tools were Chemdistiller version 0.1, MSFinder version 3.44, CFM-ID 3.0, and mzCloud. Both CFM-ID and mzCloud were accessed online between November 2020 and February 2021. The spectral library mzCloud contains multiple spectra of each compound generated with orbitrap analyzers. For most compounds, both HCD spectra and collision-induced dissociation (CID) spectra at multiple collision energies are available in mzCloud. In matching the experimental spectra to the library, no restriction on the type of collision cell or the collision energy matching tolerance was set. The other tools (Chemdistiller, MSFinder, CFM-ID) use in silico fragmentation instead of library spectra. Therefore, the type of analyzer, collision cell, and collision energy have limited effect on the identification results. We chose these tools to focus on single-step approaches that are easy to use and require limited expert knowledge.
The top three predicted identities and corresponding molecular formula for each measured spectrum were determined with all four software tools. If the correct compound was identified in the top 3 results, we considered the compound as identified correctly. It should be noted that some software tools repeat the same compound multiple times in the results for one spectrum based on multiple database sources. We strictly used the top 3 results, even if the top 3 contained the same compound identification multiple times.
3. Results
In Table 2, the percentages of correctly identified compounds are shown for DDA and DIA spectra for each software tool in the solvent standard and spiked feed extract. Three compounds in our test set were not included in the databases used by Chemdistiller. Therefore, two scores are given for this software tool: the score with the complete test set and the corrected score where the missing compounds are removed from the test set.
For the DDA spectra of compounds in solvent standards, roughly 80% of the spectra were identified correctly. mzCloud has the highest identification score (84%). The other investigated software tools yielded results with slightly lower scores.
The percentage of correctly identified compounds was slightly lower for the solvent standard DIA spectra compared to the DDA spectra, with an average of 69% compared to an average of 79%. This was expected since the DIA spectra are more complex compared to DDA spectra, although in solvent standards, this difference is less pronounced compared to spiked feed extracts, as solvent standards contain fewer interfering compounds. MSfinder and CFM-ID performed best with DIA spectra of solvent standards, with 72% correct identifications. Chemdistiller and mzCloud achieved 66% correct identifications in DIA spectra of solvent standards.
Chemdistiller and mzCloud also perform less in identifying compounds from DIA spectra in spiked feed extracts. Especially at the lower evaluated concentration, identification success rates drop to 38% and 31%, respectively. The performance of MSfinder and CFM-ID in identifying compounds from DIA spectra in spiked feed extracts was higher than that of Chemdistiller and mzCloud but slightly lower than the performance in DDA spectra, with 72% and 63% correct identifications in spiked feed extract at the lower concentration. Lower scores in spiked feed extracts compared to solvent standards are expected due to the high complexity of the animal feed matrix. However, the results from MSfinder and CFM-ID seem less affected by these interferences than the other tools.
Lower performance in the spiked feed extracts (especially in DIA spectra) is mainly caused by matrix interferences, which are abundant in the DIA spectra and cause false identifications. Figure 2 shows an example of foramsulfuron. In the DDA spectra (A, C), foramsulfuron fragments C3H3ON2+, C7H8N3O3+, and C10H11N2O4S+ are the most abundant fragments. In the DIA spectra, these fragments are less abundant in solvent standard (B) and not visible in animal feed extract (D), where matrix interferences dominate the spectrum. Foramsulfuron was identified correctly in all software tools using DDA spectra but not in three of the tested software tools using DIA spectra. Only MSfinder correctly identified the DIA spectrum of foramsulfuron.
In Table 3, the results per compound are listed. The only compound correctly identified in all spectra (DDA and DIA in solvent and spiked feed extract) was sulfaguanidine, while ketoprofen and sulfadoxine were correctly identified in all spectra except one. However, some compounds were difficult to identify correctly, for example, sulfalene/sulfamonomethoxine and levoflocaxin/oflocaxin. These two sets of two included compounds are already difficult to distinguish in target acquisition and processing methods as they have the same m/z and retention time, and the most intense fragments have the same m/z. Several other compound sets (e.g., sulfamoxole/sulfisoxazole, doxycycline/epi-doxycycline, tetracycline/epi-tetracycline) are only distinguished by slightly different retention times (<0.5 min), not by m/z and most intense fragments. They can only be distinguished by low abundant fragments. These compounds are most frequently identified incorrectly by the evaluated software tools.
Some other compounds also proved to be hard to identify, for example, cyproconazole. Cyproconazole was only identified correctly by CFM-ID, which provided correct identification in 5 out of 6 tested spectra, and by mzCloud in the DDA spectrum in solvent standard. The spectra of cyproconazole show only three fragments with a relative intensity >10%: the azole-fragment C2H4N3+, the fragment C7H6Cl+, and its 37chloride isotope. These fragments are common in azole-type pesticides. Therefore, the assignment of the correct compound within this compound class is complex.
On the other hand, all sulphonamides in the test set share multiple of the most abundant fragments. The sulphonamides included in the test set consist of three groups of two compounds each with the same molecular formula (sulfalene/sulfamonomethoxine, sulfamoxole/sulfisoxazole, sulfadimethoxine/sulfadoxine), which makes identification even more challenging. Still, as mentioned previously, most sulphonamides, except for sulfalene/sulfamonomethoxine, were correctly identified in both DDA and DIA spectra. The sulphonamides were present in a relatively high concentration in the solvent standard and spiked feed extracts. This increases the detection of lower abundance fragments needed for correct identification.
4. Discussion and Conclusions
Lately, DIA data acquisition approaches are increasingly used in untargeted analysis [18,19,20], as DIA can now be combined with DDA in one acquisition method. This approach is made possible by the latest-generation HRMS instruments, which have a much higher scan speed than earlier ones. However, research on small molecule compound identification with software tools using DIA data is unavailable. Therefore, this study evaluated the identification success rate of software tools CFM-ID, Chemdistiller, MSfinder, and spectral library mzCloud using DDA and DIA spectra of solvent standards and spiked feed extracts.
The four evaluated software identification tools had a similar success rate in DDA spectra of solvent standards. Spectral library mzCloud provided the highest percentage (84%) of correct identifications in DDA spectra of solvent standards, followed by CFM-ID (81%). Our results applying in silico tools to DDA spectra are in line with results in the literature. The MSfinder release article [16] correctly identified 82.1% of the spectra in the top 3 of the results. In the CFM-ID 3.0 release article [14], the software correctly identified 93.3% of the spectra in the top 3 results. In the software comparison of Blazenovic et al. [21], in which the CASMI 2016 spectra of environmental xenobiotics and drugs were used, CFM-ID and MSfinder performed comparably, with 91.7% and 91.0% correct identifications in the top 5, respectively. Chemdistiller and mzCloud were not included in this comparison. Chemdistiller identified 86% of the test compounds correctly in the top 5 results in its release article [15]. To our knowledge, the identification success rate of mass spectral library mzCloud has not previously been studied.
The results of this study show that identification using DIA spectra is more challenging for most software tools, especially in spiked feed extracts. Feed extract is a highly complex matrix, and very limited extract clean-up was performed. More clean-up of the extracts is not desirable in NTS as it can lead to the removal of compounds of interest. Despite the presence of matrix interferences, DIA spectra can be used for compound identification, with the highest success rates using MSfinder and CFM-ID. These two packages had similar identification scores for DIA and DDA spectra. Both software packages use in silico approaches for identification. It might be possible to improve the identification results in DIA spectra by applying spectral deconvolution prior to the use of the identification tools. In this study, we focused on easy to use, single-step approaches; therefore, we did not evaluate this option.
The conclusion of the first critical assessment of small molecule identification (CASMI) contest in 2013 was that spectral libraries provide better results than in silico tools [22]. However, in cases where the reference spectrum was not included in the library, in silico approaches and expert knowledge were required to obtain the correct identification. Our results show that mass spectral libraries and in silico approaches have a comparable success rate in DDA spectra. In contrast, the in silico approaches are more successful in compound identification for DIA spectra.
In this study, only a limited number of compounds were included in the software evaluation. The compound selection mainly consisted of pesticides and veterinary drugs. These compound groups and their fragmentation patterns are well known and commonly used for scripting rules for in silico fragmentation prediction or as training data for machine learning approaches. Therefore, these compounds are easy to identify in automated approaches. However, our compound selection also included various challenges, for example, metabolites and multiple isomeric compounds. Additionally, we did not evaluate the identification success rate for combined approaches using multiple software tools. A combined approach may yield higher success rates, as described by Blazenovic et al. [21]. We focused on single-step approaches that are easy to use and require limited expert knowledge.
To summarize, this study demonstrates that in silico annotation tools and spectral library matching offer similar success rates in DDA spectra. DIA spectra can be used for compound annotation in a simple workflow without prior spectral deconvolution, and the rule-based approaches used by MSfinder and CFM-ID offer the best identification results out of the evaluated tools for DIA spectra, especially in complex matrix extracts.
Conceptualization, R.N., M.H.B., R.S.W., E.d.L., S.P.J.v.L., B.J.A.B. and M.G.M.v.d.S.; Formal analysis, R.N. and R.S.W.; Funding acquisition, M.G.M.v.d.S.; Investigation, R.N. and R.S.W.; Methodology, R.N., M.H.B., R.S.W., S.P.J.v.L., B.J.A.B. and M.G.M.v.d.S.; Project administration, M.G.M.v.d.S.; Software, R.N., R.S.W. and E.d.L.; Writing—original draft, R.N. and M.H.B.; Writing—review & editing, R.N., M.H.B., R.S.W., E.d.L., S.P.J.v.L., B.J.A.B. and M.G.M.v.d.S. All authors have read and agreed to the published version of the manuscript.
Not applicable.
Not applicable.
Not applicable.
No potential conflict of interest was reported by the authors.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1. DDA and DIA acquisition methods visualized. The scans in each acquisition method, left, and on the right, the representation of the MS1 spectrum, where the coloring reflects the selections for the fragmentation scans. The DDA method (top) consists of a full scan measurement with a range of 135–1000 m/z, followed by acquisition of 5 consecutive data dependent MS2 spectra of the 5 signals with the highest intensity in the full scan MS1. The DIA method (bottom) consists of a full scan measurement with a range of 135–1000 m/z followed by the acquisition of 5 consecutive data independent fragmentation spectra of precursor mass ranges 95–205 m/z, 195–305 m/z, 295–405 m/z, 395–505 m/z, 495–1005 m/z.
Figure 2. (A) DDA spectrum of foramsulfuron in solvent standard, (B) DIA spectrum of foramsulfuron in solvent standard, (C) DDA spectrum of foramsulfuron in animal feed extract, (D) DIA spectrum of foramsulfuron in animal feed extract. The annotated fragments are highly abundant in the DDA spectra but less abundant in the DIA spectrum in solvent and not visible in the DIA spectrum in animal feed matrix, where matrix interferences are abundant.
Compound selection.
Name | Formula | High (µg/L) | Low (µg/L) | Mix |
---|---|---|---|---|
Albendazole-sulfone | C12H15N3O4S | 2000 | 1000 | A |
Albendazole-sulfoxide | C12H15N3O3S | 2000 | 1000 | C |
Clomazone | C12H14ClNO2 | 100 | 50 | A |
Cyproconazole | C15H18ClN3O | 100 | 50 | A |
Doxycycline | C22H24N2O8 | 2000 | 1000 | C |
Epi-Doxycycline | C22H24N2O8 | 2000 | 1000 | B |
Fenbendazole-sulfoxide | C15H13N3O3S | 200 | 100 | A |
Fenbufen | C16H14O3 | 500 | 250 | A |
Foramsulfuron | C17H20N6O7S | 100 | 50 | A |
Indoprofen | C17H15NO3 | 500 | 250 | C |
Ketoprofen | C16H14O3 | 500 | 250 | C |
Levamisole | C11H12N2S | 40 | 20 | A |
Levofloxacin | C18H20FN3O4 | 1000 | 500 | B |
Mebendazole-hydroxy | C16H15N3O3 | 40 | 20 | C |
Minocycline | C23H27N3O7 | 2000 | 1000 | B |
Naproxen | C14H14O3 | 200 | 100 | A |
Niflumic acid | C13H9F3N2O2 | 500 | 250 | A |
Ofloxacin | C18H20FN3O4 | 1000 | 500 | C |
Oxytetracycline | C22H24N2O9 | 2000 | 1000 | C |
Propyphenazone | C14H18N2O | 500 | 250 | C |
Spinosyn-A | C41H65NO10 | 100 | 50 | A |
Sulfacetamide | C8H10N2O3S | 2000 | 1000 | A |
Sulfadimethoxine | C12H14N4O4S | 2000 | 1000 | B |
Sulfadoxine | C12H14N4O4S | 2000 | 1000 | C |
Sulfaguanidine | C7H10N4O2S | 2000 | 1000 | C |
Sulfalene | C11H12N4O3S | 2000 | 1000 | B |
Sulfamonomethoxine | C11H12N4O3S | 2000 | 1000 | A |
Sulfamoxole | C11H13N3O3S | 2000 | 1000 | A |
Sulfisoxazole | C11H13N3O3S | 2000 | 1000 | A |
Tetracycline | C22H24N2O8 | 2000 | 1000 | B |
Epi-tetracycline | C22H24N2O8 | 2000 | 1000 | B |
Tetramisole | C11H12N2S | 40 | 20 | C |
Percentage of correctly identified compounds in the top 3 for DDA and DIA spectra using the tested software tools for 32 compounds in solvent standards and spiked into animal feed extracts.
Solvent Standard |
Spiked Feed Extract |
Spiked Feed Extract |
||
---|---|---|---|---|
DDA | MSfinder | 75% | 78% | 81% |
CFM-ID | 81% | 81% | 72% | |
Chemdistiller | 69% (76% *) | 69% (76% *) | 66% (72% *) | |
mzCloud | 84% | 88% | 84% | |
DIA | MSfinder | 72% | 75% | 72% |
CFM-ID | 72% | 72% | 63% | |
Chemdistiller | 59% (66% *) | 47% (52% *) | 34% (38% *) | |
mzCloud | 66% | 44% | 31% |
* score corrected for three compounds missing in Chemdistiller databases.
Identification results per compound. T: correct identification in top 3 results, F: no correct identification in top 3 results. (1) MSfinder, (2) CFM-ID, (3) Chemdistiller, (4) mzCloud.
DDA | DIA | |||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Solvent Standard | Spiked Feed Extracts | Spiked Feed Extracts | Solvent Standard | Spiked Feed Extracts | Spiked Feed Extracts | |||||||||||||||||||
High Concentration | High Concentration | Low Concentration | High Concentration | High Concentration | Low Concentration | |||||||||||||||||||
1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | |
Albendazole sulfone | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | F |
Albendazole sulfoxide | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | F |
Clomazone | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | F | T | T | T | F |
Cyproconazole | F | T | F | T | F | T | F | F | F | F | F | F | F | T | F | F | F | T | F | F | F | T | F | F |
Doxycycline | T | T | T | T | T | T | T | T | T | T | F | T | T | F | F | F | T | T | F | F | T | T | F | F |
epi-doxycycline * | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F | F |
epi-tetracycline * | F | F | F | F | F | F | F | T | F | F | F | T | F | T | F | T | F | F | F | F | F | F | F | F |
Fenbendazole sulfoxide | F | T | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | T | T | F | F | F | T | F |
Fenbufen | F | F | T | T | F | F | T | T | F | F | T | T | F | T | T | T | F | F | T | F | F | F | T | F |
Foramsulfuron | T | T | T | T | T | T | T | T | T | T | T | T | F | F | T | F | T | F | F | F | T | F | F | F |
Indoprofen | T | F | T | T | T | F | T | T | T | F | F | T | T | F | T | T | F | F | T | T | T | F | T | F |
Ketoprofen | T | T | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | T | T | T | T | T | T | T |
Levamisole | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | T | F | T | F | F | F | T | F | F |
Levofloxacin | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | T |
Mebendazole-hydroxy * | T | F | F | T | F | F | F | T | T | F | F | T | F | T | F | F | T | F | F | F | T | F | F | F |
Minocycline | T | T | T | T | T | T | F | T | T | T | T | T | T | T | T | T | T | T | F | F | T | T | F | F |
Naproxen | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | T | T | F | F | F | F | F | F | F |
Niflumic acid | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | F | T | F | F | F |
Ofloxacin | F | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | T | T | T | T | T | T | F | T |
Oxytetracycline | T | T | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | F | F | F | T | T | F | F |
Propyphenazone | F | T | F | T | T | T | T | T | T | T | T | T | F | F | F | T | T | T | T | T | T | T | T | T |
Spinosin A | T | T | F | F | T | T | F | T | T | F | F | T | T | T | F | F | T | T | F | F | T | F | F | F |
Sulfacetamide | T | T | T | F | T | T | T | T | T | T | T | T | T | T | T | F | T | T | T | T | T | T | T | F |
Sulfadimethoxine | T | T | T | T | T | T | F | T | T | T | F | T | T | T | F | T | T | T | T | T | T | T | F | T |
Sulfadoxine | T | T | F | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T |
Sulfaguanidine | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T |
Sulfalene | T | T | F | T | T | T | T | F | T | F | T | F | T | T | F | F | T | T | F | F | T | F | F | F |
Sulfamonomethoxine | F | T | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | T | F | T | F | T | F | T |
Sulfamoxole | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | T | F | T |
Sulfisoxazole | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | F | T | T | F | T | T | T | F | T |
Tetracycline | T | T | T | T | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | T | T | T | F | F |
Tetramisole | T | F | T | F | F | F | T | F | F | F | T | F | T | F | T | F | F | F | T | F | F | F | T | F |
* not included in Chemdistiller databases.
References
1. Pourchet, M.; Debrauwer, L.; Klanova, J.; Price, E.J.; Covaci, A.; Caballero-Casero, N.; Oberacher, H.; Lamoree, M.; Damont, A.; Fenaille, F. et al. Suspect and non-targeted screening of chemicals of emerging concern for human biomonitoring, environmental health studies and support to risk assessment: From promises to challenges and harmonisation issues. Environ. Int.; 2020; 139, 105545. [DOI: https://dx.doi.org/10.1016/j.envint.2020.105545] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32361063]
2. Da Silva, R.R.; Dorrestein, P.C.; Quinn, R.A. Illuminating the dark matter in metabolomics. Proc. Natl. Acad. Sci. USA; 2015; 112, pp. 12549-12550. [DOI: https://dx.doi.org/10.1073/pnas.1516878112]
3. Theodoridis, G.; Gika, H.; Raftery, D.; Goodacre, R.; Plumb, R.S.; Wilson, I.D. Ensuring Fact-Based Metabolite Identification in Liquid Chromatography–Mass Spectrometry-Based Metabolomics. Anal. Chem.; 2023; 95, pp. 3909-3916. [DOI: https://dx.doi.org/10.1021/acs.analchem.2c05192] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/36791228]
4. Blaženović, I.; Kind, T.; Ji, J.; Fiehn, O. Software tools and approaches for compound identification of LC-MS/MS data in metabolomics. Metabolites; 2018; 8, 31. [DOI: https://dx.doi.org/10.3390/metabo8020031] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/29748461]
5. Cai, Y.; Zhou, Z.; Zhu, Z.-J. Advanced analytical and informatic strategies for metabolite annotation in untargeted metabolomics. TrAC Trends Anal. Chem.; 2023; 58, 116903. [DOI: https://dx.doi.org/10.1016/j.trac.2022.116903]
6. Kind, T.; Fiehn, O. Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinform.; 2007; 8, 105. [DOI: https://dx.doi.org/10.1186/1471-2105-8-105]
7. Trier, X.; van Leeuwen, S.P.J.; Brambilla, G.; Weber, R.; Webster, T.F. Lack of chemical reference standards hinders (generation of) scientific evidence of chemical risks and their control. Environ. Health Perspect; 2023; under review.
8. Schymanski, E.L.; Jeon, J.; Gulde, R.; Fenner, K.; Ruff, M.; Singer, H.P.; Hollender, J. Identifying Small Molecules via High Resolution Mass Spectrometry: Communicating Confidence. Environ. Sci. Technol.; 2014; 48, pp. 2097-2098. [DOI: https://dx.doi.org/10.1021/es5002105]
9. Thermo Scientific, mzCloud Advanced Mass Spectral Database. Available online: https://www.mzCloud.org (accessed on 1 November 2020).
10. Bittremieux, W.; Wang, M.; Dorrestein, P.C. The critical role that spectral libraries play in capturing the metabolomics community knowledge. Metabolomics; 2022; 18, pp. 1-16. [DOI: https://dx.doi.org/10.1007/s11306-022-01947-y]
11. Kind, T.; Tsugawa, H.; Cajka, T.; Ma, Y.; Lai, Z.; Mehta, S.S.; Wohlgemuth, G.; Barupal, D.K.; Showalter, M.R.; Arita, M. et al. Identification of small molecules using accurate mass MS/MS search. Mass Spectrom. Rev.; 2018; 37, pp. 513-532. [DOI: https://dx.doi.org/10.1002/mas.21535]
12. De Vijlder, T.; Valkenborg, D.; Lemière, F.; Romijn, E.P.; Laukens, K.; Cuyckens, F. A tutorial in small molecule identification via electrospray ionization-mass spectrometry: The practical art of structural elucidation. Mass Spectrom. Rev.; 2018; 37, pp. 607-629. [DOI: https://dx.doi.org/10.1002/mas.21551]
13. BMisra, B.B. New software tools, databases, and resources in metabolomics: Updates from 2020. Metabolomics; 2021; 17, pp. 1-24. [DOI: https://dx.doi.org/10.1007/s11306-021-01796-1]
14. Djoumbou-Feunang, Y.; Pon, A.; Karu, N.; Zheng, J.; Li, C.; Arndt, D.; Gautam, M.; Allen, F.; Wishart, D.S. CFM-ID 3.0: Significantly Improved ESI-MS/MS Prediction and Compound Identification. Metabolites; 2019; 9, 72. [DOI: https://dx.doi.org/10.3390/metabo9040072]
15. Laponogov, I.; Sadawi, N.; Galea, D.; Mirnezami, R.; Veselkov, K. ChemDistiller: An engine for metabolite annotation in mass spectrometry. Bioinformatics; 2018; 34, pp. 2096-2102. [DOI: https://dx.doi.org/10.1093/bioinformatics/bty080]
16. Tsugawa, H.; Kind, T.; Nakabayashi, R.; Yukihira, D.; Tanaka, W.; Cajka, T.; Saito, K.; Fiehn, O.; Arita, M. Hydrogen Rearrangement Rules: Computational MS/MS Fragmentation and Structure Elucidation Using MS-FINDER Software. Anal. Chem.; 2016; 88, pp. 7946-7958. [DOI: https://dx.doi.org/10.1021/acs.analchem.6b00770]
17. Zomer, P.; Mol, H.G. Simultaneous quantitative determination, identification and qualitative screening of pesticides in fruits and vegetables using LC-Q-Orbitrap™-MS. Food Addit. Contam. Part A; 2015; 32, pp. 1628-1636. [DOI: https://dx.doi.org/10.1080/19440049.2015.1085652]
18. Guo, J.; Shen, S.; Xing, S.; Huan, T. DaDIA: Hybridizing Data-Dependent and Data-Independent Acquisition Modes for Generating High-Quality Metabolomic Data. Anal. Chem.; 2021; 93, pp. 2669-2677. [DOI: https://dx.doi.org/10.1021/acs.analchem.0c05022]
19. Hilaire, P.B.S.; Rousseau, K.; Seyer, A.; Dechaumet, S.; Damont, A.; Junot, C.; Fenaille, F. Comparative Evaluation of Data Dependent and Data Independent Acquisition Workflows Implemented on an Orbitrap Fusion for Untargeted Metabolomics. Metabolites; 2020; 10, 158. [DOI: https://dx.doi.org/10.3390/metabo10040158]
20. Santos, M.D.; Camillo-Andrade, A.C.; Kurt, L.U.; Clasen, M.A.; Lyra, E.; Gozzo, F.C.; Batista, M.; Valente, R.H.; Brunoro, G.V.; Barbosa, V.C. et al. Mixed-Data Acquisition: Next-Generation Quantitative Proteomics Data Acquisition. J. Proteom.; 2020; 222, 103803. [DOI: https://dx.doi.org/10.1016/j.jprot.2020.103803]
21. Blaženović, I.; Kind, T.; Torbašinović, H.; Obrenović, S.; Mehta, S.S.; Tsugawa, H.; Wermuth, T.; Schauer, N.; Jahn, M.; Biedendieck, R. et al. Comprehensive comparison of in silico MS/MS fragmentation tools of the CASMI contest: Database boosting is needed to achieve 93% accuracy. J. Chemin.; 2017; 9, pp. 1-12. [DOI: https://dx.doi.org/10.1186/s13321-017-0219-x]
22. Schymanski, E.L.; Neumann, S. CASMI: And the winner is…. Metabolites; 2013; 3, pp. 412-439. [DOI: https://dx.doi.org/10.3390/metabo3020412] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/24957999]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Liquid chromatography combined with high-resolution mass spectrometry (LC-HRMS) is a frequently applied technique for suspect screening (SS) and non-target screening (NTS) in metabolomics and environmental toxicology. However, correctly identifying compounds based on SS or NTS approaches remains challenging, especially when using data-independent acquisition (DIA). This study assessed the performance of four HRMS-spectra identification tools to annotate in-house generated data-dependent acquisition (DDA) and DIA HRMS spectra of 32 pesticides, veterinary drugs, and their metabolites. The identification tools were challenged with a diversity of compounds, including isomeric compounds. The identification power was evaluated in solvent standards and spiked feed extract. In DDA spectra, the mass spectral library mzCloud provided the highest success rate, with 84% and 88% of the compounds correctly identified in the top three in solvent standard and spiked feed extract, respectively. The in silico tools MSfinder, CFM-ID, and Chemdistiller also performed well in DDA data, with identification success rates above 75% for both solvent standard and spiked feed extract. MSfinder provided the highest identification success rates using DIA spectra with 72% and 75% (solvent standard and spiked feed extract, respectively), and CFM-ID performed almost similarly in solvent standard and slightly less in spiked feed extract (72% and 63%). The identification success rates for Chemdistiller (66% and 38%) and mzCloud (66% and 31%) were lower, especially in spiked feed extract. The difference in success rates between DDA and DIA is most likely caused by the higher complexity of the DIA spectra, making direct spectral matching more complex. However, this study demonstrates that DIA spectra can be used for compound annotation in certain software tools, although the success rate is lower than for DDA spectra.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer