It appears you don't have support to open PDFs in this web browser. To view this file, Open with your PDF reader
Abstract
Automated invertebrate classification using computer vision has shown significant potential to improve specimen processing efficiency. However, challenges such as invertebrate diversity and morphological similarity among taxa can make it difficult to infer fine-scale taxonomic classifications using computer vision. As a result, many invertebrate computer vision models are forced to make classifications at coarser levels, such as at family or order. Here we propose a novel modular method to combine computer vision and bulk DNA metabarcoding specimen processing pipelines to improve the accuracy and taxonomic granularity of individual specimen classifications. To improve specimen classification accuracy, our methods use multimodal fusion models that combine image data with DNA-based assemblage data. To refine the taxonomic granularity of the models classifications, our methods cross-references the classifications with DNA metabarcoding detections from bulk samples. We demonstrated these methods using a continental-scale, invertebrate bycatch dataset collected by the National Ecological Observatory Network. We also introduce the CV.eDNA R package, which aims to assist practitioners looking to implement our methods. Using our methods, we reached a classification accuracy of 79.6% across the 17 taxa using real DNA assemblage data, and 83.6% when the assemblage data was error-free, resulting in a 2.2% and 6.2% increase in accuracy when compared to a model trained using only images. After cross-referencing with the DNA metabarcoding detections, we improved taxonomic granularity in up to 72.2% of classifications, with up to 5.7% reaching species-level. By providing computer vision models with coincident DNA assemblage data, and refining individual classifications using DNA metabarcoding detections, our methods the potential to greatly expand the capabilities of biological computer vision classifiers. Our methods allow computer vision classifiers to infer taxonomically fine-grained classifications when it would otherwise be difficult or impossible due to challenges of morphologic similarity or data scarcity. These methods are not limited to terrestrial invertebrates and could be applied in any instance where image and DNA metabarcoding data are concurrently collected.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
* This version of the manuscript has been revised to reflect changes to the associated GitHub repository, including the creation of the CV.eDNA R package.
* https://github.com/Jarrett-Blair/CV-DNA-Hybrid
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer





