Full Text

Introduction

Infectious disease is a widely studied topic in wildlife biology and ecosystem science¹. Every year, countless scientific studies report new data on the prevalence of macroparasites (e.g., ticks and tapeworms) and microparasites (e.g., bacteria, viruses, and other classically defined “pathogens”), hereafter “parasites” for simplicity², in wild animals. These datasets are incredibly valuable, and – especially in aggregate – can be used to test ecological theory³; monitor the impacts of climate change^4,5, land use change^6,7, and biodiversity loss⁸; and even track emerging threats to human and ecosystem health^{9, 10–11}.

Disease ecologists engaged in synthesis research are often faced with reconciling datasets that vary greatly in their scope and granularity. For example, many studies do not report information about sampling effort over space and time, and may not even report the location of sampling sites^9,12. Similarly, researchers often collect a wealth of host-level data that might help to understand infection processes (e.g., sex, age, life stage, or body size). However, many studies only provide summary statistics for parasite prevalence across different sites, species, or time points, which cannot be disaggregated back to the host level. For example, out of 110 studies we recently reviewed⁹ that have tested wild bats for coronaviruses, 96 only reported data in a summarized format (see Supplemental File 4). When studies did share individual-level data, they often did so only for positive results (11 of 14 studies), making it impossible to compare prevalence across populations, years, or species.

To address these issues, wildlife disease ecology would benefit from best practices for dataset standardization and sharing, similar to those that have been developed for other types of foundational data in the biological sciences^{13, 14–15}. Data standards facilitate the sharing, (re)use, and aggregation of data by humans and machines through the use of a common structure, set of properties, and vocabulary. Here, we designed a simple and flexible minimum data standard that is intended to be accessible to a range of practitioners, while providing sufficient structure for large-scale data analysis and meeting expectations for Findable, Accessible, Interoperable, and Reusable (FAIR) research practices¹⁶. We describe the required properties and structure for wildlife disease data...

Show less

A minimum data standard for wildlife disease research and surveillance

Full Text

Suggested sources

A minimum data standard for wildlife disease research and surveillance

Content area

Full Text

Suggested sources