The digitization of herbarium specimens and their associated data have advanced our ability to understand complex and changing biological systems (Johnson, ; Pearse et al., ; Willis et al., ). Digitizing herbarium records (capturing the taxon name, date of collection, location, and/or digital image) has advanced our ability to track changes in the distributions of organisms (Lavoie, ), but herbarium specimens are rich with additional information regarding plant health, reproductive condition, and morphology that is generally not captured in current digitization workflows (Nelson et al., ). Because the utility of specimens for research is accelerating, it is essential that we structure digital data collection in ways that best facilitate longevity and integration across data sources.
Of particular interest is the enormous potential of herbarium specimens as a resource for information on plant phenology (the timing of seasonal events such as flowering or fruiting). Plant phenology has complex, cascading effects on multiple levels of biological organization from individuals to ecosystems (Bertin, ). Temporal mismatches between plants and pollinators can quickly drive populations extinct, cause rapid evolutionary shifts, and result in billions of dollars of agricultural losses (Visser and Both, ; Both et al., ; Körner and Basler, ; Miller‐Rumpff et al., ; Ozgul et al., ; Struttmann et al., ). Phenology has also been used to study the impact of climate change in a range of organisms and vegetation types (Bowers, ; Houle, ; Anderson et al., ; Lavoie, ). Consequently, maximizing the use of herbarium specimens for phenological research is not only important for improving our understanding of evolutionary change, it is also a matter of great practical concern for addressing environmental problems.
Recent studies have demonstrated the potential of herbarium specimens to be used in evaluating temporal and spatial variation in plant phenology (see Willis et al., for a review of these studies) despite known biases of herbarium records (Meyer et al., ; Pearse et al., ; Daru et al., ). These studies have provided three valuable outcomes. First, for several species, we now have a quantitative historical understanding of their phenological change over time (Rivera and Borchert, ; Primack et al., ; Miller‐Rushing et al., ; Gallagher et al., ; Robbirt et al., ; Park ; Davis et al., ). Second, for some species, relationships between temporal or spatial variation in phenology and climate (e.g., local temperature and/or precipitation) have been detected; these relationships, in turn, provide a basis for forecasting the effects of ongoing climate change on the seasonal cycles of these taxa (Badeck et al., ; Franks et al., ; Matthews and Mazer, ; Prevéy et al., ). Third, we have an improved understanding of the specific advantages of herbarium specimens for phenological research, such as filling gaps in long‐term or observational data sets, either for a period of time (Panchen et al., ) or for underrepresented regions (Gallagher et al., ; Panchen and Gorelick, ).
Given the ecological importance of phenology, the demonstrated value of herbarium specimens for phenological research, and the potential for digitization efforts to maximize herbarium records as a resource, it is necessary to develop robust standards for how phenological data are captured during or after the digitization process. There are currently two principal limitations to accessing and using phenological data from herbarium specimens: (1) the paucity of high‐quality images accompanying digitized specimen records and (2) the lack of standardized methodology for capturing specimens’ reproductive traits and sharing the resulting data. If any phenological information is present on a label or visible on a specimen, it is parsed in numerous—and often arbitrary—ways during digitization. For example, phenological data embedded in a label might be “on a south facing slope in full flower,” but this information might be digitally captured in the ‘habitat,’ ‘notes,’ ‘plant description,’ or other field of a local database.
Even if a local database does contain a field explicitly for phenological characters, each institution independently decides how to record the states present on the sheet. For example, in the SEINet collaborative (
‘reproductiveCondition’ text samples | Specimen count |
Flowering1 | 434,637 |
Flowering and fruiting | 285,865 |
flower1 | 174,751 |
Fruiting | 160,853 |
fl1 | 136,544 |
fr | 97,098 |
flowering = early1 | 90,578 |
fl‐fr | 90,132 |
flowers1 | 86,372 |
fruit | 85,260 |
flowering + fruiting = mid | 75,128 |
vegetative | 47,785 |
flr1 | 44,529 |
fertile | 39,309 |
Flo1 | 36,867 |
veg | 36,258 |
fl,fr | 35,032 |
Flr & Frt | 33,199 |
spores | 31,824 |
sterile | 30,478 |
Flower: Y Fruit: N1 Vegetation: Y Bud: N1 | 27,508 |
Flower | Fruit | 27,254 |
flowers & fruit | 26,301 |
fru | 19,833 |
1Records that refer only to open flowers or flowering; these can be scored simultaneously with the Attribute Mining Tool, resulting in over 1 million new phenological records from a single scoring effort.
It is clear that there is a huge potential for using phenological data from herbarium specimens (Willis et al., ). We propose a method here to (1) broaden the scope and longevity of digitization efforts through a standardized methodology for scoring reproductive characters from herbarium specimens and (2) provide a means of sharing the resulting data in a Darwin Core format. The protocol we describe here will unlock the potential of herbaria for phenological research by facilitating comparability among herbaria, research groups, and other methodologies used to collect phenological data (e.g., citizen science observations, satellite imagery, and stationary camera images).
To make progress toward developing standards that reflect community‐wide goals and feasible implementation, iDigBio sponsored a working group called Coding Phenological Data from Herbarium Sheets in March 2016 at the University of California, Berkeley (
This workshop brought together 37 participants from the United States, Scotland, England, Sweden, Canada, Germany, and Australia. The group represented a range of phenological interests, including phenological researchers (those who use the downstream data products obtained from specimens), herbarium collection personnel (those responsible for preparing and curating the specimens and their data), in situ phenological observers who record the phenological status of living plants, and data standards experts. A participant list can be found on the iDigBio wiki page (
Prior to the workshop, we developed a survey to assess needs of the phenological community and herbarium data users and to review the current ways phenological data were being captured. We received 76 responses to the survey, and the respondents identified themselves as being from collections, monitoring, or research areas (not mutually exclusive). With this survey and input from participants at the workshop, we reviewed the ways in which herbaria currently capture phenological traits. The two most‐scored traits from specimens are the presence of open flowers and the presence of fruit (14 million specimens represented in this survey). Most respondents also felt that of all possible traits, open flowers and fruits were the most important traits to score on a specimen. Participants of the workshop echoed this sentiment. We reviewed previous phenological research that was based on data derived from herbarium specimens in order to identify the types of raw data necessary and sufficient to achieve a variety of research goals. These findings are summarized in Willis et al. ().
When developing a scoring protocol, we considered the challenges and limitations to scoring specimens (de novo scoring and as part of regular digitization workflows) and the potential solutions to these limitations. We considered hard‐to‐see floral parts, trained vs. untrained scorers, the limited resources of most herbaria, and the likelihood of community‐wide adoption. We also considered the costs and benefits of recording qualitative data (e.g., “open flowers present/absent”) vs. quantitative data (e.g., counts or proportions of unopened flowers, open flowers, fruits).
One of our primary concerns is that any resulting data from attempts to score phenological traits should be shareable in Darwin Core–formatted files to help ensure the usefulness and longevity of these data. Representatives from the data standards community, Biodiversity Information Standards (TDWG), including Darwin Core and Apple Core, provided input for representing phenological stages using current biodiversity standards.
Finally, to ensure that phenological traits from specimens can be integrated with other sources, participants included members of the USA National Phenology Network, the California Phenology Network, the National Ecological Observatory Network, the Royal Botanic Garden Edinburgh, the Pan‐European Phenology Network, and the Plant Phenology Ontology.
We propose that reproductive traits for specimens of seed plants be scored according to the following hierarchical categories/questions (Tables and ). Our protocol uses terminology from the Plant Ontology to represent plant parts (e.g., flower, fruit) (
First‐order | Second‐order | Third‐order |
Are ‘reproductive structures’ present? | Are ‘unopen flowers’ present? | Mostly ‘unopen flowers?’ (or counts) |
(yes/no/not scorable) | Are ‘open flowers’ present? | Mostly ‘open flowers?’ (or counts) |
Mostly ‘post‐mature flowers’? (or counts) | ||
Are ‘fruits’ present? | Mostly ‘immature fruits’? (or counts) | |
Mostly ‘mature fruits’? (or counts) | ||
Mostly ‘post‐mature fruits’? (or counts) |
First‐order | Second‐order | Third‐order |
Are ‘reproductive structures’ present? | Are ‘pollen cones’ present? | Mostly ‘immature pollen cones?’ (or counts) |
(yes/no/not scorable) | Mostly ‘mature pollen cones?’ (or counts) | |
Mostly ‘post‐mature pollen cones?’ (or counts) | ||
Are ‘seed cones’ present? | Mostly ‘immature seed cones?’ (or counts) | |
Mostly ‘mature seed cones?’ (or counts) | ||
Mostly ‘post‐mature seed cones’ present? |
The question “Are ‘reproductive structures’ present? (yes/no/not scorable),” while the broadest question, was still determined to have value for scoring specimen records. Having this information allows researchers to filter millions of records quickly to find those that contribute to phenological research. It is also relatively easy for users with different levels of botanical training (e.g., curators, volunteers, and citizen scientists) to score. A “yes” means that some reproductive structures of some kind (e.g., flowers, fruits, or cones) are present. A “no” means that the specimen is sterile and strictly vegetative. It is important to note that this first‐order scoring can apply to all taxonomic groups, even beyond seed plants. Some taxonomic groups may exhibit specialized structures that make it more difficult for non‐experts to complete this process (i.e., vegetative propagules that look like fruits), but we anticipate that this challenge will be limited. Minimally, first‐order scoring will allow for records to be filtered and then subsequently scored in more detail.
For specimens that are scored as having reproductive structures present, it is valuable to characterize which reproductive structures are present. Most research thus far has used specimens with open flowers. For flowering plants, we propose the following second‐order, non‐mutually exclusive questions: “Are ‘unopened flowers’ present?,” “Are ‘open flowers’ present?,” “Are ‘fruits’ presents?” For gymnosperms the questions are: “Are ‘pollen cones’ present?” and “Are ‘seed cones’ present?” (Tables and ). The term “bud/s” can confuse floral buds with leaf buds; therefore, the PPO and this protocol refer to unopened flowers only.
The second‐order questions are not mutually exclusive. If unopened flowers, open flowers, and fruits are all present on a sheet, all questions can be answered in the affirmative. Having these data allows researchers to quickly identify the records that pertain to their individual research questions. The second‐order questions require greater training for personnel to accurately discriminate unopened flowers, open flowers, and fruits. For many taxa (e.g., grasses, sedges, rushes), floral structures are small and distinguishing between unopened flowers, open flowers, and fruits can be challenging. Additionally, it is important that scorers are trained to distinguish between leaf buds and flower buds (which contain unopened flowers). As training materials are developed for various plant groups they should be shared widely across the community.
Third‐order scoring further subdivides the categories of the second‐order scorings. While second‐order scorings will determine which specimens should be included in phenological research, it is often valuable to know the specimen's specific phenophase. Analyses can be more precise if we can distinguish between specimens in full flower from those specimens at the beginning or end of the flowering cycle. The third‐order scorings are intended to place individual specimens at a specific point in phenological development. As such, these subcategories and the units used to report them may vary depending on the institutional or research priorities that generate them. We do not specify exactly what the third‐order categories should be, as these will be determined by research priorities and staff time, but rather we explain how these questions are most commonly expressed or could be expressed within our proposed framework. Although we do not specify third‐order categories, we do strongly recommend that researchers clearly define their categories and make their definitions broadly accessible, along with pertinent metadata. For example, the Simple Knowledge Organization System (SKOS) provides a framework for representing controlled vocabularies that easily lends itself to being shared online. The New England Vascular Plant (NEVP) project has devised a vocabulary following these guidelines and has published their vocabulary using SKOS (
Scoring | NEVP definition | URI identifier |
First‐order | ||
Is the material on the sheet sterile? | No reproductive structures present (no unopened flowers, open flowers, or fruits) |
|
Is there reproductive material present on the sheet? | At least one reproductive structure of any kind is present (unopened flowers, open flowers, or fruits) |
|
Not scorable | Not possible to score reproductive condition using material present |
|
Second‐order | ||
Unopened flowers present? | At least one unopened flower is present |
|
Open flowers present? | At least one open flower is present |
|
Fruits present? | At least one immature or mature fruit is present |
|
Third‐order | ||
Mostly unopened flowers | Mostly unopened flowers (less than half open) but with at least one open flower; this category is mutually exclusive with mostly open and mostly old flowering stages and with mostly young, mostly mature, and past maturity fruiting stages |
|
Mostly open flowers | Mostly open flowers (more than half open) with few unopened flowers or old flowers that have lost their petals; this category is mutually exclusive with mostly unopened and mostly old flowering stages and with mostly young, mostly mature, and past maturity fruiting stages |
|
Mostly old flowers | Mostly old flowers (less than half open) that have lost their petals, but at least one flower still open; this category is mutually exclusive with mostly unopened and mostly open flowering stages and with mostly young, mostly mature, and past maturity fruiting stages |
|
Mostly young fruit | Mostly immature fruits present (less than half mature) but at least one mature fruit present, mutually exclusive with mostly mature and past maturity fruiting stages and with mostly unopened, mostly open, and mostly old flowering stages |
|
Mostly mature fruit | Mostly mature fruits present (more than half mature), mutually exclusive with mostly young and past maturity fruiting stages and with mostly unopened, mostly open, and mostly old flowering stages |
|
Mostly past mature fruit | Fruits have fallen from stalks, withered, or dehisced and lacking seeds (less than half mature) but at least one mature fruit present, mutually exclusive with mostly young and mostly mature fruiting stages and with mostly unopened, mostly open, and mostly old flowering stages |
|
Number of open flowers present | Number of open flowers present on specimen: 0 is an acceptable value |
|
Number of fruits present | Number of mature fruits, 0 is an acceptable value |
|
There are some taxonomic groups that have very small floral structures that either require an onerous amount of time to score or require expertise to determine what kinds of organs are actually present on the sheet. Members of Poaceae, Cyperaceae, and Juncaceae, as well as certain Asteraceae, are a few examples for which second‐order scoring may be more challenging. However, it should be relatively easy to apply first‐order scorings to these groups, thereby greatly increasing the utility of these specimens for phenological research. Our protocol does not address the presence/absence or abundance of male vs. female flowers, or distinguish between perfect and imperfect flowers in gynodioecious, gynomonoecious, or andromonoecious species, largely due to the fact that these categories have seldom been included in phenological research.
The timing of reproduction is not the only important phenological event of interest to be tracked in plants. Leaf bud break and leaf‐out are important phenomena for deciduous forests, as is autumn senescence. These vegetative characters are often tracked via satellite imagery and in situ monitoring efforts. Scoring phenological leaf traits on herbarium specimens is rare (but see Zohner and Renner, ; Gallinat et al., ), but it provides valuable insights into the effects of climate change (Chmielewski and Rotzer, ; Everill et al., ). A similar scoring protocol is recommended for foliar structures, although we do not specify a protocol here.
Darwin Core has emerged as a key standard for describing species occurrences. Online documentation, including definitions and examples, is provided for each term used in the Darwin Core (TDWG website:
We are proposing to share phenological data using the Darwin Core Extended MeasurementOrFact (eMoF) extension (Table ) (
coreid | measurementType | measurementTypeID | measurementValue | measurementValueID | measurementUnit | measurementDeterminedDate | measurementDeterminedBy | measurementRemarks |
652438 | Phenology (ver 1.2) |
|
Reproductive |
|
2017‐04‐15T01:05:07Z | egbot | ||
652438 | Phenology: reproductive |
|
Open Flowers |
|
2017‐04‐15T01:05:08Z | egbot | ||
652439 | Phenology (ver 1.2) |
|
Reproductive |
|
2017‐04‐15T01:10:54Z | egbot | ||
652439 | Phenology: reproductive |
|
Open Flowers |
|
2017‐04‐15T01:10:55Z | egbot | ||
652439 | Phenology: reproductive |
|
Fruiting |
|
2017‐04‐15T01:10:56Z | egbot |
measurementType = the nature of the measurement, fact, characteristic, or assertion; measurementTypeID = an identifier for the measurementType; measurementValue = the value of the measurement, fact, characteristic, or assertion; measurementValueID = an identifier for facts stored in the column measurementValue; measurementUnit = the units associated with the measurementValue, not used in this example; measurementRemarks = not used in this example, but could contain comments or notes accompanying the MeasurementOrFact; measurementDeterminedDate and measurementDeterminedBy = the person, date, and time that the scoring was applied.
3In the near future, these phenological mappings will be aggregated by iDigBio, GBIF, and other public repositories through Symbiota's Darwin Core Archive publishing services. The Darwin Core Archive publishing services are available within all Symbiota portals and are the central mechanized archive used by iDigBio to harvest and maintain updates of specimen data published from a Symbiota portal instance.
Table shows an example Darwin Core Archive file of the eMoF extension of two herbarium sheets that were scored using our protocol (measurementType called Phenology [ver.1.0]). The first record was scored with a measurementValue = ‘Reproductive.’ Additionally that same record is scored as measurementValue = ‘Open flowers.’ The second record, united by catalog number, is scored ‘Reproductive,’ ‘Open flowers,’ and ‘Fruiting.’
Using the eMoF extension has a potential disadvantage, namely that it does not allow the measurement to be rigorously tied to a particular aspect of the core record. This means that any user can define a new and non‐standard ‘measurementType’ and ‘measurementValue’ (e.g., potentially called “Flowering Time” and “having flowers” or “Flws”), which could lead to difficulty compiling data. Unless various measurementTypes and measurementValues are rigorously defined, an excessive number of unique text strings could be generated. To address this, we are working toward defining these terms within Apple Core. Apple Core is a set of best practice guidelines for publishing botanical specimen information for herbaria. A goal of the guidelines is to mitigate the generality of Darwin Core by providing detailed guidelines for publishing botanical specimen information in Darwin Core. These guidelines will include recommended terms, specific definitions, multiple examples, common issues, and controlled vocabularies where appropriate that are specific to herbarium specimens. Apple Core is a community‐curated resource that is still being refined, and interaction with phenological researchers will help to strengthen this resource. Finally, use of this approach is complementary with broader sharing initiatives that utilize ontologies, such as the Plant Phenology Ontology.
In the near future, using the eMoF extension will allow for phenological scorings to be published in iDigBio, the Global Biodiversity Information Facility (GBIF), and other public repositories. Darwin Core Archive publishing services are available within all Symbiota portals (Gries et al., ;
The questions presented here provide important data for researchers while also requiring minimal effort from herbarium curators. Phenological questions are easily integrated into standard label digitization workflows or could be subsequently scored from images. Due to the nested nature of the questions, a third‐order question can be scored initially, with the appropriate second‐ and first‐order questions automatically populated. For example, a report of “fruits present” on a specimen would automatically score a “yes” for the first‐order question, indicating that reproductive structures are present. To answer first‐order questions, the person who is performing the initial data entry for a specimen need only look at the sheet and check a box indicating whether reproductive structures of any kind are present. For databases that do not have the infrastructure to accommodate this type of scoring, a few alternatives are presented below.
Phenological scores can be recorded at a number of steps in a digitization workflow. In the case of an object‐to‐data workflow, scores could be made directly from the sheet as label data are being captured. With an image‐based workflow, the scoring of specimens can be achieved by visual inspection of their images. The latter approach provides the option of making the image available online where the public (e.g., citizen scientists using Notes from Nature, CrowdCurio, or other platforms) can record phenological observations. Machine learning approaches are likely to facilitate our ability to score images at scale in the near future. Database fields in local databases need to be modified to accommodate the proposed structure. Implementation of controlled vocabularies can be facilitated with drop‐down menus or pick‐lists (see Figs. and for an example from a Symbiota portal); however, providing such functionality might require changes to database management software. Fortunately, a number of tools (described below) have been developed for scoring the phenological status of specimens.
Example of Symbiota's Attribute Mining Tool. Here, a local database's text field ‘Reproductive Condition’ was searched for all text strings containing “fl” in the Fabaceae. Highlighted references are text strings referring to both flowers and fruits. These were selected, and second‐order scorings of “open flowers present” and “fruit present” were then applied to all specimens simultaneously.
Example of Symbiota's Image Scoring Tool. Images of Fabaceae specimens were searched. The user can apply the desired level of scoring to each image that appears.
For curators who do not have a database with Symbiota‐type functionality that provides phenological checkboxes corresponding to our proposed protocol, we suggest that users enter phenological information into an appropriate text field within their existing database with the expectation that new tools will enable users to search these text fields and score the specimens appropriately (see Tools that facilitate scoring: Symbiota's “Attribute Mining Tool” below). Ideally, every institution's home database will include a text field dedicated exclusively to information pertaining to phenology. However, including phenological information as text anywhere within a given specimen's label data is better than not capturing any phenological traits. To choose the best text field within a local database, it is important to know how the specimen data appear when shared using a Darwin Core Archive. If, for example, one's local database conforms to Darwin Core, reproductive traits should be included in the ‘reproductiveCondition’ field. The words entered into the text field should be unambiguous and should correspond to the protocol above (e.g., “unopened flowers,” “open flowers,” “fruit”). This is an action that all curators can immediately integrate into their current digitization workflows.
Those managing or implementing digitization workflows should consider incorporating the scoring of phenological data into their workflows. At the very least, first‐ or second‐order phenological data (as described above) should be considered for capture. Doing so will facilitate future scoring of the specimens. If time does not permit training herbarium personnel to record challenging second‐order scorings, then simply adding the word “reproductive” somewhere in a relevant database field will aid future work and research use.
Although it is not the primary focus of this paper, we think it is useful to readers to make brief mention of how our protocol could be implemented and what tools are available for doing so.
Part of the NEVP project was the development of a tool to score phenological traits using digitized label text (Fig. ). This tool allows a user to search for specific words in database fields and map these to the proposed vocabulary. For example, using this tool to search the field ‘reproductiveCondition’ within SEINet resulted in over 4000 unique text strings (Table ). The Attribute Mining Tool allows one to select all records containing text that refer solely to a single scoring category. For example, if a user were scoring “open flowers present” only, the user could select all the highlighted rows in Table and click “Open flowers present.” In the example from SEINet presented in Table , this single scoring event would result in the selection and scoring of 1,031,786 records. In a separate scoring event, the user could select all records that make reference to both open flowers and fruits and then select “open flowers present” and “fruits present.” Because a curator is responsible for mapping free text strings from the database to a controlled vocabulary, this method does not rely on computerized inference. The ability to apply phenological scoring to any specimen within a Symbiota portal is highly efficient, and these types of tools should be developed within other database platforms.
Many platforms have been developed for remotely scoring images of specimens, and we review them below. It is vital that future scoring platforms conform as closely as possible to the proposed protocol to facilitate data integration. Furthermore, it is vital that specimen trait data, even when scored outside of the local database, remain associated with the original specimen record. This will allow trait data and occurrence data to travel together through the data aggregation process, preventing duplicated scoring efforts.
The new Image Scoring Tool, developed as part of the NEVP project, allows Symbiota network users to filter images and apply a phenological score to them (Fig. ). This approach has facilitated the scoring of over 240,000 images of New England specimens to date. Phenological scorings are being shared with end users through the Consortium of Northeastern Herbaria portal via the Darwin Core Extended MeasurementOrFact extension and Darwin Core Archives, as outlined above. This functionality will soon be available to all Symbiota‐based databases.
Notes from Nature is an online citizen science platform (Hill et al., ) originally developed to support the transcription of specimen labels, but it has expanded to include phenological classifications. Notes from Nature extends the Zooniverse (
(A) A typical specimen image presented as part of a phenology expedition on Notes from Nature. (B) Classification task requesting that volunteers record the number of fruits that are visible on the herbarium specimen. (C) Classification task requesting that volunteers record the number of open flowers visible on the herbarium specimen. Note that parts B and C display tools that help volunteers to complete the task (e.g., pan, zoom, rotate, tutorial, and help).
Notes from Nature phenology expeditions have so far solicited reports of flowering and fruiting as well as counts of reproductive structures for Quercus L., Coreopsis L., and Cakile Mill. Notes from Nature has launched expeditions asking for simple annotations of open flowers or fruits present, to more complex expeditions where users are asked to count numbers of unopened flowers, open flowers, and fruits. Asking users to report first‐ and second‐order scorings generated large volumes of accurate phenological data, whereas expeditions asking for more complex scorings, such as counts, had lower participation from the community of citizen science annotators and took much longer.
CrowdCurio is a new online platform designed to give researchers the ability to design and implement crowdsourcing projects tailored to their specific interests and data sources (Willis et al., ;
In a preliminary study of the efficiency and quality of CrowdCurio data collection, Willis et al. () compared data collected by expert (herbarium curators) and non‐expert (anonymous Amazon Mechanical Turk [Mturk] workers) participants for two common New England species: greater celandine (Chelidonium majus L.) and lowbush blueberry (Vaccinium angustifolium Aiton). They found that non‐expert counts were similar to expert counts, but that non‐experts were able to record nearly twice as much data at less cost over the same amount of time.
Data collected via crowdsourcing, however, are not without limitations. Although Willis et al. () found no difference in average counts between experts and non‐experts, non‐expert counts tended to be more variable per specimen. This in part depended on the specimen being assessed—specimens with more objects to count had higher error rates. As with any crowdsourcing project, care should be taken when choosing which specimens and taxa to include (e.g., are the flowers easy to identify?). Additionally, CrowdCurio is in the process of implementing additional features to improve data quality, such as filtering users based on their ability to repeat the same task. The phenological data generated within CrowdCurio can be expressed according to the protocol outlined in this paper and shared via Darwin Core Archives.
One ultimate goal is to combine herbarium specimen data with other sources of phenological data to make possible the detection of phenological changes across geographic, temporal, and taxonomic scales. The PPO provides an opportunity for herbarium data to be combined with disparate data sources, such as in situ phenological monitoring or satellite imagery. The PPO is a common vocabulary for describing plant phenological traits and was designed to provide a means to support global‐scale integration of phenological data. Ontologies provide highly structured, controlled vocabularies for data annotation and are particularly useful for standardization, because they not only establish a common terminology but also formalize logical relationships between terms such that they can be analyzed using machine reasoning. For example, logical term relationships in the PPO specify that any plant with “expanding leaves” must necessarily also have “non‐senescing leaves.” This logical structure means that data can be integrated at different levels of detail and software can be used to establish new facts about the data that were not expressed in the original data sets. This structure in turn enables large‐scale integration among a wide range of study types, including: (i) studies addressing similar phenophases but using different methodologies, (ii) studies involving different phenophases, and (iii) studies not specifically addressing phenology but producing other types of data (e.g., trait or climatic data). Thus, the PPO empowers researchers to aggregate larger data sets, at the global scale, and to address broader questions involving the interplay of phenology and other factors. Accordingly, the PPO is already being used to integrate data resources such as those from the USA National Phenology Network (Denny et al., ), the Pan‐European Phenology Network (
Never before has an understanding of phenology been so important to humans. We are in a time of massive environmental change, and the organisms upon which we depend will have to adapt or migrate if they are to avoid local or global extinction. Herbarium specimens are critical to understanding and mitigating those changes. We need phenological data from specimens now more than ever, and researchers are ready and eager to analyze high‐quality data sets, particularly those comprising high taxonomic diversity, temporal depth, and a broad geographic range. With minimal additional efforts during or post‐digitization, specimens can be scored quickly and easily and contribute to our understanding of our changing planet and the flora that sustains it.
This work was supported by the National Science Foundation (grants DBI‐1547229 [P.S.S.], DBI‐0735191 and DBI‐1265383 [R.L.W.], DBI‐1458550 [R.P.G.], DBI‐1410087 [A.B.M.], DBI‐EF1208835 [C.C.D.], DEB‐1556768 [S.J.M.], DBI‐1458264 [J.R.C.], and DBI‐1209149 [P.W.S.]). Additional support was provided by the Andrew W. Mellon Foundation, the Sibbald Trust, and the Scottish Government. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
All authors contributed equally to the development of the protocol. J.M.Y managed the writing of the paper with contributions from all authors. C.G.W, P.W.S, E.G., and M.W.D provided figures and text from related projects.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2018. This work is published under http://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Premise of the Study
Herbarium specimens provide a robust record of historical plant phenology (the timing of seasonal events such as flowering or fruiting). However, the difficulty of aggregating phenological data from specimens arises from a lack of standardized scoring methods and definitions for phenological states across the collections community.
Methods and Results
To address this problem, we report on a consensus reached by an
Conclusions
Our hope is that curators and others interested in collecting phenological trait data from specimens will use the recommendations presented here in current and future scoring efforts. New tools for scoring specimens are reviewed.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Biological Sciences, California Polytechnic State University, San Luis Obispo, California, USA
2 Division of Botany, Peabody Museum of Natural History, Yale University, New Haven, Connecticut, USA
3 Arizona State University, School of Life Sciences, Tempe, Arizona, USA
4 iDigBio, College of Communication and Information, Florida State University, Tallahassee, Florida, USA
5 Florida Museum of Natural History and Biodiversity Institute, University of Florida, Gainesville, Florida, USA
6 Boston University, Department of Biology, Boston, Massachusets, USA
7 La Brea Tar Pits and Museum, Los Angeles, California, USA
8 Department of Ecology, Evolution and Marine Biology, University of California, Santa Barbara, California, USA
9 Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, Massachusetts, USA; University of Minnesota, Department of Biology Teaching and Learning, Minneapolis, Minnesota, USA
10 Biodiversity Information Standards (TDWG), San Francisco, California, USA
11 CyVerse, University of Arizona, Tucson, Arizona, USA
12 Royal Botanic Garden Edinburgh, Edinburgh, United Kingdom
13 Florida Museum of Natural History and Biodiversity Institute, University of Florida, Gainesville, Florida, USA; Department of Biology, Appalachian State University, Boone, North Carolina, USA
14 Systematic Botany and Mycology, Department of Biology, Munich University (LMU), Munich, Germany
15 Department of Biology, Middle Tennessee State University, Murfreesboro, Tennessee, USA
16 Biology Department, Valdosta State University, Valdosta, Georgia, USA
17 University and Jepson Herbaria, University of California Berkeley, Berkeley, California, USA
18 Swedish University of Agricultural Sciences, Unit for Field‐based Forest Research, Lammhult, Sweden
19 USA National Phenology Network, University of Arizona, Tucson, Arizona, USA
20 UC Davis Center for Plant Diversity, Davis, California, USA
21 Department of Biological Science, Florida State University, Tallahassee, Florida, USA
22 Department of Organismic and Evolutionary Biology, Harvard University Herbaria, Cambridge, Massachusetts, USA
23 University and Jepson Herbaria, University of California Berkeley, Berkeley, California, USA; Department of Integrative Biology, University of California, Berkeley, California, USA