The estimation of demographic parameters is fundamental to successful conservation and evolutionary ecology. Since their initial development, capture–mark–recapture (hereafter, CMR) models have been used to estimate demographic parameters such as apparent survival (Cormack, 1964; Jolly, 1965; Seber, 1965), true survival and site fidelity (Burnham, 1993), transitions among discrete strata (Brownie et al., 1993), temporary emigration or breeding probability (Kendall et al., 1995, 1997), recruitment (Pradel, 1996), and the spatial distribution of organisms (Royle et al., 2013; Royle & Young, 2008). Parameter estimates from CMR models are often used as vital components of population models (Caswell, 2000; Schaub & Kéry, 2021) and to develop a more complete understanding of individual fitness (Cam et al., 2002; Stearns, 1992). CMR models typically consist of two primary components: (1) a model of latent biological processes (i.e., survival, movement among populations, emigration, disease dynamics), and (2) a model of the observation of uniquely identifiable individuals. Models of both latent biological and observation processes typically take the form of categorical or Bernoulli distributions, and individuals are grouped into discrete groups or states (e.g., alive or dead, observed or not observed).
Heterogeneity among “uniquely identifiable” (hereafter, marked) organisms in both biological processes (e.g., Cam et al., 2002; Pledger & Schwarz, 2002) and observation probability (e.g., Pledger, 2005; Pollock, 1982) has long been recognized as a central challenge in CMR modeling (Otis et al., 1978). In a seminal paper, Pollock (1982) proposed that heterogeneity in detection might be accounted for by subdividing primary occasions into multiple secondary occasions. Similarly, Fletcher (1994) developed a method for modeling the probability of encounter of individuals as a function of the number of unique resights of that individual during the previous occasion. Shortly thereafter, Kendall and others (Kendall et al., 1995, 1997) expanded the method developed by Pollock (1982) to estimate availability for encounter (i.e., zero-inflation) by partitioning primary occasions into shorter secondary occasions, assuming closure among secondary occasions within a primary occasion, and estimating probabilities of temporary emigration from the study area. Since that time, methods have been developed to estimate individual detection probabilities using random effects (Clark et al., 2005; Royle & Dorazio, 2008) or mixtures (Pledger, 2000; Pledger et al., 2003). More recent efforts have simultaneously used information about marked organism location and the locations of sampling efforts to model spatial variation in reencounter probability (Royle et al., 2013; Royle & Young, 2008). However, the estimation of heterogeneity in the observation process remains a key challenge in CMR studies, and the continued development of alternative approaches is critical for improved parameter estimation.
Heterogeneity in the detection of marked organisms is often driven by two primary processes. The first is whether or not an individual is even present within the bounds of the study area (i.e., temporary emigration as a source of zero-inflation; Kendall et al., 1995; Schaub et al., 2004). The second is variation among the latent encounter probabilities of individuals that are present. This latent heterogeneity can be affected by factors such as variation in individual behavior, life stage, and location relative to sampling effort (Royle & Young, 2008). When primary occasions extend over multiple days, weeks, or months, this can lead to some individuals being encountered many times while others are rarely, if ever, detected. The key concept in this paper is that in the same way that counts contain more information about the abundance of a population than simple detection/non-detection data, the number of encounters of marked individuals may contain more information about the observation process than detection/non-detection data (e.g., McClintock et al., 2009, 2019; McClintock & White, 2009). Thus, rather than summarizing capture–reencounter data using ones (encountered) and zeroes (not encountered) during a primary occasion or multiple secondary occasions, capture–reencounter data can also be summarized as counts of the number of times each marked individual was encountered during a primary occasion (McClintock et al., 2019; McClintock & White, 2009). The number of encounters can then be modeled using a variety of discrete distributions, such as the Poisson or negative binomial distributions. If model assumptions are met, this approach provides a flexible and useful approach to modeling the observation process and may improve upon existing tools to estimate heterogeneity in encounter probability among individuals. Notably, improved estimates of heterogeneity in the observation process lead to improved estimates of demographic parameters. In this paper, we (1) demonstrate the use of this approach with simulated data, (2) describe potential benefits relative to more traditional approaches, (3) demonstrate several approaches for modeling individual heterogeneity in encounter probability, and (4) discuss possible future extensions and uses of this parameterization.
METHODSWe simulated 250 CMR datasets, each with 10 primary occasions (). For each simulation, we released 25 marked individuals in the first through ninth primary occasions, for a total of 225 released individuals (). We simulated the latent state of each individual (; 1: alive, 0: dead) from occasion to occasion as, , given a survival probability generated from a beta distribution, . If an individual was alive in occasion t, we simulated its availability for encounter (; 1: available, 0: unavailable) given simulated Markovian (Kendall et al., 1997) probabilities of availability for encounter (),[Image Omitted. See PDF]
These probabilities are directly analogous to parameters described by Kendall et al. (2013), such that in this study is equal to the probability of availability given availability in , or as defined by Kendall et al. (2013), and in this study is equal to the probability of availability given absence in , or as defined by Kendall et al. (2013). During each primary occasion, we sampled individuals that were available for detection for 21 consecutive days (, that is, 3 weeks) given simulated individual random variation in daily detection probability (; Dorazio et al., 2013; Gomez et al., 2018). Thus, the simulated capture–recapture data form a 3-dimensional array (Y) with dimensions ,[Image Omitted. See PDF]where is the simulated mean daily detection probability of an average individual, and is the amount of among-individual heterogeneity in detection. We then summarized the daily CMR data for analysis with four different model types: (1) a Cormack–Jolly–Seber model where the secondary captures are ignored (CJS; Cormack, 1964; Jolly, 1965; Seber, 1965), (2) a robust design model (RD; Kendall et al., 1995, 1997), and two capture–recapture models with count-based observation likelihoods, (3) a zero-inflated Poisson (ZIP), and (4) a zero-inflated gamma-Poisson with heterogeneity in the number of encounters per individual (ZIGP). To summarize the CMR data (M) for a CJS model, we constructed an matrix and filled the matrix as a function of whether or not an individual was observed on any day during a primary occasion,[Image Omitted. See PDF]
To summarize the robust design encounter data (R) for the robust design capture–reencounter model, we subdivided each 21-day long primary occasion into three one-week long secondary occasions (). If an individual was observed on any day of a week in a secondary occasion, then that secondary occasion () equaled one. If an individual was not observed on any day during a specific secondary occasion, then . Finally, we summarized the counts of reencounters by individual and primary occasion by simply summing the total number of encounters of each individual during each primary occasion, .
In the same way that the data were generated, all four capture–recapture models share a common likelihood for the survival process. The latent state of each individual during each occasion () was modeled as a function of the individual's latent state in the previous occasion () and a survival probability (), . A vague prior was used for survival, . For the CJS model, we then simply modeled the primary occasions encounter data (M) as a function of the individual's latent state and a detection probability (p), . We specified a vague prior for detection probability . For the remaining three models, we also estimated whether an individual was available for detection () given its previous state () and vague priors for Markovian probabilities of availability for encounter (; Kendall et al., 1997).[Image Omitted. See PDF]
For the robust design model, we modeled whether or not each individual was detected during each secondary occasion as a function of its latent availability for detection during the primary occasion () and a secondary occasion detection probability (p). We then derived primary occasion detection probability (p*) from the secondary occasion detection probabilities for comparison of parameter estimates among models,[Image Omitted. See PDF]
For the zero-inflated Poisson model, we model the total number of encounters of each individual during each primary occasion () given availability for detection () an expected mean number of encounters per individual per primary occasion (),[Image Omitted. See PDF]
For the zero-inflated Gamma-Poisson model with heterogeneity in the number of expected observations per individual, we modeled the number of encounters of each individual during each primary occasion () given availability for detection (), the mean expected number of encounters per individual (), and individual encounter heterogeneity () estimated using an overdispersion parameter (),[Image Omitted. See PDF]
This parameterization is similar to Gamma-Poisson formulations of the negative binomial distribution (Greene, 2008); however, here we assume heterogeneity among individuals, not observations. Fitting these models in a Bayesian framework allows users to easily customize existing described count distributions for use in these model types. We called JAGS (Plummer, 2003) from R (R Core Team, 2018) using the jagsUI package (Kellner, 2016). For each simulated dataset, we sampled three MCMC chains of 50,000 iterations with an adaptive phase of 1000 iterations. We discarded the first 10,000 iterations and retained every tenth saved iteration. We assessed convergence visually, and chains converged acceptably. We calculated mean signed difference (MSD) as the mean of the differences between the median of the posterior distribution and the true parameter value used to simulate the data, and we calculated coverage as the proportion of simulations in which the 95% symmetric credible intervals included the true parameter value used to simulate the data.
RESULTSEstimates of survival () were low relative to truth for CJS models (MSD = −0.047; Coverage = 0.464), but constant (i.e., equivalent to truth) and calibrated (i.e., exhibited appropriate coverage near 0.95) for RD (MSD = −0.003; Coverage = 0.940), ZIP (MSD = −0.002; Coverage = 0.948), and ZIGP (MSD = 0.001; Coverage = 0.948) CMR models (Figure 1; Table 2). Estimates of availability for encounter given previous availability for encounter () were slightly underestimated by RD (MSD = −0.020; Coverage = 0.892) and ZIP (MSD = −0.013; Coverage = 0.896) models, but near truth for the ZIGP (MSD = 0.006; Coverage = 0.936) CMR model (Figure 2; Table 2). Estimates of availability for encounter given previous unavailability for encounter () were slightly overestimated by RD (MSD = 0.018; Coverage = 0.956), ZIP (MSD = 0.015; Coverage = 0.964), and ZIGP (MSD = 0.019; Coverage = 0.976) CMR models, but coverage was adequate. Estimates of detection probability (p) exhibited poor coverage (Figure 3; Table 2) for the RD (MSD = 0.009; Coverage = 0.832) CMR model. Estimates of the average number of reencounters per individual () were overestimated with poor coverage with the ZIP (MSD = 0.078; Coverage = 0.764) CMR model, and near truth with the ZIGP (MSD = 0.002; Coverage = 0.928) CMR model. The simulated individual heterogeneity in encounter probability () in the data was positively correlated with dispersion in the count data (D; Figure 4). The overdispersion parameter () in the ZIGP model accounted for some of this overdispersion (Figure 4), improving coverage and constancy for ZIGP models relative to other model types. ZIP and ZIGP models were computationally less expensive than RD models (Figure 4) to sample the same number of iterations.
FIGURE 1. Scatter and density plots of the medians of posterior distributions for apparent survival relative to truth (ϕ$$ \phi $$) from Cormack–Jolly–Seber (CJS; upper left), robust design (RD; upper right), zero-inflated Poisson (ZIP, lower left), and zero-inflated gamma-Poisson with individual heterogeneity (ZIGP; lower right), capture–mark–reencounter models used to analyze 250 simulated capture–mark–reencounter datasets.
FIGURE 2. Scatter and density plots of the medians of posterior distributions for availability for encounter relative to truth (γ$$ \gamma $$) from robust design (RD; left), zero-inflated Poisson (ZIP, center), and zero-inflated gamma-Poisson with individual heterogeneity (ZIGP; right), capture–mark–reencounter models used to analyze 250 simulated capture–mark–reencounter datasets.
FIGURE 3. Scatter and density plots of the medians of posterior distributions for primary occasion detection probability (p) or the expected number of encounters per individual (ϵ$$ \epsilon $$) from Cormack–Jolly–Seber (CJS; upper left), robust design (RD; upper right), zero-inflated Poisson (ZIP, lower left), and zero-inflated Poisson with individual heterogeneity (ZIGP; lower right), capture–mark–reencounter models used to analyze 250 simulated capture–mark–reencounter datasets.
FIGURE 4. Violin plots of model run times across 250 simulations for Cormack–Jolly–Seber (CJS; Cormack, 1964; Jolly, 1965; Seber, 1965), robust design (RD; Kendall et al., 1995, 1997), zero-inflated Poisson (ZIP; this study) and zero-inflated gamma-Poisson (ZIGP; this study) capture–mark–recapture models (left), scatter plots of the index of dispersion (D; Var(C)/Mean(C)) for the capture–mark–reencounter count data relative to the simulated heterogeneity in detection probability among individuals (σδ$$ {\sigma}_{\delta } $$), and scatterplots of the mean of posterior distributions of the overdispersion parameter (θ$$ \theta $$) regressed against the index of dispersion for each capture–mark–recapture dataset.
We demonstrate that CMR models parameterized with zero-inflated count distributions can function much like robust design CMR models. Estimates of survival probability from RD, ZIP, and ZIGP models were centered around truth, while estimates of survival from the CJS model were consistently low relative to truth. Further, the use of these model types may allow for improved estimation of heterogeneity in encounter probability among individuals and improve computational efficiency (Figure 4). We see substantial utility for these parameterizations in a variety of scenarios. For instance, non-breeding resights of individuals at wintering or stopover sites may provide an excellent system to model the total number of encounters rather than simple detection/non-detection data. Further, existing and emerging data types such as camera traps, PIT tags, and automated telemetry may provide large number of detections in discrete time blocks, providing excellent data for the models we describe in this paper.
As we demonstrate, this approach may be particularly useful when unobservable states exist, as counts of reencounters allow for the estimation of a zero-inflation parameter (i.e., availability for detection), which may be biologically analogous to breeding probability or presence at a stopover or wintering site. Count parameterizations might also be used to model secondary occasions within a robust design model; one or more secondary occasions may be estimated from some count distribution and others from a more typical Bernoulli distribution. The inherent flexibility of programs such as JAGS (Plummer, 2003), NIMBLE (de Valpine et al., 2017), and Stan (Carpenter et al., 2017), and ample literature on capture–reencounter parameterizations should lead to a wide array of extensions of these model types, and their incorporation into joint likelihood models, such as integrated population models (Schaub & Kéry, 2021).
Critically, the use of these model types also has advantages for estimating heterogeneity in detection probability among individuals that are observable, as some individuals may be seen more often than others. Estimating heterogeneity in probabilities from a small number of Bernoulli trials can be challenging (Fay et al., 2022). Summarizing mark–reencounter data as counts of encounters may provide additional information for estimating latent heterogeneity among individuals or estimating mixtures (e.g., Pledger et al., 2003). For example, rather than the heterogeneity parameterization explored in this paper, one might specify a mixture distribution for the number of encounters per individual. Individual covariates can be incorporated simply by modeling the expected number of encounters with a log-link function. We anticipate a variety of other parameterizations might be useful as well (Table 1) and that simulation work may reveal more effective parameterizations than those described herein. For instance, recent research has demonstrated that a count-based observation likelihood can be useful for helping to address “false-positives” in reencounter data (Rakhimberdiev et al., 2022). Thus, we suggest that continued extension of these methods may have broad utility moving forward for capture–reencounter modeling.
TABLE 1 Potential parameterizations for zero-inflated count distribution-based capture–reencounter models, where is the number of encounters of individual
| Parameterization | Model and priors |
| 1. Poisson |
|
| 2. Gamma-Poisson with individual heterogeneity |
|
| 3. Poisson with two categorical mixtures () |
|
| 4. Alternative Gamma–Poisson with individual heterogeneity () |
|
| 5. Lognormal with individual covariates (X) and heterogeneity () |
|
Note: We explicitly test parameterizations 1 and 2 in this paper. Parameterization 3 allows for mixtures in encounter probability, where is the proportion of individuals in group one, and is an categorical variable defining the mixture of each individual. Parameterization 4 is similar to parameterization 2, but with a slightly different model for each individual's encounter probability with shape () and rate () hyperpriors. Finally, parameterization 5 allows for the inclusion of individual covariates (), associated regression parameters (), and individual heterogeneity (). Please note that a much larger number of potential parameterizations exists, and see Pledger et al. (2003), Greene (2008), Lynch et al. (2014), Kéry and Royle (2015), and McClintock et al. (2009, 2019) for further reading.
TABLE 2 Mean difference between the medians of the posterior distributions and truth and parameter coverage (in parentheses) for estimates of apparent survival (), availability for encounter given (), availability for encounter given (), primary occasion detection probability (
| Parameter | CJS | RD | ZIP | ZIGP |
| −0.047 (0.464) | −0.003 (0.940) | −0.002 (0.948) | 0.001 (0.948) | |
| – | 0.018 (0.956) | 0.015 (0.964) | 0.019 (0.976) | |
| – | −0.020 (0.892) | −0.013 (0.896) | 0.006 (0.936) | |
| p (CJS) or p* (RD) | −0.306 (0.004) | 0.010 (0.832) | – | – |
| – | – | 0.078 (0.764) | 0.002 (0.928) |
As with the use of any model, violations of model assumptions will lead to inaccurate parameter estimates. We caution against the use of these models when encounters are conditional on previous encounters within a season (i.e., trap happiness). As a particularly problematic example, if the nest of a marked animal is discovered and the animal is then observed repeatedly while visiting the nest, this would serve as an additional type of zero-inflation (i.e., nesting in the study area is a Bernoulli trial, the discovery of the nest is a Bernoulli trial, and the subsequent visits are a product of study design and nest monitoring protocols, not a random encounter process). We expect that other types of heterogeneity are common in CMR data. For example, the number of encounters might be right truncated if observers cease recording reencounters of individuals that have already been encountered multiple times. Thus, we strongly encourage careful thought about how previous monitoring protocols might affect the distribution of encounters of each individual when applying these models to data and discourage using this approach without explicit information about monitoring protocols.
The use of the Poisson distribution requires the assumption that the mean and the variance are equal. When the encounter data are under or overdispersed, this can lead to respective under or overestimation of the expected number of encounters per individual. Similarly, the probability of availability for encounter will be over or underestimated given under or overdispersion of the encounter data (Figure 4). While overdispersion can be modeled simply using gamma-Poisson mixture (demonstrated herein) or negative binomial distributions (Table 1), underdispersion requires the use of more complex distributions such as the Conway–Maxwell–Poisson (Conway & Maxwell, 1962; Lynch et al., 2014). We suggest that additional simulation work is required to fully understand the benefits and costs associated with using alternative distributions. Notably, while the authors have not yet developed goodness-of-fit tests for these model types, the use of these parameterizations might simplify goodness-of-fit testing for capture–reencounter models due to the use of counts rather than Bernoulli trials.
While we have demonstrated in this paper that count-based observation parameterizations can be useful for capture–mark–reencounter studies, much remains to be learned. For example, careful thought will be required for developing appropriate priors (e.g., Northrup & Gerber, 2018), and empirical research may reveal unforeseen problems. Future simulation work might assess the impacts of priors on inference, further examine the impacts of over- and under-dispersion, and explore various other capture–recapture parameterizations and count distributions.
AUTHOR CONTRIBUTIONSThomas V. Riecke: Conceptualization (lead); writing – original draft (lead); writing – review and editing (equal). Daniel Gibson: Conceptualization (equal); writing – review and editing (equal). James S. Sedinger: Conceptualization (equal); writing – review and editing (equal). Michael Schaub: Conceptualization (equal); writing – review and editing (equal).
ACKNOWLEDGMENTSWe thank David N. Koons and Madeleine G. Lohman for helpful discussion of model parameterizations.
CONFLICT OF INTERESTThe authors have no conflict of interest to declare.
DATA AVAILABILITY STATEMENTAll of the data used in this manuscript were simulated. The R script for simulating these data is attached as Appendix S1.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The estimation of demographic parameters is a key component of evolutionary demography and conservation biology. Capture–mark–recapture methods have served as a fundamental tool for estimating demographic parameters. The accurate estimation of demographic parameters in capture–mark–recapture studies depends on accurate modeling of the observation process. Classic capture–mark–recapture models typically model the observation process as a Bernoulli or categorical trial with detection probability conditional on a marked individual's availability for detection (e.g., alive, or alive and present in a study area). Alternatives to this approach are underused, but may have great utility in capture–recapture studies. In this paper, we explore a simple concept:
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
; Gibson, Daniel 2
; Sedinger, James S 3 ; Schaub, Michael 1
1 Swiss Ornithological Institute, Sempach, Switzerland
2 Warner College of Natural Resources, Colorado State University, Fort Collins, Colorado, USA
3 Department of Natural Resources and Environmental Science, University of Nevada, Reno, Nevada, USA




