Estimating population density is one of the fundamental goals of population ecology and is necessary for the effective management and conservation of wildlife (Royle et al., 2014). With the need for robust population estimates has come the advent and application of innovative data collection methods, such as noninvasive genetic sampling and camera traps (Long et al., 2008), enabling data to be collected at increasingly larger spatial and temporal scales. Alongside this growth in data collection abilities, coupled with increasing computing power, has come rapid development and widespread application of more sophisticated quantitative methods to estimate population density (Lewis et al., 2018). As a result, statistical models for estimating animal density and abundance have become a pervasive tool in ecology and are an integral component of most contemporary wildlife management and conservation programs. However, with this ever-increasing ability to address more complex research questions with advanced models comes the need to continually assess the robustness of models' assumptions to realistic ecological conditions and sampling processes (Gerber & Parmenter, 2015).
While estimating population density is the objective of many wildlife monitoring programs (Burton et al., 2015), identifying factors that give rise to spatial variation in density can have broader relevance and applicability to conservation and management (Fuller et al., 2016). As humans continue to transform natural landscapes, understanding how current and past habitat conditions and human disturbances interact to shape carnivore abundance and distribution over large landscapes is critical for predicting ecological outcomes, informing conservation and management, and assessing the effectiveness of these actions (Lamb et al., 2018). Spatially explicit capture–recapture (SECR; Borchers & Efford, 2008; Efford, 2004; Royle et al., 2014) methods are well-suited for this purpose. SECR represents an extension of the classical capture–recapture framework (Otis et al., 1987), which ignores locations of capture or detection, by coupling a spatial point process model for the distribution of animals (density submodel) with an observation model that describes encounter probability as a function of the distance between a sample location and animals' activity center (detection submodel; Borchers & Efford, 2008; Royle et al., 2014). SECR takes advantage of the increasing ability to identify individuals from noninvasive methods (i.e., passively collected camera trap photos and genetic samples). In comparison with other density estimation methods that do not require individual identification, SECR can often provide greater efficiency in survey design and yields more precise estimates from similar or less survey efforts (Howe et al., 2022). Although relatively recent in the field of statistical ecology, SECR methods have undergone considerable development and have become widely adopted as the standard for quantifying spatial patterns in density where animals are individually identifiable (Dupont et al., 2020).
Reliable information on large carnivore (hereafter, carnivore) population density is important given the species' wide-ranging impacts on ecosystems and their vulnerability to human-induced environmental changes (Ripple et al., 2014). Further, carnivores often come into conflict with people, making them more prone to human-induced population declines (Nyhus, 2016; Treves & Karanth, 2003). Consequently, carnivores are increasingly subject to intensive conservation and management programs that necessitate accurate population estimates to ensure success.
In practice, however, obtaining unbiased, precise estimates of carnivore density is often challenging because the species tend to range widely over difficult-to-access areas and exhibit low and variable densities and capture probabilities (Dröge et al., 2020; MacKenzie et al., 2005). SECR requires multiple detections of some of the same individuals (recaptures), including at different detector locations (spatial recaptures), where the number and spatial distribution of recaptures inform the baseline detection probability (g0) and spatial scale parameter (σ) of the detection submodel (Borchers & Efford, 2008; Royle et al., 2014). Sparse data, in which numbers of recaptures and spatial recaptures are low, have been demonstrated to lead to imprecise and biased estimates (Clark, 2017; Sollmann et al., 2012; Sun et al., 2014). Data sparsity is one of the major impediments to successfully implementing SECR studies of carnivores (Nawaz et al., 2021). Moreover, this challenge is further amplified when there are multiple sources of variation in detectability or density, as data must then be sufficient to estimate covariate effects or strata-specific model parameters.
Research to date has addressed the issue of data sparsity through improved survey design and statistical techniques. In recent years, multiple studies on the American black bear (Ursus americanus) have contributed to improvements in SECR sampling methods and yielded insight into some aspects of survey design (Clark, 2017; Sollmann et al., 2012; Sun et al., 2014; Wilton et al., 2014). Collectively, these studies demonstrate that SECR methods generally perform well provided that the spatial components of survey design (extent, spacing, and configuration of arrays of detectors) are appropriate to the spatial characteristics of different subsets of the sampled population during the sampling period (such as home ranges size and range of individual movements). However, in practice, it can be logistically and financially challenging to conduct field surveys with sufficient spatial coverage and trap spacing for wide-ranging species that exist at low and nonuniform densities (Wilton et al., 2014). In cases when surveys yield insufficient data to estimate model parameters, a common approach is to pool data from separate surveys, where models are simultaneously fit with parameters shared across time or space (MacKenzie et al., 2005). This approach has been employed in several SECR studies of bears (Ursus spp.; Azad et al., 2019; Howe et al., 2013; Schmidt et al., 2022). With sufficiently large datasets, covariates can be included to account for differences in density and detectability to better reflect the actual state of the population (Sollmann et al., 2012).
While aggregating data can enhance precision and power to detect and model some sources of variation in detectability or density (Azad et al., 2019; Howe et al., 2013; Schmidt et al., 2022), this approach may come at the expense of bias (MacKenzie et al., 2005). When carnivores are sampled across broad landscapes, the assumption that detection and density parameters remain constant across space is likely violated. Both density and detectability of black bears have been documented to vary by sex and across space and time (Howe et al., 2013; Humm et al., 2017; Humm & Clark, 2021). Thus, if a misspecified SECR model is fit in the presence of sex and spatial heterogeneity, there is potential to produce biased density estimates (Tobler & Powell, 2013). However, it is often challenging to account for all potential sources of heterogeneity in SECR models due to small sample sizes relative to the number of parameters and the computational burden of fitting highly parameterized SECR models. An alternative to avoiding bias introduced by model misspecification is to fit separate models to subsets of the data. Yet, this approach is not always practical as it may reduce sample size such that it is impossible to estimate parameters or estimates are too imprecise to be useful for management or conservation. Further, for sparse datasets, as the number of spatial and nonspatial recaptures decreases, estimates of σ are often negatively biased, causing overestimation of population size and density (Clark, 2017; Sun et al., 2014).
Collectively, the above issues highlight a long-standing issue in statistical ecology: selecting an appropriate model for a dataset (Brewer et al., 2016). Model selection, the purpose of which is to identify models that optimize the trade-off between bias and precision (Burnham & Anderson, 2002), is a critical yet challenging component of SECR analyses. For the vast amount of SECR studies on bears that use a frequentist framework, models are selected using Akaike information criterion adjusted for small sample sizes (AICc; Hurvitch & Tsai, 1989) in which the top-ranked model(s) are used for inferences regarding parameter estimates and relationships between variables of interest (examples include Lamb et al., 2018; Morehouse & Boyce, 2016; Obbard et al., 2010).
Identifying models that allow for aggregation of survey data to improve precision while still accounting for important sources of heterogeneity to reduce bias is thus an essential consideration for SECR studies on elusive species where pooling data across space or time is common (Howe et al., 2013, 2022; Schmidt et al., 2022). Yet, few studies have systematically examined the performance of model selection and consequences of misspecification of both the density and detectability submodels on SECR model performance under realistic levels of sex-based and spatial variation in carnivores.
Here, we assess how unmodeled spatial and sex-based variability in density and detectability influences the performance of SECR models and AICc model selection when data are pooled across sampling areas. We do so through simulation, using parameters consistent with noninvasive genetic surveys of black bears. However, the findings are generalizable to any wide-ranging or elusive species, particularly carnivores. We had three primary research questions: (1) How well does AICc correctly identify the true data generating model? (2) How well does the true data generating model perform, in terms of bias, accuracy, and precision, across a suite of simulations with varying levels of complexity in spatial (i.e., across study areas) and sex-based variation in density and detectability? (3) When AICc does not rank the data generating model as the top model, what are the implications regarding inferences about population density?
METHODS Hypothetical study area and sampling designSimulations were designed to roughly resemble the ongoing black bear monitoring research project in Ontario, Canada (Howe et al., 2013, 2022; Marrotte et al., 2022; Obbard et al., 2010). We simulated a hypothetical study area with a heterogeneous landscape composed of three distinct study areas (A, B, C). These study areas broadly represent habitats in Ontario, Canada, in which black bear density and detectability vary due to differences in habitat quality and human disturbances across the landscape (Howe et al., 2013; Obbard et al., 2010). Although these scenarios are parameterized for black bears in Ontario, patchy or gradients in variable density exist for many carnivore species due to habitat conditions and management practices (Fuller et al., 2016; Humm et al., 2017). In each hypothetical study area, black bears were sampled using two linear arrays of 40 detectors (i.e., line of baited barbed wire hair corral detectors spaced apart by some distance), where arrays were far enough apart so that no individuals were detected at more than one array. Linear arrays were selected to replicate the curvilinear arrays used for ongoing black bear research and monitoring in Ontario (Howe et al., 2013, 2022; Marrotte et al., 2022; Obbard et al., 2010). In the SECR literature, researchers recommend an optimal detector spacing to be less than two times the minimum spatial scale of the detection probability function (σmin; Sollmann et al., 2012; Sun et al., 2014) and when g0 is low to be less than σmin (Efford & Boulanger, 2019). Therefore, using our σmin of 2500 m (Table 2; grouping code 3 for σ female area C), detectors were spaced 2000 m apart.
Simulation of populations and capture historiesSECR data were simulated for populations of female and male bears over five sampling occasions, where populations were distributed according to a homogenous Poisson point process within each study area. Populations of both sexes were simulated using a 6601.76-km2 state space defined as the linear array plus a 27-km buffer; this buffer corresponds to 4σmax, which is recommended to ensure that animals with activity centers outside the state space have negligible chances of being detected (Efford, 2021). While this buffer is excessive for most female black bears, using an area of integration that is too small can positively bias SECR density estimates, whereas using an area that is too large does not lead to bias as density estimates reach an asymptote with larger buffer widths (Efford, 2021; Royle et al., 2014). Spacing between the grid points of the mask was set at 2.2 km to reduce computation time after verifying that increasing resolution had negligible effects on density estimates.
A total of 29 scenarios were simulated (Table 1) by holding model parameters (D, σ, g0) constant or having them vary by either sex, area, or both sex and area. To reduce the number of scenarios and computational burden, we varied g0 by sex alone or combined area and sex. We did not vary g0 by area alone because we did not expect area to be the main driver of variation in baseline detectability for large carnivores, which often display sex-specific differences in home range size. We selected parameter values for each sampling area (Table 2) that were consistent with prior simulation studies for black bears (Table 2; Clark, 2017; Sollmann et al., 2012; Sun et al., 2014) and represented reasonable parameter combinations for black bears in Ontario (Howe et al., 2013, 2022; Obbard et al., 2010). Area A was simulated with relatively high g0 and σ and low D (representative of relatively unproductive boreal forests in Ontario; Rowe, 1972); area C was simulated with relatively low g0 and σ and high D (representative productive Great Lakes–St Lawrence [GLSL] Forest in Eastern Ontario); area B was simulated with intermediate values. Within areas, males were simulated with higher values of σ and lower values of D and g0 due to differences in density and detectability between the sexes (Hooker et al., 2015; Humm et al., 2017). For each scenario, we generated 1000 populations and corresponding capture–recapture datasets (hereafter, simulations), resulting in 29,000 datasets.
TABLE 1 Structure of scenarios where each row details a scenario with varying density (
Scenario ID | D | g0 | σ | |
000 | . | . | . | D(.) g0(.) σ(.) |
001 | . | . | Area | D(.) g0(.) σ(area) |
023 | . | Sex | Sex and area | D(.) g0(sex) σ(sex + area) |
031 | . | Sex and area | Area | D(.) g0(sex + area) σ(area) |
032 | . | Sex and area | Sex | D(.) g0(sex + area) σ(sex) |
100 | Area | . | . | D(area) g0(.) σ(.) |
101 | Area | . | Area | D(area) g0(.) σ(area) |
103 | Area | . | Sex and area | D(area) g0(.) σ(sex + area) |
122 | Area | Sex | Sex | D(area) g0(sex) σ(sex) |
123 | Area | Sex | Sex and area | D(area) g0(sex) σ(sex + area) |
130 | Area | Sex and area | . | D(area) g0(sex + area) σ(.) |
131 | Area | Sex and area | Area | D(area) g0(sex + area) σ(area) |
132 | Area | Sex and area | Sex | D(area) g0(sex + area) σ(sex) |
133 | Area | Sex and area | Sex and area | D(area) g0(sex + area) σ(sex + area) |
203 | Sex | . | Sex and area | D(sex) g0(.) σ(sex + area) |
223 | Sex | Sex | Sex and area | D(sex) g0(sex) σ(sex + area) |
230 | Sex | Sex and area | . | D(sex) g0(sex + area) σ(.) |
231 | Sex | Sex and area | Area | D(sex) g0(sex + area) σ(area) |
232 | Sex | Sex and area | Sex | D(sex) g0(sex + area) σ(sex) |
233 | Sex | Sex and area | Sex and area | D(sex) g0(sex + area) σ(sex + area) |
301 | Sex and area | . | Area | D(sex + area) g0(.) σ(area) |
302 | Sex and area | . | Sex | D(sex + area) g0(.) σ(sex) |
320 | Sex and area | Sex | . | D(sex + area) g0(sex) σ(.) |
321 | Sex and area | Sex | Area | D(sex + area) g0(sex) σ(area) |
322 | Sex and area | Sex | Sex | D(sex + area) g0(sex) σ(sex) |
323 | Sex and area | Sex | Sex and area | D(sex + area) g0(sex) σ(sex + area) |
331 | Sex and area | Sex and area | Area | D(sex + area) g0(sex + area) σ(area) |
332 | Sex and area | Sex and area | Sex | D(sex + area) g0(sex + area) σ(sex) |
333 | Sex and area | Sex and area | Sex and area | D(sex + area) g0(sex + area) σ(sex + area) |
Note: Scenario ID coded by variables corresponding to whether the parameters (D, g0, σ) were held constant (0) or varied by area (1), sex (2), or both sex and area (3). Period (.) in columns represents that parameter held constant. Last column indicates the approximate true data generating model () for each scenario.
TABLE 2 Spatially explicit capture–recapture parameter values for density (
Grouping codea | Grouping description | Parameterb | Area A | Area B | Area C |
3 | Sex and area | D female | 0.03 | 0.06 | 0.12 |
D male | 0.02 | 0.04 | 0.08 | ||
g0 female | 0.35 | 0.25 | 0.15 | ||
g0 male | 0.20 | 0.15 | 0.10 | ||
σ female | 4500 | 3500 | 2500 | ||
σ male | 6750 | 5250 | 3750 | ||
1 | Area | D | 0.025 | 0.05 | 0.10 |
g0 | 0.275 | 0.200 | 0.125 | ||
σ | 5625 | 4375 | 3125 | ||
2 | Sex | D female | 0.07 | ||
D male | 0.05 | ||||
g0 female | 0.25 | ||||
g0 male | 0.15 | ||||
σ female | 3500 | ||||
σ male | 5250 | ||||
0 | Baseline | D | 0.06 | ||
g0 | 0.20 | ||||
σ | 4375 |
aGrouping code corresponds to scenario ID where variables represent whether parameters D, g0, and σ vary by either area (1), sex (2), sex and area (3), or are constant (0).
bσ is represented by distance (in meters) and density (number of bears per square kilometers).
Noticeably, these scenarios were not exhaustive of the entire parameter space. For SECR models, a minimum of 20 spatial recaptures is recommended (Efford et al., 2004, 2009) as the precision of density estimates is influenced by the number of spatial recaptures in a sample (Schmidt et al., 2022; Sun et al., 2014). As this work aimed to assess the influence of spatial and sex-based heterogeneity on density estimates, irrespective of the influence of sample size, parameter values were selected to maintain a relatively similar and sufficient number of recaptures across sampling areas while still being biologically realistic for black bears.
Data from all three study areas from each simulation were analyzed simultaneously; study areas were modeled as sessions in a multisession analysis that allowed for different degrees of data aggregation between sexes and among areas to estimate parameters. For each scenario, we fit 105 candidate model forms, representing almost all possible additive and interactive combinations of parameters (for a full list of candidate models, see Appendix S1: Table S1) by maximizing the full likelihood for proximity detectors and using the half-normal detection probability function. This resulted in many models (3,045,000 models overall) requiring high computational costs and was prohibitively slow to summarize model outputs. There were negligible differences in density estimates from models with interactive or additive effects, and therefore, we excluded candidate model forms where covariates on any one of the parameters (D, g0, ) included an interaction effect (see Appendix S1: Section 4 for further clarification). This resulted in 48 candidate models with only additive effects included in the subsequent analysis and presented in the following Results and Discussion.
Evaluation of model performanceFor each scenario and model form, we compared density estimates to the expected values in terms of bias (mean percent relative bias [MPRB] and the 95% CI coverage), precision (mean coefficient of variation [MCV]), and accuracy (root-mean-square error [RMSE]). A |MPRB| ≤ 5% was considered an allowable bias (Dupont et al., 2020) and an MCV < 0.2 was considered acceptable for carnivore management (Proctor et al., 2010). A low RMSE represented a good trade-off between low bias and variance (Blanc et al., 2013).
Model selectionAICc is one of the most commonly used model selection criteria for SECR studies and can be used to rank competing candidate models. ΔAICc is the difference in AICc between the top-ranked model and other models, where the top-ranked model has ΔAICc = 0. Burnham and Anderson (2002) suggest that models with ΔAICc ≤ 2 have substantial support, models with 4 < ΔAICc ≤ 7 have considerably less support, and models with ΔAICc > 10 have negligible support. For each scenario, we identified the data generating model () that best approximated the expected parameters. Then, using Burnham and Anderson's (2002) guidelines, for each scenario, we calculated the frequency, out of the 1000 simulations, at which was included in the following generalized classes: (1) the top-ranked model (ΔAICc = 0); (2) 0 ≤ ΔAICc ≤ 2; (3) 2 < ΔAICc ≤ 10; (4) ΔAICc > 10.
All simulations were implemented through R 4.0.4 (R Core Team, 2021) using packages “secr” version 4.3.3 (Efford, 2020a) and “secrdesign” version 2.5.11 (Efford, 2020b). As computation time was prohibitively slow with a stand-alone personal computer, we used high-performance computing software provided by Compute Canada (
Our simulations yielded variable numbers of animals, detections, and spatial recaptures across scenarios; see Appendix S1: Figures S1 and S2 for a summary. All models converged.
Model selectionWe first identified the data generating model () that best approximated the expected parameters for each scenario (see Appendix S1: Section 6 for details on identifying the ). Across scenarios, we were able to reliably identify as having substantial support (0 ≤ ΔAICc ≤ 2) for 71.5%–100% of the simulations (Table 3). We identified as the top-ranked model (ΔAICc = 0) for fewer simulations on average (42.6%–99.9% of the simulations; Table 3). The likelihood that we selected generally increased as the number of parameters in increased (Table 3; also see AICc weights in Appendix S1: Table S3). In contrast, we rarely selected as having considerably less or no support (0%–27.2% of the simulations; Table 3).
TABLE 3 The percent, out of 1000 simulations, that the true data generating model () in each scenario was identified as the top AICc ranked model (ΔAICc = 0), having substantial support (0 ≤ ΔAICc ≤ 2), considerably less support (2 < ΔAICc ≤ 10), or no support (ΔAICc > 10).
Scenario | K | ΔAICc = 0 | 0 ≤ ΔAICc ≤ 2 | 2 < ΔAICc ≤ 10 | ΔAICc > 10 |
000 | 3 | 42.6 | 71.5 | 27.2 | 1.3 |
001 | 5 | 48.7 | 76.2 | 23.1 | 0.7 |
100 | 5 | 51.2 | 79.8 | 19.8 | 0.4 |
023 | 7 | 61.8 | 84 | 15.4 | 0.6 |
032 | 7 | 61.8 | 87 | 12.3 | 0.7 |
101 | 7 | 58.8 | 84 | 15.7 | 0.3 |
122 | 7 | 55.1 | 75.2 | 12.5 | 12.3 |
203 | 7 | 66.6 | 87.3 | 12.3 | 0.4 |
230 | 7 | 64.5 | 86 | 13.7 | 0.3 |
302 | 7 | 67.6 | 86.7 | 13.2 | 0.1 |
320 | 7 | 63 | 83.2 | 16.6 | 0.2 |
031 | 8 | 58.5 | 83.2 | 16.5 | 0.3 |
103 | 8 | 67.1 | 88.9 | 10.9 | 0.2 |
130 | 8 | 60.2 | 84.2 | 15.3 | 0.5 |
223 | 8 | 73.8 | 90 | 9.7 | 0.3 |
232 | 8 | 75.9 | 92.2 | 7.7 | 0.1 |
301 | 8 | 69.3 | 88.3 | 11.3 | 0.4 |
322 | 8 | 75.5 | 90 | 9.8 | 0.2 |
123 | 9 | 74 | 90.6 | 9.2 | 0.2 |
132 | 9 | 72.2 | 90.7 | 9.2 | 0.1 |
231 | 9 | 72 | 90 | 9.8 | 0.2 |
321 | 9 | 71 | 90.8 | 8.8 | 0.4 |
131 | 10 | 70.3 | 89.5 | 10.4 | 0.1 |
233 | 10 | 86.5 | 93.9 | 5.9 | 0.2 |
323 | 10 | 87.4 | 95.3 | 4.6 | 0.1 |
332 | 10 | 86.6 | 94.8 | 5 | 0.2 |
133 | 11 | 83.6 | 94.8 | 4.9 | 0.3 |
331 | 11 | 84.2 | 95 | 4.8 | 0.2 |
333 | 12 | 99.9 | 100 | 0 | 0 |
Note: Scenarios are coded by variables corresponding to whether the model parameters (D, g0, ) vary by area (1), sex (2), sex and area (3), or are constant (0). ΔAICc is the difference between the focal model and the top-ranked model, and K is the number of parameters in .
Performance of the data generating model BiasFor all scenarios, yielded estimates of density with less than 5% |MPRB| (Figure 1; Appendix S1: Figure S3). Although still within ±5% bias, for both sexes and across areas, there was a very slight positive MPRB for scenarios where had constant density (Figure 1; Appendix S1: Figure S3). Taking into consideration the remaining scenarios (density varied by sex, area, or sex and area), estimates for females were slightly negatively biased, with the magnitude greater in area A (82.6% of the total scenarios displayed negative MPRB in area A, 79.3% in area B, and 72.4% in area C) compared with males where there was a slight positive MPRB for approximately half of the scenarios (51.7% of the total scenarios displayed positive MPRB for area A, 48.2% for area B, and 51.7% for area C; Appendix S1: Figure S3). The variance in percent relative bias (PRB; i.e., the spread of estimates across the 1000 simulations) differed by scenario; scenarios where had constant density had the least variable PRB, while scenarios where had density vary by either area or sex and area displayed more variable PRB (Figure 1). Scenarios where density varied by sex had intermediate variation in PRB (Figure 1). CI coverage of was near nominal across all scenarios (across areas and sexes, CI coverage ranged from 93.0% to 96.5%; mean 94.9%); see Appendix S1: Table S2.
FIGURE 1. Percent relative bias (PRB) of female (♀) and male (♂) black bear density estimates for the true data generating model (MG$$ {M}_G $$) across study areas (A, B, C), for 1000 simulations of each scenario. Scenarios are coded by variables corresponding to whether the parameters (D, g0, σ$$ \upsigma $$) were constant across areas and sexes (0), or varied by area (1), sex (2), or both sex and area (3); see Table 1 for more detailed summary of scenarios. White dots within the violins represent the mean percent relative bias and colors of violins represent the number of parameters in MG$$ {M}_G $$. Thick black horizontal lines represent PRB within 5% and red horizontal lines represent no bias. Background colors correspond to the four levels of variation in density: constant density (white), density varies by area (light gray), sex (medium gray), and both area and sex (dark gray).
For all scenarios, density estimates had CVs well below 0.2 for both sexes and all areas (CV range 0.033–0.106; mean 0.062; Figure 2). Estimates were most precise where was simple (number of parameter K ≤ 5) and density was constant, followed by scenarios where density varied by sex but not among study areas (Figure 2). However, for the latter scenarios, there were opposite patterns in MCV between sexes and the CV remained relatively consistent across areas for each sex (Figure 2). The remaining scenarios (density varied either by area or by sex and area) were generally characterized by the least precise estimates.
FIGURE 2. CV of female (♀) and male (♂) black bear density estimates for the true data generating model (MG$$ {M}_G $$) across study areas (A, B, C), for 1000 simulations of each scenario. Scenarios are coded by variables corresponding to whether the parameters (D, g0, σ$$ \upsigma $$) were constant across areas and sexes (0), or varied by area (1), sex (2), or both sex and area (3); see Table 1 for more detailed summary of scenarios. White dots within the violins represent the mean CV and colors of violins represent the number of parameters in MG$$ {M}_G $$. Background colors correspond to the four levels of variation in density: constant density (white), density varies by area (light gray), sex (medium gray), and both area and sex (dark gray).
In contrast to bias, the CV and its variation differed across scenarios and between sexes and areas. Across most scenarios, area B (intermediate density and detectability) exhibited on average the most precise estimates, followed by area C (high density and low detectability) and area A (low density and high detectability). Variability in CV generally increased with increasing magnitude of CV, with area A displaying the most elevated and variable CV values for scenarios where had density vary by either area or sex and area (Figure 2). Further, despite attempts to maintain a similar number of total spatial recaptures across scenarios, this was not always possible due to the nature of parameter combinations for some scenarios. Consequently, there was a pattern of generally higher precision for some scenarios with more recaptures (Appendix S1: Figure S1).
AccuracyRMSE followed a similar pattern to precision, with RMSE slightly higher for most scenarios in which MCV was elevated (Figure 3); this pattern was most pronounced for area C. Specifically, area C exhibited the highest variability in RMSE for scenarios where had density vary by area or sex and area, followed by area B and then area A. Similar to precision, for scenarios where had density varying by sex, RMSE was relatively consistent across all areas for each sex and there were opposite patterns in RMSE across these scenarios between males and females (Figure 3).
FIGURE 3. Root-mean-squared error of female (♀) and male (♂) black bear density estimates for the true data generating model (MG$$ {M}_G $$) across study areas (A, B, C) from 1000 simulations of each scenario. Scenarios are coded by variables corresponding to whether the parameters (D, g0, σ$$ \upsigma $$) were constant across areas and sexes (0), or varied by area (1), sex (2), or both sex and area (3); see Table 1 for more detailed summary of scenarios. Colors of the dots indicate the number of parameters in MG$$ {M}_G $$ for each scenario. Background colors correspond to the four levels of variation in density: constant density (white), density varies by area (light gray), sex (medium gray), and both area and sex (dark gray).
Across scenarios, the number of candidate model forms with unbiased density estimates (|MPRB| ≤ 5%) generally decreased with increasing complexity of (Figure 4; also see Appendix S1: Table S3 for the MPRB of the most frequently selected misspecified models for each scenario). For the least complex scenarios (K = 3), where had all constant parameters, incorrectly selecting any one of the 47 misspecified models had negligible impact on the density estimates (Figure 4; further see Appendix S1: Table S3). As a result, for these scenarios, despite the AICc ranking making us less likely to select as the top-ranked model, minimal bias was incurred by selecting a model more complex than .
FIGURE 4. Number of candidate model forms where the absolute mean percent relative bias (|MPRB|), out of 1000 simulations, of female (♀) and male (♂) black bear density estimates was ≤5%, across study areas (A, B, C). The color of the circles denotes the number of parameters in the true data generating model (MG$$ {M}_G $$) for each scenario. Scenarios are coded by variables corresponding to whether the parameters (D, g0, σ$$ \upsigma $$) are constant (0) or vary by area (1), sex (2), or sex and area (3); see Table 1 for more detailed summary of scenarios. For each scenario, 48 candidate model forms were fit. Background colors correspond to the four levels of variation in density: constant density (white), density varies by area (light gray), sex (medium gray), and both area and sex (dark gray).
In contrast, for scenarios with complex (K ≥ 10), selecting a model other than often yielded biased (>5% |MPRB|) density estimates (Figure 4; Appendix S1: Table S3). More specifically, for misspecified models with >5% |MPRB|, density was often overestimated for males and underestimated for females (Appendix S1: Table S3). However, as was most often correctly selected for more complex scenarios (K ≥ 10) when a good approximating model was included in the candidate model set (ranging from 70.3% to 99.9% of the simulations, Table 3; also see Appendix S1: Table S3), biased density estimates with >5% |MPRB| were unlikely.
For scenarios with moderate complexity (7 ≤ K ≥ 9), the number of model forms that produced density estimates with >5% |MPRB| varied (Appendix S1: Figure S3). However, despite these scenarios having a poor to moderate chance of AICc selecting as the top-ranked model (ranging from 55.1% to 75.9% of the simulations; Table 3), there were generally minimal consequences on the bias of selecting a misspecified model, as in these cases, the bias most often remains within ±5% (Appendix S1: Table S3).
AccuracyThe misspecifying models that we selected using AICc had similar trends to those of . These models had higher bias usually characterized by higher MCV and RMSE for both sexes, and vice versa (Appendix S1: Table S3). However, this pattern was not always consistent; for some scenarios, there were minimal changes in MCV and RMSE across candidate model forms (Appendix S1: Table S3). Further, for misspecified models where bias was exceptionally large (>15% |MPRB|), variances were typically underestimated; this was most apparent for scenarios with more complex (Appendix S1: Table S3).
DISCUSSIONStatistical methods for estimating population density and its variance accurately and precisely are fundamental for effective management and conservation of wildlife populations. Although the use of SECR models for estimating density has become increasingly widespread, their application may outpace rigorous testing of their performance using real-world datasets (Gerber & Parmenter, 2015). Here, we address this concern, specifically in relation to realistic levels of sex-based and spatial heterogeneity for large carnivore datasets. We demonstrated that when a good approximating model is included in the set of candidate models, SECR methods coupled with AICc model selection generally yield unbiased estimates with nominal CI coverage. However, we identified some situations in which these methods are prone to bias.
In our series of simulations, the critical factor contributing to SECR models' effectiveness is an inverse pattern between the performance of SECR models and AICc model selection. For scenarios with low to moderate levels of complexity in (3 ≤ K > 9), we less successfully identified the approximate true data generating model using AICc (Table 3) and were prone to overfitting (Appendix S1: Table S3). However, selecting an unnecessarily complex model generally had minor effects on density estimates as there was generally low bias (Appendix S1: Table S3). This pattern could be explained by the estimated effects of uninformative parameters being small, such that the density estimates for these scenarios were largely unaffected (Arnold, 2010). Conversely, where the underlying approximate true data generating model was complex (K ≥ 10), top-ranked models that were misspecified produced density estimates with severe negative and positive bias for female and male black bears, respectively (Appendix S1: Table S3). However, for such scenarios, was almost always selected as the top-ranked model (Table 3; Appendix S1: Table S3), and therefore, obtaining unbiased and precise estimates is likely, provided that a good approximating model is included in the candidate set.
While SECR model performance was influenced by variation in all model parameters, in our series of simulations, variation in density more strongly influenced model performance than variation in detection parameters. When was not selected as the top-ranked model, misspecification of the density submodel generally caused more bias than misspecification of the detection submodel (Appendix S1: Table S3). The severe bias introduced from model misspecification for the more complex scenarios demonstrates that failure to account for heterogeneity in SECR models is of greater concern when pooling data from subpopulations that display high variation in density and, to a lesser extent, detection. This has important implications for monitoring because carnivore density is likely to vary spatially across gradients of habitat productivity and human influence that might be unknown to researchers. Thus, these findings demonstrate that if there is potential for variation in density between sexes or in space, this needs to be accounted for by including relevant covariates of density in candidate model sets. Elsewise, the estimates of density are likely to be biased.
Our more complex scenarios with misspecified models yielded biased sex-specific densities. This finding is a shortcoming for managing harvested populations with high heterogeneity. As adult male black bears are more likely to be harvested than females (Gantchoff et al., 2020; Obbard et al., 2017), inflated estimates may lead to suboptimal management decisions that place populations at risk. Further, because females are critical for long-term population stability (Humm & Clark, 2021), accurate estimates are critical for monitoring and predicting populations' response to shift in land-use and environmental conditions. While our simulations focus on black bears, such consequences are pertinent to other large game populations, particularly species of conservation concern.
Moreover, even when was selected as the top-ranked model, scenarios where density varied by area or area and sex were the most challenging to produce unbiased density estimates with high precision. Collectively, this work highlights that while SECR models coupled with AICc selection generally perform well, it is more challenging for highly heterogenous populations, particularly those with varying densities.
Our findings further expand on simulation studies of bias caused by heterogeneity in SECR detection parameters. As demonstrated by Efford and Mowat (2014), for sex-based heterogeneity in detection parameters, scenarios where detection parameters varied in the same direction (reinforcing heterogeneity) displayed large bias; in contrast, those where detection parameters varied in opposite direction (compensatory heterogeneity) displayed small or zero bias. Our simulations increased the level of complexity of the scenarios to represent wild carnivore populations by varying both density and detectability by sex and area, and we found a similar pattern: scenarios where both detection parameters varied by sex and area displayed relatively larger bias (Figure 1). As some of our scenarios had detection parameters vary in the same direction (Table 2), the larger bias for these scenarios could be partly explained by the simulation structure itself.
Limitations and future areas of research and developmentAlthough our series of simulations provide insight into areas where SECR methods are robust and areas where caution is warranted, there is scope for refinement. While these simulations were more extensive than many prior SECR simulation studies, we did not simulate general or site-specific learned responses to detectors (b and bk, respectively). This form of variation in detectability is commonly included in SECR analyses of carnivores, specifically bears (examples include Azad et al., 2019; Howe et al., 2013; Humm & Clark, 2021; Lamb et al., 2018; Loosen et al., 2018). Further, heterogeneity among individual detection may be attributed to a combination of intrinsic and extrinsic factors other than sex and area, as we focused on here, that may be difficult to observe or unknown to the researcher. In cases where unmodeled heterogeneity remains, mixture models may be used to reduce bias (Borchers & Efford, 2008). However, such approaches are demanding of data and can also yield unreliable and imprecise estimates (Marrotte et al., 2022), which limits the practical application for many real-word datasets on elusive species. Thus, we anticipate that further sources of heterogeneity, if unaccounted for, would lead to more severe bias than we reported.
While we attempted to remove the issue of data sparsity, this was not entirely possible due to the trade-off between selecting parameter values that were biologically realistic for the study system and values that maintain a relatively consistent number of recaptures. As expected, for some scenarios, increased number of spatial recaptures slightly improved precision and reduced bias of density estimates, similar to previous SECR studies (Sollmann et al., 2012; Sun et al., 2014).
Testing for violations of model assumptions remains an ongoing challenge for SECR studies. As recently recommended (Moqanaki et al., 2021), goodness-of-fit tests for SECR methods could help identify unmodeled sex and spatial variation in density or detectability and determine whether there is a need to account for such variations in the model. In our simulations, misspecified SECR models for less complex scenarios generally performed well; however, this may not be the case for other studies or species, and such a test could help identify cases of concern. These tests would be particularly useful for populations expected to display high levels of heterogeneity where model misspecification can lead to extremely biased estimates, as demonstrated here. While Bayesian p values have been suggested as a goodness-of-fit test for Bayesian SECR models, specific tests for SECR models fit using likelihood have yet to be fully developed and remain an important knowledge gap to be addressed (Moqanaki et al., 2021).
CONCLUSION AND RECOMMENDATIONSDespite SECR being one of the most powerful tools for enumerating wildlife, variation in density and detectability is often inevitable and can cause bias when not properly accounted for. Pooling data across sexes, time, or space is a common approach to mitigate the challenges of data sparsity in SECR studies of rare and elusive species, but can introduce heterogeneity to datasets. Therefore, researchers may inadvertently risk biased density estimates if they do not include complex models that account for variation in detection, and most importantly density, in the candidate model set. In these situations, reliable density estimates depend on obtaining sufficient sample sizes to detect and model such variability using covariates. However, fitting highly parameterized models to sparse data is inappropriate because the number of model parameters becomes large relative to the sample size of individual encounter histories. In such cases, we encourage researchers to include covariates likely to be the most influential. Choice of covariates should be based on knowledge of the study system and the species biology and ecology. Specifically, for many wide-ranging and elusive species such as carnivores, sex and spatial heterogeneity are key factors to consider.
Most animals, carnivores in particular, exhibit individual heterogeneity in detection probabilities beyond what can be explained by observable covariates such as sex and location (i.e., age and social status, learned behavior). While mixture distributions (Pledger, 2000) can, in some situations, reduce bias and improve CI coverage in the presence of such heterogeneity, this approach is often demanding of data, which limits the practical application for many real-word datasets. In many cases, unfortunately, expensive and time-consuming data collection efforts yield sparse yet heterogeneous data to which complex models cannot be fitted. Density estimates from misspecified models with too few parameters are not only potentially biased but may overstate precision. Therefore, we encourage researchers and wildlife managers to pay particular attention to the impact of unmodeled heterogeneity attributable to both known and unknown sources, when conducting analyses and interpreting results that inform conservation and management.
Our work reinforces the need for simulation studies to include realistic levels of variation in density and detection parameters, particularly when simulations are intended to inform survey design. Elsewise, it risks surveys yielding inadequate data to model important sources of heterogeneity and hence flawed inferences. Further, while pilot studies of small sampling areas may suggest that less intensive surveys are adequate for enumerating populations, the assumption that detection and density parameters remain constant must be considered, particularly as the spatial extent of the study area and, in turn, the variability of the sampled population increases. As our findings demonstrate, not accounting for possible violations of this assumption can result in pronounced bias, particularly when heterogeneity in density and detectability is high. Collectively, our analysis reinforces the importance of understanding the limitations and assumptions of statistical methods. Such an understanding will contribute to improving the robustness of field-based density estimates of carnivores and reduce the potential for basing conservation, management, and policy decisions on flawed information.
ACKNOWLEDGMENTSThis research was made possible by the support provided by The Digital Research Alliance of Canada (
The authors declare no conflicts of interest.
DATA AVAILABILITY STATEMENTEmpirical data and novel code were not used for this research.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Reliable estimates of population density are fundamental for managing and conserving wildlife. Spatially explicit capture–recapture (SECR) models in combination with information-theoretic model selection criteria are frequently used to estimate population density. Variation in density and detectability is inevitable and, when unmodeled, can lead to erroneous estimates. Despite this knowledge, the performance of SECR models and information-theoretic criteria remain relatively untested for populations with realistic levels of variation in density and detectability. We addressed this issue using simulations of American black bear (
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details




1 Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada
2 Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario, Canada
3 Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada; Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario, Canada