1. Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1], the beta-coronavirus that causes what is known as coronavirus disease 2019 (COVID-19), was first identified in an outbreak in Wuhan, Hubei province, China, in December 2019. The virus rapidly spread throughout the world, and on 11 March 2020, the World Health Organization officially declared the international COVID-19 situation a pandemic [2]. In the United States (U.S.), the first confirmed case was identified on 20 January 2020 [3,4] in Washington state, although evidence suggests COVID-19 may have arrived in the U.S. as early as December 2019 [5,6]. The number of infected individuals and subsequent deaths quickly grew, and as of 31 March 2021—slightly past the one-year anniversary of the pandemic—the U.S. had over 30.5 million cumulative confirmed COVID-19 cases and over 560,000 COVID-19 deaths according to the Centers for Disease Control and Prevention (CDC) [7], both figures the highest among every country in the world [8].
Clinical studies on COVID-19 patients conducted early in the pandemic found that men were dying at markedly higher rates relative to women [9,10,11,12,13]. At the population level, males comprise the majority of COVID-19 deaths in the overwhelming majority of countries that report sex-disaggregated COVID-19 mortality data [14]. Here, we focus on the U.S., where the majority of COVID-19 deaths also occur among males [15]. Standard analyses and commentaries contrasting the male and female population level COVID-19 mortality burdens typically involve calculating the percentage of total deaths by sex—contrasting them with their respective percent population shares—and/or calculating male and female (age-adjusted) mortality rates [16,17,18]. For example, the GenderSci Lab COVID Project at Harvard University [18] tracks the number of male and female COVID-19 deaths by state, calculating the percentage of total deaths by sex as well as crude and age-adjusted male and female mortality rates for each state. However, because COVID-19 case fatality rates are considerably higher among individuals in older age groups, COVID-19 death counts and mortality rates for both males and females are predominantly determined by data from COVID-19 decedents in older age groups. Younger individuals, however, are also susceptible to death from COVID-19, which in principle represent greater unrealized years of life, economic productivity, and broader contributions to society compared to decedents of greater age.
Years of potential life lost (YPLL) is a widely used epidemiological measure of mortality burden that emphasizes deaths that occur at younger ages by explicitly weighting such deaths more heavily [19]. The mathematical formula for YPLL for an individual decedent i is defined as the difference between an upper reference age (typically close to a widely applicable life expectancy) and age at death if the difference is positive and zero otherwise:
(1)
Here, we quantified disparities in YPLL attributable to COVID-19 in the U.S. by sex at the state level to examine both their magnitudes and their state-to-state variation. Specifically, we characterized the disparities by estimating (a) percentages of total YPLL by sex and (b) age-adjusted male-to-female YPLL rate ratios (RRs), both nationally and for each of the 50 states and the District of Columbia (D.C.). For comparison, we also calculated the corresponding percentages of total deaths by sex and estimated the corresponding age-adjusted male-to-female mortality RRs to examine potential differences in the characterization of the disparities when measuring mortality burden in terms of YPLL compared to (age-irrespective) death counts. To perform estimation and uncertainty quantification of the estimands of interest, we used novel Monte Carlo (MC) simulation techniques to obtain interval estimates for them.
2. Materials and Methods
2.1. Data
We examined U.S. national COVID-19 mortality data from the National Center for Health Statistics (NCHS) summarized as cumulative death counts (ICD-10 code U07.1 [20] as an underlying or multiple cause of death) within age intervals stratified by state (as well as D.C. and Puerto Rico) and sex [21]. The sex categories are male and female, and the following set of mutually exclusive, collectively exhaustive, and chronologically ordered age groups are used: <1, 1–4, 5–14, 15–24, 25–34, 35–44, 45–54, 55–64, 65–74, 75–84 and 85+. Death counts between 1 and 9 are suppressed in the NCHS data due to patient privacy laws. However, the NCHS data additionally provides the total number of male and female deaths in each jurisdiction. Therefore, for each state and sex, we know the total number of deaths that are within the union of age groups with suppressed death counts, each of which contains between 1 and 9 deaths. The NCHS data also provide non-suppressed death counts within these same age groups stratified by sex for the U.S. overall. See File S1 in the Supplementary Materials for the NCHS data as of 31 March 2021 (reflecting all COVID-19 deaths reported to and processed by the NCHS as of 27 March 2021), which comprises 533,291 total deaths. A total of 111 male age groups and 113 female age groups have suppressed death counts across the 50 states and the District of Columbia (D.C.).
To standardize estimates by age, we used the 2019 U.S. Census Bureau estimates of the population age distribution in each state (and D.C.) stratified by sex [22], defined over integer ages from 0 to 84 and a catch-all 85+ age group for the remaining ages. See File S2 in the Supplementary Materials for the U.S. Census Bureau data.
2.2. Previous Work Quantifying Male-Female Disparities in COVID-19-Attributable YPLL in the United States
YPLL has been used in diverse contexts to quantify and contrast the impact of premature mortality by sex (e.g., [23,24,25,26,27,28,29,30,31,32]), and it has been used as a quantitative measure of mortality burden in the context of COVID-19 (e.g., [33,34,35,36,37,38,39,40]). In particular, Quast et al. (2021) [41] analyzed the NCHS data as of 3 February 2021 (reflecting all COVID-19 deaths reported to the NCHS and processed as of 31 January 2021), which served as a rough approximation of U.S. COVID-19 deaths during the first year of the pandemic, estimating total YPLL and crude YPLL rates by sex in each U.S. state using the sex-specific remaining life expectancy method [42] to define YPLL, meaning that YPLL for each decedent is defined as the expected number of remaining years of life conditional on sex and survival to the observed age at death with respect to the overall U.S. population according to actuarial life tables. While their paper serves as a useful initial analysis contrasting COVID-19-attributable YPLL by sex, there are a number of methodological shortcomings in their study that we aim to address in our analysis.
First, Quast et al. did not quantify the uncertainty of their state-level estimates of total YPLL and crude YPLL rates due to the administrative interval censoring of ages at death in the NCHS data. Quast et al. focused exclusively on point estimation, assuming for the purposes of calculation that ages at death among decedents in a given age group all occurred at a fixed age, usually the age group midpoint (i.e., assuming that decedents in age group 60–69 died at age 65). For the 85+ age group in particular, because the ages at deaths are right-censored, they assumed all of these deaths occurred at age 90, an arbitrary assumption made out of analytical convenience to compute YPLL for these decedents. Second, Quast et al. handled the suppressed death counts in the NCHS data by excluding them from their analysis, analogous to a “complete-case analysis” in the missing data literature [43]. This approach, however, yields underestimates of total YPLL and YPLL rates and could potentially induce bias in their associated ratios. Third, Quast et al. did not account for differences in the male and female population age distributions within and between states by age-standardizing their YPLL rate estimates, which is especially important when a consistent comparison of the magnitudes of male-female disparities across states is desired. Moreover, the context of COVID-19 reveals an important methodological concern regarding the sex-specific remaining life expectancy method to define COVID-19-attributable YPLL. The health profiles of COVID-19 decedents are not representative of the overall U.S. population; for example, hospitalized COVID-19 patients have substantially higher rates of obesity [44] and other pre-existing health conditions [45] compared to the overall U.S. population. As such, the sex-specific remaining life expectancy method would be expected to overestimate the counterfactual years of life remaining, an admission made by Quast et al; this topic has also been explored further by Hanlon et al. (2021) [46]. Indeed, Quast et al. applied a 25% reduction to their estimates of male and female YPLL to reflect this reality, but it is unclear why 25% was specifically chosen as the discount factor and why it was the same for males and females. YPLL has been used to contrast the COVID-19 mortality burden by sex in the state of Ohio [47], but outside of Quast et al. and to the best of our knowledge at the time of writing, YPLL has not yet been formally used as an epidemiological measure of mortality burden in the peer-reviewed literature to comprehensively characterize state-level disparities in the COVID-19 mortality burden between males and females in the U.S.
Our analysis improves upon the methodology employed in Quast et al.’s study in three main ways. First, we used a definition of YPLL (Equation (1)) that circumvented YPLL calculation for decedents in the 85+ age group, is widely used, and provided an equitable comparison between males and females. Second, we developed a novel adaptation of the MC simulation procedure proposed by Xu et al. (2021a) [48] to account for and quantify the estimation uncertainty arising from the administrative interval censoring of ages at death and the suppression of low death counts within individual age groups. Third, we accounted for differences in the male and female population age distributions within and between states by standardizing our male and female YPLL rate estimates to 2019 Census Bureau estimates of the U.S. national population age distribution when estimating age-adjusted male-to-female YPLL RRs both nationally and in each state (and D.C.).
2.3. Estimation Procedure for YPLL-Based Estimands from Administratively Interval Censored Ages at Death
As previously described in Section 1, we characterized disparities in COVID-19- attributable YPLL by sex through the estimation of percentages of total YPLL by sex—contrasting them with their respective percent population shares—and age-adjusted male-to-female YPLL RRs. Additionally, to provide context on the magnitude of COVID-19-attributable YPLL experienced by males and females, we also estimated total YPLL and age-adjusted YPLL rates by sex. As explained in Section 2.2, estimation uncertainty pertaining to the above YPLL-based estimands of interest can be attributed to two sources: (a) administrative interval censoring of ages at death and (b) suppression of low death counts within individual age groups, which we elaborate upon here.
We first focus on the issue of administrative interval censoring of ages at death, assuming momentarily there are no suppressed death counts for purposes of illustration. As the exact ages at death for each individual are unknown, exact YPLL values for each individual are also unknown. In such settings, the standard approach to calculate aggregate YPLL is to operationally assume the age at death for each individual in a given age group is equal to the midpoint, also referred to as the “midpoint method”, which implicitly assumes that ages at death within each age group are uniformly distributed [49,50]. However, applied epidemiological studies using the midpoint method to estimate YPLL-based quantities typically do not quantify the uncertainty attributable to the administrative interval censoring of ages at death (e.g., [51,52,53,54,55,56]).
Xu et al. (2021a) [48] proposed a MC simulation procedure to quantify the uncertainty associated with YPLL-based estimates obtained from mortality data summarized as death counts within age intervals, which has been used in other applied research [40]. The full details of the procedure can be found in their paper, but to summarize it briefly, Xu et al.’s MC simulation procedure consists of stochastic simulation of ages at death for each individual in the data from continuous uniform distributions defined over their respective age intervals at each MC iteration. A point estimate of the YPLL-based estimand of interest is then calculated from the collection of simulated ages at death at each MC iteration, and the overall point estimate is taken to be the mean of the collection of MC point estimates, while the lower and upper endpoints of a interval estimate (which can be conceptualized as a “range interval” per Bobashev and Morris (2010) [57]) are defined as the and quantiles of the collection of MC point estimates, respectively.
2.4. Procedure Modification to Account for Suppression of Low Death Counts
We now address the second source of estimation uncertainty—suppression of low death counts within individual age groups—through a modification of Xu et al.’s MC simulation procedure described in Section 2.3. A similar modification of the standard Xu et al. MC simulation procedure was also described by Xu et al. (2021b) [40] with respect to their comparative analysis of COVID-19-attributable YPLL by race/ethnicity; as such, we use similar language in our description here. For each state and sex, we know the total number of deaths contained in the union of age groups with suppressed death counts, each of which must be an integer between 1 and 9. Hence, for each sex, we can exhaustively enumerate all possible death count combinations across the age groups with suppressed death counts. For each sex, each death count combination corresponding to the age groups with suppressed death counts juxtaposed with the age groups containing non-suppressed death counts, constitutes one possible sex-specific “mortality dataset” of death counts within age groups.
We modified Xu et al.’s MC simulation procedure described in Section 2.3 by independently simulating ages at death for each individual for each possible male-female mortality dataset pair at each MC iteration. Then, at each MC iteration, a point estimate of the estimand of interest was calculated from the simulated ages at death for each male-female mortality dataset pair, from which we stored only the minimum and maximum point estimates. A conservative interval estimate of the estimand of interest was then constructed from the quantile of the collection of minimum MC point estimates and the quantile of the collection of maximum MC point estimates. We describe the interval estimate as “conservative” due to our estimation strategy of enumerating all possible male-female mortality dataset pairs in the data and using the extrema of the subsequent MC point estimates to form an interval estimate. Indeed, if the suppressed death counts had actually been known, a interval estimate of the estimand of interest obtained from the standard Xu et al. MC simulation procedure [48] would be completely contained in the corresponding conservative interval estimate.
2.5. Computational Savings by Omitting Unnecessary Mortality Datasets
For each state and estimand of interest, the modified version of Xu et al.’s MC simulation procedure described in Section 2.4—in theory—comprises simulating ages at death for each individual in male-female mortality dataset pairs, where denotes the number of specified MC iterations, denotes the number of male mortality datasets in state s, and denotes the number of female mortality datasets in state s. As such, the total number of male-female mortality dataset pairs to simulate from can be enormous for sufficiently large values of either or (or both), potentially making it a computationally prohibitive endeavor. However, substantial computational savings can be achieved by identifying male-female mortality dataset pairs that we do not need to simulate ages at death from because they would yield a maximum or minimum MC point estimate of the estimand of interest with probability 0.
We describe an example of identifying such superfluous male-female mortality dataset pairs we can omit from our analysis when the estimand of interest is the male percentage of total YPLL. This is achieved by separately considering which male-female mortality dataset pairs will yield a maximum or minimum point estimate. To obtain a maximum point estimate at each MC iteration, only one male-female mortality dataset pair needs to be considered. The only male mortality dataset that needs to be considered is the one that contains the maximum possible number of deaths in the youngest age groups corresponding to suppressed death counts, and the only female mortality dataset that needs to be considered is the one that contains the maximum possible number of deaths in the oldest age groups corresponding to suppressed death counts. Any other male-female mortality dataset pair would yield a maximum MC point estimate with probability 0. A similar result applies when obtaining a minimum point estimate at each MC iteration. For that task, the only male mortality dataset that needs to be considered is the one that contains the maximum possible number of deaths in the oldest age groups corresponding to suppressed death counts, and the only female mortality dataset that needs to be considered is the one that contains the maximum possible number of deaths in the youngest age groups corresponding to suppressed death counts. Any other male-female mortality dataset pair would yield a minimum MC point estimate with probability 0. Hence, when the estimand of interest is the male percentage of total YPLL, only 2 male-female mortality dataset pairs need to be simulated from at each MC iteration.
For estimation of the age-adjusted YPLL-based estimands of interest (i.e., age-adjusted male and female YPLL rate, age-adjusted male-to-female YPLL RR), identifying male-female mortality dataset pairs that can be omitted is less straightforward. However, inequality conditions can be established computationally to identify meaningful numbers of male-female mortality dataset pairs that can validly be omitted, thereby attaining computational savings.
2.6. Complete Monte Carlo Simulation Procedure
Here, we comprehensively summarize the modified version of Xu et al.’s MC simulation procedure that we used in our analysis to perform estimation and uncertainty quantification of the YPLL-based estimands of interest. In summaries of the results of our analysis, we characterize D.C. as a “state” for brevity. For each state s, the procedure can be comprehensively summarized as follows.
-
Calculate the difference between the total number of male deaths and the number of male deaths contained in age groups with non-suppressed death counts. This difference is the number of male deaths contained in the union of age groups with suppressed death counts. Construct all possible male mortality datasets, each of which corresponds to a possible male death count combination across the age groups with suppressed death counts. Do the same for female deaths to obtain all possible female mortality datasets.
Let denote the total number of MC iterations, and let index the MC iterations. Let denote the total number of male deaths in state s, and let index the individual male deaths. Similarly, let denote the total number of female deaths in state s, and let index the individual female deaths. Let denote the number of male-female mortality dataset pairs for state s that remain after omitting those male-female mortality dataset pairs that would yield a maximum or minimum point estimate with probability 0, and let index both their male and female mortality dataset constituents.
-
Specify a YPLL upper reference age less than or equal to 85 years. We view the <1 age group as equivalent to the singular age 0, and the remaining numeric NCHS age group endpoints represent integer age at last birthday so that there is a 1-year gap between the endpoints of two chronologically consecutive age groups (e.g., 35–44 and 45–54). We treat age as a continuous variable, and as a consequence, we mathematically interpret the <1 age group (age 0) as the right half-open interval , the 85+ age group as the half-bounded interval , and the remaining NCHS age groups as right half-open intervals with lower limit equal to the lower endpoint of the corresponding NCHS age group and upper limit equal to the upper endpoint of the corresponding NCHS age group plus one (e.g., age group 15–24 is viewed as ). Observe that is intentionally and necessarily chosen to be less than or equal to 85 years to obviate the simulation of ages at death for decedents in the 85+ age group because each decedent in that age group contributes zero YPLL.
-
At each MC iteration b, independently simulate an age at death for each male decedent in male mortality dataset j with reported age at death from a continuous uniform distribution over the same interval:
(2)
Likewise, independently simulate an age at death for each female in female mortality dataset j with reported age at death from a continuous uniform distribution over the same interval:
(3)
-
At each MC iteration b, calculate a point estimate of the estimand of interest from the simulated ages of death corresponding to each of the J male-female mortality dataset pairs.
Specifically, for the estimation of the male percentage of total YPLL, first calculate total YPLL for males and females from the simulated ages at death, which are and , respectively. Then, the male percentage of total YPLL, which we denote , is given by:
(4)
Similarly, the female percentage of total YPLL, which we denote , is given by:
(5)
For estimation of the age-adjusted male-to-female YPLL RR, first estimate the age-adjusted male and female YPLL rates using direct age adjustment [58], using the 2019 U.S. Census Bureau age distribution estimate of the overall U.S. population as the standard population. Since the simulated ages at death are continuous and the U.S. Census Bureau age distribution estimates are defined over integer ages from 0 to 84, we aggregate the corresponding simulated YPLL values with respect to the 1-year age intervals implied by these integer ages (i.e., age implies age interval ) to calculate the age-specific YPLL rates, which are subsequently applied to the standard population to obtain the age-adjusted YPLL rate. The male age-adjusted YPLL rate is given by:
(6)
where denotes aggregate male YPLL corresponding to age ; denotes the 2019 U.S. Census Bureau male population estimate for age in state s; and denotes the 2019 U.S. Census Bureau national population estimate for age .Analogously, the female age-adjusted YPLL rate is given by:
(7)
where denotes aggregate female YPLL corresponding to age ; and denotes the 2019 U.S. Census Bureau female population estimate for age in state s.The age-adjusted male-to-female YPLL RR is defined as the quotient of and :
(8)
-
At each MC iteration b, store the maximum and minimum of the J MC point estimates of the estimand of interest obtained from the simulated male-female mortality dataset pairs.
-
A conservative interval estimate of the estimand of interest is given by the quantile of the minimum MC point estimates and the quantile of the maximum MC point estimates. Moreover, a conservative two-sided -level test for whether the estimand of interest equals a specific hypothesized value can be performed by observing whether the associated conservative interval estimate contains that value. We characterize the described test as “conservative” for analogous reasons as for the characterization of a conservative interval estimate as “conservative.”
An overall point estimate for the estimand of interest is not straightforward to define as a result of our estimation strategy. To be explicit, the midpoint of the conservative interval estimate should not be interpreted as the point estimate. As such, we present the results of our YPLL analysis in terms of the collection of interval estimates we generate for the YPLL-based estimands of interest.
2.7. Monte Carlo Simulation Procedure for Estimation of Age-Adjusted Mortality Rates and Rate Ratios
We also considered estimation of age-adjusted mortality rates by sex and age-adjusted male-to-female mortality RRs in the U.S. and in each state for the purpose of comparing them to our estimates of age-adjusted YPLL rates by sex and age-adjusted male-to-female YPLL RRs, respectively, to examine potential differences in the characterization of the disparities when measuring mortality burden in terms of YPLL compared to death counts. Adopting the same methodological motivation as Xu et al. (2021b) in their estimation of age-adjusted mortality rates by race/ethnicity and their associated RRs, we wanted our male and female mortality rates to be standardized to the 2019 U.S. Census Bureau age distribution estimate of the overall U.S. population without combining age intervals in the U.S. Census Bureau data to align them with the NCHS data age intervals so that estimated mortality and YPLL rates in our analysis were age-standardized to as identical as possible standard populations in terms of age interval granularity. To this end, we performed a MC simulation procedure to obtain conservative interval estimates of male and female age-adjusted mortality rates and age-adjusted male-to-female mortality RRs that largely mirrors the MC simulation procedure for the YPLL-based estimands of interest described in Section 2.6. The key differences were that ages at death were simulated for all individuals in non-85+ age groups, and in the direct age adjustment procedure, we summed the number of simulated ages at death falling within the 1-year intervals implied by integer ages 0 to 84 to calculate the age-specific mortality rates for ages 0 to 84, as well as calculate an age 85+ mortality rate, which were subsequently applied to the standard population to obtain the male and female age-adjusted mortality rates, which we denote as and , respectively.
Mathematically, the male age-adjusted mortality rate is given by:
(9)
where is the number of male simulated ages at death equal to age , denotes the number of male deaths in the 85+ age group, and denotes the 2019 U.S. Census Bureau national population estimate for age group 85+.Similarly, the female age-adjusted mortality rate is given by:
(10)
where is the number of female simulated ages at death equal to age , and denotes the number of female deaths in the 85+ age group.The age-adjusted male-to-female mortality RR is defined as the quotient of and :
(11)
2.8. Computation
We used the modified version of Xu et al.’s MC simulation procedure described in Section 2.6 for interval estimation of the YPLL-based estimands of interest in the U.S. and in each state. We performed the MC simulation procedure using iterations and a constant YPLL upper reference age of years for both males and females, an approach used by the CDC [59,60,61,62,63,64,65,66] and widely used in applied research studies [23,67,68,69,70,71,72,73,74,75,76,77,78,79]. Alternative definitions of YPLL sometimes used to contrast YPLL by sex include using sex-specific values of corresponding to at-birth male and female life expectancies (e.g., [80,81]) and the sex-specific remaining life expectancy method, as employed by Quast et al. and in other applied research studies (e.g., [82,83,84,85]), to reflect known underlying differences in the life expectancy between males and females. However, the gap between the male and female life expectancies changes over time (e.g., [86]), varies substantially across countries [87], and are attributable to a myriad of biological, behavioral, and social factors that, after decades of research, are not fully understood [88]. Hence, out of a sex equity ethos not to necessarily normalize lower male life expectancy (similar sentiments are shared in other applications [40,89,90,91]) but also to provide a consistent comparison of YPLL by sex [92], we decided to use a constant YPLL upper reference age of to define YPLL for both males and females—a common practice, as previously discussed. We also obtained conservative 95% interval estimates of age-adjusted mortality rates by sex and age-adjusted male-to-female mortality RRs in the U.S. and in each state and D.C. using an analogous MC simulation procedure described in Section 2.7, also for iterations.
All MC simulations were performed using the
3. Results
3.1. Blueprint for Interpretation of Results
Tables S1 and S2 in the Supplementary Materials contain the complete results of our analysis. Table S1 presents the percent population shares by sex, total COVID-19 deaths by sex, percentages of total COVID-19 deaths by sex, conservative 95% interval estimates of total COVID-19-attributable YPLL by sex, and conservative 95% interval estimates of the percentage of total COVID-19-attributable YPLL by sex in the U.S. and in each state. When the percentage of total deaths for males is above their percent population share, males are overrepresented among COVID-19 deaths, and when the percentage of total deaths for males is below their percent population share, males are underrepresented among COVID-19 deaths. The interpretation of the interval estimates of the percentage of total YPLL relative to the percent population share is slightly more nuanced, however. When the interval estimate of the male percentage of total YPLL is completely above the male percent population share, males are either overrepresented among COVID-19 deaths or male decedent ages are systematically younger relative to those of females—to a degree that is statistically discernable—or both. Conversely, when the interval estimate of the male percentage of total YPLL is completely below the male percent population share, males are either underrepresented among COVID-19 deaths or male decedent ages are systematically older relative to those of females—to a degree that is statistically discernable—or both. This second scenario for the interval estimates of the percentage of total YPLL, however, does not occur in the results of our analysis.
The magnitudes of the sex disparities in the COVID-19 mortality burden can be amplified when mortality is measured in terms of YPLL compared to death counts. For example, if males are overrepresented among COVID-19 deaths, their interval estimate of the percentage of total YPLL can be completely above their percentage of total deaths as a result of male decedent ages being systematically and statistically discernably younger relative to those of females. Similarly, if males are underrepresented among COVID-19 deaths, their interval estimate of the percentage of total YPLL can be completely below their percentage of total deaths as a result of male decedent ages being systematically and statistically discernably older relative to those of females. This second scenario, however, does not occur in the results of our analysis.
Moreover, the direction of the disparity in the COVID-19 mortality burden can in fact reverse when mortality is measured in terms of YPLL compared to death counts. For example, if males are underrepresented among COVID-19 deaths, the interval estimate of the percentage of total YPLL can, in contrast, be completely above the male percent population share as a result of male decedent ages being systematically younger relative to females to a degree that is both statistically discernable and outweighs the disproportionately low number of male deaths. Similarly, if males are overrepresented among COVID-19 deaths, the interval estimate of the percentage of total YPLL can, in contrast, be completely below the male percent population share as a result of male decedent ages being systematically older relative to females to a degree that is both statistically discernable and outweighs the disproportionately high number of male deaths. This second scenario, however, does not occur in the results of our analysis.
Table S2 presents conservative 95% interval estimates of the male and female age-adjusted mortality and YPLL rates as well as the age-adjusted male-to-female mortality and YPLL RRs in the U.S. and in each state. When the interval estimate of the age-adjusted male-to-female mortality RR is completely above 1.0, it means that after accounting for differences in the male and female population age distributions, the male mortality rate is statistically and discernably above that of females. Similarly, when the interval estimate of the age-adjusted male-to-female YPLL RR is completely above 1.0, it means that after accounting for differences in the male and female population age distributions, the male YPLL rate is statistically discernably above that of females. When the interval estimate of the age-adjusted male-to-female YPLL RR is completely above the interval estimate of the age-adjusted male-to-female mortality RR, the male-female disparity in the COVID-19 mortality burden is statistically discernably greater in magnitude when measuring mortality burden in terms of YPLL compared to death counts as a result of males dying at systematically and statistically discernably younger ages relative to females after accounting for differences in their population age distributions. Similarly, when the interval estimate of the age-adjusted male-to-female YPLL RR is completely below the interval estimate of the age-adjusted male-to-female mortality RR, the male-female disparity in the COVID-19 mortality burden is statistically discernably smaller in magnitude when measuring mortality burden in terms of YPLL compared to death counts as a result of males dying at systematically and statistically discernably older ages relative to females after accounting for differences in their population age distributions. While this approach for determining whether the age-adjusted male-to-female YPLL RR differs from the age-adjusted male-to-female mortality RR to a degree that is statistically discernable at the level is overly conservative beyond that which is attributable to our strategy of considering all possible male-female mortality dataset pairs, it nevertheless represents a sufficient condition while also circumventing MC simulation of all possible male-female mortality dataset pairs corresponding to the difference between the age-adjusted male-to-female YPLL RR and the age-adjusted male-to-female mortality RR, which may be computationally prohibitive.
3.2. Presentation of Results
Figure 1 displays a graphical comparison between the conservative 95% interval estimates of the male percentage of total YPLL, the male percentage of total deaths, and the male percent population share in the U.S. and in each state. Nationally, males are overrepresented among COVID-19 deaths, comprising 54.8% of total deaths despite representing 49.2% of the U.S. population. This is also mirrored at the state level, where males are overrepresented among COVID-19 deaths in all but two states (Maine and Rhode Island being the exceptions), with percentages of total COVID-19 deaths exceeding the state percent population shares by between 0.03 percentage units in Connecticut to 12.5 percentage units in Nevada. Moreover, males die from COVID-19 in the U.S. overall at systematically younger ages relative to females to such a degree that the U.S. national conservative 95% interval estimate of the male percentage of total YPLL (64.0–64.1%) is completely above the U.S. national male percentage of total COVID-19 deaths. This national trend of males dying from COVID-19 at systematically younger ages relative to females is nearly universally observed at the state level, with the interval estimates of the male percentage of total YPLL completely above the male percentage of total deaths in all but two states (Hawaii and Alaska being the exceptions). For example, in California, males represent 49.7% of the population and 59.0% of total deaths, but the interval estimate of the percentage of total YPLL is an even higher and astonishing (68.3–68.6%). The direction of the disparity reverses for the two states with males underrepresented among COVID-19 deaths. In Maine, males represent 49.0% of the population and only 48.0% of COVID-19 deaths, but the interval estimate of the percentage of total YPLL is (62.8–68.4%). Similarly, in Rhode Island, males represent 48.7% of the population and only 48.1% of COVID-19 deaths, but the interval estimate of the percentage of total YPLL is (60.8–64.4%). Furthermore, in every state except Alaska, the interval estimates of the male percentages of total YPLL are completely above the male percent population shares. To complement Figure 1, Figure 2 displays a graphical comparison between the conservative 95% interval estimates of the female percentage of total YPLL, the female percentage of total deaths, and the female percent population share in the U.S. and in each state.
Figure 3 presents a graphical comparison of the conservative 95% interval estimates of the age-adjusted male-to-female YPLL and mortality RRs in the U.S. and in each state. The U.S. national conservative 95% interval estimate of the age-adjusted male-to-female mortality RR is (1.62–1.62), and the state-level interval estimates of the age-adjusted male-to-female mortality RR are completely above 1.0 in every state and completely above 2.0 in Hawaii. Furthermore, after accounting for differences in the male and female national population age distributions, males die from COVID-19 in the U.S. overall at systematically younger ages relative to females to such a degree that the U.S. national conservative 95% interval estimate of the age-adjusted male-to-female YPLL RR (1.88–1.89) is completely above the U.S. national interval estimate of the male-to-female mortality RR. This national trend is also widely observed at the state level, with the interval estimates of the age-adjusted male-to-female YPLL RR completely above the corresponding interval estimates of the age-adjusted male-to-female mortality RR in 33 states. Intriguingly, the reverse inequality is observed in four states (Alabama, Alaska, Mississippi, and South Dakota); hence, for these states, males actually die from COVID-19 at older ages relative to females to a degree that is statistically discernable after accounting for differences in the male and female state population age distributions. For three of these four states (Alaska being the exception), males actually die from COVID-19 at statistically discernably younger ages than females without accounting for differences in the male and female state population age distributions (i.e., the interval estimate of the male percentage of total YPLL is completely above the male percentage of total deaths; see Figure 1), thereby illustrating the importance of age standardization. Nevertheless, for these three states, the age-adjusted male YPLL rates still statistically discernably exceed the age-adjusted female YPLL rates due to the disproportionately high number of male COVID-19 deaths in these states. In fact, the state-level interval estimates of the age-adjusted male-to-female YPLL RR are completely above 1.0 in every state except Alaska and, remarkably, are completely above 2.0 in six states (California, Colorado, Nevada, New Hampshire, New Jersey, and New York).
4. Discussion
COVID-19 does not affect all segments of the U.S. population equally. As with older individuals [94] and certain racial/ethnic minorities (Blacks, Hispanics, and American Indian and Alaska Natives) [95], men are at disproportionately high risk of hospitalization and mortality following COVID-19 infection [15]. Multiple theories have been proposed attempting to explain the substantial disparities in COVID-19 mortality outcomes between males and females. For example, men in the U.S. have higher rates of certain medical conditions such as coronary artery disease [96,97], diabetes [98], and liver disease [98], which are risk factors for severe illness post-COVID-19 infection [99]. U.S. men also have higher rates of hypertension [100], which some studies have suggested is also a risk factor for COVID-19-related severe outcomes [99,101,102]. Some research has focused on the mechanisms by which innate biological differences between males and females contribute to the observed disparities in mortality outcomes by sex. Particular areas of focus include differences in the expressions of angiotensin-converting enzyme-2 (ACE2) [103,104,105], the expressions of transmembrane serine protease 2 (TMPRSS2) [106,107], and T-cell response [108,109,110] by sex. Behavioral differences between men and women have also been proposed as contributing to the disparities in the COVID-19 mortality burden by sex. For example, men are more likely to smoke cigarettes [111]—a risk factor for COVID-19-related severe outcomes [99]—and may seek medical care later in the course of a COVID-19 infection compared to women [112]. Men are also more likely than women to eschew wearing masks [113,114,115,116,117], and some research has suggested that masks can reduce the severity of a COVID-19 infection by reducing the viral inoculum [118,119,120,121]. Hand washing may also reduce the viral inoculum of a COVID-19 infection [122], and men are much less likely to practice proper hand hygiene [123,124,125,126]. More broadly, a 2016 meta-analysis of 85 peer-reviewed publications investigating a wide array of countries across six continents found that women were approximately 50% more likely than men to practice non-pharmaceutical health-protective behaviors such as proper hand washing, face mask wearing, and surface cleaning in the context of respiratory infectious disease epidemics and pandemics [127].
Our findings also reveal noticeable state-to-state variability in the magnitudes of the estimated disparities, as illustrated in Figure 1, Figure 2 and Figure 3, which seem to suggest that factors related to social determinants of health [128,129], whose degree of association with sex varies state to state, play a role in driving male-female disparities in COVID-19 mortality, a perspective similarly shared by other COVID-19 researchers [130,131]. For example, men are vastly overrepresented in certain essential industries that required continued employment during the U.S. COVID-19 epidemic such as food/agriculture, transportation/logistics, and manufacturing [132]. Indeed, a 2021 study in California (U.S.) indicated that these industries are among the occupational sectors with the highest associated excess mortality attributable to COVID-19 [133]. While the full scope of factors causing the disproportionately high degree of COVID-19 mortality experienced by males are complex and multifaceted [134,135] and warrants further research, a comprehensive investigation is beyond the scope of this paper.
Against this backdrop, it is notable and concerning that U.S. national COVID-19 public health strategies by both the Trump [136] and Biden [137] presidential administrations have failed to explicitly acknowledge or propose strategies to address the markedly disproportionate rates of morbidity and mortality associated with COVID-19 infections among men, a trend also observed internationally in governmental pandemic response policies [138]. We echo voices within the public health and medical communities calling on policy makers to place greater emphasis on addressing the disproportionate impacts of COVID-19 on men [134,138,139,140,141], especially for those subgroups of men who have the poorest health outcomes, such as certain racial/ethnic minority groups. Such efforts should not minimize the need to address serious impacts that COVID-19 has on women (e.g., [142,143,144,145,146,147,148,149,150,151,152,153]), which have been disproportionate in some domains other than morbidity and mortality, nor should they detract from efforts to improve women’s health more broadly. However, the absence of COVID-19 prevention and mitigation efforts that are focused on men constitutes a glaring missed opportunity to more effectively combat the U.S. COVID-19 epidemic among a substantial proportion of the population.
Currently, one of the most obvious areas needing attention is the gap in male and female COVID-19 vaccination rates, with notably lower rates among men [154,155,156]. Vaccine hesitancy among men has emerged as a substantial source of concern, and an NPR/PBS NewsHour/Marist poll conducted 3–8 March 2021 [157] revealed that men were substantially more likely than women to refuse to get a COVID-19 vaccine, with Republican-identifying men especially unwilling (49%). From a public health perspective, the absence of U.S. health policy initiatives specifically targeted to encourage COVID-19 vaccine uptake among men is disconcerting [155,158]. Even with the development and availability of COVID-19 vaccines in the U.S. [159], the extent to which SARS-CoV-2 and its associated genetic variants [160] will continue to pose a public health threat to the U.S. in the future is currently unknown. While vaccines are the most potent tool against morbidity and mortality associated with COVID-19, they are not an absolute panacea and do not address the root causes of observed population-level disparities. Given the significantly higher risk of COVID-19-related severe outcomes among patients with pre-existing health conditions, long-term investments and strategies to reduce health disparities and promote better overall health in the U.S. can reduce disparities in future COVID-19 outcomes and mitigate the impacts of potential future COVID-19 outbreaks. Possible areas to tackle that are of particular relevance to COVID-19 include obesity, cigarette smoking, excessive alcohol consumption, and unmanaged diabetes and hypertension. Given their higher rates of many chronic health conditions, including those that are risk factors for COVID-19-related severe outcomes, men stand to benefit substantially from such initiatives. For example, males have markedly higher rates of prediabetes than females [161], which represents a ripe opportunity for public health interventions to reduce disparities in eventually diagnosed diabetes. A tangible first step could be to expand the list of explicit health priorities for men in Healthy People 2030 [162], a U.S. Department of Health and Human Services and Office of Disease Prevention and Health Promotion initiative setting national objectives to improve health and well-being over the current decade, to include additional targeted objectives, such as some of those mentioned above.
Contrasting the male and female COVID-19 mortality burden using YPLL captures disparities in both the number of COVID-19 deaths and the ages at death of COVID-19 decedents in a single metric that complements conventional comparative COVID-19 mortality analyses by sex, and our results show that measuring mortality burden in terms of YPLL compared to death counts generally amplifies the magnitude of the disparities in the COVID-19 mortality burden between males and females. For instance, after accounting for the differences in the male and female national population age distributions, we estimated the COVID-19-attributable mortality rate in the U.S. to be approximately 62% higher for males than females but the U.S. national COVID-19-attributable YPLL rate to be 88–89% higher for males than females, owing to the fact that, nationally, males die from COVID-19 at systematically younger ages than females, even after accounting for differences in the national population age distributions between males and females. Remarkably, the age-adjusted male YPLL rates are estimated to be more than twice that of females in six states. Furthermore, while substantial disparities in the COVID-19 mortality burden exist between males and females, the overwhelming majority of individuals infected with COVID-19 do not die from the disease. However, the long-term health effects of recovered COVID-19 patients are unknown at the time of writing [163], but when more detailed information on the long-term disability profiles of individuals formerly infected with COVID-19 becomes available, a comparative analysis of disability-adjusted life years (DALY) [164] by sex, for example, would vastly broaden our understanding of the disparate impacts of the U.S. COVID-19 epidemic between males and females beyond just immediate mortality.
Two sources of uncertainty within the NCHS data substantially complicated our analysis: the administrative interval censoring of ages at death (precluding the exact calculation of YPLL) and the suppression of death counts between 1 and 9 within age groups denoting age at death. To perform estimation of the YPLL-based estimands of interest, accounting for the estimation uncertainty arising from the administrative interval censoring of ages at death and the suppression of low death counts, we developed a novel adaptation of the MC simulation procedure developed by Xu et al. (2021a) that targets estimation of the extrema of the theoretically attainable values of the YPLL-based estimands of interest, resulting in interval estimates for them. A consequence of this conservative estimation strategy, however, was wide interval estimates in states with a high ratio of suppressed to non-suppressed death counts, corresponding to states with a low total number of COVID-19 deaths (e.g., see results for Hawaii). As a result, disparities in COVID-19-attributable YPLL between males and females can be hard to detect when they exist and are small. Despite the challenges in estimation of the YPLL-based estimands of interest due to the two sources of uncertainty in the data and our conservative estimation approach, our analysis nevertheless revealed substantial and statistically discernable disparities in COVID-19-attributable YPLL before age 75 between males and females across U.S. states.
5. Conclusions
We quantified and contrasted COVID-19-attributable YPLL before the age of 75 between males and females in the U.S. and in each of the 50 U.S. states and D.C. from U.S. COVID-19 mortality data from the NCHS (as of 31 March 2021), estimating percentages of total YPLL by sex—contrasting them with their respective percent population shares—and age-adjusted male-to-female YPLL RRs. Our results reveal a virtually universal pattern across states of males experiencing disproportionately high COVID-19-attributable YPLL relative to females. To examine differences in the characterization of the disparities in the COVID-19 mortality burden between males and females when measuring mortality burden in terms of YPLL compared to death counts, we also calculated the corresponding percentages of total COVID-19 deaths by sex and estimated the corresponding age-adjusted male-to-female mortality RRs in the U.S. and in each of the 50 states and D.C. Comparing these two approaches to measuring mortality burden revealed that the estimated disparities are generally greater in magnitude when measuring mortality burden in terms of YPLL compared to death counts, reflecting a broad dual pattern of males dying from COVID-19 in the U.S. at higher rates and at systematically younger ages relative to females. As an epidemiological measure, YPLL offers a compelling illustration of the disproportionately high COVID-19 mortality burden experienced by males in the U.S. by explicitly incorporating age at death in quantifying mortality impact.
More broadly, the COVID-19 pandemic offers lessons regarding the importance of cultivating public health environments in the U.S. and across the world that appropriately recognize the sex-specific needs of individuals as well as different patterns in risk factors, health behaviors, and responses to interventions between men and women. In particular, there is an immediate and urgent need to address COVID-19 vaccine hesitancy among men in the U.S. We urge public officials to update vaccine rollout plans with focused efforts to increase COVID-19 vaccinations among men.
Supplementary Materials
The following are available online at
Author Contributions
Conceptualization, J.J.X.; data curation, J.J.X.; formal analysis, J.J.X.; methodology, J.J.X.; software, J.J.X.; visualization, J.J.X.; writing—original draft, J.J.X.; writing—review and editing, J.J.X., J.T.C., T.R.B., R.S.B., M.A.S. and C.M.R. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data presented in this study are available in the Supplementary Materials.
Conflicts of Interest
T.R.B. has received support from NIH/NCATS grant UL1 TR001881 and NIH/NIMH grant P30 MH058107 in addition to funding outside the scope of this work from the Patient Centered Outcomes Research Institute and the Movember Foundation. M.A.S. has received contracts from Janssen Research & Development, LLC; Private Health Management, Inc.; the United States Department of Veteran Affairs; and the United States Food & Drug Administration and research grants from the National Institutes of Health, all outside the scope of this work. C.M.R. has received a contract from Private Health Management, Inc. outside the scope of this work.
Abbreviations
The following abbreviations are used in this manuscript:
CDC | Centers for Disease Control and Prevention |
COVID-19 | Coronavirus Disease 2019 |
D.C. | District of Columbia |
MC | Monte Carlo |
NCHS | National Center for Health Statistics |
NPR | National Public Radio |
PBS | Public Broadcasting Service |
RR | Rate Ratio |
SARS-CoV-2 | Severe Acute Respiratory Syndrome Coronavirus 2 |
U.S. | United States |
YPLL | Years of Potential Life Lost |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures
Figure 1. Conservative 95% interval estimates of the percentage of total COVID-19-attributable YPLL before age 75, the percentage of total COVID-19 deaths, and the percent population shares for males in the U.S. and in each of the 50 states and D.C. Quantities calculated with respect to cumulative COVID-19 deaths according to data from the National Center for Health Statistics as of 31 March 2021.
Figure 2. Conservative 95% interval estimates of the percentage of total COVID-19-attributable YPLL before age 75, the percentage of total COVID-19 deaths, and the percent population shares for females in the U.S. and in each of the 50 states and D.C. Quantities calculated with respect to cumulative COVID-19 deaths according to data from the National Center for Health Statistics as of 31 March 2021.
Figure 3. Conservative 95% interval estimates of the age-adjusted male-to-female YPLL and mortality RRs in the U.S. and in each of the 50 states and D.C. Quantities calculated with respect to cumulative COVID-19 deaths according to data from the National Center for Health Statistics as of 31 March 2021. States are ordered from top to bottom in descending order of the signed difference between the lower limit of the YPLL RR interval and the upper limit of the mortality RR interval.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2021 by the authors.
Abstract
Males are at higher risk relative to females of severe outcomes following COVID-19 infection. Focusing on COVID-19-attributable mortality in the United States (U.S.), we quantified and contrasted years of potential life lost (YPLL) attributable to COVID-19 by sex based on data from the U.S. National Center for Health Statistics as of 31 March 2021, specifically by contrasting male and female percentages of total YPLL with their respective percent population shares and calculating age-adjusted male-to-female YPLL rate ratios, both nationally and for each of the 50 states and the District of Columbia. Using YPLL before age 75 to anchor comparisons between males and females and a novel Monte Carlo simulation procedure to perform estimation and uncertainty quantification, our results reveal a near-universal pattern across states of higher COVID-19-attributable YPLL among males compared to females. Furthermore, the disproportionately high COVID-19 mortality burden among males is generally more pronounced when measuring mortality burden in terms of YPLL compared to death counts, reflecting dual phenomena of males dying from COVID-19 at higher rates and at systematically younger ages relative to females. The U.S. COVID-19 epidemic also offers lessons underscoring the importance of cultivating a public health environment that recognizes sex-specific needs as well as different patterns in risk factors, health behaviors, and responses to interventions between men and women. Public health strategies incorporating focused efforts to increase COVID-19 vaccinations among men are particularly urged.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details






1 Department of Biostatistics, Jonathan and Karin Fielding School of Public Health, University of California, 650 Charles E. Young Drive South, 51-254 CHS, Los Angeles, CA 90095, USA;
2 Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Harvard University, Cambridge, MA 02115, USA;
3 Department of Biostatistics, Jonathan and Karin Fielding School of Public Health, University of California, 650 Charles E. Young Drive South, 51-254 CHS, Los Angeles, CA 90095, USA;
4 Department of Biostatistics, Jonathan and Karin Fielding School of Public Health, University of California, 650 Charles E. Young Drive South, 51-254 CHS, Los Angeles, CA 90095, USA;