Content area
Background
China is a country with an extremely high disease burden of hepatitis B. Spatiotemporal analysis of hepatitis B from a socioeconomic perspective is of great significance for reducing the disease burden, but there is still a relative lack of research.
Methods
The age-period-cohort model and spatial distribution maps describe the three-dimensional distribution characteristics of hepatitis B. Spatial autocorrelation analysis and spatiotemporal scanning were used to analyze the spatiotemporal distribution characteristics. The random forest algorithm was used to screen the potential influencing factors. The geographic detector model was used to analyze the interaction patterns of variables. Finally, a geographically and temporally weighted regression model was established to analyze the effects of variables on the incidence rate of hepatitis B at different spatiotemporal scales.
Results
From 2004 to 2023, a total of 20,376,898 cases of hepatitis B were reported in China. The incidence rate of hepatitis B decreased at a rate of 3.31% per year, and hepatitis B vaccination has led to this downward trend, accompanied by a significant birth cohort effect. And it shows an aggregated characteristic, which highlights the inequality of geographical distribution. Stronger explanations for the incidence of hepatitis B were found for the number of people at the end of each year (q = 0.1949; where q value refers to the explanatory ability of the independent variable for the dependent variable) and the proportion of rural population (q = 0.1895), with an even stronger explanation for the interaction (q = 0.5366). The magnitude and direction of the effect of factors influencing hepatitis B also varied in different regions, and the effect of each factor on the incidence of hepatitis B was not an independent event.
Conclusions
The later people are born, the lower the incidence of hepatitis B. The northwest and southwest regions are the main hotspots, but there is a tendency to spread to southern China. The number of beds in medical institutions should be increased in densely populated areas, and economic development should be accelerated in sparsely populated areas. Hepatitis B prevention and control should be prioritized in geographic hotspots, coupled with enhanced awareness campaigns in rural areas and catch-up vaccination programs targeting high-risk populations.
Introduction
Hepatitis B is an infectious disease caused by the hepatitis B virus with liver damage as the main manifestation, mainly through blood, sexual contact and mother-to-child transmission [1]. According to the World Health Organization, approximately 296 million people worldwide are infected with the hepatitis B virus, resulting in approximately 820,000 deaths. With about 1.5 million new infections each year, it has become an important public health problem affecting the health of the global population [2, 3]. Since the implementation of hepatitis B vaccination for newborns and preschool children in 1992, the trend of newly infected hepatitis B cases in China has declined. However, China still has the highest burden of hepatitis B disease in the world, accounting for nearly 25% of the total number of people infected with the chronic hepatitis B virus globally [4]. According to a study by Liu et al. [5], the number of chronic hepatitis B virus infections in China declined by 33.9% from 1990 to 2020. According to modeling, the current prevalence of hepatitis B will result in 55.73 million people being infected with hepatitis B in 2030, and more interventions should be taken to reach the 2030 hepatitis B elimination goal.
Since 2002, China has included the hepatitis B vaccine in its childhood immunization program and provided hepatitis B vaccination to children free of charge. In 2015, the coverage rate of hepatitis B vaccination among children reached 99.6%, which has already met the target (90%) set by the World Health Organization (WHO) [6]. China's hepatitis B incidence rate declined significantly for the first time in 2009 and for the second time in 2012, probably due to the vaccine catch-up program for unimmunized children under the age of 15 in China from 2009 to 2011, which reached a cumulative total of more than 68 million children, and greatly increased the coverage of hepatitis B vaccine in China [7]. Previous modeling studies of hepatitis B have mainly applied traditional statistical or mathematical models [8]. For example, Nadia Gul et al. [9] used a transmission dynamics model to simulate the transmission state of hepatitis B. However, the model is extremely sensitive to parameters, and different parameters may lead to huge deviations in the results. And it is not possible to quantify the parameters of some interventions directly in the real world, and the spatial and temporal specificity of the distribution of hepatitis B in different regions is not taken into consideration. Spatial and temporal analyses have shown that factors such as population density and gross domestic product (GDP) may influence the incidence of hepatitis B. However, these studies have focused on the Beijing-Tianjin-Hebei region [10] and Xinjiang [11, 12], and few studies have been conducted in the whole country. There is significant heterogeneity in the incidence rate of hepatitis B virus in different regions and populations. Therefore, analyzing the influencing factors of hepatitis B from the perspective of spatial and temporal analysis can help to understand the actual transmission of hepatitis B in different regions and take corresponding measures to better control the transmission of hepatitis B according to the actual situation [13, 14].
In this study, we firstly mapped the spatial distribution of hepatitis B incidence in China and used the age-period-cohort model to characterize the three-interval distribution of hepatitis B in China. The spatial distribution of hepatitis B in each region of China was revealed by calculating the global Moran’s I index and cold hot spot analysis. The spatial aggregation of hepatitis B in different regions at different times was detected by using time-scanning methods. From the perspectives of sociology and economics, potential factors influencing the incidence rate of hepatitis B were selected, and the random forest algorithm was used for variable selection, focusing on the more significant factors for analysis. The geographic detector model was employed to assess the relationship between these factors and the incidence rate of hepatitis B, quantifying the impact of their interactions on the incidence rate. Finally, a geographically and temporally weighted regression model was used to explore the direction and intensity of these influencing factors' effects under different spatiotemporal conditions, providing scientific evidence for hepatitis B prevention and control.
Materials and methods
Study area and data sources
The study area for this research is the 31 provinces (autonomous regions and municipalities directly under the central government) in China, excluding Hong Kong, Macao, and Taiwan, using the provincial administrative unit as the smallest geographic unit for spatio-temporal analysis. China is categorized into seven regional types based on geographic subdivisions (Table 1).
[IMAGE OMITTED: SEE PDF]
In this study, the monthly data on the number of hepatitis B cases and the incidence rate in each province and age group in China from 2004 to 2020 were collected by the Public Science Data Center. The number of hepatitis B incidence in 2021–2023 was obtained from the National and Provincial Disease Control Bureaus, including the number of hepatitis B incidence per month. The incidence rate of hepatitis B is presented as the number of cases per 100,000 people. Eight variables from the social demographic, economic, and healthcare dimensions were selected to build a random forest algorithm model. This study categorizes the factors potentially influencing the incidence rate of hepatitis B into three dimensions. The sociodemographic dimension includes data on the rural population ratio in China (%), birth rate (per 1,000 people), and the population at the end of each year (in 10,000 people). The economic dimension includes data on per capita GDP (yuan, RMB), the proportion of output value from the tertiary industry (%), and annual national health and medical expenditure (in hundred million yuan). The healthcare level dimension includes data on the number of technicians (per 1,000 people) and the number of hospital beds (in 10,000). The data on influencing factors were sourced from the China Statistical Yearbook and the China Health and Family Planning Statistical Yearbook [15, 16].
Statistical analysis
Age-period-cohort model
The age-period-cohort model is based on the Poisson distribution, and describes the long-term trend of population diseases over time under the condition of adjusting for age, period (continuous time periods composed of calendar years), and cohort at the same time. The model decomposes the data from the three dimensions of age, period and cohort, and analyzes the influence of these three factors on the disease size separately [17]. The age-period-cohort model based on logarithmic transformation was used in this study with the model expression:
$$\text{log}\left({E}_{ij}\right)=\text{log}\left({P}_{ij}\right)+\mu +{\alpha }_{i}+{\beta }_{j}+{\gamma }_{k}$$
where \({E}_{ij}\) denotes the expected number of incidence of hepatitis B in the population in the i-th age group in the j-th period and is assumed to follow a Poisson distribution (In this study, it is set as 5-year age groups); \(\text{log}\left({P}_{ij}\right)\) denotes the logarithmic value of the total number of people in the population in the ith age group in the j-th period; \(\mu\) is the intercept term; \({\alpha }_{i}\) is the age effect, indicating the risk of incidence in the ith age group; \({\beta }_{j}\) is the period effect, indicating the risk of incidence in the j-th period (In this study, the number of periods is 3, and the range is from 2008 to 2018); \({\gamma }_{k}\) is the cohort effect, indicating the risk of incidence in the k-th birth cohort (In this study, the number of birth cohorts is 19, and the range is from 1926 to 2016). By default, we use the median age and period ranges as reference points for calculations. Longitudinal age-specific incidence refers to the incidence of different age groups in a given birth cohort after adjusting for period effects. Cross-sectional age-specific incidence is the incidence of different age groups in a given period, after adjusting for cohort effects. The relative risk (RR) value for the period (or cohort) refers to the RR of a certain period (or cohort) compared to the reference period (or cohort) after correcting for the cohort (or period) effects of age and nonlinearity. The National Cancer Institute (NCI) web tool was used to estimate the age-period-cohort effects on hepatitis B incidence, which in turn was used to estimate the age and time distribution of hepatitis B incidence in China [18]. In this study, since the data for the population aged 85 and above is an open interval, the analysis was conducted on the population aged 0–84, with 5-year age groups. The age-period-cohort model requires that the age group intervals match the period group intervals. Considering that the latest data span of this age group available is from 2004 to 2020, we thus used the most recent data from 2006 to 2020 for the analysis. In this study, P < 0.05 was considered statistically significant.
Spatial autocorrelation analysis
Spatial autocorrelation is used to test whether there is an association between attribute values in a spatial neighborhood, and the degree of aggregation of attributes of spatial units can be expressed through spatial autocorrelation analysis [19]. The commonly used statistic for global spatial autocorrelation is the global Moran's I index, which reflects the spatial clustering pattern of the entire study area and measures the overall degree of spatial aggregation in the data. Its calculation formula is as follows:
$$I=\frac{n\sum_{i=1}^{n}\sum_{j=1}^{n}{w}_{ij}({x}_{i}-\overline{x })({x}_{j}-\overline{x })}{\sum_{i=1}^{n}\sum_{j=1}^{n}{w}_{ij}{({x}_{i}-\overline{x })}^{2}}$$
where n is the sample size; \({x}_{i}\) and \({x}_{j}\) denote the attribute values of the i-th and j-th spatial units, respectively; \({w}_{ij}\) is the spatial weight matrix, in this study set i and j adjacent is weighted 1, otherwise 0; \(\overline{x }\) is the mean of the observations of the study subjects. In this study, when the global Moran's index is greater than 0, it means that there is a positive spatial correlation between the incidence rates of hepatitis B in all regions; when the global Moran's index is less than 0, it means that there is a negative spatial correlation between the incidence rates of hepatitis B in all regions; when the global Moran's index is equal to 0, it means that there is no spatial correlation.
The local Moran’s I index is calculated to determine the location of the aggregation as well as the pattern of aggregation [20]. The formula for its calculation is:
$${I}_{i}=\frac{n({x}_{i}-\overline{x })\sum_{j}^{n}{w}_{ij}({x}_{j}-\overline{x })}{\sum_{i}^{n}{w}_{ij}{({x}_{i}-\overline{x })}^{2}}$$
Among them, \({I}_{i}\) represents the local Moran's I index of the i-th region. There are four types of spatial correlation states that exist when the local Moran’s I index is significant, including: high—high aggregation areas, high—low aggregation areas, low—high aggregation areas, and low—low aggregation areas. For those places that pass the significance test a Local Indicators of Spatial Association (LISA) plot can be used to clearly show the areas and types of aggregation.
Cold spot-hot spot analysis is used to study the clustering patterns and locations of high or low value elements in local spaces. The \({G}_{i}^{*}\) statistic is calculated for each element in the dataset to determine the clustering state of high or low value elements in space [20, 21]. To become a statistically significant hot spot, an element should have a high value and be surrounded by other elements with similarly high values. When \({G}_{i}^{*}\) is positive, a hot spot area is displayed on the map, while when \({G}_{i}^{*}\) is negative, a cold spot area is displayed. The calculation formula is as follows:
$${G}_{i}^{*}=\frac{{\sum }_{j}^{n}{w}_{ij}{x}_{j}}{{\sum }_{j}^{n}{x}_{j}}$$
where n is the total number of spatial units, \({w}_{ij}\) is the spatial weight matrix, and \({x}_{j}\) is the specific attribute value of the spatial unit. The symbols in this part are the same as those used in the calculation of the Moran's index. In the cold and hot spot analysis, the 90%, 95%, and 99% confidence intervals of cold spots and hot spots are usually calculated to show their distribution under different significance levels. In the spatial autocorrelation analysis, the global Moran's I index was calculated using R version 4.3.2 (‘ape’ packages), while the remaining calculations and plotting were performed using ArcGIS 10.8.
Spatiotemporal scanning analysis
Spatiotemporal scanning is an extension of spatial scanning, considering both spatial clustering and temporal distribution patterns, providing a more comprehensive understanding of event distribution over time. The goal of spatiotemporal scanning analysis is to identify significant clustered events in both time and space. A cylindrical scanning window slides across different regions and times, scanning the theoretical incidence numbers for different time periods and areas. The log-likelihood ratio and RR are calculated based on the actual incidence numbers and theoretical incidence numbers to detect the spatiotemporal clustering of diseases [22]. This study used SaTScan 10.1 to perform spatiotemporal scan analysis of the incidence of hepatitis B in China from January 2004 to December 2023 on a month-by-month basis, choosing a discrete Poisson distribution model and using a cylindrical scanning window for retrospective scanning. The Monte Carlo simulation algorithm was used for 999 simulations. The RR, which is used as an evaluation indicator of risk factors in epidemiology, was used to assess the risk factors.
Random forest model and geographic detector model
The random forest model is a machine learning algorithm based on classification trees, known for its strong ability to handle complex and variable factors with high interpretability. It effectively avoids issues of multicollinearity and overfitting, making it suitable for screening the influencing factors of hepatitis B [23]. The geographic detector is a spatial analysis method used to detect spatial heterogeneity and explain the underlying driving forces. Its basic concept is: if a certain independent variable has a significant impact on the dependent variable, then the spatial distributions of the independent and dependent variables should be similar [24]. The study uses the factor detection and interaction detection modules of the geographic detector. Factor detection can identify statistically significant independent variables and their explanatory power for the dependent variable, primarily quantified through the \(q\)-value [25]. The calculation formula is as follows:
$$q=1-\frac{{\sum }_{h=1}^{L}{N}_{h}{\sigma }_{h}^{2}}{N{\sigma }^{2}}$$
where \(q\) represents the explanatory power of the independent variable for the dependent variable, with a range of [0,1]; h is the number of layers for the independent or dependent variable; L represents the number of stratification levels of the detection factor; \({N}_{h}\) and \({\sigma }_{h}^{2}\) are the number of units and variance within the h-th layer, respectively; N and \({\sigma }^{2}\) are the total number of units and variance in the entire study area. Five discretization methods, namely equal—interval classification, natural breaks classification, quantile classification, geometric interval classification, and standard deviation classification, were adopted to determine the optimal discretization method for each variable. Then, a geographic detector model was established. After factor detection, the interaction detector is used to explore the interactions between different independent variable factors. The modes of interaction include: nonlinear weakening, single-factor nonlinear weakening, two-factor enhancement, independence, and nonlinear enhancement [26]. The random forest model was used to screen the factors that may influence hepatitis B. The importance of the variables was assessed based on the increase in mean squared error. The significance of each variable was calculated using permutation testing. Relatively important variables were selected and incorporated into the geographic detector model for analysis. When screening variables using the random forest, the default number of 500 trees in the R function is adopted. The number of candidate variables randomly sampled for each split is the integer value obtained by taking one—third of the total number of independent variables. The minimum sample size of 1, which is also the default in R, is used for the terminal nodes. When testing the significance of variable importance, 500 trees are also used, and permutation tests are repeated 100 times to make the results more stable. In this study, both the geographical detector model and the random forest model were constructed using R 4.3.2. The geographical detector model was implemented through the 'GD' package, and the random forest model was implemented through the 'randomForest', 'rfPermute' and 'rfUtilities' packages.
Geographically and temporally weighted regression
Geographically and temporally weighted regression is a local linear regression model that can simultaneously account for both temporal and spatial non-stationarity. It considers the temporal and spatial non-stationarity of regression coefficients, providing a synchronized explanation of the spatiotemporal differentiation mechanisms of geographic phenomena [27]. Geographically and temporally weighted regression combines the original geographic spatial coordinates with time, forming a spatiotemporal three-dimensional coordinate \(\left({u}_{i},{v}_{i},{t}_{i}\right)\). These three dimensions are coordinates and time. The bandwidth of the model was determined according to the Akaike Information Criterion Correction (AICc) using an adaptive method. The mathematical expression of the model is as follows:
$${y}_{i}={\beta }_{0}\left({u}_{i},{v}_{i},{t}_{i}\right)+{\sum }_{k=1}^{p}{\beta }_{k}\left({u}_{i},{v}_{i},{t}_{i}\right){x}_{ik}+{\varepsilon }_{i}$$
where \({y}_{i}\) is the observed value, \(\left({u}_{i},{v}_{i},{t}_{i}\right)\) represents the spatiotemporal three-dimensional coordinates of the i-th sample point, and \({\beta }_{k}\left({u}_{i},{v}_{i},{t}_{i}\right)\) is the regression coefficient of the k-th independent variable for the i-th sample point, which is estimated using weighted least squares. \({\varepsilon }_{i}\) is the random error for the i-th sample point, with \({\varepsilon }_{i}\sim N\left(0,{\sigma }^{2}\right)\), and the random errors of different sample points are independent with a covariance of 0. The geographically and temporally weighted regression model is computed using GTWR v1.1 developed by Huang et al. [28]. Significant factors identified in the geographic detector model were subjected to geographically and temporally weighted regression analysis, where the hepatitis B incidence rate was used as the dependent variable and the remaining variables were used as independent variables to build the model. The geographically and temporally weighted regression model requires that there is no multicollinearity between the explanatory variables, so it is necessary to conduct a multicollinearity test on the variables. A variance inflation factor less than 10 is considered to indicate the absence of multicollinearity.
Results
Hepatitis B intertriginous distribution analysis
From 2004 to 2023, a total of 20,376,898 cases of hepatitis B were reported in China, and the annual incidence rate and number of cases of hepatitis B are shown in Fig. 1. From 2004 to 2007, hepatitis B in China showed an increasing trend. After 2009, it began to gradually decline. In 2012, there was a second drop, and it remained at a lower level during the three years of the COVID—19 pandemic. However, in 2023, the incidence rate showed an increase again. The age-period-cohort model was used to analyze the hepatitis B incidence rate in China (Fig. 2).
[IMAGE OMITTED: SEE PDF]
[IMAGE OMITTED: SEE PDF]
In Fig. 2A, the net drift represents the overall annual percentage change of age groups adjusted for time. The results show that the net drift of hepatitis B incidence rate per year in China was −3.31% (95% CI: −3.94%, −2.67%), indicating that, overall, the incidence rate of hepatitis B in China decreased at an average rate of 3.31% per year. Local drift represents the annual percentage change for each age group. It can be observed that the hepatitis B incidence rate for age groups under 45 years shows a declining trend, with the largest decrease occurring in the 10–15 age group. For age groups over 45, the hepatitis B incidence rate shows an increasing trend. Figure 2B shows that the longitudinal age-specific incidence rate (Trends in incidence with increasing age) of hepatitis B in China first decreases, then starts to gradually rise after the age of 15. It reaches a peak 25 years old and then gradually declines and stabilizes. In Fig. 2C, the cross-sectional age-specific incidence rate (The trend of incidence in different time periods) of hepatitis B in China showed a trend of increasing with age. From the period effect (Fig. 2D), the risk of hepatitis B incidence in China showed a gradual decreasing trend. From the cohort effect (Fig. 2E), the risk of hepatitis B incidence in China gradually increased with increasing birth years before 1971 and decreased with increasing birth years after 1971.
The map of annual incidence rate of hepatitis B by province in China from 2004 to 2023 is shown in Fig. 3. Before 2012, the high incidence of hepatitis B in China was mainly concentrated in Northwest China and North China, but there is a tendency to spread to South China. After 2012, the incidence of hepatitis B in North China has gradually decreased, while the incidence of hepatitis B in South China and Central China has gradually increased and become a high incidence area together with Northwest China. Qinghai Province has always been a high incidence of hepatitis B, while Gansu Province and Henan Province in 2012, the incidence rate of hepatitis B fell sharply, Gansu Province from 119.81 per 100,000 people to 46.02 per 100,000 people; Henan Province from 140.14 per 100,000 people to 80.01 per 100,000 people.
[IMAGE OMITTED: SEE PDF]
Analysis of spatial heterogeneity
Global spatial autocorrelation analysis. The spatial weight matrix method was used to calculate the global Moran's I index of hepatitis B incidence in China from 2004 to 2023 (Table 2). It can be observed that the global Moran's index for all years is greater than 0 (P < 0.05), indicating positive spatial autocorrelation. Thus, it can be concluded that there is spatial aggregation in the incidence of hepatitis B in China. Local Moran's I index was calculated for the incidence of hepatitis B in China, and LISA clustering maps were drawn (Fig. 4). The pattern of aggregation of hepatitis B incidence in China from 2004 to 2023 has changed over time. The high—low aggregation in the Xinjiang Uygur Autonomous Region before 2010 and no spatial aggregation thereafter. In Guangdong Province, a pattern of high-high aggregation began in 2015. The North China region has been high and low aggregation from 2015 to 2019, and from 2020 onwards it changes to low—low aggregation. In the cold—spot and hot—spot analysis, the 90%, 95%, and 99% confidence intervals are usually calculated to identify the cold—spot and hot—spot areas under different significance levels (Fig. 5). China's Northwest and Southwest regions are perennial hot spot regions, while North China, Northeast China, and East China are perennial cold spot regions.
[IMAGE OMITTED: SEE PDF]
[IMAGE OMITTED: SEE PDF]
[IMAGE OMITTED: SEE PDF]
Spatiotemporal scan analysis
The results of the spatiotemporal scan analysis were shown in Fig. 6. From February 2004 to August 2012, high risk clusters were observed in Northwest China (Shaanxi, Gansu, Ningxia Hui Autonomous Region, Qinghai), Southwest China (Chongqing, Sichuan, Yunnan), Central China (Hubei, Henan), and North China (Shanxi, Hebei), with an aggregation radius of 768.91 km and a RR of 1.70 (P < 0.001). From February 2013 to January 2023, low-risk aggregation was observed in East China (Shanghai, Jiangsu and Zhejiang), with an aggregation radius of 239.21 km and a RR of 0.28 (P < 0.001). From January 2014 to December 2023, high-risk clustering was observed in South China (Guangdong, Guangxi, Hainan), East China (Fujian, Jiangxi), and Central China (Hunan), with an aggregation radius of 692.52 km and a RR of 1.56 (P < 0.001).
[IMAGE OMITTED: SEE PDF]
Random forest and geographic detection analysis
Finally, statistically significant variables were selected (the population at the end of each year (in 10,000 people), number of beds in health care facilities (in 10,000), rural population proportion (%), tertiary industry proportion (%), number of health technicians per 1,000 population, and per capita GDP (yuan, RMB)). The increase in mean squared error is shown in Fig. 7.
[IMAGE OMITTED: SEE PDF]
The factor detection results showed that the explanatory power of all variables was statistically significant (Table 3). The strongest explanatory power is the number of population at the end of each year with a q-value of 0.1949, followed by the share of rural population with a q-value of 0.1895, and the weakest explanatory power is the number of beds in healthcare facilities with a q-value of 0.1044. Interaction detection of the variables shows that the number of population at the end of each year, the number of beds in healthcare facilities, and the rest of the variables show nonlinear enhancement, and the rest of the variables show two-factor enhancement (Table 4). This indicates that these six factors are not independent factors, and their explanatory power for the incidence of hepatitis B was enhanced by the interaction of the two factors, with the strongest explanatory power being the interaction between population at the end of each year and the proportion of rural population, with a q-value of 0.5366.
[IMAGE OMITTED: SEE PDF]
[IMAGE OMITTED: SEE PDF]
Geographically and temporally weighted regression analysis
The results show that the variance inflation factor for the population at the end of each year is 5.64, the variance inflation factor for the number of beds in health care institutions is 5.89, the variance inflation factor for the proportion of rural population is 3.87, the variance inflation factor for the proportion of the tertiary industry is 2.81, the variance inflation factor for the number of health care technicians per 1,000 population is 5.22, and the variance inflation factor for the per capita GDP is 4.35. Since each of the variables has a variance inflation factor of less than 10, it can be assumed that there is no multicollinearity.
The Gaussian kernel function method is used to calculate the spatio-temporal weights, and the spatio-temporal distance ratio is 0.2688. The model's coefficient of determination was 0.8493, and the adjusted coefficient of determination was 0.8478, which indicated that the model fit well and the model was stable. Due to the large amount of data, only the regression coefficients of the even years in 2004–2023 are shown, and the distribution of regression coefficients of each of its explanatory variables is shown from Supplementary Fig. 1 to Supplementary Fig. 6. The test results of the regression coefficients show that the regression coefficients are all significantly non-zero and the differences are statistically significant.
Discussion
There is an increasing trend in the incidence of hepatitis B in people over 45 years of age, which should be noted and more measures need to be taken for these people. In terms of longitudinal age-specific incidence rate, the incidence rate is higher in people over 65 years of age, suggesting that older people account for a larger proportion of the incidence of hepatitis B in China. It is possible that these individuals are at a higher risk of developing hepatitis B because they were not vaccinated against the disease at birth. The similarly studies have suggested that vaccination of the elderly is more important in areas where hepatitis B vaccination programs are available, as prevention for the elderly is relaxed [29]. It has also been noted that hepatitis B carriage rates are high among older people who have been infected with hepatitis B, and that prevention among older people should be emphasized more [30]. In terms of longitudinal age-specific prevalence, the incidence is higher in people under 5 years of age and in people aged 15–30 years. In children, hepatitis B is mainly contracted through vertical transmission. Mothers with hepatitis B should use highly effective antiviral drugs for mother-to-child interruption during pregnancy and assess the risk of interruption failure in a timely manner, and children should be immunized with hepatitis B vaccine and hepatitis B immune globulin after birth [31]. For 15–30 year olds, according to a survey of hepatitis B in Uganda, hepatitis B is mainly transmitted through blood and sex in this age group, and probably in the same way in China [32]. A survey in Shandong Province, China, showed that the prevalence of hepatitis B peaked in people aged 20–29 years, which is consistent with the results of this study, which also showed that the mode of transmission in this group is mainly sexual [33]. For this group, Hepatitis B antibody testing should be conducted regularly, and Hepatitis B vaccine should be given to those with declining antibodies in a timely manner, along with non-pharmacological interventions. For example, publicizing hepatitis B prevention and distributing free condoms to this group will effectively reduce the incidence [34].
In terms of spatial distribution, the incidence of hepatitis B in China shows spatial aggregation, which suggests that there is variability in the geographic distribution of hepatitis B, which may be related to the hepatitis B vaccination rate. A study shows that there are differences in the hepatitis B vaccination rate across different landscapes [35]. Cold-spot-hot-spot analysis showed that the hot-spot areas of hepatitis B were mainly distributed in the northwest and southwest regions of China. LISA maps also showed that these areas were mainly high-low clustered, suggesting that the incidence rate in these areas was relatively high while that in the neighboring areas was low. Northeast China is the main cold spot distribution area, which may be related to the high vaccination rate in the region. A research shows that in 1999, the hepatitis B vaccine in Jilin Province according to the immunization program of the entire 3-dose qualified vaccination rate has reached 88.7% [36]. The LISA map of North China, adjacent to the northeast, shows a high-low aggregation pattern. This may be due to the high population density and high mobility in North China, such as Beijing and Tianjin, which may be the main factor affecting the agglomeration pattern [10]. Since 2012, the hotspots have gradually shifted towards the South China. Guangdong Province and Hainan Province, which are part of the South China, also fall within the high-risk areas. In terms of GDP, the GDP of the South China region has been growing rapidly since 2011, indicating that this region has entered a period of rapid development. According to a study [37], in 2011, the net migration rate of population flow in Guangdong Province, which is in the South China, ranked among the top in China, and its growth rate was also relatively rapid. According to the disease control report of Guangdong Province, in the survey conducted in 1992, the HBV infection rate among the population in Guangdong Province was 75.3%, and the HBsAg carrier rate was 17.85%, ranking first in the country. The sudden increase in a large amount of population flow and contacts has had a significant impact on the development of hepatitis B in the South China. However, as a tourist-oriented provincial administrative region, Hainan Province has a high level of population mobility. As a result, there are frequent contacts among people, leading to a relatively high transmission risk. The Guangxi Zhuang Autonomous Region has always been an area with a high incidence of hepatitis B. In the early days, the incidence rate of hepatitis B in Guangxi ranked among the top of the reported incidence rates of various infectious diseases [38]. After the state implemented the immunization program for hepatitis B vaccine, the vaccination coverage rate for newborns in each province has significantly increased. However, due to the combined effects of factors such as large-scale population mobility, economic development, and a high hepatitis B infection rate within the population base, a favorable background has been provided for the gradual spatial transfer of hepatitis B in the South China. Therefore, more efforts should be made to prevent and control hepatitis B in this region, and a strategy of providing free hepatitis B vaccination for adults should be implemented. In recent years, Jiangsu, Shanghai and Zhejiang Provinces in East China are low prevalence areas while Fujian Province is a high-risk area, indicating that the prevalence of Hepatitis B is not only affected by geographic location but also by other factors. The study found that provinces neighboring Guangdong Province are at high risk from January 2014 to December 2023, which may be due to the impact of the hepatitis B epidemic in Guangdong Province on these neighboring provinces. In China, Guangdong province is an economically developed province with close contact with people from other provinces, especially neighboring provinces, and Guangdong province is in a period of economic development, which increases the contact between people. More measures should be taken to reduce the incidence of hepatitis B in Guangdong province, so that the incidence of hepatitis B in the neighboring provinces may be reduced as well [38,39,40].
Analyzing the factors influencing disease from a spatial perspective not only provides an understanding of the way in which these factors act on disease as a whole, but also shows the spatial heterogeneity of their effects [41]. Where there is a high population density, there will be more human contact and therefore a higher risk of infection. People in rural areas may lack appropriate knowledge of health care and awareness of hepatitis B transmission routes compared to urban areas. In addition, according to a survey of hepatitis B vaccination rates among children born between July 1, 1994 and June 30, 1995 in 10 provinces in China, the vaccination rate was 82.9% in urban areas and only 33% in rural areas, thus also having a greater impact on hepatitis B transmission [42, 43]. Increased levels of economic development may lead to more money being spent on health care and disease prevention, which may reduce the incidence of hepatitis B [44]. The results of the study found that among these six factors population density was the most important factor influencing the incidence of hepatitis B, followed by the proportion of rural population, suggesting that these two factors play an important role in the transmission of hepatitis B. This is different from the results of Xu et al.'s study on Beijing-Tianjin-Hebei, probably because Beijing-Tianjin-Hebei is an economically developed region, whereas the present study is on China as a whole, which has greater geographic variance [10]. From the results of interaction detection, the influencing factors of hepatitis B in China did not act independently, and were mainly characterized by nonlinear enhancement and two-factor enhancement. Among them, the interaction effects of two factors, year-end population size and proportion of rural population, were the most influential. This emphasizes the importance of considering the demographic structure in hepatitis B prevention and control strategies. Although these macro-factors are difficult to intervene directly, they can be dealt with indirectly by optimizing the allocation of rural healthcare resources and increasing the awareness of hepatitis B prevention and control and vaccination rates among the rural population.
Studies have shown that there is a clear spatial and temporal trend in the factors influencing the incidence of hepatitis B [45]. In Northwest China, emphasis should be placed on improving the level of medical care and per capita income. The Northeast China should increase income levels, raise population density, and promote urbanization, reducing the proportion of rural population. The results of the analysis in East China show that there are differences in the way the influencing factors work between Fujian Province and the rest of the region, which may be related to the geographical location of Fujian Province near Guangdong Province, the center of economic development. Fujian Province should prioritize improving healthcare, while other regions should focus on raising per capita income. Central China, which has less spatial and temporal variation, should reduce the incidence of hepatitis B mainly by improving per capita income and healthcare. For South China, the influencing factors work differently for Guangdong Province than for the other two provinces, probably due to the fact that Guangdong's pace of economic development is different from that of the other two regions [46]. Guangdong Province needs to increase the number of beds in its medical institutions, while other provinces should enrich their medical resources and accelerate urbanization. The Southwest China should focus on increasing per capita income, promoting urbanization and building a better health-care system to ensure that all residents have access to better-quality health care.
This study also has some limitations. On the one hand, this study only analyzed the socio-economic and demographic factors influencing hepatitis B and did not consider the effect of hepatitis B vaccination. On the other hand, although the monitored case data used in this study is the most complete data available, there may still be some underreporting problems leading to estimation errors.
Conclusions
China has achieved significant results in controlling hepatitis B. However, the incidence of hepatitis B has an age effect and a birth cohort effect, as well as significant spatial and temporal aggregation. The incidence is highest in people under 5 years of age, 15–30 years, and over 65 years, and more interventions should be implemented in these populations. The later the age of birth, the lower the incidence of hepatitis B. The spatial and temporal distribution of the incidence of hepatitis B in China is characterized by clustering, with population density and the proportion of the rural population being important factors affecting the incidence of hepatitis B. Moreover, there is a phenomenon of spatial and temporal shifts in the influencing factors of hepatitis B. Hepatitis B should be prevented and controlled using targeted measures based on the characteristics of different regions. This study provides new evidence on the socioeconomic perspective of hepatitis B prevention and control, and the results may provide useful information for policy makers to develop appropriate targeted prevention and control strategies to reduce the disease burden of hepatitis B in China.
Data availability
No datasets were generated or analysed during the current study.
Lee HM, Banini BA. Updates on chronic HBV: current challenges and future goals. Curr Treat Options Gastro. 2019;17:271–91.
Pan Y, Xia H, He Y, Zeng S, Shen Z, Huang W. The progress of molecules and strategies for the treatment of HBV infection. Front Cell Infect Microbiol. 2023;13: 1128807.
Sheena BS, Hiebert L, Han H, Ippolito H, Abbasi-Kangevari M, Abbasi-Kangevari Z, et al. Global, regional, and national burden of hepatitis B, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Gastroenterol Hepatol. 2022;7:796–829.
Li M, Zhao L, Zhou J, Sun Y, Wu X, Ou X, et al. Changing clinical care cascade of patients with chronic hepatitis B in Beijing, China. Lancet Regional Health Western Pacific. 2021;16: 100249.
Liu Z, Li M, Hutton DW, Wagner AL, Yao Y, Zhu W, et al. Impact of the national hepatitis B immunization program in China: a modeling study. Infect Dis Poverty. 2022;11:106.
Bai X, Chen L, Liu X, Tong Y, Wang L, Zhou M, et al. Adult hepatitis B virus vaccination coverage in China from 2011 to 2021: a systematic review. Vaccines. 2022;10: 900.
Liu J. Liu M [Progress and challenges in achieving the WHO goal on “Elimination of Hepatitis B by 2030” in China]. Zhonghua Liu Xing Bing Xue Za Zhi. 2019;40:605–9.
Xie J, Wang X, Wang X, Li J, Jie Y, Hao Y, et al. Assessing the impact of comorbid type 2 diabetes mellitus on the disease burden of chronic hepatitis B virus infection and its complications in China from 2006 to 2030: a modeling study. Glob Health Res Policy. 2024;9:5.
Gul N, Bilal R, Algehyne EA, Alshehri MG, Khan MA, Chu Y-M, et al. The dynamics of fractional order Hepatitis B virus model with asymptomatic carriers. Alex Eng J. 2021;60:3945–55.
Xu CD, Xiao GX, Li JM, Cao HX. Spatiotemporal patterns and risk factors concerning hepatitis B virus infections in the Beijing–Tianjin–Hebei area of China. Epidemiol Infect. 2019;147: e110.
Wang Y, Xie N, Li F, Wang Z, Ding S, Hu X, et al. Spatial age-period-cohort analysis of hepatitis B risk in Xinjiang from 2006 to 2019. Front Public Health. 2023;11: 1171516.
Wang Y, Xie N, Wang Z, Ding S, Hu X, Wang K. Spatio-temporal distribution characteristics of the risk of viral hepatitis B incidence based on INLA in 14 prefectures of Xinjiang from 2004 to 2019. MBE. 2023;20:10678–93.
Fei W, Xiao T, Li X, Yang F. Knowledge about hepatitis B and influencing factors among the residents in Qingdao: a cross-sectional study. J Infect Dev Ctries. 2023;17:1030–6.
Stroffolini T, Mele A, Tosti ME, Gallo G, Balocchini E, Ragni P, et al. The impact of the hepatitis B mass immunisation campaign on the incidence and risk factors of acute hepatitis B in Italy. J Hepatol. 2000;33:980–5.
Statistical Yearbook of Health in China. http://www.nhc.gov.cn/mohwsbwstjxxzx/tjzxtjsj/tjsj_list.shtml. Accessed 7 Mar 2025.
China Statistical Yearbook - National Bureau of Statistics. https://www.stats.gov.cn/sj/ndsj/. Accessed 7 Mar 2025.
Ding Y, Chen X, Zhang Q, Liu Q. Historical trends in breast cancer among women in China from age-period-cohort modeling of the 1990–2015 breast cancer mortality data. BMC Public Health. 2020;20:1280.
Rosenberg PS, Check DP, Anderson WF. A web tool for age–period–cohort analysis of cancer incidence and mortality rates. Cancer Epidemiol Biomark Prev. 2014;23:2296–302.
Chen Y. An analytical process of spatial autocorrelation functions based on Moran’s index. PLoS ONE. 2021;16: e0249589.
Ren H, Shang Y, Zhang S. Measuring the spatiotemporal variations of vegetation net primary productivity in Inner Mongolia using spatial autocorrelation. Ecol Ind. 2020;112: 106108.
Wang S, Liu H, Pu H, Yang H. Spatial disparity and hierarchical cluster analysis of final energy consumption in China. Energy. 2020;197: 117195.
Kiani B, Raouf Rahmati A, Bergquist R, Hashtarkhani S, Firouraghi N, Bagheri N, et al. Spatio-temporal epidemiology of the tuberculosis incidence rate in Iran 2008 to 2018. BMC Public Health. 2021;21:1093.
Borup D, Christensen BJ, Mühlbach NS, Nielsen MS. Targeting predictors in random forest regression. Int J Forecast. 2023;39:841–68.
Song Y, Wang J, Ge Y, Xu C. An optimal parameters-based geographical detector model enhances geographic characteristics of explanatory variables for spatial heterogeneity analysis: cases with different types of spatial data. GIScience Remote Sensing. 2020;57:593–610.
Anselin L. Local Indicators of Spatial Association—LISA. Geogr Anal. 1995;27:93–115.
Zhao Y, Deng Q, Lin Q, Zeng C, Zhong C. Cadmium source identification in soils and high-risk regions predicted by geographical detector method. Environ Pollut. 2020;263: 114338.
Hu J, Zhang J, Li Y. Exploring the spatial and temporal driving mechanisms of landscape patterns on habitat quality in a city undergoing rapid urbanization based on GTWR and MGWR: the case of Nanjing. China Ecolog Indicators. 2022;143: 109333.
Huang B, Wu B, Barry M. Geographically and temporally weighted regression for modeling spatio-temporal variation in house prices. Int J Geogr Inf Sci. 2010;24:383–401.
Loustaud-Ratti V, Jacques J, Debette-Gratien M, Carrier P. Hepatitis B and elders: An underestimated issue. Hepatol Res. 2016;46:22–8.
Kondo Y, Tsukada K, Takeuchi T, Mitsui T, Iwano K, Masuko K, et al. High carrier rate after hepatitis B virus infection in the elderly. Hepatology. 1993;18:768–74.
Cheung KW, Lao TT-H. Hepatitis B - vertical transmission and the prevention of mother-to-child transmission. Best Pract Res Clin Obstet Gynaecol. 2020;68:78–88.
Ebong D. The prevalence of hepatitis b viral infection among patient age 15–35 year attending health services at Ngetta HCIV Lira District, Uganda. 2017.
Liu J, Lv J, Yan B, Feng Y, Song L, Xu A, et al. Comparison between two population-based hepatitis B serosurveys with an 8-year interval in Shandong Province, China. Int J Infect Dis. 2017;61:13–9.
Nguyen MH, Wong G, Gane E, Kao JH, Dusheiko G. Hepatitis B virus: advances in prevention, diagnosis, and therapy. Clin Microbiol Rev. 2020;33. https://doi.org/10.1128/cmr.00046-19.
Zhu X, Zhang X, Chai F, Wang K. The hepatitis B (HB) vaccine coverage rate in 10 Chinese provinces and its influence factors. Chin J Vaccines Immun. 1998:31–5.
Lin L, Wang J, Liu G, Song T, Chen C, Zheng W, et al. A Survey on Inoculation Rates of Four EPI Vaccines and HB Vaccine among Children in 1999 in Jilin Province. Chin J Vaccines Immun. 2000;41–3.
Derong WANG, Bochuan ZHU. Research on the impact of population mobility on the development level of digital economy. J Huizhou University. 2024;44:1–11.
Shen LP, Yang JY, Mo ZJ, Tan Y, Li RC, Wu XL, et al. The surveillance of incidence status of hepatitis B in Guangxi Province, 1996–2005. Chin J Dis Control Prev. 2007;11:210–1.
Xiao J, Lin H, Liu T, Zeng W, Li X, Shao X, et al. Disease burden from hepatitis B virus infection in Guangdong Province, China. Int J Environ Res Public Health. 2015;12:14055–67.
Li W, Zhang X. Guangdong-Hong Kong-Macau Greater Bay Area’s Construction promotes the Economic Development of Guangdong in the New era. E3S Web Conf. 2021;235:02014.
Lin C-H, Wen T-H. How spatial epidemiology helps understand infectious human disease transmission. Trop Med Infect Dis. 2022;7:164.
Liu J, Wang X, Wang Q, Qiao Y, Jin X, Li Z, et al. Hepatitis B virus infection among 90 million pregnant women in 2853 Chinese counties, 2015-2020: a national observational study. Lancet Reg Health – West Pac. 2021;16:100267.
Xu X, Wu C, Jiang L, Peng C, Pan L, Zhang X, et al. Cost-effectiveness of hepatitis B mass screening and management in high-prevalent Rural China: a model study from 2020 to 2049. Int J Health Policy Manag. 2022;11:2115–23.
Zeng D-Y, Li J-M, Lin S, Dong X, You J, Xing Q-Q, et al. Global burden of acute viral hepatitis and its association with socioeconomic development status, 1990–2019. J Hepatol. 2021;75:547–56.
Li J, Xu Z, Zhu H. Spatial-temporal analysis and spatial drivers of hepatitis-related deaths in 183 countries, 2000–2019. Sci Rep. 2023;13:19845.
Pan W, Xie T, Wang Z, Ma L. Digital economy: An innovation driver for total factor productivity. J Bus Res. 2022;139:303–11.
© 2025. This work is licensed under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.