Health Care, Medical Insurance, and Economic

Full text

Turn on search term navigation

1. Summary

This paper presents a comprehensive dataset of inpatients’ financial conditions, their demographic information, opinions about treatment, and hospital fees. The survey, which was conducted from August 2014 to March 2016, strictly conformed to the ethical standards of the International Committee of Medical Journal Editors (ICMJE) Recommendations, the World Medical Association (WMA) Declaration of Helsinki, and Decision 460/QD-BYT by the Vietnamese Ministry of Health. The survey process was long due to the sensitive nature of the research. The survey team approached and gradually asked the patients and/or patients’ families about sensitive matters related to their financial situation and their attitudes and behaviors regarding the hospital and treatment process, such as bribery or length of stay. In some instances, the process took up to three to four weeks due to emotional instability on the part of the patient or their family. Eventually, 1042 records were collected. Smaller subsets have been derived from the dataset and analyzed to explore health insurance issues [1], health care payments, financial destitution [2,3,4], and satisfaction with healthcare services [5].

The submitted dataset provides the full 1042 observations and the entire set of coded variables. Moreover, a demo analysis of a Bayesian statistics approach is also introduced in the article. The comprehensive information from the dataset and the new method are expected to provide resources for health economic researchers to investigate the healthcare and health insurance services in transitional economies such as Vietnam.

In the Data Description section, we explain in detail the coded variables and propose some potential research questions that might be explored using the dataset. Then, the employed methods and examples of analysis are shown in the Methods section. Finally, the article concludes with the limitations and implications of the dataset.

2. Data Description

The dataset includes 1042 records of patients’ demographic information, financial status, opinions about treatment, and hospital fees. Previously, smaller datasets of 330 and 900 records extracted from this dataset were used to explore health insurance and healthcare services [1,2,5] in addition to the financial burden of patients [2,3,4] in Vietnam. The current dataset, never publicized before, presents all of the records with all measured variables. There are 15 categorical (discrete) variables and 15 numerical (continuous) variables. Some of these variables could be used indirectly. For instance, the numerical variable “Income” was used to constitute “IncRank”. Details of the categorical variables can be found in Table 1.

Table 2 shows the explanation and simple statistical description for numerical variables.

In Figure 1 and Figure 2, visualizations of the variables “Burden” and “IfHigher” are shown. Figure 1 confirms the intuitive observation that lower-income patients tended to have a higher financial burden, while the total medical expenditures and daily costs rose according to the degree of the financial burden. This result indicated a finance–health dilemma for low-income patients in Vietnam.

Figure 2 shows that the income of male patients was relatively higher than that of female patients, while the total medical expenditures and average daily costs for both males and females were relatively similar. The implication is clear: female patients faced a greater financial risk than their male counterparts.

Figure 3 shows the distribution of patients’ ages on a histogram, which was created using the numerical variable ‘Age.’ Most patients ranged from late teens to early 60s with people in their 50s representing the highest percentage.

Since its economic reforms, Vietnam’s health care system has experienced major changes, which have greatly affected the delivery and financing of health services [6,7]. Several issues related to efficiency and equity have been raised. The cost of visiting a doctor and drugs are relatively expensive for many households [8]. Besides, travel costs and the amount of time required might also be the reasons behind the increase in financial burden, and lead to discontinued income during the treatment period.

Low-income households usually spend a higher percentage of their monthly income on health services than wealthier households. As a result, the risk of being destitute seems to be higher among poor households [9]. This dataset can, therefore, provide evidence and trends regarding the financing methods of Vietnamese patients in health services.

Table 3 shows some potential research questions and hypotheses that can be examined by employing this dataset. Several research questions and hypotheses have already been explored using smaller datasets [1,2,3,4,5].

3. Methods

3.1. Data Collection

In order to collect the data, 1042 patients from a number of hospitals in the northern region of Vietnam were surveyed by questionnaires. The surveyed hospitals were major hospitals in the region, such as Viet Duc Hospital and Bach Mai Hospital in Hanoi, Viet Tiep Hospital and Kien An Hospital in Haiphong, and Uong Bi Hospital in Quang Ninh, to name a few. Further details can be seen in the dataset. The survey strictly conformed to the ethical standards of the ICMJE Recommendations, the WMA Declaration of Helsinki, and Decision 460/QD-BYT by the Vietnamese Ministry of Health. A total of 330 records were collected during the first phase, from 2014 August 10 to February 2015. More records were obtained from February to May 2015, raising the total number of observations to 900. The third and final phase ended in March 2016, with the final set of 1042 patient records.

The survey took 20 months to finish due to the sensitive nature of the research. For instance, there were cases in which the survey team had to approach the patients or families four to five times over the course of four weeks in order to collect one questionnaire. As a matter of fact, some patients themselves or their family members became too emotional to finish the survey as they thought of the severity of their illnesses.

Raw data from the collected questionnaires were entered into an Excel file at 1042data.xlsx (see the dataset). The data were then edited and saved in CSV format for analyzing in the R statistical software (v3.5.3). Both frequentist and Bayesian statistics approaches were explored in the data analysis.

3.2. Frequentist Analysis

The analysis used the baseline-category logits (BCL) model [10]. Because the current dataset was a combination of discrete and continuous variables, logistic regression was a suitable method for demonstrating the independence or association among variables. Using coefficients, the logistic model could estimate the probability for each value of response variables according to the condition of the exploratory variables.

The common equation of the logistic model is as follows:

$\log (\frac{π_{j} (x)}{π_{J} (x)}) = α_{j} + β_{j}^{T} x, j = 1, \dots, J - 1,$

where

π_{j} (x) = P (Y = j | x)

, with Y as the response variable, indicates the probability corresponding to the exploratory variable x.

The probability of each response variable was calculated as follows:

$π_{j} (x) = \frac{\exp (α_{j} + β_{j}^{T} x)}{1 + \sum_{h - 1}^{J - 1} \exp (α_{h} + β_{h}^{T} x)} .$

The current article employs the analysis used in [2], which estimated the probability of the type of Burden by using the 330–observation dataset. This time, the model was re-run using the full 1042–observation dataset. Table 4 reports the results obtained from the estimations.

The analysis was executed by using the following R commands:

> library(nnet)

> library(stargazer)

> data1$Res<-relevel(data1$Res,ref=“Yes”)

> data1$Insured<-relevel(data1$Insured,ref=“Yes”)

> logit_burden<-multinom(Burden ~ Res + Insured, data=data1)

> stargazer(logit_burden,type = “text”, out = “logit_burden.htm”)

Additional R commands can be found in CodeR.txt (see the dataset). The resulting coefficients were then used to construct Equations (1)–(3), corresponding to each logit model respectively, as follows:

(1) $\log (\frac{{\hat{π}}_{B}}{{\hat{π}}_{A}}) = - 1.291 + 1.784 N o n R e s + 1.601 U n I n s u r e d,$

(2) $\log (\frac{{\hat{π}}_{C}}{{\hat{π}}_{A}}) = - 2.599 + 3.801 N o n R e s + 1.635 U n I n s u r e d,$

(3) $\log (\frac{{\hat{π}}_{D}}{{\hat{π}}_{A}}) = - 6.561 + 4.163 N o n R e s + 2.401 U n I n s u r e d .$

The probabilities corresponding to the status of burden outcomes were also calculated according to each condition of residency and being insured. The results are demonstrated in Figure 4:

This dataset indicated a similar decreasing trend of probabilities of destitution corresponding to both long-time and short-time hospitalization (see Figure 5). It also confirmed that longer length of hospital stay increased the risk of falling into destitution [5]:

3.3. Bayesian Analysis

In this section, we use a Bayesian statistics approach to examine the dataset. We hoped that the application of Bayesian statistics would bring a fresh perspective to the dataset. The strength of the Bayesian approach is its capacity to visualize the result and the distributions of the coefficients. Moreover, the Bayesian approach also allows for a robustness check of the model using the analysis of prior sensitivity. Had the model been not sensitive to adjustment of the prior, we would have robust evidence for its credibility [11,12,13,14].

R statistical software and a BayesVL package (v0.6) were used to construct a regression model for the correlation between the patients and their families’ financial situation after paying for treatment (“burden”) against where the patients reside (“res”) and whether they were insured or not (“insured”) [13,14,15,16]. Similar applications of Bayesian statistics can be found in [11,12]. The BayesVL package is available in [17].

The mathematical formulation of the model is as follows:

burden[i] = α + β_res * res[i] + β_insured * insured[i].

The BayesVL package (v0.6) was used to design the model, generate the STAN code for the model, and for the test. Examples of R code that were used to construct the model are as follows:

# Design the model

model <- bayesvl()

model <- bvl_addNode(model, “burden”, “norm”)

model <- bvl_addNode(model, “res”, “norm”)

model <- bvl_addNode(model, “insured”, “norm”)

model <- bvl_addArc(model, “res”, “burden”, “slope”)

model <- bvl_addArc(model, “insured”, “burden”, “slope”)

# Generate the stan code for model

model_string <- bvl_model2Stan(model)

cat(model_string)

# Fit the model

fit <- bvl_modelFit(model, data1, warmup = 2000, iter = 20000, chains = 4, cores = 1)

Moreover, the STAN code that was used for model sampling and parameter learning is as follows:

data {

int<lower = 0> Nobs; //number of observations

vector[Nobs] y;

vector[Nobs] res; //independent variable 1

vector[Nobs] insured; //independent variable 2

}

parameters {

real alpha; //intercept

real b_res; //beta for educate, etc

real b_insured;

real sigma;

}

model {

alpha ~ normal(0,100); //priors for all betas

b_res ~ normal(0,100); //

b_insured ~ normal(0,100);

y ~ normal(alpha + b_res * res + b_insured * insured, sigma); //model

}

generated quantities {

vector[Nobs] log_lik;

for(i in 1:Nobs) {

log_lik[i] = normal_lpdf(y[i] | alpha + b_res * res[i] + b_insured * insured[i], sigma);

}

The regression model using the Bayesian approach provided the following results:

4 chains, each with iter = 5000; warmup = 1000; thin = 10;
post-warmup draws per chain = 400, total post-warmup draws = 1600.

	mean	se_mean	sd	2.5%	25%	50%	75%	97.5%	n_eff	Rhat
alpha	4.08	0	0.09	3.90	4.02	4.08	4.14	4.24	1485	1
b_res	−1.03	0	0.04	−1.12	−1.06	−1.03	−1.01	−0.95	1502	1
b_insured	−0.33	0	0.05	−0.43	−0.37	−0.33	−0.30	−0.24	1610	1
sigma	0.65	0	0.01	0.62	0.64	0.65	0.66	0.67	1763	1

In the mathematical form:

burden ~ 4.08 -1.03 * res -0.33*insured.

As shown above, all regression coefficients were negative, which suggested that where patients reside would affect their financial burden after paying for treatment, while having insurance showed less effect on the financial burden. The posterior distribution of all coefficients is presented in Figure 6.

The Hamiltonian Markov chain Monte Carlo (MCMC) technical validations for the model using the STAN code are shown in Figure 7. The MCMC simulation in STAN contained 4 Markov chains with 5000 iterations.

In the model, the correlation coefficients’ posterior distributions are shown in Figure 8:

Finally, the simulated parameter pairs of “insured” and “res” are shown in Figure 9:

4. Conclusions

This data descriptor article presents a comprehensive dataset on the situations and opinions of inpatients regarding the cost of treatment at the hospital, and the application of both the frequentist and Bayesian statistics approaches in data analysis. Smaller subsets extracted from this dataset were the backbone of five different health economic publications, which contributed significantly to the literature of healthcare, health insurance, patients’ satisfaction with the hospital, and their financial destitution. The public availability of the full dataset and the introduction of Bayesian method will enable health economic researchers to explore more issues and infer significant insights. Furthermore, the previous and upcoming findings based on this dataset have supported and will continue to inform the decisions of healthcare policy-makers in making grounded policies that will help inpatients [18].

We acknowledge that the dataset only reflects the situation in the northern region of Vietnam and the mindset of people in this region. In different areas with different economic contexts, specific findings may not hold. However, the values of this dataset do not only lie in its records, but also the design logic, the usage of coded variables, and the potential for replication and expansion. Therefore, we hope scholars from Vietnam and worldwide will breathe new life into the dataset. We believe researchers from different backgrounds will be able to exploit every aspect of this dataset, under comparative perspectives, for example.

Author Contributions

Conceptualization: Q.-H.V., K.-C.P.N.; methodology: Q.-H.V., V.-P.L.; formal analysis: M.-T.H., M.-H.N., V.-P.L., T.T.; data curation: K.-C.P.N., M.-T.H., M.-H.N., V.-P.L., H.-K.T.N.; writing—original draft preparation: M.-T.H., M.-H.N., T.-T.V.; writing—review and editing: M.-T.H., T.-T.V., H.-K.T.N.; visualization: M.-H.N., V.-P.L., T.T.; supervision: Q.-H.V., K.-C.P.N.; project administration: Q.-H.V., M.-T.H.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank the staff of Vuong & Associates, especially Dam Thu Ha, Do Thu Hang, Vuong Ha My, and Ho Manh Tung, for their support in logistics and research. We would also like to show our appreciation to the personnel of the hospitals, the patients and their families.

Conflicts of Interest

The authors declare no conflict of interest.

Figures and Tables

Figure 1. The level of “Income”, “Spent”, and “Dcost” according to the types of “Burden” of the patient.

Figure 2. The level of “Income”, “Spent”, and “Dcost” according to the types of “IfHigher” of the patients.

Figure 3. A histogram for the distribution of patients’ age.

View Image - Figure 4. The probabilities were computed corresponding to the status of burden outcomes based on the conditions of residency and insurance. Recreated from the idea in [4]. Note: minimally affected (A), adversely affected (B), destitute (C), adversely destitute (D).

Figure 4. The probabilities were computed corresponding to the status of burden outcomes based on the conditions of residency and insurance. Recreated from the idea in [4]. Note: minimally affected (A), adversely affected (B), destitute (C), adversely destitute (D).

View Image - Figure 5. The probabilities of destitution corresponding to both long-time and short-time hospitalization based on the conditions of residency and insurance. Recreated from the idea in [4]. Note: destitution with long-time hospitalization (DestLong) and destitution with short-time hospitalization (DestShort).

Figure 5. The probabilities of destitution corresponding to both long-time and short-time hospitalization based on the conditions of residency and insurance. Recreated from the idea in [4]. Note: destitution with long-time hospitalization (DestLong) and destitution with short-time hospitalization (DestShort).

Figure 6. The regression model’s posterior distribution of all coefficients. Note: HPDI: Highest Posterior Density Interval.

Figure 7. The Hamiltonian Markov chain Monte Carlo (MCMC) technical validations for the simulation model.

Figure 8. The correlation coefficients’ posterior distribution.

Figure 9. Simulated parameter pairs of “insured” and “res”.

Table 1

Categorical variables.

Coded Name	Explanation	Items	Total		Male		Female
Coded Name	Explanation	Items	Freq	%	Freq	%	Freq	%
Res	Whether the patient lives in the same region as the hospital.	Yes	578	55.5	323	55.9	255	44.1
Res		No	464	44.5	289	62.3	175	37.7
Stay	How long the patient stays at the hospital: under 10 days (S) or more than 10 days (L).	Long	289	27.7	175	60.6	114	39.4
Stay		Short	753	72.3	437	58.0	316	42.0
Insured	Whether the patient has valid insurance or not.	Yes	724	69.5	406	56.1	318	43.9
Insured	Whether the patient has valid insurance or not.	No	318	30.5	206	64.8	112	35.2
Edu	The highest educational level of the patient: junior high school (JHS), high school (HS), university (Uni), or graduate school (Grad).	JHS	141	13.5	79	56.0	62	44.0
		HS	705	67.7	426	60.4	279	39.6
		Uni	194	18.6	105	54.1	89	45.9
		Grad	2	0.2	2	100.0	0	0.0
SES	The socioeconomic status of the patient. This variable was based on IncRank (the ranking of the patient’s income) or that of the patient’s guardian(s) if required.	Hi	38	3.6	20	52.6	18	47.4
		Med	908	87.1	535	58.9	373	41.1
		Low	96	9.2	57	59.4	39	40.6
Illness	The seriousness of the patient’s illness or injury. In the dataset, the variable “Ill2” combined two values “ill” and “light” into one value “light” for analysis.	Emergency	285	27.4	204	71.6	81	28.4
		Bad	520	49.9	293	56.3	227	43.7
		Ill	221	21.2	105	47.5	116	52.5
		Light	16	1.5	10	62.5	6	37.5
Jcond	The condition of the patient’s employment.	Stable	513	49.2	300	58.5	213	41.5
		Unstable	335	32.1	212	63.3	123	36.7
		Unemployed	99	9.5	52	52.5	47	47.5
IncRank	The ranking of the patient’s income.Unit: million VND (Vietnamese Dong).	High (>180)	8	0.8	4	50.0	4	50.0
		Middle (48–180)	241	23.1	139	57.7	102	42.3
		Low (<48)	793	76.1	469	59.1	324	40.9
AvgCost	The average cost that the patient spent daily during treatment. Unit: million VND (Vietnamese Dong).	High(>5.4)	159	15.3	110	69.2	49	30.8
		Medium(1.5 to 5.4)	432	41.5	255	59.0	177	41.0
		Low(≤1.5)	451	43.3	247	54.8	204	45.2
InsL	The categories of the amount that insurance covered. It is based on the numerical variable “Pins”, which is the portion of fees covered by insurance reimbursement.	A (>0.45)	546	52.4	318	58.2	228	41.8
		B (>0.25 and ≤0.45)	105	10.1	45	42.9	60	57.1
		C (≤0.25)	65	6.2	35	53.8	30	46.2
		N.E. (=0)	326	31.3	214	65.6	112	34.4
EnvL	The portion of “extra thank-you money” that the patient had to include in the medical fees.	High(>15%)	108	10.4	37	34.3	71	65.7
		Medium (7%–15%)	158	15.2	99	62.7	59	37.3
		Low (<7%)	464	44.5	294	63.4	170	36.6
		Nil (0)	312	29.9	182	58.3	130	41.7
Burden	The self-reported evaluation of the patient’s and family’s financial situation after paying treatment fees: minimally affected (A), adversely affected (B), destitute (C), adversely destitute (D).	A	442	42.4	232	52.5	210	47.5
		B	275	26.4	161	58.5	114	41.5
		C	312	29.9	213	68.3	99	31.7
		D	13	1.2	6	46.2	7	53.8
End	The outcome of treatment: recovered (A), need follow-up treatment (B), stopped in the middle (C), and quit early (D).	A	539	51.7	273	50.6	266	49.4
		B	394	37.8	259	65.7	135	34.3
		C	47	4.5	31	66.0	16	34.0
		D	62	6.0	49	79.0	13	21.0
SatIns	The patient’s satisfaction level regarding health insurance.	Satisfied	118	11.3	61	51.7	57	48.3
		Average	613	58.8	344	56.1	269	43.9
		Low	1	0.1	1	100.0	0	0.0
		No Comment	274	26.3	178	65.0	96	35.0
IfHigher	The self-reported evaluation of the patient’s and family’s financial situation if the patient continues treatment. The values of this variable are the same as “Burden”.	A	185	17.8	80	43.2	105	56.8
		B	641	61.5	391	61.0	250	39.0
		C	187	17.9	123	65.8	64	34.2
		D	29	2.8	18	62.1	11	37.9

Table 2

Numerical variables.

Coded Name	Explanation	Unit	Mean	Standard Deviation	Min	Max
Age	The patient’s age.	Age	45.43	17.96	1	92
Days	The number of days the patient stays in for treatment.	Day	8.97	5.99	1	60
MaxIns	The highest level of insurance coverage.	Percent	0.60	0.42	0	1.00
Saving	The portion of savings.	Percent	0.18	1.99	0	60.00
WkYrs	The number of years the patient has worked.	Year	20.6	15.85	0	60
Income	The annual income of the patient.	Million VND(Vietnamese Dong)	40.67	39.04	0	550.00
Dcost	The cost of staying at the hospital for a day.		3.07	3.76	0.03	50.33
Spent	The amount of money the patient actually spent.		27.85	42.40	0.10	665.00
Pins	The portion of fees financed by insurance reimbursement.	Percent	0.41	0.33	0	0.90
Pinc	The portion of fees financed by income.		0.50	0.33	0	1.00
Pchar	The portion of fees financed by a charity.		0.02	0.09	0	1.00
Ploan	The portion of fees financed by a loan.		0.07	0.17	0	1.00
Streat	The portion of funds used for treatment.	Percent	0.82	0.13	0.17	1.00
Srel	The portion of funds used for paying relatives who came to help.		0.12	0.10	0	0.83
Senv	The portion of funds used for “extra thank-you money” or for bribing doctor/staff.		0.06	0.07	0	0.60

Table 3

Research questions and hypotheses.

• What are the effects of socio-demographic factors on the probability of being destitute?

• To what extent are socio-demographic factors the determinants of the degree of illness?

• What is the impact of hospitalization length on patients’ financial burden?

• How do the treatment costs and illness explain the end outcome of treatment?

• How does the amount of out-of-pocket “extra thank-you money” determine the end outcome of treatment?

Table 4

Rechecking the probability of the type of “Burden”.

	Intercept	Resident	Insured
		No	No
	$β_{0}$	$β_{1}$	$β_{2}$
Logit(B\|A)	−1.291***	1.784***	1.601***
Logit(C\|A)	−2.599***	3.801***	1.635***
Logit(D\|A)	−6.561***	4.163***	2.401***
Residual Deviance = 1777.9, Log-likelihood = −888.96 on 9 df, baseline = “A”

Word count: 3293

Show less

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

The dataset contains 1042 records obtained from inpatients at hospitals in the northern region of Vietnam. The survey process lasted 20 months from August 2014 to March 2016, and yielded a comprehensive set of records of inpatients’ financial situations, healthcare, and health insurance information, as well as their perspectives on treatment service in the hospitals. Five articles were published based on the smaller subsets. This data article introduces the full dataset for the first time and suggests a new Bayesian statistics approach for data analysis. The full dataset is expected to contribute new data for health economic researchers and new grounded scientific results for policymakers.

Dataset: The dataset is submitted as a supplement to this manuscript.

Dataset License: CC-BY

Details

Title

Health Care, Medical Insurance, and Economic Destitution: A Dataset of 1042 Stories

Author

Ho, Manh-Toan¹

; Viet-Phuong La¹

; Minh-Hoang Nguyen²

; Thu-Trang Vuong³; Nghiem, Kien-Cuong P⁴; Tran, Trung⁵; Nguyen, Hong-Kong T⁶; Quan-Hoang Vuong¹

¹ Center for Interdisciplinary Social Research, Phenikaa University, Ha Dong District, Hanoi 100803, Vietnam; Faculty of Economics and Finance, Phenikaa University, Ha Dong District, Hanoi 100803, Vietnam
² International Cooperation Policy, Graduate School of Asia Pacific Studies, Ritsumeikan Asia Pacific University, Beppu, Oita 874-8577, Japan
³ Sciences Po Paris, Campus de Dijon, 21000 Dijon, France
⁴ Vietnam-Germany Hospital, 16 Phu Doan Street, Hoan Kiem District, Hanoi 100000, Vietnam
⁵ Vietnam Academy for Ethnic Minorities, DreamTown COMA6, Road 70, Tay Mo, Nam Tu Liem, Hanoi 100000, Vietnam
⁶ Graduate School of Asia Pacific Studies, Ritsumeikan Asia Pacific University, Beppu, Oita 874-8577, Japan

First page

Publication year

2019

Publication date

2019

Publisher

MDPI AG

e-ISSN

23065729

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/data4020057

ProQuest document ID

2548365847

Health Care, Medical Insurance, and Economic Destitution: A Dataset of 1042 Stories

Jump to:

Full text

Abstract

Details

Suggested sources