Content area
Personal health records (PHR) system serves not just as static repositories for data but it also combines knowledge and software tools with patients’ data that results in empowering patients to become active participants in their own healthcare management by providing their own medical history. In this research, a cloud-dew Architecture based Layered Provenance Framework for PHR system is proposed that goes beyond the trivial network/storage/service concept to a new micro-service level concept offering high scalability and availability in vertically distributed computing hierarchy. It pushes the frontiers of computing applications, data, and low-level services away from centralized virtual nodes to the end users. In response to this new micro-service model, a study was conducted of initially scrutinizing the cloud-dew architecture and ascertaining a list of requirements for the collection of provenance. We have observed the different layers of the cloud-dew architecture for identifying the requirements while keeping in view the several characteristics of cloud-dew architecture such as abstraction, modularity, etc. A lightweight and cost-efficient provenance framework is designed and established for its accomplishment. In addition, this proposed framework also provides services such as storage, query, and visualization of provenance besides highlighting the identified list of requirements. To measure the benefits of enabling provenance in Dew Computing, the pre-collected provenance and services of the proposed framework are utilized.
Introduction
In the past few years, the adoption of electronic health record (EHR) has been a common tool among public and private organizations around the world (Dunlop 2006). The shift from a healthcare provider’s patriarchal approach to the consumer-oriented approach is in limelight these days. It is in the notice that the end users that are well informed about their health issues, tend to be more responsive to the instructions and healthcare procedures. Empowering a patient via Personal Health Record System in correspondence with healthcare providers may be the best possible way out once a patients’ medical history is provided (Waegemann 2002; Win and Fulcher 2007). This system includes all relevant patient data with possible decision support capabilities that can help patients in managing their chronic conditions while providing their medical history instantly. However, the derivation history of an object is described by the metadata called data provenance which is a key component of data science.
The data must be complemented with its background providing the accurate information of its capturing nature, processing, validation, analyzing and other important facts that may enable its use and interpretation which may be more beneficial (Sohaib et al. 2018). Dealing with provenance is critical not only to provide evidence to the reliability and trustworthiness of the derived object but also to support audit and trial of activities as well. So far, provenance has been widely explored in centralized data engineering architectures and in all different forms of distributed systems such as Grid (especially in the U.K. e-Science program), Cloud etc. To our best knowledge, it has not yet been addressed in the context of cloud-dew computing. We propose a cloud-dew provenance framework for personal health record system following up a case study from Khyber Pakhtunkhwa. Despite promising future, PHR has not shown tremendous adoption until now. Government hospitals in KPK are not offering any such facility as well as they are also not equipped with up to this level. However, the private hospitals are offering the electronic health recording facility but that too is isolated and can only be accessible by the management of that specific hospital.
Problem statement and research objectives
There are several challenges and reasons behind lack of adoption of PHR in KPK which may range from cultural issues to privacy concerns. Moreover, on the technical side, dealing with the provenance of data is also critical not only to provide evidence to the reliability and trustworthiness of the derived object but also to support audit and trial of activities. In this context, we categorized our objectives into the following two areas:
Objective 1: To develop a cloud-dew provenance framework for Personal Health Record system in Khyber Pakhtunkhwa;
Objective 2: To investigate the attitudes of patients/public towards sharing their medical information and how well are they prepared to adopt PHR system.
Our methodology for achieving these objectives is focused as per the given Fig. 1.
Fig. 1 [Images not available. See PDF.]
Proposed Methodology
Information about barriers and attitudes toward sharing PHR data by the KPK public/patients will be collected through different surveys and through focus group discussions and one-on-one discussions with community members, health care providers and other related stakeholders in the objectives component;
However, by using our proposed cloud-dew architecture based layered provenance framework, we introduced various components to address the scalable, independent, collaborative and layered architecture of cloud-dew and to answer the requirements of provenance in posed by the underlying architecture. It is important to mention that the requirements for cloud-dew architecture based provenance framework are Modularity, Independence, Consistency, Overhead, and Usability.
Research contributions
The research contributions of this work are:
Introducing the health industry with the most efficient and most advanced technological solution called PHR and expanding the health literacy rate in KPK;
proposing a cloud-dew architecture based layered provenance framework that would be serving as one of the most appropriate sources in the relevancy with the latest information, data and computational science that is shifting towards cloud-dew architecture;
not only this but this framework would also be focused to highlight the key requirements such as abstraction, modularity etc. in the context of Life Science applications in a modular, independent and seamless fashion for the collection of provenance;
this framework will also provide services such as storage, query, and visualization of provenance besides highlighting the identified list of requirements;
and our proposed interoperable framework would be capable to operate even in the absence of clouds if the cloud failure occurs on any stage;
identification of an effective data modeling language for our framework by comparing the most selected languages available in the slot.
The rest of the paper is organized as follows: existing Literature has been discussed in Sect. 2, methodology is expressed in Sect. 3, Provenance in PHR Systems and Cloud Dew Framework has been discussed in Sects. 4 and 5. Section 6 concludes the work.
Literature review
The literature review of our research describes the two main objectives of our research including the possible barriers and risks hindering the adoption of Personal Health Records System in Khyber Pakhtunkhwa, Pakistan as well as highlights its expediters and the research work is done on Provenance which is a key component of data science.
Personal health record system
The National Coordinator for Health Information Technology at U.S Secretary of Health and Human Services and the administrator of the Center for Medicare and Medicaid Services USA (CMS) acknowledged the Personal Health Records System as the top priority research area (Endsley et al. 2006). Formally, the PHR system can be defined as “A private, secure and confidential electronic application through which individuals can access, manage and share their health-related medical information, and for others whom they are authorized” (Winkelman and Leonard 2004). A PHR system is having enormous potential to make an improvement in both the documentation of a patient’s health information as well as in the patient’s care. Nowadays, we can find several types of PHR systems in the market that are in use globally. Waegemann (2002) has classified all of these types in one of his studies which includes the Offline PHR System, a Web-based Commercial PHR system (that can be used on an organizational level), a Functional PHR system (purpose-based), the Partial PHR systems, and those PHR systems which are provider-based (Bilal et al. 2018). These different PHR systems are then defined briefly, such as those Commercial or Organizational PHR systems which offer their services against some fee (may be annual or for any defined period of time), enabling the end-users to store and access their health information wherever they are. There are a number of healthcare providing organizations and health insurance companies who own and maintain such PHR systems, for example, The Kaiser Permente etc. Moreover, patients suffering from specific chronic diseases need more accurate monitoring and information such as people suffering from Aids, cardiovascular diseases, diabetes, and hypertension—a Purpose-based PHR system is designed for such group of individuals. Krohn has also suggested few different models of PHR systems depending upon their origins such as a Desktop based PHR system, an Internet-based PHR system, and a mobile PHR system that would be located on a mobile phone entirely or on any other portable storage device). Such PHR systems would be stand-alone, Electronic Medical Records Patient Portals, Health Plan Patient Portals and a Consumer-Centric PHR system (Gibson et al. 2009). The performance of these PHR systems can be more beneficial for a patient to communicate their health information with healthcare providers if these systems are regularly monitored and maintained.
Dew computing
A new paradigm is on the horizon for the researchers known as Dew Computing. Though Dew Computing is very closely related to cloud computing, therefore, we can say that it’s a complementary part of cloud computing. There are different definitions by different researchers however Wang (Wang 2015, 2016) defined the dew computing as “When the on-premises computer provides the functionality that is independent of cloud services and is also collaborative with cloud services, it would be termed as the Dew Computing which is basically an on-premises computer hardware-software organization paradigm in the cloud computing environment”. To completely grasp the potentials of on-premises computers and services is the aim of dew computing. In one of a study (Ristov et al. 2016). Presented a discussion on the features of dew computing which is stated as: “The main objective of dew computing is the use of resources as much as possible prior to the transferring of processes into the cloud server. Dew computing architecture is used for providing the micro-services along with the macro services or dew services along with the cloud services”.
As both of these definitions are almost of the same nature, therefore, considering the Wang’s definition, we can say that there are two main features of Cloud Computing; one is independence and the other is collaboration. Here, the independence refers to the ability of providing functionality without taking the services of cloud and an internet connection which means that it is an offline application or in other words it is not a cloud service or an online application. For example, we cannot say that a browser is a dew computing application because a browser cannot provide all of its services until there is an internet connection available. Pre-sending requests to the cloud services, this feature of independence in dew computing supports only using the on-premises resources so that it can completely grasp the potentials of on-premises computers. Whereas, collaboration is referring to that property of Dew Computing’s application through which it has to exchange information with cloud services automatically i.e. synchronization, correlation and other kinds of interoperation. The best example for this feature is the majority of the desktop applications such as Microsoft Office which is not a dew computing application. Collaboration means that cloud services are used by all of the dew computing applications. This feature promotes the utilization of cloud services along with the on-premises computers by grasping the potentials of cloud services.
The structure of Dew Computing is described in Fig. 2. It shows the green leaf as the on-premises computers and the dew drops on it are representing some applications which are running in those on-premises computers. These applications are providing two features i.e. one is the services to users and those devices which are independent of cloud services while the other one is a collaboration with cloud services.
Fig. 2 [Images not available. See PDF.]
Structure of Dew computing
Provenance
The research work on provenance has attracted much attention in the late 90s (Wang and Madnick 1990; Woodruff and Stonebraker 1997) and in early 2000s (Buneman et al. 2001; Cui et al. 2000; Khan et al. 2008). However, we have based our background study on creating a taxonomy of provenance techniques used, based on applications’ domain Fig. 3. Our taxonomy is going around two basic granularities of provenance—one is a workflow or coarse-grained provenance and the other one is data or fine-grained provenance. The coarse-grained workflow provenance is considered to be a program in the scientific domain in which the interconnection between the computation steps and the human–machine steps takes place. It records the overall history of the finally derived output from any workflow, however, the recorded information of provenance would remain variable.
Fig. 3 [Images not available. See PDF.]
Taxonomy of research work on provenance and its applications
Unlikely, in the data (fine-grained) provenance as a result of the transformation step for any given data derivation, a detailed account is provided.
Overall, we have taken six components (i.e. first five from data provenance and the last one is for workflow provenance): (1) Engineering Applications and Provenance, (2) life science applications and provenance, (3) databases and provenance, (4) business (commercial) applications and provenance, (5) humanities and provenance, and (6) workflow applications and provenance.
Identifying barriers towards adoption of PHR
As the technology is growing rapidly and there is a strong emphasis on the advancement advancement in providing the right information on the right time to the right person in today’s globally interconnected world, the healthcare industry in Khyber Pakhtunkhwa is also moving towards the Electronic Health Record System (EHR) and most of the private hospitals are already using it.
It has now been realized that the paper record is not sufficient to provide all the relevant patient information on time to the caregivers so that they can utilize it in a way they need it. With the introduction of new technologies in the healthcare field, the risks and barriers to the adoption of Personal Health Record systems are also changing constantly. The main purpose of our study was to:
To find the highest level of risks in using PHR (both for patients and health care providers).
To identify the core functionalities being used by the hospitals in KPK.
To determine the significance of the relationship between the risks and barriers and the nature of the hospital (i.e. either government or private).
To determine the role of provenance in PHR systems and proposing a framework for the management of the provenance data in the PHR system.
This section is comprised of the overall information and findings that have been collected about the barriers and attitudes towards sharing PHR data by the KPK public/patients. We collected the information through a KAP survey and one-on-one discussions with community members, health care providers, and other related stakeholders.
Methodology
In order to gain a better understanding of Personal Health Record System and the use of Electronic Health Record system in Khyber Pakhtunkhwa, we conducted a study through a survey. We have designed two different questionnaires for this purpose, one for the general users (i.e. patients and their families) and the other one is for healthcare providers. Different hospitals in KPK were visited during the survey and the data samples were collected from 601 individuals collectively (i.e. both from general users and from health care providers). Among these 601 individuals, 53% were male while the rest of 47% were female respondents.
After the data collection process, we further processed it to obtain the results. The response rate was almost equal between the male and female participants. However, in a few stages, the response rate was low due to restrictions to the ethnic traits and it could have been due to various reasons such as lack of awareness on PHR and the disinclination to respond some questions due to their professional obligations.
Results and discussion
As we have conducted our survey via two different questionnaires (i.e. one for general users and the other one is for healthcare providers specifically), our results and discussions are therefore divided into two sections: (1) general users, and (2) healthcare providers.
Analysis of collected data (general users)
We have interviewed a total of 350 individuals in this surveys who were from different age groups and walks of life. These respondents were selected randomly while visiting different hospitals and institutes in Khyber Pakhtunkhwa.
Demographics of general users
As per our results derived from the collected data, there were 188 male respondents among 350 while the rest of 162 were female respondents as shown in Table 1.
Table 1. Demographic of respondents of PHR (general users)
Number | Percent | |
|---|---|---|
Gender | ||
Male | 188 | 53.7 |
Female | 162 | 46.3 |
Total | 350 | 100.0 |
Age | ||
< 20 | 58 | 16.6 |
21–30 | 114 | 32.6 |
31–40 | 111 | 31.7 |
41–60 | 59 | 16.9 |
> 60 | 8 | 2.3 |
Total | 350 | 100.0 |
Education | ||
Primary school | 47 | 13.4 |
Secondary school | 59 | 16.9 |
Intermediate | 53 | 15.1 |
Bachelors | 76 | 21.7 |
Postgraduate | 115 | 32.9 |
Total | 350 | 100.0 |
Profession | ||
Govt. job | 56 | 16.0 |
Private job | 119 | 34.0 |
Self employed | 53 | 15.1 |
Unemployed | 122 | 34.9 |
Total | 350 | 100.0 |
Monthly income | ||
No income | 67 | 19.1 |
< 20000 PKR | 95 | 27.1 |
20,000 to 50,000 PKR | 62 | 17.7 |
50,000 to 80,000 PKR | 31 | 8.9 |
80,000 to 1,20,000 PKR | 18 | 5.1 |
> 1,20,000 PKR | 77 | 22.0 |
Total | 350 | 100.0 |
As we can see in the given 1 above, most of our respondents were youth which means that the 32.6% were from the age group of 21–30 however, 16.6% respondents were less than 20 years of age, 31.7% were between 31 and 40 years of age, 16.9% were between 41 and 50 and only 2.3% were above 60 years of age.
As we can see in the given 1 above, most of our respondents were youth which means that the 32.6% were from the age group of 21–30 however, 16.6% respondents were less than 20 years of age, 31.7% were between 31 and 40 years of age, 16.9% were between 41 and 50 and only 2.3% were above 60 years of age. Among these 350 individuals, the major ratio is of well-educated people, as there were a total of 32.9% respondents who were postgraduates, 21.7% have completed their bachelors, 15.1% were doing their intermediate level, 16.9% were done with their secondary school certification and the remaining 13.4% were only primary level qualified.
Looking into the profession of each of the respondents among these 350, we can see that the major part is of unemployed individuals. There were a total of 34.9% respondents who were unemployed, 34% were doing private jobs, 16% were involved in government jobs while the rest of 15.1% were self-employed or running their own businesses. We have also included a question regarding the monthly income of each individual so that we can determine if there is any impact of income on the attitudes towards adoption of PHR. We have noticed that most of our respondents (27.1%) were earning less than 20,000 PKR per month while 22% were earning more than 120,000 PKR month. 19.1% were those who have not any source of income, 17.7% falls in average category and were earning between 20,000 and 50,000 PKR per month, 8.9% were those who were earning between 50,000 and 80,000 PKR per month and the remaining 5.1% were earning between 80,000 and 120, 000 PKR every month.
These demographics show that most of our respondents were highly qualified and earning a good amount of income each month while it also shows that the 2nd highest ratio was of those who were jobless and do not have any source of income.
Analysis of general users results
To collect the results as per our need, we have used the inferential analysis that permitted us to use the samples to create generalizations about the populations from which the samples were passed.
The given Table 2 shows us the mean of our collected data against ‘Medical Information available on PHR’. It shows that among the 350 respondents, the greater ratio is of those who think that the most important health data that should be available on PHR is individual’s personal medical history which also includes the allergy data.
Table 2. Mean of medical information available on PHR
N statistic | Minimum statistic | Maximum statistic | Mean | Std. deviation | ||
|---|---|---|---|---|---|---|
Statistic | Std. error | |||||
Rank_Personal_Info | 350 | 1 | 5 | 2.57 | 0.087 | 1.629 |
Rank_Personal_Media_History | 350 | 1 | 5 | 3.14 | 0.074 | 1.385 |
Rank_Medication | 350 | 1 | 5 | 2.92 | 0.082 | 1.534 |
Rank_Lab_Test | 350 | 1 | 5 | 2.82 | 0.078 | 1.459 |
Rank_Faimaly_History | 350 | 1 | 5 | 2.89 | 0.074 | 1.387 |
The second highest mean is for ‘Medications’ which means that the respondents are of the opinion that it will be more beneficial if this specific information is avaialabe on PHR. Samwise, the third important medical information according to the respondents is ‘Family Medical History’ which means that most of the individuals are in favor of putting their families medical history on PHR and they assume that it is the most important data for managing one’s health condition. The least important medical data which the respondents thought is less important are ‘Laboratory Tests’ and ‘Personal Information’ respectively.
We have plotted the cross tabulation between ‘age’ of the respondents and their ‘willingness to adopt PHR’ in Table 3 where i.e. one of the variables was independent while the other one was a dependent variable.
Table 3. Chi square test for the age willingness to adopt PHR
Value | Value | Df | Asymp. sig. (2-sided) |
|---|---|---|---|
Person Chi square | 67.167a | 16 | 0.000 |
Likelihood ratio | 69.512 | 16 | 0.000 |
Linear-by-linear association | 11.235 | 1 | 0.001 |
N of valid cases | 350 |
a7 cells (28.0%) have expected count < 5. The minimum expected count is 0.41
The crossed variables with the help of a cross-tabulation table hold both the numerical values for each class along with the words and/or column of approximate significance.
We have used the Chi Square tests to find out whether the association (relationship) between the two categorical variables in a sample data is likely to contemplate a substantial association between these two variables in the population. In Table 3, we can see that there are some selections which were taken from the statistics to get the Chi Square results and to capture the required frequencies, as indicated above. However, we only need the ‘Chi Square’ investigation for our required results. We may notice that our crossed variables are strongly associated with each other because its correlation is .000 which means that age is a great factor in willingness to adopt the PHR system.
Note: The p value is calculated as .000, it means that this value should be interpreted as p < 0.001 and not be taken as exactly 0. Further, if the Chi Square value is < 0.05, this means the value is significant and the variables are ‘strongly associated’.
In Table 4, we have crossed the variable ‘willingness to Adopt PHR’ with ‘computer skills’. It resulted in a strong association with a value of 0.000 which means that computer skills are very important for willingness to adopt PHR. The higher computer skills will result in more willingness to adopt and use the PHR system. However, if the computer using skills are weak, there would be less chances of people willing to adopt the PHR system.
Table 4. Chi square test for computer skills willingness to adopt PHR
Value | Df | Asymp. Sig. (2-sided) | |
|---|---|---|---|
Person Chi square | 85.114a | 16 | 0.000 |
Likelihood ratio | 87.490 | 16 | 0.000 |
Linear-by-linear association | 57.836 | 1 | 0.001 |
N of valid cases | 350 |
a7 cells (28.0%) have expected count < 5. The minimum expected count is 0.98
We have therefore plotted the ICT skills of our selected 350 respondents. It can be observed from Fig. 4 that 45% of the individuals are those who are the average users of the computer while 21% are the beginners. 17% of respondents cannot use a computer which is a point to ponder because when people are not familiar with the technologies they avoid to adopt it and they feel uncomfortable with it. 12% of respondents are those who are the advanced users and use a computer frequently, however, only 5% users are professional who either studied computer or doing jobs in such departments.
Fig. 4 [Images not available. See PDF.]
ICT Skills
Likewise, Table 5 below shows the cross-tabulation of ‘internet usage’ with the same variable ‘Willingness to Adopt PHR’. It can be noticed that the Chi Square value for this cross tabulation is also 0.000 which means that these variables are also strongly associated with each other. So, if anyone is familiar with the using internet frequently, they would be willing to adopt PHR and if the frequency of using the internet is low, there would be fewer chances of willingness towards the adoption of PHR.
Table 5. Chi square test for internet usage willingness to adopt PHR
Value | Df | Asymp. Sig. (2-sided) | |
|---|---|---|---|
Person Chi square | 1358E2a | 16 | 0.000 |
Likelihood ratio | 87.490145.438 | 16 | 0.000 |
Linear-by-linear association | 84.858 | 1 | 0.001 |
N of valid cases | 350 |
a4 cells (16.0%) have expected count < 5. The minimum expected count is 1.59
In the above Fig. 5, we have selected all those questions which show the willingness of respondents to adopt PHR keeping in mind the options given for each question i.e. ‘Strongly Agree’, ‘Agree’, ‘To Some Extent’, ‘Disagree’, and ‘Strongly Disagree’. Further, if we observe these five questions, we may notice that most of the answers given are neutral while the ratio of disagreement is very low. In the first question, 118 individuals agreed to share their health information on a website to some extent however, 99 respondents were those who strongly opposed this point and are not willing to share their health data on any website. Likewise, the responses for the second question indicate that majority of the responses are in favor of providing access to the users which could help them maintain their health data and result in improved health conditions. Considering the third question, the agreement level of the respondents are again high while very few people are against the idea of adopting PHR and thus supporting the fourth question which shows that most people are avoiding to volunteer the PHR by selecting the neutral option frequently. Besides, all of these findings, the last question is providing a very positive response from a greater no. of respondents. It can be noticed that the agreement level is very high here while very few responses are negative, showing that they are not interested in maintaining the PHR if it is somehow provided in KPK.
Fig. 5 [Images not available. See PDF.]
Willingness of respondents to adopt PHR
In short, the majority of the individuals are not resistant to the adoption and use of ICT in healthcare services. It shows a very positive aspect of those strategies which are developed for the adoption of more advanced and beneficial healthcare quality services and are likely to be well received on the side of healthcare providers too.
Analysis of collected data (healthcare providers)
During this survey, we have interviewed a total of 251 individuals from the healthcare field comprising of doctors, technicians, pharmacists, laboratory assistants, and nurses etc. The purpose of our study was to investigate the barriers and risks in the adoption of PHR in KPK in the perspective of healthcare providers. These 251 respondents were selected randomly by visiting different hospitals, clinics, laboratories, pharmacies and healthcare institutions.
Demographics of healthcare providers
After the collection of data samples from 251 respondents, we processed it to find out our required results for further discussions. Among these 251 respondents, 132 were males while 119 were females. It shows that our survey was more balanced in the perspective of gender.
As shown in Table 6, we have calculated the demographics of our healthcare providers with only two important variables; one is gender and the other is age. Looking into the age section, we can see that there were only 0.8% respondents whose age was less than 20 years which indicates that these were students from healthcare institutions. 18.7% individuals were between 21 and 30 years of age and a major ratio was observed for young respondents which were 49%. These 49% of people were from the age of 31–40. 28.7% of respondents were between 41 and 60 years of age and thus shows that they were those well-experienced doctors and healthcare providers who were serving in this field for many years.
Table 6. Demographics of respondents of PHR (healthcare providers)
Number | Percent | |
|---|---|---|
Gender | ||
Male | 132 | 52.6 |
Female | 119 | 47.4 |
Total | 251 | 100.0 |
Age | ||
< 20 | 2 | 0.8 |
21 to 30 | 47 | 18.7 |
31 to 40 | 123 | 49.0 |
41 to 60 | 72 | 28.7 |
> 60 | 7 | 2.8 |
Total | 251 | 100.0 |
Only 2.8% of these individuals were above the age of 60 years (as shown in Fig. 6).
Fig. 6 [Images not available. See PDF.]
Age Segregation
All of these age groups shared their viewpoints about the barriers and risks that can come across the adoption of PHR system in KPK which are further explained in the upcoming section.
Analysis of data (healthcare providers)
As mentioned in the previous section, we have used the inferential analysis to process our collected data and derive the required outputs. The given Table 7 indicates the statistics of our questionnaire that we have designed for the healthcare providers.
Table 7. Descriptive statistics
Scale from 1 to 5 | N | Mean |
|---|---|---|
Security of data | 251 | 3.75 |
Privacy of data access control | 251 | 3.65 |
Inaccurate patients information due to periodic updates | 251 | 3.57 |
Availability of standards of information | 251 | 3.56 |
Inability to find the appropriate software | 251 | 3.52 |
Patients may frightened about critical diseases | 251 | 3.45 |
Return on investment | 251 | 3.35 |
Personal cost of hiring new technical staff | 251 | 3.31 |
Hardware cost | 251 | 3.16 |
Participation from nursing staff | 251 | 3.00 |
Participation from physician | 251 | 2.88 |
Patients may frightened about common diseases | 251 | 2.78 |
This Table contains all of the questions related to the barriers and risks that may be obstructing the adoption of PHR. All of the 251 respondents recorded their answers with us on the provided questionnaires and after calculating their means we found out that the highest risk to the adoption of PHR would be the ‘Security of Data’. Most of the individuals were of the opinion that the security of data and privacy of access control may be the barriers because if the data is lost or any important data is leaked, their jobs would be on risk. Further, the next highest mean was calculated for inaccurate patients information due to periodic updates that have a value of 3.57 while the availability of standards of information has also been mentioned as the barrier which may be a risk to the adoption of PHR. Many of the respondents also mentioned that the inability to find the appropriate software is a barrier, as well as, there are many low standard softwares available in the market which may downgrade the value of PHR’s high-level framework that would no doubt be costly. Moreover, a large number of respondents shared their views on making the patients more independent. They added that there is 90% of chances that a patient finds out about a critical disease via their PHR and it may create a panic for them. They would get frightened and probably their nervous system may fail. Along with all these concerns, most of the respondents termed the return on investment as one of the major barriers in PHR’s adoption because whenever a new technology is introduced, it may require a huge amount to be implemented. The rest of the questions got average responses from the healthcare providers while the least perceived barrier to the adoption of PHR was patients, concerns on finding out about their common diseases.
Since the Chi Square tests have been applied on Table 8, it shows that whatever the hospital type is, the privacy of data access control would be a barrier either it’s a private hospital or a government one. We can see that the significance is .211 which is greater than 0.05 and hence indicates that both of these variables are not associated with each other.
Table 8. Chi square test for the hospital type privacy of data access control
Value | Df | Asymp. Sig. (2-sided) | |
|---|---|---|---|
Person Chi square | 5.846a | 4 | .211 |
Likelihood ratio | 5.874 | 4 | .209 |
Linear-by-linear association | 3.407 | 1 | 0.065 |
N of valid cases | 251 |
a2 cells (20.0%) have expected count < 5. The minimum expected count is 0.97
Samwise, in Table 9, we can notice that the value of p = 0.034 which is not greater than 0.05 which means that the values are significant and are closely associated. It also indicates that the inaccurate patient’s information due to periodic updates depends upon the type of hospital either it is private or public.
Table 9. Chi square tests for hospital type Inac-curate patient info due to periodic updates
Value | df | Asymp. Sig. (2-sided) | |
|---|---|---|---|
Person Chi Square | 10.390a | 4 | 0.034 |
Likelihood ratio | 11.662 | 4 | 0.020 |
Linear-by-linear association | 0.062 | 1 | 0.803 |
N of valid cases | 251 |
a2 cells (20.0%) have expected count < 5. The minimum expected count is 1.46
Likewise, Table 10 shows the cross tabulation between the type of a hospital and the security of data. It can be clearly observed that the value of p = 0.648 which is far greater than 0.05. It indicates that both of these variables are insignificant and are not associated with each other. We can also state that if a hospital is private or public, the concerns about the security of data will be the same or whatever the hospital type is, security of data would always be an issue.
Table 10. Chi square tests for type of hospital security of data
Value | df | Asymp. Sig. (2-sided) | |
|---|---|---|---|
Person Chi square | 2.480a | 4 | 0.648 |
Likelihood ratio | 2.486 | 4 | 0.647 |
Linear-by-linear association | 2.110 | 1 | 0.146 |
N of valid cases | 251 |
a2 cells (20.0%) have expected count < 5. The minimum expected count is 0.97
Why provenance in PHR systems
A PHR system utilizes ICT affairs for the implementation of the system to ease the relationship between the patients and their doctors. Such a system collects different bits of information regarding a patient through electronic devices and shares the data of individual and their relatives. All such information is stored in a database which can be accessed both by the patient and their doctors for health benefits. However, such a system introduces a risk of the authenticity of the data. Data must be authentic and provable for their adoption by the general public.
Provenance which is the metadata about the original data plays a key role in providing authentication of data produced and shared by electronic methods. Provenance not only makes the data trustable but also provable when there are any suspicious activities in the database (Zhang et al. 2011). For instance, imagine a situation where the PHR system is hacked and malicious information are added to the database. In such a situation, provenance is able to detect such an intrusion and also provide methods to recover the original data (Imran et al. 2017, 2018) Therefore, it is considered a key part in the PHR system by various researchers (Haas et al. 2011; Liu et al. 2011). Hereby, a framework which captures, stores, manages and utilizes provenance data is necessary to make the system more reliable and trustworthy.
It must be noted that implementing a PHR system offers multiple challenges such as: (1) offline access to the data, (2) trustworthiness on the data, (3) maintaining a history of the patient’s, records and, (4) privacy and security of patients, records. Dew server offers the mechanism to access records offline thus addressing challenge 1. Provenance, on the other hand, is used for trustworthiness by keeping the history of records thus addressing challenge 2 and 3. Cryptography techniques are used in literature for providing privacy and security of personal information addressing challenge 4. Therefore, provenance is a key in addressing different challenges in PHR systems.
Cloud-dew provenance framework
The cloud-based businesses have raised some new issues i.e. ‘Cloud Outages’ (Endo et al. 2016). As the majority of the businesses are now turning into software business thus converted the cloud outages into business outages. As enterprises migrate more mission-critical workloads into production cloud environments, mere minutes of downtime from a provider can significantly impact profits, damage relations with customers, and cause IT administrators to prematurely age.
Such ‘cloud outages’ and its respective consequences have led to the development of new Cloud complementary technology, coined as Dew Computing (Wang 2016). (Ristov et al. 2016) defined dew computing as ‘dew computing (DC)’ goes beyond the concept of a network/storage/service, to a sub-platform. It is based on a micro-service concept in vertically distributed computing hierarchy. DC pushed the frontiers to computing applications, data, and low-level services away from centralized virtual nodes to the end users (Ristov et al. 2016). Key motivations for this new technology are ‘independence’ and ‘collaboration’. Independence refers to the ability of an on-premises computer to provide functionality without cloud services and an internet connection. In other words, it means this application is not completely online or a cloud service whereas collaboration means that the dew computing application has to exchange the information automatically with cloud services during its operation services. Such collaboration includes synchronization, correlation, and/or other kinds of interoperation.
The cloud-dew architecture presents two key servers where each server is responsible for providing functionalities depending on the availability of the internet. On one hand, Dew server is a web server that resides on user’s local computer and offers two functionalities. (1) It provides the client with the same services as the cloud server and (2) it synchronizes dew server databases with cloud server databases. On the other hand, the cloud server presents the ubiquitous resources required for the execution of services and storage of data. The architecture of the Cloud is further divided into infrastructure, platform, and software layers.
Keeping in view the architecture of the Cloud Dew computing, the subsections below present the provenance framework for the cloud-dew architecture. The framework is divided into several parts namely provenance collection, provenance storage, provenance query and visualization. The first part focuses on the collection of important provenance data from the cloud or dew server. The second part is focused on the storage of the data for easy retrieval whereas the last part provides interface for the utilization of the data.
Provenance collection
This component of the framework is responsible for performing two key operations. Firstly, the interception of provenance data at the various layers of communication in the Dew computing and secondly, the Intercepted data is parsed according to the various layers of the DC. The subsections below provide the detail of these two key operations.
Provenance interception
A middleware (Rehman et al. 2017) is software that connects and integrates different components of complex applications in distributed environments. The communication between components is taking place using a variety of protocols such as remote procedure call (RPC), Java remote method invocation (RMI), Java message service (JMS), and web services for examples. Middleware provides the functionality of interception for various purposes such as quality of service (QoS), versioning, security, and privacy. The interception provides a low-level mechanism to intercept the communication messages without altering the architecture. Therefore, middleware can be extended with custom logic using interceptors. We propose to design interceptors for the collection of significant provenance data and deploy them in middleware thus extending the cloud-dew architecture. These interceptors are placed in various flows for the collection of provenance data as shown in Fig. 7.
Fig. 7 [Images not available. See PDF.]
Provenance collection in the middleware with interceptor component
Provenance data is collected by adopting a technique which extends the underlying middleware of whole cloud-dew architecture as well as that of the client side application workflow. That is provenance is collected at different service models. Using this technique, the basic architecture of cloud-dew remains the same and no changes are made to services architecture. Since the data is collected without altering the architecture, we expect the computation overhead to be minimal. Once the data is collected from communication messages at various layers of cloud-dew architecture, it is forwarded to the parser module for further processing.
Provenance parsing
The parser module receives communication messages from the collector module and parses the data according to the various layers of communication e.g. IaaS, PaaS and SaaS, Dew Server and Client-side application layers. The parser is divided into sub-parsers because of the layered architecture namely: Infrastructure Parser, Platform Parser, Software Parser, Dew Parser, and Application Parser as shown in Fig. 8. It must be noted that infrastructure, platform and software parser works at the cloud server. Application parser works at the workflow level whereas Dew parser works at the dew server. The details of the individual parsers are presented below:
Fig. 8 [Images not available. See PDF.]
Provenance parsing through sub-parsers
Infrastructure parser: This parser collects information about the resources involved in the cloud dew architecture such as resource type, instance type, resource provider, IP addresses of VMs and user data etc.
Platform parser: this parser collects information such as commutation and linking protocols, developer name and group information, changes made to services such as different versions etc.
Software parser: this parser collects information for applications executed in Clouds such as web service names, method names, input and output parameters, and time taken by services among others.
Dew parser: this parser collects information on devices and sensors connections and their communication as well as their timestamps and type of requests. On the other hand, it also collects information on the communication between the Dew server and the actual Cloud.
Application parser: the application parser lies on the client application side. It keeps track of the client activities on the application and collects all such activities and communications between the client’s applications.
Provenance storage
This part of the framework is responsible for performing two key operations regarding the storage of the provenance data. Firstly, we focus on the selection of the data modeling language and provide arguments for the selected technique for storing the provenance data. Secondly, we present some sample data at individual layers of the DC computing.
Selection of data modeling language
We have identified the following principles for evaluating our selected four modeling languages (UML, ORM, ER, XML) for conceptual data modeling: structural validity, simplicity, expressibility, non-redundancy, shareability, extensibility, integrity, and diagrammatic representation. Although ORM’s richer constraint notation makes it more expressive graphically; all the other methods extend expressibility through the use of textual languages. XML scores higher on structural validity as it supports Unicode, allowing almost any information in any written human language to be communicated. XML also allows validation using schema languages such as XSD and Schematron, which makes effective unit-testing, firewalls, acceptance testing, contractual specification, and software construction easier. As compared to the other UML and ER languages; ORM is easier to validate but through verbalization and multiple instantiations. Moreover, XML is platform-independent, thus relatively immune to changes in technology.
Being attribute-free, ORM is more stable for both modeling and queries. All of the methods are amenable to similar abstraction mechanisms and have adequate formal foundations. UML class diagrams are often more compact and can be adorned with a vast array of implementation detail for engineering too and from object-oriented programming code. Moreover, UML includes mechanisms for modeling behavior, and its acceptance as an OMG standard is helping it gain wide support in the industry, especially for the design of object-oriented software. Talking about the graphical and diagrammatical representation of various entities, ER diagram gives a clear picture of its attributes and relationships between entities. This in turn helps in the clear understanding of the data structure and in minimizing redundancy and other problems. However, there is no industry standard notation for developing an E-R diagram which is why UML and ORM are widely accepted in the industrial market rather than the ER.
Thus, all the four methods have their own advantages. For data modeling purposes, it seems worthwhile to provide tool support that would allow users to gain the advantages of performing conceptual modeling in any selected language. Table 11 shows that different attributes/properties of the selected four languages.
Table 11. Attributes of selected languages
UML | ORM | ER | XML | |
|---|---|---|---|---|
Structural validity | ✔ | ✔ | ✔ | ✔ |
Simplicity | ✔ | ✔ | ||
Expressibility | ✔ | ✔ | ✔ | ✔ |
Non-redundancy | ✔ | ✔ | ||
Shareability | ✔ | |||
Extensibility | ✔ | |||
Integrity | ✔ | |||
Diagrammatic representation | ✔ | ✔ | ✔ |
Why we choose XML
The purpose of choosing XML for Schema is to define the legal building blocks of an XML document i.e. the elements and attributes that can appear in a document, the number of (and order of) child elements, data types for elements and attributes, default and fixed values for elements and attributes. One of the greatest strength of XML Schemas is the support for data types (Houlding 2001; Khan et al. 2018; Sohaib et al. 2018). Other advantages include the following:
It is easier to describe allowable document content.
It is easier to validate the correctness of data.
It is easier to define data facets (restrictions on data).
It is easier to define data patterns (data formats).
It is easier to convert data between different data types.
XML schema is used for the collected provenance information because it is a widely used model for data representation. Furthermore, XML can be used to maximize the advantages such as custom algorithms and third-party applications from various users which utilize a well-formed provenance. It is also useful to provide a standard schema and hence the usage according to individuals preferences such as querying the provenance data.
Provenance data
The parser module parses the data according to different service models of the cloud-dew architecture. Therefore, to identify important provenance information according to various layers or service models is important. Following is a list of important provenance data that is required and collected:
Process provenance data: the control flow i.e. the sequential execution of various services and processes is important provenance information such as web service name, method name, and timestamps of invocation and completion, in particular, are important provenance information. Similarly, the control flow of various components inside a particular service is also important. Such provenance information is utilized for various purposes, e.g. finding the exact method and time when a failure occurs.
Storage provenance data: the data which is stored from various consumers or applications in the database storage has valuable provenance information such as the type of data, size of data, creator of data and the storage location. Furthermore, the time information about the creation, update, and deletion of data items, and the time details about sharing the data items with other members of the architecture are significant provenance data.
System provenance data: system information or physical resource details like sensor name, the total number of sensors, compiler version, operating system, VM details, and the location of virtualized resources. For example, if a sensor or device or VM fails to work properly, its location can be found in provenance data.
Consumer provenance data: the details about various users such as their names, location, group information, and Access Control Policy (ACP). Such provenance is important for various tasks like security and privacy of data and processes.
Application provenance data: information about various processes of an application, e.g. process name, method name, and the time taken by a single or overall process. The platform and software layers also provide significant provenance data from the viewpoint of software developer and software consumer as shown in Fig. 9.
Fig. 9 [Images not available. See PDF.]
Provenance framework for cloud-dew architecture
Provenance query
The ‘provenance query’ module presents the architecture for efficient, fast and reliable results of users queries utilizing the provenance data. Efficient and reliable query execution depends on storage mechanisms. The query module of the framework utilizes the stored provenance to extract various information. Since provenance is managed independent original data, custom applications can be designed to query provenance data based on the individual requirements.
The query module works in two stages. First, the required provenance information is selected by users via a given form. After the selection, information is collected from the XML file and displayed to the user. This function is extensible according to the various user’s requirements.
The overall structure of the framework is presented in Fig. 9. The figure shows different components of the framework such as provenance collection, storage, and query for the cloud and dew architecture. Similarly, the PHR application provenance is also collected. Requirements such as independence, offline access, and maintaining history among others are also presented.
Conclusion
In this research, we proposed a reliable and trustful personal health record systems based on cloud-dew Architecture. It is based on layered provenance framework for PHR. The proposed system proves to be beyond the traditional network/service concept and offers a new micro-service level concept. A case of cloud-dew architecture based Provenance Framework was discussed and it offered high scalability and availability in vertically distributed computing hierarchy.
A lightweight and cost-efficient provenance framework is designed and established for its accomplishment. Our framework also provides services such as storage, query, and visualization of provenance besides highlighting the identified list of requirements.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
Bilal, M; Asif, M; Bashir, A. Assessment of secure OpenID-Based DAAA protocol for avoiding session hijacking in web applications. Secur Commun Netw; 2018;
Buneman, P; Khanna, S; Wang-Chiew, T. Why and where: a characterization of data provenance. International conference on database theory; 2001; New York, Springer: pp. 316-330.
Cui, Y; Widom, J; Wiener, JL. Tracing the lineage of view data in a warehousing environment. ACM Trans Database Syst (TODS); 2000; 25, pp. 179-227. [DOI: https://dx.doi.org/10.1145/357775.357777]
Dunlop, L. ) Electronic health records: Interoperability challenges Patients’ right to privacy. Shidler JL Com & Tech; 2006; 3, 1.
Endo, PT; Rodrigues, M; Gonçalves, GE; Kelner, J; Sadok, DH; Curescu, C. High availability in clouds: systematic review and research challenges. J Cloud Comput; 2016; 5, 16. [DOI: https://dx.doi.org/10.1186/s13677-016-0066-8]
Endsley, S; Kibbe, DC; Linares, A; Colorafi, K. An introduction to personal health records. Fam Pract Manag; 2006; 13, 57.
Gibson, A; Gamble, M; Wolstencroft, K; Oinn, T; Goble, C; Belhajjame, K; Missier, P. The data playground: an intuitive workflow specification environment. Fut Gener Comput Syst; 2009; 25, pp. 453-459. [DOI: https://dx.doi.org/10.1016/j.future.2008.09.009]
Haas, S; Wohlgemuth, S; Echizen, I; Sonehara, N; Müller, G. Aspects of privacy for electronic health records. Int J Med Inf; 2011; 80, pp. e26-e31. [DOI: https://dx.doi.org/10.1016/j.ijmedinf.2010.10.001]
Houlding, SW. XML—an opportunity for <meaningful> data standards in the geosciences. Comput Geosci; 2001; 27, pp. 839-849. [DOI: https://dx.doi.org/10.1016/S0098-3004(00)00145-X]
Imran, M; Hlavacs, H; Haq, IU; Jan, B; Khan, FA; Ahmad, A. Provenance based data integrity checking and verification in cloud environments. PloS One; 2017; 12, e0177576. [DOI: https://dx.doi.org/10.1371/journal.pone.0177576]
Imran, M; Hlavacs, H; Khan, FA; Jabeen, S; Khan, FG; Shah, S; Alharbi, M. Aggregated provenance and its implications in clouds. Fut Gener Comput Syst; 2018; 81, pp. 348-358. [DOI: https://dx.doi.org/10.1016/j.future.2017.10.027]
Khan FA, Han Y, Pllana S, Brezany P (2008) Provenance support for grid-enabled scientific workflows. In: 2008 Fourth International Conference on Semantics, Knowledge and Grid, pp 173–180. http://doi.org/10.1109/SKG.2008.86
Khan, FA; Rahman, A; Alharbi, M; Qawqzeh, YK. Awareness and willingness to use PHR: a roadmap towards cloud-dew architecture based PHR framework. Multimedia Tools Appl; 2018;
Liu LS, Shih PC, Hayes GR (2011) Barriers to the adoption and use of personal health record systems. In: Proceedings of the 2011 Conference, pp 363–370. http://doi.org/10.1145/1940761.1940811
Rehman, HU; Asif, M; Ahmad, M. Future applications and research challenges of IOT. 2017 International conference on information and communication technologies (ICICT); 2017; Pakistan, IEEE: pp. 68-74. [DOI: https://dx.doi.org/10.1109/ICICT.2017.8320166]
Ristov, S; Cvetkov, K; Gusev, M. Implementation of a horizontal scalable balancer for dew computing services. Scalable Comput: Pract Exp; 2016; 17, pp. 79-90.
Sohaib, O; Solanki, H; Dhaliwa, N; Hussain, W; Asif, M. Integrating design thinking into extreme programming. J Ambient Intell Hum Comput; 2018; 25, pp. 1-8.
Waegemann CP (2002) Status report 2002: electronic health records Boston: Medical Records Institute
Wang Y (2015) The initial definition of dew computing Dew Computing Research
Wang, Y. Definition and categorization of dew computing. Open J Cloud Comput (OJCC); 2016; 3, pp. 1-7.
Wang YR, Madnick SE (1990) A polygen model for heterogeneous database systems: the source tagging perspective
Win, KT; Fulcher, JA. Consent mechanisms for electronic health record systems: a simple yet unresolved issue. J Med Syst; 2007; 31, pp. 91-96. [DOI: https://dx.doi.org/10.1007/s10916-006-9030-3]
Winkelman, WJ; Leonard, KJ. Overcoming structural constraints to patient utilization of electronic medical records: a critical review and proposal for an evaluation framework. J Am Med Inform Assoc; 2004; 11, pp. 151-161. [DOI: https://dx.doi.org/10.1197/jamia.M1274]
Woodruff A, Stonebraker M (1997) Supporting fine-grained data lineage in a database visualization environment. In: Proceedings 13th International Conference on Data Engineering, pp 91–102. http://doi.org/10.1109/ICDE.1997.581742
Zhang OQ, Kirchberg M, Ko RK, Lee BS (2011) How to track your data: the case for cloud computing provenance. In: 2011 IEEE Third International Conference on Cloud Computing Technology and Science, pp 446–453. http://doi.org/10.1109/CloudCom.2011.66
© Springer-Verlag GmbH Germany, part of Springer Nature 2019.