Content area
Big data analysis is the process of gathering, managing and analyzing a large volume of data to determine patterns and other valuable information. Agricultural data can be a significant area of big data applications. The big data analysis for agricultural data can comprise the various data from both internal systems and outside sources like weather data, soil data, and crop data. Though big data analysis has led to advances in different industries, it has not yet been extensively used in agriculture. Several machine learning techniques are developed to cluster the data for the prediction of crop yield. However, it has low accuracy and low quality of the clustering. To improve clustering accuracy with less complexity, a Proximity Likelihood Maximization Data Clustering (PLMDC) technique is developed for both sparse and densely distributed agricultural big data to enhance the accuracy of crop yield prediction for farmers. In this process, unnecessary data is cleansed from the sparse and dense based agricultural data using a logical linear regression model. After that, the presented clustering method is executed depending on the similarity and weight-based Manhattan distance. The genetic algorithm (GA) is applied with a good fitness function to select the features from the clustered data. Finally, the decision support system is computed by the A-FP growth algorithm to predict the crop yields according to their selected features such as weather features and crop features. The results of the proposed PLMDC technique are better in case of clustering accuracy of both spare and densely distributed data with minimum time and space complexity. Based on the results observations, the PLMDC technique is more efficient than the existing methods.
Introduction
In the Indian Economy, agriculture can be the major source. The agricultural production optimization and chains of food contributions are very critical to more competently make and bring food, fiber and fuel to meet growing demand with the Indian population projected to exceed 1.5 billion by 2050. Therefore, better crop yield is needed in the future. Hence, Agricultural Big Data is a necessary part of the second green revolution that can be needed to fulfill these requirements. Agriculture is an observable aim for big data. In general, the farmers are needed in a timely direction to forecast the crop yield in the future for the effective crop yield and analysis can also be required to assist the farmers for the utilization of full capacity in the crop production. However, this kind of farmers’ aim is going to be difficult due to climate change and urbanization. In agriculture, risks and improbability have occurred commonly through the irregular weather [1, 2–3]. The conditions of environments, input levels, inconsistency in soil, mixture and product prices have provided it every more applicable for farmers to utilize information and find assist to build serious farming decisions [4].
For agricultural planning purposes, a crucial problem can be the precise yield evaluation for the several crops concerned in the planning process [5, 6–7]. Data mining techniques have been utilized to achieve practical and effectual solutions for this difficulty. For example, sensor technologies have been exploited to collect data from the crop fields to estimate, predict, or observe the damage. Sensible and early warnings will be used to prevent huge losses. Besides, ICT (information and communication technology) has given efficient instruments to sustain and mitigate risks. Agricultural Big Data can be used in the decision-making process [8, 9, 10–11]. By using historical agricultural datasets, a predictive examination, crop, integrated soil, weather can estimate consequences like crop yields and food uncertainty. Also, the decision-making system is enhanced by predictive analytics for the prediction of crop yield soil and climate condition [12].
At present, the yield prediction can play an important agricultural difficulty. All farmers are interested in knowing how much yield can be predicted to produce. In the past, one farmer may build the predictions for crop yield with the help of previous experience of the farmer for specific crop production [13]. In the Data Mining process, data transformation, feature extraction, loading process and the meaningful information prediction from enormous data are executed to obtain many patterns and understandable structure is also be obtained for additional utilization [14, 15, 16–17]. Agricultural big data contains following issues:
A large number of data sources that can be varied with differing levels of information
Effective description and optimized methods in terms of both storage space and data access
The high dimensionality problem
The problem of interoperability with standards in this domain.
Therefore, a new clustering algorithm and the predictive model are presented to overcome the above issues and it aims to generate a user-friendly interface for farmers that will provide the crop yield prediction analysis depending on the obtainable datasets.
Related works
A machine learning approach has been presented in for the prediction of crop yield, in which superior performance was, demonstrated in the 2018 Syngenta Crop issues by applying a corn hybrids’ large datasets. In this method, deep neural networks were exploited to construct yield predictions according to the environment data and genotype. Nonlinear and complex relationships between genes, environmental situations and their interactions from historical data have been learned by deep neural networks to provide precise yield predictions for novel hybrids planted in new areas with recognized weather situations. An SNN (spiking neural networks) model has been presented in to build a timely crop yield forecast. In this work, historical crop yield data and spatial accumulation of blocks of MODIS-NDVI (Moderate Resolution Imaging Spectroradiometer- normalized difference vegetation index) 250-m resolution data was employed to the examination [18, 19–20]. The SNNs promising techniques were utilized in remote sensing for analysis, spatiotemporal data model, and crop prediction. The multi-sensor such as optical and microwave remote sensing data were combined to evaluate crop yield and to the prediction process by applying the two methods. The first method Enhanced Vegetation Index (EVI) was derived from MODIS and Vegetation Optical Depth (VOD) that has integrated the data from the two satellite sensors in a distinctive descriptor or feature. In the second method, statistics summarization and machine learning usages were prevented to merge full-time series of EVI and VOD. The modified k- Means clustering was provided in to the crop prediction through quality and accuracy count enhancement. By comparing k-Means and k-Means++ algorithms, this modified k-Means clustering algorithm has been estimated and high-quality clusters, correct prediction of crop and accurateness count were obtained [21, 22–23, 24].
The trim forecast framework was presented in by applying the neural system and the fuzzy bunching strategies. In this approach, counterfeit neural systems were utilized to intense apparatus for probability and illustrate, to gather their capability. The reasonable harvest can be predicted by identifying various parameters such as temperature, solar radiation, biomass, and rainfall. The soybean yield prediction of Lauderdale County, Alabama, USA was focused on applying the 3D CNN (Conventional Neural Network) model that has leveraged the spatiotemporal uniqueness. From the USDA (United States Department of Agriculture)-NASS (National Agricultural Statistics Service) Quick Stat tool, the yield was given for years 2003 to 2016. In this method, NASA’s MODIS satellite data was used such as land products, land surface temperature surface reflectance, and temperature through Google Earth Engine [25].
Proposed methodology
In this paper, new clustering algorithms Proximity Manhattan Distance estimation (PMDE) and Expected Likelihood Maximization (ELM) with log-likelihood probability are presented for sparse and dense based agricultural data to predict crop yield. Figure 1 shows the proposed architecture diagram for crop yield prediction. Four phases are suggested in Fig. 1.1 for an effective prediction of crop yield in various climate conditions that can help to enhance the former’s ability for the decision making of crop selections. The agricultural data is taken from the India governmental agriculture portal using online. Initially, the data is cleansed by removing inconsistent data to improve the data quality for the further process. In the second phase, sparse and dense data can be clustered by applying the proposed methods PMDE and EM. In the third phase, the features will be chosen from the clustered data according to the crop yield attributes using the Genetic Algorithm (GA). Finally, Apriori and Frequent Pattern (A-FP) growth algorithms are combined to build a decision support system to predict the crop yield depending on the selected features. This proposed contains the following phases:
Data cleansing using a logical linear regression model
Data clustering using proposed clustering methods
Feature Selection using GA with a good fitness function
Decision support system using A-FP growth algorithm
[See PDF for image]
Fig. 1
Architecture diagram
Data cleansing
In this paper, the sample agricultural-based historical data is collected from the India Governmental agriculture portal according to the various climate conditions. At first, the taken agricultural data can be preprocessed to improve the data quality by converting unstructured data into structured data by the elimination of inconsistent data or unnecessary data from the taken dataset. The noisy agricultural data have errors that can be deviated from the probable and inconsistent data. Consequently, errors will be decreased by applying the logical linear regression model to the data smoothing process. This logical linear regression model can model the relationship between dependent data attribute and two or more independent data attributes by fitting a straight line. In this method, the relationships can be modeled using linear predictor functions whose unknown model parameters are evaluated from the agricultural data. After that, the best line is discovered by fitting the more than two agricultural data attributes (or variables), hence, one data attribute will be employed to predict the others. The data cleansing algorithm is given below:
Algorithm
Input: Agricultural data, data attributes.
Output: Smoothed data with low dimension.
To read the given agricultural data.
To initialize the dependent data variable as a response attribute and to initialize one independent variable or two independent data variables as predictor attributes.
To model the relationship between the dependent data variable and the independent variable by fitting a straight line using linear regression method and it is calculated using the following equation
1
where, denotes the error variable, represents the response variable, can be acted as a predictor variable and denotes the regression.
To error between actual lines using separation of the data and estimation of the line by least squares and it is given by:
2
To reduce data dimensionality by applying the log-linear regression model using the following equation (3)
3
Stop
Thus, the agricultural data has been converted into structured data with dimensionality reduction by error reduction and is used for clustering, feature selection and crop yield prediction in different climate situations.
Proposed clustering algorithm for crop yield prediction
An efficient Proximity Likelihood Maximization Data Clustering (PLMDC) technique is introduced with the objective of enhancing the clustering performance of both spare and dense data for big data analytics. The objective of the PLMDC technique is obtained with the application of proximity Manhattan distance estimation and expected likelihood maximization. This PLMDC technique deals with both sparsely and densely distributed big data. The PMDE method is employed to cluster the sparse data such as crop-based data (Beans, Grains, Oil, Fruits, etc.,) and ELM method is applied to cluster the dense data such as current and historical data (Weather, humidity, temperature, moisture)—satellite remote sensing data.
PMDE method for sparse data clustering
After the data cleansing process, the clustering process is firstly applied on sparsely distributed data is carried out using Proximity Manhattan Distance estimation for finding the relationship between data points. Using this technique, the proximity estimation model is applied to obtain efficient sparse data clustering by finding the relationship between the sparsely distributed data in two-dimensional spaces. The relationship between the sparse data is measured based on the distance metric. Clustering is executed on sparsely distributed data using Proximity Manhattan Distance estimation for finding the relationship between data points. In our proposed method, proximity-based Manhattan distance is proposed for sparse data clustering. The similarity between agricultural data is discovered based on minimum or nearer distance with weights using Manhattan distance. The weight is used to combine the similarities between sparse data. The closet points only estimated in the standard Manhattan distance estimation. Thus, the weight-based nearest data points are calculated in this proposed modified Manhattan distance method, to find the similarities between data. Here, Manhattan distance is modified with weights using shared nearest neighbor (SNN) method to find the nearest similarity data depending on their weights. In this proposed method, the neighbor relationships between each data can be represented by the SNN. The proximity is used to refer to similarity and dissimilarity between agricultural data. Here, the weights are utilized to combine the similarities. In this proposed proximity-based Manhattan distance, a nearer distance of each data point will be estimated according to their weight to find and combine the similarities between data for the clustering process. After that, the geometric mean of all the distances is calculated and finds the cluster center. Thus the sparse similar data are grouped which is more similar to the cluster center.
Algorithm
To assume two data points and
To estimate the similarity between data points using the following equation
4
where, w denotes the weight of data points, r is the attribute of each data point, represents the rth attribute’s similarity and denotes the pointer variable for rth attribute.
To calculate proximity Manhattan distance using the following equation
5
6
To cluster the data points as A1, A2, …, An in sparse distributed agricultural data based on the estimated similarity and proximity Manhattan distance.
To locate the cluster center for each cluster using geometric mean to every distance and it is given by
7
Stop
ELM method for dense data clustering
The dense based agricultural data is clustered by applying the Expected Likelihood Maximization with log-likelihood probability to enhance the clustering process. In this clustering process, the historical data such as weather, temperature, humidity and moisture data will be clustered to find the densely distributed agricultural data. The relationships between densely distributed agricultural data are found using this proposed clustering method. Initially, the similarity between every data point is estimated and then log-likelihood is estimated by finding the sample mean and square deviation. Finally, the densely distributed agricultural data is clustered according to the estimated similarity and log-likelihood function. The algorithm for dense based data clustering is given below:
Algorithm
To estimate the similarity between data points using the following equation.
8
To estimate the log-likelihood for each data
9
10
11
To cluster the dense data based on estimated similarity and log-likelihood of every dense data
Stop
Hence, the sparsely and densely distributed data has been clustered for the prediction of crop yield.
Feature selection using GA with optimal fitness function
The features are selected from the clustered data using a genetic algorithm. In this paper, seven features are selected such as Emissions (Ch4), Emission CO2, Temperature, Area under Irrigation, Yield, Annual rainfall and soil type for crop yield. The crop yield prediction is executed according to several factors to obtain enhanced prediction results and the feature selection can play a very important role in the agriculture yield prediction area. The genetic algorithm has given high cross-validation accuracy among various feature selection algorithm. Consequently, the features are selected with optimal fitness function to construct a decision support system for the crop yield prediction model according to the A-FP growth algorithm.
Algorithm
Feature vector and feature set
Selected Features
To read the feature vector and feature set
To initialize GA with above mentioned seven features.
To estimate random feature = Random Selection(Feature vector, Feature set).
To computed Fitness function for each feature f from random feature using the following equation
12
where, A and B can be the parameters that can be described by the user to establish the convergence rate of prediction according to the requests.
Fitness (f)—fitness value for a subset.
TP—True Positive (The actual feature is X and the predicted feature is also X).
TN—True Negative (The actual feature is not X and the predicted feature is also not X).
FP—False Positive (The actual class is X, but the predicted feature is not X).
FN—False Negative (the actual class is not X, but the predicted feature is X).
i = 1,2, …, 7 represents the number of features.
To discover optimal fit features OFF using Eq. (13)
13
where, denotes the value of threshold with a maximum for a feature subset.
If OFF > 7 then continue the process, otherwise, add feature f to select feature
Stop
Decision support system using A-FP growth algorithm
In this paper, we combine the Apriori and FP-growth algorithm called the A-FP growth algorithm to make the decision support system with selected features for an efficient crop yield prediction. The frequent data items are made by Apriori algorithm in which a frequent itemset can be an itemset that contains high transaction support than minimum support and it needs to analysis and compares features with selected features to find suitable information about particular crop yield set by constraining a minimum and then FP Growth Algorithm can be applied to the storing the recurrent pattern into a different kind of data structure. The confident association rules have been made from the frequent item sets. A confident association rule can be operated as a rule with confidence above the minimum level of confidence to predict the crop yield production in different climate conditions. A hybrid of Apriori and FP-Growth algorithm uses FP Tree to maintain crop yield suggestions with various climate conditions. In this A-FP method, the maximal frequent item sets are discovered by the Apriori algorithm and then remaining frequent item sets are found using the FP-growth algorithm.
Algorithm
Selected features (sf), Minimum support min-sup
Every frequent item, crop yield prediction
To scan the given agricultural dataset according to the selected features and generate a 2-dimensional array.
To arrange in descending order depending on the transaction k-length.
To initialize the transactions at repetition count of data items for each candidate set C1 over the threshold value=1and it is given by
14
Identify maximum frequent k- item sets with and is calculated by using the following equation
15
where, denotes the frequent item set, , X and Y are the subsets of item sets.
The candidate subset generation is executed for
The function candidate_gen(F) is used in the fourth step and this function contains two phases such as the Join phase and the Pruning phase. In the join step, two frequent itemsets are combined to generate probable candidate’s c. In the pruning phase, determines if every subset of c is in
To mine all frequent data items in descending order according to the frequency. Go to step 8 when no frequent item set left.
Every the frequent itemset’s non-empty subset is taken.
Identify frequent-1 itemsets and the database is pruned using maximal frequent itemset removal process while there are no frequent itemsets not in the maximal itemset.
To build FP-tree with the help of binary strings in decreasing order that contains similar content with the C’s first item;
Remaining frequent itemsets are found.
Some generated confidence rules according to the identified frequent data items are given Table 1 to predict the crop yield in various climate situations.
Table 1. Generated rules by A-FP growth
S. no | Rules |
|---|---|
1 | areaLess, rainfallLight proLess yieldLess |
2 | areaMed,rainfallMed yieldMed |
3 | areaMed,proMed yieldMed |
4 | areaVeryGood, rainfallLight, pro Avg yieldMed |
5 | areaVeryGood, rainfallHeavy, proGood yieldGood |
6 | areaMed, rainfallRatherHeavy, pro Avg yieldMed |
7 | areaMed, rainfallLight, proAvg yieldMed |
Thus, the crop yield can be predicted depending on the generated rules through the clustered dataset.
Results and discussion
For the experimental study, the agricultural data has been collected from India Governmental portal using online. The agricultural data attributes description is given in Table 1 for the performance analysis of our proposed clustering algorithm. This new algorithm Proximity Likelihood Maximization Data Clustering (PLMDC) with A-FP can give the optimal prediction system to both sparse and densely based agricultural big data analysis for crop yield prediction. The performance of our new algorithm will be estimated and compared with existing clustering algorithms such as k-means, k-medoids, and fuzzy clustering in terms of following metrics such as Accuracy, Precision, Recall, F-measure, Homogeneity, Error rate, and time complexity (Tables 2, 3).
Table 2. Agricultural data attributes
Sl. no. | Attributes | Definition |
|---|---|---|
1 | Year | India’s Agricultural data from the year of 2000 to 2019 |
2 | State name | All state names in India |
3 | District name | All district names in every state of India |
4 | Crops name | 124 types of crops |
5 | Season name | Four kinds of Season such as spring, autumn, winter and summer |
6 | Area | Range of area cultivation |
7 | Soil Type | Different soils’ name for crop production |
8 | Production | Crop production range |
Table 3. Performance analysis
Metrics | Proposed PLMDC (%) | K-Means (%) | K-Medoids (%) | Fuzzy Clustering (%) |
|---|---|---|---|---|
Homogeneity | 91.03 | 86.46 | 87.88 | 85.923 |
Precision | 90.2 | 86.8 | 87.4 | 84.83 |
Recall | 89.9 | 83.9 | 85.6 | 86.47 |
F1-Measure | 90.04 | 85.32 | 86.4 | 85.7 |
The performance analysis of the proposed clustering method and clustering methods such as K-means, K-medoids, and fuzzy clustering have been established in Fig. 2. From the comparison chart 2, our proposed clustering method PLMDC has taken a high percentage of homogeneity, precision, recall, and F-measure compared to existing clustering methods during the clustering of agricultural data clustering for the crop yield prediction.
[See PDF for image]
Fig. 2
Performance analysis
Prediction accuracy
The prediction accuracy of the presented and existing methods can be analyzed through how many numbers of patterns are recognized out of total patterns to the crop yield prediction depending on the area and climate conditions. This kind of predictive method can use the current agricultural data to forecast crop yield to assist the farmer. The efficiency of each method is evaluated using the accuracy level of the prediction process. The crop yield prediction accuracy of the proposed and existing method has shown in Fig. 3. Figure 3 clearly says that the proposed predictive method has given high accuracy for crop yield prediction than existing methods k-means, k-medoids, and fuzzy clustering.
[See PDF for image]
Fig. 3
Accuracy comparison
Conclusion
At present, the crop yield prediction in various climatic situations is extremely concentrated that can develop the farmer’s capability in the decision making of crop selection. Therefore, an improved Proximity Likelihood Maximization Data Clustering (PLMDC) technique has been presented with an A-FP growth algorithm to predict the crop yield. In this presented prediction method, the taken agricultural data was preprocessed with the help of a logical linear regression model for the data quality improvement. The clustering methods PMDE and ELM have been proposed to cluster preprocessed sparsely and densely distributed agricultural data according to the improved Manhattan distance and maximum likelihood function. After that, the features were selected from the clustered data by applying the genetic algorithm with optimal fitness function and selected features are provided to the decision-making system to improve the accuracy of prediction. Finally, a decision-making system was made by a combined Apriori and FP growth algorithm. This presented clustering and prediction system can be very useful to farmers to predict the crop yield in various climate situations or seasons.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Khaki, S; Wang, L. Crop yield prediction using deep neural networks. Front. Plant Sci.; 2019; 10,
2. Bose, P; Kasabov, NK; Bruzzone, L; Hartono, RN. Spiking neural networks for crop yield estimation based on spatiotemporal analysis of image time series. IEEE Trans. Geosci. Remote Sens.; 2016; 54,
3. Mateo-Sanchis, A; Piles, M; Muñoz-Marí, J; Adsuara, JE; Camps-Valls, G. Synergistic integration of optical and microwave satellite data for crop yield estimation. Remote Sens. Environ.; 2019; 234, pp. 1-12. [DOI: https://dx.doi.org/10.1016/j.rse.2019.111460]
4. Narkhede, UP; Adhiya, KP. Evaluation of modified K-means clustering algorithm in crop prediction. Int. J. Adv. Comput. Res.; 2014; 4,
5. Parthasarathy, P; Vivekanandan, S. A typical IoT architecture-based regular monitoring of arthritis disease using time wrapping algorithm. Int. J. Comput. Appl.; 2020; 42,
6. Verma, A; Jatain, A; Bajaj, S. Crop yield prediction of wheat using fuzzy C means clustering and neural network. Int. J. Appl. Eng. Res.; 2018; 13,
7. Vijayarajeswari, R; Parthasarathy, P; Vivekanandan, S; Basha, AA. Classification of mammogram for early detection of breast cancer using SVM classifier and Hough transform. Measurement; 2019; 146, pp. 800-805. [DOI: https://dx.doi.org/10.1016/j.measurement.2019.05.083]
8. Terliksiz, A.S. and Altýlar, D.T.: Use of deep neural networks for crop yield prediction: A case study of soybean yield in Lauderdale County, Alabama, USA. International Conference on Agro-Geoinformatics (Agro-Geoinformatics), pp. 1–4, (2019)
9. Bolton, DK; Friedl, MA. Forecasting crop yield using remotely sensed vegetation indices and crop phenology metrics. Agric. For. Meteorol.; 2013; 173, pp. 74-84. [DOI: https://dx.doi.org/10.1016/j.agrformet.2013.01.007]
10. Panchatcharam, P; Vivekanandan, S. Internet of things (IOT) in healthcare–smart health and surveillance, architectures, security analysis and data transfer: a review. Int. J. Softw. Innov. (IJSI); 2019; 7,
11. Janssen, SJ; Porter, CH; Moore, AD; Athanasiadis, IN; Foster, I; Jones, J. Towards a new generation of agricultural system data, models and knowledge products: information and communication technology. Agric. Syst.; 2017; 155, pp. 200-212. [DOI: https://dx.doi.org/10.1016/j.agsy.2016.09.017]
12. Pantazi, X. Wheat yield prediction using machine learning and advanced sensing techniques. Comput. Electron. Agric.; 2016; 121, pp. 57-65. [DOI: https://dx.doi.org/10.1016/j.compag.2015.11.018]
13. Schulze, C; Spilke, J; Lehnerb, W. Data modelling for precision dairy farming within the competitive field of operational and analytical tasks. Comput. Electron. Agric.; 2007; 59,
14. Parthasarathy, P., Vivekanandan, S.: Detection of suspicious human activity based on CNN-DBNN algorithm for video surveillance applications. In 2019 Innovations in Power and Advanced Computing Technologies (i-PACT) (Vol. 1, pp. 1–7). IEEE (2019)
15. Kamilaris, A; Assumpcio, A; Blasi, AB; Torrellas, M; Prenafeta-Boldú, FX. Estimating the Environmental Impact of Agriculture by Means of Geospatial and Big Data Analysis: The case of Catalonia From science to Society; 2017; Cham, Springer:
16. Fan, W., Chong, C., Xiaoling, G., Hua, Y., Juyun, W.: Prediction of crop yield using big data. In: 8th International Symposium on Computational Intelligence and Design, pp. 255–260.
17. Parthasarathy, P; Vivekanandan, S. Investigation on uric acid biosensor model for enzyme layer thickness for the application of arthritis disease diagnosis. Health Inf. Sci. Syst.; 2018; 6,
18. Nilakanta, S; Scheibe, K; Rai, A. Dimensional issues in agricultural data warehouse designs. Comput. Electron. Agric.; 2008; 60,
19. Parthasarathy, P; Vivekanandan, S. Biocompatible TiO2–CeO2 Nano-composite synthesis, characterization and analysis on electrochemical performance for uric acid determination. Ain Shams Eng. J.; 2020; 11,
20. He, Li; Coburn, CA; Wang, Z-J; Feng, W; Guo, T-C. Reduced prediction saturation and view effects for estimating the leaf area index of winter wheat. IEEE Trans. Geosci. Remote Sens.; 2019; 57,
21. Varadharajan, R; Priyan, MK; Panchatcharam, P; Vivekanandan, S; Gunasekaran, M. A new approach for prediction of lung carcinoma using back propogation neural network with decision tree classifiers. J. Ambient Intell. Hum. Comput.; 2018; 58, pp. 1-12.
22. Bang, S., Bishnoi, R., Chauhan, A.S., Dixit, A.K., Chawla, I.: Fuzzy logic based crop yield prediction using temperature and rainfall parameters predicted through ARMA, SARIMA, and ARMAX models. In: Twelfth International Conference on Contemporary Computing (IC3), pp. 1–6, (2019)
23. Mathan, K; Kumar, PM; Panchatcharam, P; Manogaran, G; Varadharajan, R. A novel Gini index decision tree data mining method with neural network classifiers for prediction of heart disease. Des. Autom. Embed. Syst.; 2018; 22,
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.