1. Introduction
Modern vehicles include electronic modules to control the vehicle’s subsystems. These modules are called Electronics Control Units (ECU). The number of ECUs in some vehicles can reach up to 70 [1]. Many vehicle networks were developed to allow vehicle ECUs to communicate with each others. Controller Area Network (CAN) was introduced as an automotive communication network protocol by Robert Bosch LLC in 1994 [2]. Flexray is another vehicle network protocol that provides more bandwidth than CAN [3]. Ethernet is also used as a network protocol in the automotive industry [4]. The transferred data within the vehicle network are called in-vehicle data.
In the recent years, new concepts of vehicle communications were introduced, such as Vehicle to Vehicle (V2V) and Vehicle to Infrastructure (V2I). In these approaches, a vehicle can transfer data with other road elements, such as other vehicles, pedestrians and infrastructure cloud systems. The transferred data between the vehicle and other road elements is called connected vehicles data. More information about V2V and V2I communication can be found in [5,6].
Many new features and applications were introduced to utilize vehicles data (Both in-vehicle and connected vehicles data) to improve vehicle and road safety. Ziebinski et al. [7] provided a review for the latest Advanced Driver Assistance System (ADAS) that uses in-vehicle data to introduce safety features such as lane detection, road object detection and traffic sign recognition. These systems require dedicated sensors such as cameras, radars and ultrasonic sensors to collect road information. Park et al. [8] proposed forward collision warning system using mono camera. A frontal object detection system based on sensor fusion of radar and mono vision camera was proposed by Hsu et al. [9]. A literature review for connected vehicles data and Internet of Things (IOT) to implement the smart cities approach can be found in [10].
Connected vehicles data systems require wireless devices to transfer the data. Moreover, the size of the transferred data is large and the data require advanced data storage and data processing systems. The ML approaches required to deal with the connected vehicles data are more complex than in-vehicle data. Therefore, in-vehicle data systems are usually less expensive and more readily available than connected vehicles data-based systems.
The main goal of this research is to enhance vehicle and road safety using a low-cost ML system that uses readily available in-vehicle data. Two design considerations were taken into account to reduce the ML system cost. The first consideration is that the ML system requires only basic in-vehicle CAN data. No special sensors, such as cameras and radars, are required by the ML system. Engine rpm, engine coolant temperature, manifold pressure, vehicle acceleration and fuel consumption are examples of the used data by the ML system. These data are available in the CAN for the main vehicle functionalities, and the proposed ML system uses this existing for predicting road conditions. This will significantly lower the cost of the data required by the ML system.
The second consideration is to use traditional ML algorithms, such as decision trees, random forests and SVM. These algorithms can achieve acceptable accuracy scores and allow real-time implementation with low cost. Deep learning algorithms may provide more accurate predictions than the traditional ML algorithms, but also require very expensive systems for real time implementation.
The proposed ML system handles three categorization problems; road surface conditions, road traffic conditions and the driving style. Road surface is characterized by three classes; full of holes, smooth or even. Road traffic is characterized as high, normal or low and the driving style is characterized as aggressive or normal.
In this paper, Section 2 explores some related work to our research. Section 3 provides an overview of the proposed system architecture. Section 4 explains the dataset we used for algorithms training and testing. Section 5 briefly explains the ML algorithms implementation. Section 6 defines the evaluation metrics. Section 7 presents the detection results. Section 8 provides a discussion about the system results, system limitation and future enhancement. Finally, conclusions are provided in Section 9.
2. Related Work
Since our proposed system uses in-vehicle data, this section explores more related work about in-vehicle data and ML applications. Lattanzi et al. [11] used two ML approaches and in-vehicle sensor data to identify unsafe driving behavior by the driver. They used SVM and neural network algorithms for classification. The input features to the ML system were the vehicle speed, engine speed, engine load, throttle position, steering wheel angle and Brake pedal pressure. Classification results of this study showed an average accuracy above 90% for both classifiers.
Alvarez-Coello et al. [12]. proposed a model for dangerous driving events using in-vehicle data. Random forests and Recurrent Neural Network were used to classify the data. The authors used features such as acceleration, brake pedal position, acceleration pedal position, engine RPM and torque. The danger level classified as normal, moderate and aggressive. Wang et al. [13] proposed k-means clustering-based support vector machine (kMC-SVM) method to classify drivers into two types: aggressive and moderate. Vehicle speed and throttle opening were treated as the feature parameters to reflect the driving styles.
Osman et al. [14] introduced a machine learning model for near-crash prediction from observed vehicle kinematics data. Vehicle kinematics data, such as speed, longitudinal acceleration, lateral acceleration, yaw rate and pedal position, were used as input features for multiple ML systems. The authors utilized several machine learning algorithms, such as K nearest neighbor (KNN), random forests, support vector machine (SVM) and adaptive boost (AdaBoost), to predict near-crash situations. The AdaBoost algorithm showed a better recall and F-score than other algorithms. A system which can identify the driver trip using historical trip-based data collected from in-vehicle data was proposed by Moreira-Matias et al. [15]. Decision trees obtained an accuracy between 75% and 100%.
Ghadge et al. [16] proposed a model to detect road potholes using vehicle accelerator information and GPS data. The k-means clustering algorithm was applied on the training data to build the model. Random forests classifier was used to evaluate this model on the test data for better prediction. Dhiman et al. [17] proposed a computer vision approach to detect potholes using stereo vision camera and deep learning algorithm. Kim et al. [18] provided a review for potholes techniques using machine learning. The paper summarised the different approaches for potholes detection using vibration sensors, accelerometer, 3D construction and 2D images.
Bernas et al. [19] provided a survey for low-cost techniques to detect road traffic using in-vehicle sensors. The techniques include applications of infrared and visible light sensors, wireless transmission, accelerometers, magnetometers, ultrasonic and microwave radars as well as acoustic sensing.
There are many other applications that uses in-vehicle data with ML. For example, a vehicle theft prevention and driver identification system was proposed by Martinelli et al. [20,21]. A system to predict the driver’s drowsiness based on the air quality presented in the cabin car was proposed by Goh et al. [22]. Bai et al. [23] proposed a system to address the problem of detecting traffic signals from a set of vehicle speed profiles.
The significance of our research is in proposing a low-cost prediction system for road conditions and driving style. In order to reduce the system cost, general CAN were was used as input features to the ML system. No additional cost for special sensors are required by the system. Furthermore, the chosen ML algorithms are inexpensive to implement and they do not require a complex computing system.
3. System Overview
This section describes how ML and vehicle network data can be used together to implement a full prediction system. It also explains how the predictions can be used in safety applications. The safety application can be implemented in the vehicle and in the infrastructure system by transferring the prediction results to the infrastructure. The proposed system block diagram is summarized in Figure 1. The proposed system includes the following components:
Vehicle network: The in-vehicle data to be fed to the ML system are collected through a vehicle network.
Data logging system: The data logging system collects data from the vehicle network.
Machine learning system: The machine learning system receives the data from the logging system and then classifies them. A training dataset is required to train the ML algorithms. The training dataset should be labeled correctly to the required classes.
Vehicle to Infrastructure communication (V2I): This network is used to transfer the result of the ML predictions to the infrastructure system.
Vehicle application system: This system uses the ML results to provide in-vehicle safety functions for the driver. For example, if the road traffic is classified as high, then a warning is issued to drive carefully.
Infrastructure application system: This system uses the ML classification result to provide functions in the infrastructure level. for example, a road maintenance request is issued if the road surface is detected as being full of holes.
Three algorithms were implemented for in-vehicle data classifications; decision trees, random forest and Support Vector Machine (SVM). A labeled dataset collected from the CAN network was used to train and test these algorithms. Results of the classification were analyzed with respect to algorithm accuracy, precision, recall and F-score.
4. The Dataset
The dataset used in this work was obtained from the Kaggle website under the title of Traffic, Driving Style and Road Surface Condition [24]. Two cars were used to collect the dataset, a Peugeot 207 1.4 HDi and an Opel Corsa 1.3 HDi. The dataset was collected from the vehicles On Board Diagnostics port (OBD) by using an OBD device that can be paired with a smartphone. Ruta et al. [25] used this dataset to propose machine learning models in Internet of Things (IOT).
The dataset includes 14 input features. They are summarized as follows:
Altitude change, calculated over 10 s.
Current speed value, which is the average speed in the last 60 s.
Speed variance in the last 60 s.
Speed variation for every second of detection.
Longitudinal acceleration.
Engine load, expressed as a percentage.
Engine coolant temperatures in degree celsius.
Manifold Air Pressure (MAP), a parameter used by the internal combustion engine used to compute the optimal air/fuel ratio.
Revolutions Per Minute (RPM) of the engine.
Mass Air Flow (MAF) Rate measured in g/s. This reading is used by the engine to set fuel delivery and spark timing.
Intake Air Temperature (IAT) at the engine entrance.
Vertical acceleration.
Average fuel consumption, calculated as liters per 100 km.
The dataset was labeled to three sub-problem categories, i.e., road surface conditions, road traffic conditions and driving style. The road surface condition was labeled as smooth, even or full of holes. The road traffic condition was labeled as low, normal or high, and the driving style was labeled as normal or aggressive style. The dataset includes 24,957 data points. Table 1 summarizes the input features, the categories and the labels for each category of the dataset.
The number of labels for each category are different. Smooth roads, normal traffic conditions and normal driving style represent the majority of the labels of each category. This is due to the nature of the roads used for data collection. Figure 2 shows the distribution of the labels for the classification categories we considered in this study.
5. ML Algorithms Implementation
The used ML algorithms in this work are common and widely used in classification problems. Decision trees is a flowchart-like tree structure, where each internal node denotes a test on an attribute, each branch represents an outcome of the test and each leaf node (terminal node) holds a class label. A decision tree typically starts with a single node, which branches into possible outcomes. Each of those outcomes leads to additional nodes, which branch off into other possibilities. This gives it a treelike shape. More information about decision trees can be found in [26].
Random forests is an ML algorithm constructed from many decision trees. The random forests algorithm establishes the outcome based on the predictions of the decision trees. It predicts by taking the average or mean of the output from various trees. More information about random forests can be found in [27].
The support vector machine is an algorithm that tries to find a hyperplane to separate the data based on the classes. SVM finds boundaries that maximize the distance between the support vector data of each class. More information about SVM can be found in [28].
As mentioned in the previous sections, three ML algorithms were implemented to classify the CAN data. The implementation was done in Python using Sklearn, Pandas, Scipy, Numpy and Matplot packages [29,30,31,32]. The packages were used for ML implementation, results analysis and visualization. The dataset was divided to 80% for training and 20% for testing. This yields 19,965 samples for training and 4992 for testing.
In the decision trees implementation, the minimum number of samples to split to an internal node is set to 2, while the minimum number of samples per leaf is set to 1. In total, 200 trees were used in the random forests implementation.
SVM was implemented with radial basis function (RBF) kernel as the dataset is highly non linear, and the Kernel is needed to create accurate boundary conditions. The scaling parameter () and the cost parameter were adjusted to achieve the best classification accuracy.
6. Evaluation Metrics
The results of the detection are analyzed by showing the classification confusion matrix for each algorithm. The confusion matrix shows the true positives, false positives, true negatives and false negatives for each of the classification problems in this study. From the confusion matrix, accuracy, recall, precision and F-score are calculated.
Accuracy represents the number of the correct prediction as a ratio to the number of total predictions. Precision shows how many are predicted correctly from all the classes predicted as positive. Recall shows how many are predicted correctly from all the positive classes. F-score is the harmonic mean of precision and recall.
The above measures are given as follows:
(1)
(2)
(3)
(4)
The permutation feature importance approach is implemented to show the importance of the used features to the accuracy of detection. This algorithm works by shuffling the data of a single feature at time to destroy its quality while maintaining the rest of the features. If the quality of prediction is highly impacted, it means the feature is very important for the predictor. Feature ranking helps in understanding how the ML algorithms work and what data are more important to them. André et al. [33] showed more information about the permutation importance and implementation.
7. Results
7.1. Road Surface Conditions Classification Results
Table 2 shows the accuracy, the precision, the recall and the F-score of the road surface conditions classification and Table 3 shows the confusion matrix of the predictions.
Table 4 shows the top seven important features for road surface detection for the three algorithms. As shown in the results, engine coolant temperature is the most important feature for the three algorithms (decision trees, random forests and SVM). SVM was the only approach to have the longitudinal acceleration as one of the top seven important features for classification.
7.2. Road Traffic Conditions Classification Results
Table 5 shows the accuracy, the precision, the recall and the F-score of road traffic conditions classification, while Table 6 shows the confusion matrix of the predictions.
Table 7 shows the feature importance for road traffic classification using the permutation feature importance. SVM relied on vehicle instant speed for classification, while decision trees and random forests relied more on the average speed. Fuel consumption ranked as the third important feature for decision trees and random forests, while it was not in the top seven important features for SVM. Manifold absolute pressure was more important to SVM than the other two algorithms.
7.3. Driving Style Classification Results
Table 8 shows the accuracy, the precision, the recall and the F-score of driving style classification and Table 9 shows the confusion matrix of the predictions.
Table 10 shows the feature importance for driving style classification. Fuel consumption was more important for decision trees and random forests, while manifold air pressure was more important for SVM.
8. Discussion
In this work, in-vehicle data were used to make predictions for road conditions and driving style using supervised machine learning algorithms. The detest was collected from vehicle CAN network. It includes 14 features, such as vehicle speed, longitudinal acceleration, fuel consumption and engine rpm, as shown in Table 1. The data were labeled to three categories. The first one is road surface conditions, which classify the road as full of holes, smooth or even. The second category is road traffic conditions, which classifies the traffic as low, normal or high. Finally, The driving style which classifies the driving style into aggressive or normal.
A detailed overview for the system architecture is shown in Figure 1. The model includes a data logging system for in-vehicle data. A machine learning algorithm system was used for classification and prediction.
Three ML algorithms were implemented, i.e., decision trees, random forests and SVM. The detection results showed that random forests provided the best performance among the three algorithms. Decision trees came second, while the lowest performance algorithm was SVM. Figure 3, Figure 4 and Figure 5 show the detection results represented in charts for the three classifications topics we covered in this work.
Due to the nature of road conditions where the dataset was gathered, it was noticed that 61% of road surface is smooth and only 13% is full of holes. Moreover, 75% of the traffic is normal and only 12% is high traffic. Normal driving style is 89% of the data and the rest are aggressive. This imbalanced data distribution can impact the ML models and make them biased toward one class more than others. Although the recall results in this study were good, which means algorithms detection was accurate for the positive classes (low-sample data), it is always better to have balanced data. Future work may focus on solving this issue by increasing the amount of the training data to have more balanced data. Another solution is to use oversampling techniques to increase the positive classes’ samples.
The permutation feature importance technique was used to rank the input features based on its impact on the detection results. It was noticed that decision trees and random forests have almost the same rank for the features, while SVM showed different ranking. If we have to develop a voting system to choose between many ML detection, it is important to choose ML algorithms that build different classification models and think differently. Feature ranking showed that some features did not have high impact to the detection results. For example, engine load and manifold absolute pressure data have a very low impact on the driving style detection. Altitude variation and vehicle speed variation have a very low impact on driving style detection in SVM. Therefore, eliminating low-ranked features can improve the ML system performance and helps avoiding model over fitting; it also helps in the practical implementation of the system.
More work can be added in the future to this research. Collecting more data from other vehicle systems, such as suspension, brake and gear, can help improve the results. Extracting some statistics from the data, such as mean, standard deviation and median, can add more value to the input features. Ranking the features and eliminating the low impact features is also a good practice to reduce system complexity.
Fusing data from many resources provides a better understanding about the vehicle surrounding area and then yield to a better prediction system. Therefore, fusing in-vehicle data with connected vehicle data should boost the performance of the ML system. Adding data from sensors such as camera, radar and Lidar will improve the detection results.
A deep learning algorithm, such as neural network, can be suggested as a future work. Neural networks may have a better performance than the conventional ML algorithms, such as random forests and SVM. However, deep learning techniques require more computation and then a more expensive system. Therefore, choosing between deep learning and the traditional ML algorithms is a trade off between system accuracy and system cost.
9. Conclusions
In this study, an ML system is proposed to solve three categorization problems; road surface conditions, road traffic conditions and driving style. Decision trees, random forests and SVM were implemented in Python. In-vehicle CAN data were used to train and test the algorithms.
Random forests showed the best accuracy, precision, recall and F-score for all the classifications. The nature of the features and the amount of the training dataset is what gives an algorithm the advantage over another. From the results, we can conclude that random forests is the best algorithm to predict road surface conditions, road traffic conditions and driving style.
Feature importance of the algorithms was analyzed using permutation feature importance algorithm. It was noticed that decision trees and random forests have almost the same feature importance rank. SVM showed different feature importance rank. For example, SVM showed a high rank for longitudinal acceleration for road surface detection, while decision trees and random forests showed a low rank for this feature. Features ranking can help eliminate the low-ranked features to reduce system complexity, while maintaining the ML system performance.
Finally, this work shows that vehicle network carries rich information that can be analyzed and classified using ML to provide useful applications. In-vehicle data with traditional ML algorithms can provide a system with high accuracy and inexpensive implementation compared to more complex ML systems.
G.A.-r. developed the detection system overview and defined the system elements. He also did the ML algorithm implementation in Python. G.A.-r. contributed to the related work and conclusion. H.E. contributed to data analysis, evaluation metrics, features ranking, discussion and general paper review and corrections. M.R. wrote the introduction, contributed to the related work section, results, discussion and conclusion. All authors have read and agreed to the published version of the manuscript.
Not applicable.
Not applicable.
The used dataset in this work is available on the Kaggle website under the title of Traffic, Driving style, road surface conditions:
Our gratitude to the people who prepared the dataset and made it available in Kaggle.
The authors declare no conflict of interest.
The following abbreviations are used in this manuscript:
ML | Machine learning |
CAN | Controller area network |
V2V | Vehicle to Vehicle communication |
V2I | Vehicle to Infrastructure communication |
ADAS | Advanced Driver Assistance System |
OBD | On Diagnostics Board |
FCW | Forward collision warning |
AEB | Automatic emergency braking |
ACC | Advance cruise control |
ECU | Electronic control unit |
GPS | Global positioning system |
SVM | Support vector machine |
RBF | Radial basis function |
KNN | N nearest neighbor |
ANN | Artificial Neural Network |
TP | True positive |
TN | True negative |
FP | False positive |
FN | False negative |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figure 1. The figure shows the system architecture for road conditions and driving style prediction.
Figure 2. The distribution of each class in the dataset: (a) Label distribution as a percentage for the road surface conditions. (b) Label distribution as percentage for the road traffic conditions. (c) Labels distribution as a percentage for the driving style.
Figure 3. Accuracy, precision, recall and F-score for road surface conditions results.
Figure 4. Accuracy, precision, recall and F-score for road traffic conditions results.
The dataset input features and the output classes for the road conditions, traffic conditions and the driving style.
Features | Output 1: Road Surface Conditions | Output 2: Road Traffic Conditions | Output 3: Driving Style |
---|---|---|---|
AltitudeVariation | Smooth | Low traffic | Normal style |
VehicleSpeedInstantaneous | Full of holes | Normal traffic | Aggressive style |
VehicleSpeedAverage | Even condition | High traffic | |
VehicleSpeedVariance | |||
VehicleSpeedVariation | |||
LongitudinalAcceleration | |||
EngineLoad | |||
EngineCoolantTemperature | |||
ManifoldAbsolutePressure | |||
EngineRPM | |||
MassAirFlow | |||
IntakeAirTemperature | |||
VerticalAcceleration | |||
FuelConsumptionAverage |
Road surface conditions classification results.
Decision Trees | Random Forests | SVM | |
---|---|---|---|
Accuracy | 0.954 | 0.983 | 0.94 |
Precision | 0.974 | 0.99 | 0.942 |
Recall | 0.975 | 0.984 | 0.941 |
F-score | 0.975 | 0.989 | 0.94 |
Confusion matrix for road surface conditions classification.
Decision Trees | Random Forests | SVM | |||||||
---|---|---|---|---|---|---|---|---|---|
True/Predicted | Full of holes | Smooth | Even | Full of holes | Smooth | Even | Full of holes | Smooth | Even |
Full of holes | 649 | 2 | 1 | 649 | 1 | 0 | 553 | 15 | 41 |
Smooth | 57 | 2995 | 1 | 23 | 3045 | 4 | 42 | 2932 | 85 |
Even | 52 | 43 | 1198 | 22 | 21 | 1248 | 39 | 77 | 1208 |
Road surface conditions feature importance using the permutation feature importance approach.
Feature Importance Rank | Decision Trees | Random Forests | SVM |
---|---|---|---|
1 | Engine Coolant Temperature | Engine Coolant Temperature | Engine Coolant Temperature |
2 | Fuel Consumption Average | Intake Air Temperature | Intake Air Temperature |
3 | Vehicle Speed Average | Fuel Consumption Average | Vehicle Speed Average |
4 | Intake Air Temperature | Vehicle Speed Average | Vehicle Speed instantaneous |
5 | Engine RPM | Engine RPM | Engine RPM |
6 | Vehicle Speed Variance | Manifold Absolute Pressure | Longitudinal Acceleration |
7 | Manifold Absolute Pressure | Vehicle Speed Variance | Manifold Absolute Pressure |
Road traffic conditions classification results.
Decision Trees | Random Forests | SVM | |
---|---|---|---|
Accuracy | 0.951 | 0.979 | 0.938 |
Precision | 0.973 | 0.989 | 0.938 |
Recall | 0.973 | 0.981 | 0.938 |
F-score | 0.973 | 0.985 | 0.938 |
Confusion matrix for road traffic conditions classification.
Decision Trees | Random Forests | SVM | |||||||
---|---|---|---|---|---|---|---|---|---|
True/Predicted | High traffic | Low traffic | Normal traffic | High traffic | Low traffic | Normal traffic | High traffic | Low traffic | Normal traffic |
High traffic | 599 | 2 | 1 | 599 | 6 | 0 | 504 | 84 | 11 |
Low traffic | 80 | 3694 | 2 | 11 | 3744 | 2 | 47 | 3693 | 35 |
Normal traffic | 39 | 43 | 549 | 25 | 35 | 549 | 13 | 119 | 486 |
Road traffic conditions feature importance using the permutation feature importance approach.
Feature Importance Rank | Decision Trees | Random Forests | SVM |
---|---|---|---|
1 | Vehicle Speed Average | Engine Coolant Temperature | Engine Coolant Temperature |
2 | Engine Coolant Temperature | Vehicle Speed Average | Vehicle Speed instantaneous |
3 | Fuel Consumption Average | Fuel Consumption Average | Engine RPM |
4 | Intake Air Temperature | Intake Air Temperature | Intake Air Temperature |
5 | Vehicle Speed Variance | Vehicle Speed Variance | Longitudinal Acceleration |
6 | Engine RPM | Longitudinal Acceleration | Vehicle Speed Average |
7 | Longitudinal Acceleration | Engine RPM | Manifold Absolute Pressure |
Driving syle classification results.
Decision Trees | Random Forests | SVM | |
---|---|---|---|
Accuracy | 0.92 | 0.95 | 0.91 |
Precision | 0.92 | 0.95 | 0.91 |
Recall | 0.92 | 0.95 | 0.91 |
F-score | 0.92 | 0.95 | 0.91 |
Confusion matrix for driving style classification.
Decision Trees | Random Forests | SVM | ||||
---|---|---|---|---|---|---|
True/Predicted | Aggressive style | Normal style | Aggressive style | Normal style | Aggressive style | Normal style |
Aggressive style | 354 | 209 | 345 | 230 | 140 | 409 |
Normal style | 184 | 4243 | 37 | 4393 | 55 | 4393 |
Driving style feature importance using the permutation feature importance approach.
Feature Importance Rank | Decision Trees | Random Forests | SVM |
---|---|---|---|
1 | Vehicle Speed Average | Vehicle Speed Average | Vehicle Speed instantaneous |
2 | Vehicle Speed instantaneous | Longitudinal Acceleration | Vehicle Speed Average |
3 | Longitudinal Acceleration | Fuel Consumption Average | Engine RPM |
4 | Fuel Consumption Average | Vehicle Speed instantaneous | Longitudinal Acceleration |
5 | Vehicle Speed Variance | Vehicle Speed Variance | Vertical Acceleration |
6 | Engine RPM | Vertical Acceleration | Manifold Absolute Pressure |
7 | Vertical Acceleration | Engine RPM | Vehicle Speed Variance |
References
1. Schmidgall, R. Automotive Embedded Systems Software Reprogramming. Ph.D. Thesis; Brunel University: London, UK, 2012.
2. Farsi, M.; Ratcliff, K.; Barbosa, M. An overview of controller area network. Comput. Control Eng. J.; 1999; 10, pp. 113-120. [DOI: https://dx.doi.org/10.1049/cce:19990304]
3. Makowitz, R.; Temple, C. Flexray-a communication network for automotive control systems. Proceedings of the 2006 IEEE International Workshop on Factory Communication Systems; Torino, Italy, 27–30 June 2006; pp. 207-212.
4. Matheus, K.; Königseder, T. Automotive Ethernet; Cambridge University Press: Cambridge, UK, 2021.
5. Zeadally, S.; Guerrero, J.; Contreras, J. A tutorial survey on vehicle-to-vehicle communications. Telecommun. Syst.; 2020; 73, pp. 469-489. [DOI: https://dx.doi.org/10.1007/s11235-019-00639-8]
6. Dey, K.C.; Rayamajhi, A.; Chowdhury, M.; Bhavsar, P.; Martin, J. Vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication in a heterogeneous wireless network–Performance evaluation. Transp. Res. Part C Emerg. Technol.; 2016; 68, pp. 168-184. [DOI: https://dx.doi.org/10.1016/j.trc.2016.03.008]
7. Ziebinski, A.; Cupek, R.; Grzechca, D.; Chruszczyk, L. Review of advanced driver assistance systems (ADAS). AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2017; Volume 1906, 120002.
8. Park, K.-Y.; Hwang, S. Robust range estimation with a monocular camera for vision-based forward collision warning system. Sci. World J.; 2014; 2014, 923632. [DOI: https://dx.doi.org/10.1155/2014/923632] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/24558344]
9. Hsu, Y.W.; Lai, Y.H.; Zhong, K.Q.; Yin, T.K.; Perng, J.W. Developing an on-road object detection system using monovision and radar fusion. Energies; 2019; 13, 116. [DOI: https://dx.doi.org/10.3390/en13010116]
10. Heidari, A.; Navimipour, N.J.; Unal, M. Applications of ML/DL in the management of smart cities and societies based on new trends in information technologies: A systematic literature review. Sustain. Cities Soc.; 2022; 85, 104089. [DOI: https://dx.doi.org/10.1016/j.scs.2022.104089]
11. Lattanzi, E.; Freschi, V. Machine learning techniques to identify unsafe driving behavior by means of in-vehicle sensor data. Expert Syst. Appl.; 2021; 176, 114818. [DOI: https://dx.doi.org/10.1016/j.eswa.2021.114818]
12. Alvarez-Coello, D.; Klotz, B.; Wilms, D.; Fejji, S.; Gómez, J.M.; Troncy, R. Modeling dangerous driving events based on in-vehicle data using Random Forest and Recurrent Neural Network. Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV); Paris, France, 9–12 June 2019.
13. Wang, W.; Xi, J. A rapid pattern-recognition method for driving styles using clustering-based support vector machines. Proceedings of the 2016 American Control Conference (ACC); Boston, MA, USA, 6–8 July 2016; pp. 5270-5275.
14. Osman, O.A.; Hajij, M.; Bakhit, P.R.; Ishak, S. Prediction of near-crashes from observed vehicle kinematics using machine learning. Transp. Res. Rec.; 2019; 2673, pp. 463-473. [DOI: https://dx.doi.org/10.1177/0361198119862629]
15. Moreira-Matias, L.; Farah, H. On developing a driver identification methodology using in-vehicle data recorders. IEEE Trans. Intell. Transp. Syst.; 2017; 18, pp. 2387-2396. [DOI: https://dx.doi.org/10.1109/TITS.2016.2639361]
16. Ghadge, M.; Pandey, D.; Kalbande, D. Machine learning approach for predicting bumps on road. Proceedings of the 2015 International Conference on Applied and Theoretical Computing and Communication Technology (iCATccT); Davangere, India, 29–31 October 2015; pp. 481-485.
17. Dhiman, A.; Klette, R. Pothole detection using computer vision and learning. IEEE Trans. Intell. Transp. Syst.; 2019; 21, pp. 3536-3550. [DOI: https://dx.doi.org/10.1109/TITS.2019.2931297]
18. Kim, T.; Ryu, S.-K. Review and analysis of pothole detection methods. J. Emerg. Trends Comput. Inf. Sci.; 2014; 5, pp. 603-608.
19. Bernas, M.; Płaczek, B.; Korski, W.; Loska, P.; Smyła, J.; Szymała, P. A survey and comparison of low-cost sensing technologies for road traffic monitoring. Sensors; 2018; 18, 3243. [DOI: https://dx.doi.org/10.3390/s18103243] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30261665]
20. Martinelli, F.; Mercaldo, F.; Nardone, V.; Orlando, A.; Santone, A. Who’s Driving My Car? A Machine Learning based Approach to Driver Identification. Proceedings of the 4th International Conference, ICISSP 2018; Funchal, Portugal, 22–24 January 2018; pp. 367-372.
21. Martinelli, F.; Mercaldo, F.; Santone, A. Machine learning for driver detection through CAN bus. Proceedings of the 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring); Antwerp, Belgium, 25–28 May 2020; pp. 1-5.
22. Goh, C.C.; Kamarudin, L.M.; Zakaria, A.; Nishizaki, H.; Ramli, N.; Mao, X.; Syed Zakaria, S.M.; Kanagaraj, E.; Abdull Sukor, A.S.; Elham, M.F. Real-time in-vehicle air quality monitoring system using machine learning prediction algorithm. Sensors; 2021; 21, 4956. [DOI: https://dx.doi.org/10.3390/s21154956] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/34372192]
23. Bai, R.; Chen, X.; Chen, Z.L.; Cui, T.; Gong, S.; He, W.; Jiang, X.; Jin, H.; Jin, J.; Kendall, G. et al. Analytics and machine learning in vehicle routing research. Int. J. Prod. Res.; 2021; pp. 1-27. [DOI: https://dx.doi.org/10.1080/00207543.2021.2013566]
24. Kaggle. Available online: https://www.kaggle.com/datasets/gloseto/traffic-driving-style-road-surface-condition (accessed on 18 July 2022).
25. Ruta, M.; Scioscia, F.; Loseto, G.; Pinto, A.; Di Sciascio, E. Machine learning in the Internet of Things: A semantic-enhanced approach. Semant. Web; 2019; 10, pp. 183-204. [DOI: https://dx.doi.org/10.3233/SW-180314]
26. Myles, A.J.; Feudale, R.N.; Liu, Y.; Woody, N.A.; Brown, S.D. An introduction to decision tree modeling. J. Chemom. A J. Chemom. Soc.; 2004; 18, pp. 275-285. [DOI: https://dx.doi.org/10.1002/cem.873]
27. Biau, G.; Scornet, E. A random forest guided tour. Test; 2016; 25, pp. 197-227. [DOI: https://dx.doi.org/10.1007/s11749-016-0481-7]
28. Noble, W.S. What is a support vector machine?. Nat. Biotechnol.; 2006; 24, pp. 1565-1567. [DOI: https://dx.doi.org/10.1038/nbt1206-1565]
29. SKlearn. Available online: https://scikit-learn.org/stable/ (accessed on 18 July 2022).
30. Numpy. Available online: https://numpy.org/ (accessed on 18 July 2022).
31. Pandas. Available online: https://pandas.pydata.org/ (accessed on 18 July 2022).
32. Matplot. Available online: https://matplotlib.org/ (accessed on 18 July 2022).
33. Altmann, A.; Toloşi, L.; Sander, O.; Lengauer, T. Permutation importance: A corrected feature importance measure. Bioinformatics; 2010; 26, pp. 1340-1347. [DOI: https://dx.doi.org/10.1093/bioinformatics/btq134]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Many network protocols such as Controller Area Network (CAN) and Ethernet are used in the automotive industry to allow vehicle modules to communicate efficiently. These networks carry rich data from the different vehicle systems, such as the engine, transmission, brake, etc. This in-vehicle data can be used with machine learning algorithms to predict valuable information about the vehicle and roads. In this work, a low-cost machine learning system that uses in-vehicle data is proposed to solve three categorization problems; road surface conditions, road traffic conditions and driving style. Random forests, decision trees and support vector machine algorithms were evaluated to predict road conditions and driving style from labeled CAN data. These algorithms were used to classify road surface condition as smooth, even or full of holes. They were also used to classify road traffic conditions as low, normal or high, and the driving style was classified as normal or aggressive. Detection results were presented and analyzed. The random forests algorithm showed the highest detection accuracy results with an overall accuracy score between 92% and 95%.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer