Machine learning techniques for spatiotemporal traffic prediction in 5G cellular networks

Abstract

Wireless traffic prediction is vital for network planning and management, enabling real-time decisions and both short- and long-term forecasting. Accurate and efficient techniques improve cellular networks by optimizing resource allocation, adapting to dynamic user behavior, and ensuring high-quality service through pattern recognition in network traffic. This facilitates proactive management, including load balancing and beam coordination. This paper developed eight models for cellular network traffic prediction using the telecom Italia big data challenge dataset, which provides a comprehensive view of urban activities and telecommunications in Milan City. These models include seasonal-AutoRegressive-integrated-moving-average, Facebook-prophet, adaptive-boosting (AdaBoost), extreme-gradient-boosting (XGBoost), Long-short-term memory (LSTM), convolutional-neural-network (CNN), hybrid CNN-LSTM, and ensemble model that combined of the outputs of CNN and LSTM. These models were applied to predict different types of network traffic, namely the Internet, SMS, and call traffic across distinct geographic regions: city center, commercial, residential, and business. Each region exhibited unique temporal traffic patterns influenced by weekdays, weekends, and local activities. These models are evaluated based on performance metrics and computational time. The results demonstrate that the ensemble CNN+LSTM is the most accurate model, achieving R² values of 0.990 for Internet, 0.986 for call, and 0.976 for SMS, followed by the hybrid CNN-LSTM and LSTM models. These models are associated with a high level of computational complexity. Meanwhile, the AdaBoost and XGBoost models obtain practical alternatives for balancing accuracy with computational efficiency. Finally, the ensemble CNN+LSTM surpasses prior research, demonstrating enhanced predictive reliability across all network traffic types.

Full text

Translate

Turn on search term navigation

Introduction

Cellular networks have evolved significantly over the decades, starting with the first generation (1G) in the late 1970s, which provided voice communication via analog technology but suffered from poor quality and limited coverage. The 1990s saw the advent of 2G networks, which introduced digital transmission, improved reliability and security, and added SMS capabilities. The transition to 2.5G brought general packet radio service (GPRS) for Internet services. Third-generation (3G) networks enhanced Internet connectivity and multimedia services by the universal mobile telecommunication system, CDMA2020, and other technologies. Fourth-generation (4G) networks, like long-term evolution (LTE), further improved speed and enabled high-quality audio and video streaming. The latest evolution, 5G networks, promises even more significant advancements, offering significantly higher data transmission rates, reduced latency, and enhanced connectivity. This enables more reliable and faster communication services, supporting various applications, from mobile broadband to the Internet of things (IoT) and autonomous vehicles. 5G technology is set to revolutionize communication networks by providing seamless connectivity and fostering innovations across various sectors [1]. According to the cisco annual Internet report (2018–2023), 5G devices and connections are projected to make up over 10% of global mobile devices and connections by 2023 [2].

The increased number of devices also complicates the system architecture [3]. Researchers are exploring the potential challenges for future wireless communications, starting with 6G and beyond. These challenges include higher requirements for latency, reliability, security, and privacy [4]. A large user group would generate significant traffic across several services, requiring more efficient allocation and scheduling of network resources to support enhanced cellular networks [5]. This can be achieved by implementing effective traffic prediction techniques utilizing artificial intelligence. The utilization of the cellular network would increase energy consumption. However, this issue may be reduced by employing adaptive network strategies. For instance, intelligent base station sleeping techniques can be developed using traffic prediction [6].

Recently, cellular networks have become the backbone of modern society, supporting many services such as Internet access, SMS, and voice calls. Cellular traffic prediction is challenging due to the complex internal patterns hidden within historical traffic data. Although there are multiple challenges in predicting network traffic, ML models can provide benefits in multiple real-world scenarios, both in the short term and the long term [7, 8]. From an ML perspective, optimal resource planning, congestion control, and packet routing can be achieved by applying short-time prediction ranging from milliseconds to minutes. Long-term forecasts can be used to analyze future capacity requirements network security, and reduce unnecessary operation costs by optimizing bandwidth and energy [8]. Accurate traffic prediction is crucial for several reasons. Firstly, in terms of quality of service (QoS) [9], predicting traffic patterns enables operators to preemptively manage network resources, ensuring consistent and reliable service even during peak usage. Secondly, for capacity planning, understanding future traffic trends allows operators to make informed decisions about infrastructure investments, avoiding over-provisioning or under-provisioning network resources [10]. Thirdly, from a cost-efficiency perspective, efficient traffic management helps reduce operational costs by optimizing the use of network resources and minimizing unnecessary expenditures. Lastly, regarding energy efficiency and environmental impact, effective traffic prediction contributes to energy saving by allowing operators to power down underutilized network components during low traffic periods, thereby reducing the carbon footprint of cellular networks and promoting a greener environment.

Given the temporal nature of traffic data, time series prediction is essential for capturing patterns and trends over time, providing valuable insights into future traffic behavior. Currently, numerous ML and deep learning (DL) models are utilized to achieve accurate traffic prediction. The Seasonal Auto-Regressive Integrated Moving Average (SARIMA) model can capture the seasonality, trends, and noise [11]. Cellular network traffic exhibits periodic patterns, including daily or weekly cycles, which can be effectively modeled by SARIMA through its seasonal differencing and seasonal ARIMA components. Correia et al. [12] demonstrated SARIMA's precision in predicting network key performance indicators (KPIs) by leveraging these capabilities to efficiently model the time series data. Additionally, SARIMA’s ability to stabilize data trends through differencing while managing noise via its moving average (MA) component makes it robust against unpredictable traffic spikes. Its computational efficiency and interpretability further enhance its practicality for real-world applications in network traffic prediction. The SARIMA model’s success has also been validated in domains like IoT traffic analysis [13], demonstrating its versatility and transferability to similar contexts. Another model, known as Facebook Prophet (FB Prophet), is widely recognized as a robust tool for time series forecasting, particularly for its efficacy in managing missing data, seasonality, and abrupt trend changes [14]. Unlike traditional statistical models, FB Prophet incorporates a flexible and modular framework that integrates user-defined seasonalities and holiday effects seamlessly. This feature makes it particularly well-suited for modeling cellular network traffic patterns irregularities, where periodicity and special events can significantly influence data trends. Guesmi et al. [15] demonstrated the effectiveness of the FB Prophet model in forecasting scenarios characterized by complex seasonal trends and irregular fluctuations. The FB Prophet model’s capacity to adapt to missing or incomplete datasets ensures forecast reliability even when data irregularities occur, such as due to temporary outages or corrupted logs. Moreover, FB Prophet’s intuitive design facilitates ease of customization, allowing users to tailor it to specific forecasting needs and provides interpretable results, making it highly practical for real-world applications.

By transitioning to ML models, an adaptive-boosting (AdaBoost) model has been demonstrated to be a useful tool for enhancing the accuracy of time series predictions in a variety of domains by integrating multiple weak learners into a robust predictive model [16]. Zhang [17] highlighted AdaBoost’s capability to enhance long-short-term memory (LSTM) based Internet traffic forecasting, emphasizing its strength in improving weak learners and addressing complex data patterns. Moreover, the extreme gradient boosting (XGBoost) model has been extensively applied in time series prediction due to its efficiency and ability to handle large datasets with complex patterns [18, 19]. For instance, Khattak et al. [20] integrated the XGBoost into a hybrid framework for wind shear intensity forecasting, where it outperformed other statistical models. On the other hand, long-short-term-memory (LSTM) networks are powerful for time series prediction due to their ability to capture long-term dependencies and patterns in sequential data [21]. LSTM networks are designed to address the vanishing gradient problem found in traditional neural networks, making them suitable for predicting traffic patterns over extended periods. Also, convolutional neural networks (CNNs), when adapted for time series prediction, are proficient at identifying local patterns and trends via their convolutional layers [22]. The CNNs were utilized for proactive IoT network optimization, harnessing their ability to extract time-series features for the identification of network traffic [23].

Based on the aforementioned introduction, this paper is structured into multiple sections. Section 2 offers an extensive review of related work in traffic prediction, encompassing diverse strategies and methodologies previously investigated in the literature. Section 3 presents a detailed problem statement, clearly identifying the challenges and providing a comprehensive formulation of the proposed solution. Furthermore, this section highlights the authors’ specific contributions, emphasizing the novelty and significance of their work in the study context. Section 4 indicates the experimental configuration, including the utilized datasets, the data preprocessing techniques, and the profiling data for traffic prediction. Section 5 outlines the experimental results, providing a comprehensive quantitative and visual assessment of the efficacy of the developed prediction models. Section 6 concludes the paper by summarizing the findings and highlighting the most effective models based on various criteria. This section also outlines potential directions for future work or enhancements, building on the current study’s findings to guide further research in this area. Lastly, Sect. 7 highlights the scope for future research, focusing on enhancing model scalability, incorporating real-time data processing, and applying advanced deep learning techniques to broader and more complex network environments.

Related work

This section investigates the related works on traffic prediction. The challenge of accurately predicting network traffic has become increasingly pronounced in recent years. The exponential growth in mobile data traffic, driven by technological advancements in communication and networking, has significantly increased the number of devices connected to cellular networks. Moreover, the widespread adoption of social media platforms such as Instagram and Facebook has further amplified the volume and complexity of network traffic. Table 1 summarizes the various models applied in prior studies, along with the most effective model utilized. A significant percentage of mobile traffic is randomly unpredictable, as demonstrated by Xu et al. [24], which examined the traffic patterns of over 9000 base stations (BSs) in an urban region. Huang et al. [25] identified several obstacles ML models face in network traffic prediction. These challenges encompass data collection, class imbalance, and handling large datasets [26]. Nevertheless, the literature has introduced numerous methods to address network traffic prediction effectively. Various methods have been developed to analyze the dynamics of cellular networks, with time-series analytic techniques being widely utilized for predicting mobile traffic patterns. Most studies employed approaches such as ARIMA to capture the patterns in the temporal evolution of mobile traffic [27, 28]. One problem with these methods is that they don’t work well with time series that change quickly because the estimate tends to be too close to the average of the values already seen [6]. Furthermore, these techniques were applied to homogenous time-series data, where both the input and the forecast were within the same range of values. Alekseeva et al. [29] explored ML and DL techniques to enhance network operations in complex environments. It reviewed the currently used machine learning applications in communication, identified significant challenges, and suggested potential solutions.

Table 1. Summary of applied Machine learning models in prior literature

Study	Year	Application	Implemented models	Best model
Huang et al. [25]	2017	Mobile traffic prediction	CNN, RNN, and combination CNN-RNN	CNN-RNN
Fang et al. [30]	2018	Spatial and temporal traffic prediction	SARIMA, graph convolutional network (GCN), LSTM, and combination GCN-LSTM	GCN-LSTM
Andreoletti et al. [31]	2019	Network traffic prediction	Diffusion convolutional recurrent neural network (DCRNN), LSTM, CNN, LSTM, and CNN-LSTM	DCRNN
Wang et al. [7]	2019	Spatial and temporal traffic prediction	SARIMA, LSTM, holt-winters, and GNN	GNN
Aldhyani et al. [32]	2020	Time Series traffic prediction	ANFIS and LSTM	LSTM
Dommaraju et al. [33]	2020	Spatial and temporal traffic prediction	Multilayer perceptron deep neural learning (MPDNL), spatio-temporal convolutional networks (STCNet), and GNN	MPDNL
Lin et al. [6]	2021	Developed a sleeping control strategy based on cellular traffic prediction	Multi-graph convolutional network (MGCN) ARIMA, LSTM, and MGCN-LSTM	MGCN-LSTM
Alekseeva et al. [29]	2021	Traffic prediction of real wireless network	(Linear, Huber, and Bayesian regression), SVM, Bagging, Random Forest, and Gradient Boosting	Gradient Boosting
Richerzhagen et al. [8]	2022	Spatial–temporal traffic prediction	LSTM, BI directional LSTM, convolutional CNN, CNN 1D-LSTM, CNN 2D-LSTM, Dense Net, and Dense Net Fusion	CNN 2D-LSTM
Hassan et al. [34]	2022	Enhance bandwidth slice predictions dynamically	LSTM with six local smoothing techniques (LOWESS, LOESS, Moving Average, Savitzky–Golay, RLOESS, and RLOWESS)	Hybrid Moving Average LSTM
Ferreira et al. [35]	2023	Spatial and temporal traffic prediction	RNN, LSTM, Gated Recurrent Unit (GRU), and CNN	3D CNN
Guesmi et al. [15]	2024	Traffic forecasting in emerging cellular networks	SARIMA, and FB Prophet, LSTM, and GRU	LSTM
Zhang [17]	2024	Internet traffic prediction	Adaboost, LSTM, and Hybrid AdaBoost-LSTM	AdaBoost-LSTM
Yuliana et al. [36]	2024	Network traffic prediction	KNN, Random Forest, and XGBoost	XGBoost

A comparative analysis of methods like Linear Regression, Gradient Boosting, and SVM on LTE network Edge traffic using a public dataset revealed distinct performance metrics [29]. Aldhyani et al. [32] focus on improving QoS in network management by predicting traffic loads using advanced time series models enhanced with machine intelligence. This study combines fuzzy c-mean clustering and weighted exponential smoothing as preprocessing steps to augment ML models, specifically LSTM and adaptive neuro-fuzzy inference system (ANFIS). When the proposed model is tested on real network traffic data, it significantly reduces prediction errors and achieves more reliable results. Azari et al. [37] and Madan and Mangipudi [38] conducted a comparative study on ARIMA and LSTM in terms of the impact of various parameters on the models. Their simulation results demonstrate the superiority of LSTM over ARIMA, mainly when the training time series is sufficiently long. However, ARIMA provides nearly optimal performance with reduced complexity in certain situations. The research conducted by Yuliana et al. [36] explored traffic and throughput prediction in cellular networks using ML to enhance network performance. Three machine learning models including the K-nearest neighbors (KNN), random forest (RF), and XGBoost were evaluated. Among these, the XGBoost model delivered the best results for traffic prediction. Based on the above aforementioned, the effectiveness of the ML model, particularly the gradient boosting, in enhancing resource allocation and network efficiency is evident, contributing to advancements in 5G technologies.

Deep learning techniques exhibit an exceptional ability to capture the complex and nonlinear relationships concealed within wireless communications. The CNN model is one of the most robust and effective options in DL models. The CNN model has proven highly successful in various fields, including computer vision and natural language processing (NLP). Similarly, traffic prediction has benefited from the effective application of convolutional neural networks (CNNs) in this domain. Zhang et al. [39] proposed a novel method for citywide traffic pattern prediction utilizing the CNN model. They mainly utilized CNN to model traffic’s spatial and temporal correlation in various network cells. Numerous studies have been conducted to analyze the dynamic properties of mobile network traffic using methods such as ARIMA, ML, and DL [40, 41]. Andreoletti et al. [31] proposed a novel method for traffic prediction using a diffusion convolutional recurrent neural network (DCRNN). They implemented a model to predict network congestion by predicting the traffic. Also, the authors compared the developed model with other models, including fully connected neural networks and LSTM. The results indicated that the proposed model outperformed the referenced counterparts regarding network congestion prediction. To enhance prediction accuracy and reduce computational time, the expected conditional maximization clustering and Ruzicka regression-based multilayer perceptron deep neural learning (ECMCRR-MPDNL) technique is introduced [33]. This method processes spatial and temporal cellular data through a deep neural network with multiple layers. By leveraging an activation function at the output layer, the technique predicts traffic with high precision. Evaluated on real-world datasets, ECMCRR-MPDNL demonstrated superior performance, achieving 98% accuracy, a 20% reduction in prediction time, and a lower false-positive rate compared to state-of-the-art methods. Furthermore, graph convolutional network (GCN) is used to model spatial relevancy through a dependency graph based on spatial distances, while LSTM captures temporal dependencies [30]. The integrated graph convolutional LSTM (GC-LSTM) model further enhances performance by simultaneously modeling spatial and temporal aspects. Experimental results in [30], demonstrated that GCN-LSTM significantly outperformed standalone GCN, LSTM, and traditional ARIMA models, providing superior forecasting accuracy and granularity for cellular demand prediction. Hassan et al. [34] enhanced traffic forecasting accuracy by integrating LSTM neural networks with six local smoothing techniques, creating a dynamic learning framework for adaptive bandwidth slice prediction. The results demonstrate that the hybrid moving average LSTM (MLSTM) and robust locally weighted scatter plot smoothing LSTM (RLWLSTM) significantly improve prediction accuracy, achieving up to 100% enhancement in certain scenarios. However, the pattern of mobile network traffic is highly complex due to various factors, including the mobility and diversity of user equipment (UE). Consequently, it quickly becomes clear that the linear models are ineffective in this complex network, and adopting new models based on deep learning is essential [42].

Problem statement and research gap

In the modern age of rapid technological advancement and broad mobile communication, cellular networks have become essential, providing a wide range of services, including Internet access, SMS, and calls. However, the exponential growth in demand for these services poses significant challenges for network operators in managing and optimizing network traffic to maintain high-quality service. The inability to accurately predict network traffic can lead to congestion, degraded QoS, inefficient resource allocation, increased operational costs, and higher energy consumption. Current methods for traffic management often fail to adequately address the dynamic and complex nature of network traffic, which exhibits temporal patterns, seasonal variations, and abrupt changes. Consequently, there is a pressing need for advanced predictive models that can provide accurate and reliable forecasts of network traffic. This will enable proactive resource management, energy savings, and improved service quality. Our proposed solution to this challenge is to use advanced ML and DL models to predict cellular network traffic time series.

The primary objective of this research is to develop and evaluate the efficacy of different models for cellular network traffic. These models were SARIMA (a traditional statistical model), ML models (FB Prophet, AdaBoost, and XGBoost), DL models (LSTM, CNN), and combined integrated models (hybrid CNN-LSTM and ensemble CNN+LSTM). The selection of these models is based on their ability to effectively capture different aspects of time series data, including seasonality, trends, noise, and long-term dependencies. It will identify the most effective models for predicting cellular network traffic based on various criteria, including different statistical regression metrics and computational time. These models are applied to predict various types of cellular network traffic, including (1) Internet traffic records, characterized by high variability and strong temporal patterns; (2) SMS traffic records, which exhibit periodic patterns influenced by social and cultural factors; (3) call traffic records, which demonstrates daily and weekly patterns. Cellular network operators can achieve highly accurate traffic predictions using advanced statistical methods: machine learning (ML), deep learning (DL), and hybrid models. These insights enable optimized 5G network performance, cost efficiency, and a minimized environmental impact. However, the key contributions of this research are as follows:

A comprehensive evaluation of various predictive models for cellular network traffic prediction is conducted, including SARIMA, FB Prophet, AdaBoost, XGBoost, LSTM, CNN, hybrid CNN-LSTM, and ensemble CNN+LSTM.
Highlights the significance of spatiotemporal traffic prediction, incorporating both spatial dimensions (different geographic zones) and temporal patterns (weekdays, weekends, and peak hours).
An innovative sampling technique is proposed to reduce dataset size while preserving essential traffic patterns, ensuring predictive accuracy, and enabling efficient evaluation on smaller, more manageable datasets.
Identify the best predictive models for real-time applications where computational efficiency is critical and highlight models achieving the highest accuracy when applied to different traffic and regions.

Methodology

Figure 1 illustrates a methodological framework adopted in the current study. The approach begins with collecting a comprehensive dataset covering 10,000 areas over 62 days throughout the city of Milan [43]. This extensive data collection is vital as it forms the foundation upon which all subsequent analyses and model validations are built. Data preprocessing follows, which is essential for maintaining the quality and usability of the data. During this phase, standardization of time formats and resampling ensure that the dataset is uniform. Also, handling missing values through appropriate imputation methods prevents biases that could skew the study’s results, while normalization ensures that different scales of data do not distort the predictive models’ performance. Then, the dataset analysis phase of the research involves temporal and spatial profiling, alongside the generation of a correlation heatmap. Temporal profiling helps identify time-related data trends, crucial for developing prediction models by highlighting cyclical and seasonal patterns. Spatial profiling examines geographic variability, essential for understanding regional differences in telecommunications data.

[See PDF for image]

Fig. 1

Methodology flowchart

Eight predictive models were implemented, with the dataset split into 40 days for training and 10 days for testing. This predictive modeling stage utilizes a range of statistical, ML, and DL models to generate predictions. This diversity in modeling approaches is crucial for benchmarking and understanding the strengths and limitations of different methodologies. An integral part of the modeling process, hyperparameter optimization for each model was determined through grid search and random search. This step fine-tunes the models to achieve the best possible performance by systematically exploring various configurations. This step fine-tunes the models to achieve the best possible performance by systematically exploring various configurations. Evaluation and validation of the developed models using visual representations (e.g., violin boxplots and dynamic curves) and quantitative measurements (e.g., regression metrics and uncertainty analysis) [44]. Additionally, it examined the trade-off between accuracy and computational time by employing the gated recurrent units (GRUs) technique [45].

The next step involves employing a sampling technique to reduce the dataset size. This reduction is crucial for optimizing computational resources and enhancing the efficiency of the model deployment phase. After sampling the dataset, the best-performing models were selected using a reduced dataset of 15 days, with 11 days for training and 4 days for testing. The most effective model also will assess and validate generalizability by using an independent dataset, different from the one utilized in the current study. Finally, a comparative analysis benchmarked the developed models against prior studies, demonstrating their superior effectiveness.

Database collection

A European telecommunications service provider provides cellular traffic data as part of the “Big Data Challenge” [43]. The raw dataset is collected from November 1, 2013, to January 1, 2014, with a temporal interval of 10 min, covering the entire city of Milan. As shown in Fig. 2a, Milan city is divided into a grid of 100 × 100 squares, each square measuring 0.235 × 0.235 km². Within each square, three types of cellular traffic are recorded: short message service (SMS), call service (Call), and Internet service. The original dataset includes the following fields: Square ID, Timestamp (in epoch Unix format, representing the number of microseconds since January 1, 1970, with each timestamp representing 10 min or 600,000 microseconds), SMS-in, SMS-out, Call-in, Call-out, and Internet. This study adjusts the time interval for traffic aggregation to one hour instead of ten minutes to decrease the number of records of different kinds of traffic.

[See PDF for image]

Fig. 2

Milan city grids

Figure 2b illustrates the intensity heatmap distribution of cellular network traffic across various areas in Milan city, showcasing spatial variability in traffic intensity. The city’s central region exhibits a significantly higher concentration of cellular network activity, as indicated by the red and orange hues on the heatmap, suggesting that this area experiences the densest usage. In contrast, the peripheral regions are characterized by lower traffic intensity, represented by blue hues, which likely correspond to less populated or less network-demanding zones. This gradient in traffic intensity reflects the urban dynamics of Milan, wherein cellular activity progressively declines from the central business and residential districts toward the peripheral regions.

The call detail records (CDRs) utilized in this research were provided by the semantics and knowledge innovation lab (SKIL) of Telecom Italia. Each time a user interacts with a telecommunication; the operator assigns a radio base station (RBS) to facilitate communication through the network. Subsequently, a new CDR is created, recording the time of the interaction and the RBS that handled it. To spatially aggregate the CDRs within the grid, each interaction is associated with the coverage area of the RBS that managed it. Telecom Italia records various types of CDRs, capturing different user activities as in Table 2. On the other hand, shared datasets are created by combining all this anonymous information with a temporal aggregation into time slots of ten minutes. The number of records in the datasets $S_{i}' (t)$ follows the rule of:

S_{i}^{'} (t) = S_{i} (t) \cdot k^{'}

Table 2. CDRs and user activities

CDRs*	Activities
Received SMS	The user receives an SMS
Sent SMS	The user sends an SMS
Incoming call	The user receives a call
Outgoing call	The user makes a call
Internet	The user initiates or terminates an Internet connection

^*CRD during a single connection: It is generated if the connection lasts for more than 15 min or if the user transfers more than 5 MB of data

where k’ is a constant Telecom Italia defines to obscure the valid number of calls, SMS, and connections. This ensures the anonymity and confidentiality of the user data while allowing for the analysis of traffic patterns.

Within the same framework, Table 3 presents a comprehensive summary of the collected data’s characteristics, providing a succinct overview of the statistical description. The table provided presents descriptive statistics for the Internet, SMS, and Call traffic. Each traffic is characterized by its minimum, maximum, mean, median, standard deviation, kurtosis, and skewness, highlighting their distribution, central tendency, and dispersion. Internet traffic exhibits a high mean value, indicating that, on average, Internet usage is significantly greater than SMS and Calls. This is further supported by its high standard deviation and a larger difference between the minimum and maximum suggesting that Internet traffic fluctuates substantially among users. Additionally, the skewness (2.84) and kurtosis (7.49) values indicate a highly right-skewed and leptokurtic distribution, implying that a small number of users contribute disproportionately high traffic.

Table 3. Descriptive statistics of the collected database

Traffic type	Mean	Median	Standard deviation	Kurtosis	Skewness	Minimum	Maximum
Internet	2533.99	280.18	5804.14	7.49	2.84	16.748	33606.87
SMS	317.43	21.06	784.18	6.32	2.72	0.020	4744.77
Calls	295.97	25.99	729.42	6.10	2.71	0.001	4027.28

SMS traffic, in contrast, has a considerably low mean value and standard deviation, illustrating that SMS usage is more stable and less variable than Internet traffic. The median is much lower than the mean, confirming a right-skewed distribution (skewness = 2.72). The relatively high kurtosis (6.32) suggests that SMS traffic follows a peaked distribution, where extreme values are more likely than in normal distribution. Similarly, Calls traffic shows a right-skewed nature of its distribution (skewness = 2.71). The minimum and maximum of the Calls traffic further highlight the presence of a wide range in traffic values, indicating that while most users generate relatively low call traffic, a few individuals engage in significantly higher call activity. Overall, the analysis reveals that Internet traffic exhibits the highest variability and usage, with a significantly larger range and dispersion compared to SMS and Calls. While all three traffic types display right-skewed and leptokurtic distributions, indicating the presence of a small subset of high-traffic users, SMS and Calls demonstrate more consistent usage patterns with lower means and variability.

Data preprocessing and preparation

The preprocessing of the dataset involves several key steps to ensure data quality and usability for analysis. First, timestamps in the dataset, originally recorded in Epoch Unix format, are converted to a human-readable date and time format. This transformation enhances the interpretability of the data and facilitates subsequent analysis. Next, the data is resampled to an hourly interval. Since the original data is recorded at 10-min intervals, this process involves aggregating the measurements into 1-h intervals. Resampling reduces the data’s granularity, making it more manageable while retaining critical temporal patterns and trends.

To maintain data integrity, a thorough check for missing values is conducted. The total dataset size is expected to be 24 × 62 × 10,000 records, representing 24 h per day, 62 days, and 10,000 grid areas. If missing values are identified, they are addressed through interpolation or other appropriate imputation methods to ensure completeness. Finally, the data is normalized using Min–Max scaling, which transforms the values to a fixed range, typically between 0 and 1. The max–min mapping technique can be expressed as follows:

X_{n} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}

where X_n represents the normalized data; X_min and X_max denote the smallest and largest values of CDRs, respectively; and X stands for the original data set undergoing the rescaling process. This normalization is crucial for optimizing the performance of many ML algorithms by ensuring that features are on a comparable scale and preventing bias during model training.

Dataset analysis

Spatial profiling and sampling

An analysis of traffic data across 10,000 grid squares was conducted to understand spatial usage patterns. An instance of Internet records, Fig. 3 illustrates the relationship between grid number and mean usage. It highlights significantly greater traffic in central grid regions relative to peripheral areas. Activity decreases substantially with increasing distance from the city center, highlighting the central area’s considerable demand for network resources. Four representative grid squares were selected from the 10,000 grid squares to assess the efficacy of different traffic prediction models. These selections based on their activity to serve as a sample for the application of various traffic prediction models [46]:

City center region (grid square 5161).
Commercial region (grid square 1884).
Residential region (grid square 1684).
Business region (grid square 7121).

[See PDF for image]

Fig. 3

Mean Internet usage for all grid squares

Figure 4 compares the average total traffic records, including Internet, calls, and SMS, during the weekdays and weekends. The dataset is presented in four grid squares, each representing a distinct area type: city center, commercial, residential, and business. The city center exhibits more balanced traffic, with a slight increase at weekends, as illustrated in Fig. 4a. This area generally comprises a combination of commercial, residential, and recreational facilities, attracting people throughout the week. The weekend increase suggests heightened activity as people visit for shopping, dining, and entertainment. In commercial areas, as illustrated in Fig. 4b, the traffic is higher during weekdays. This can be attributed to business hours when people visit shops, malls, and offices. The decline in weekend traffic indicates diminished commercial activities, as numerous businesses close or shorten their operating hours. The residential area depicted in Fig. 4c exhibits an increase in traffic records over weekends. This pattern is likely due to residents at home or engaging in local community activities. During weekdays, traffic decreases as people leave for work or school, leading to lower network usage. Lastly, the traffic in the business area is considerably higher on weekdays, as illustrated in Fig. 4d. These areas are typically home to offices, corporate headquarters, and other business facilities that operate primarily during standard business hours. The sharp drop in traffic during weekends indicates that these areas are vacated when businesses are closed.

[See PDF for image]

Fig. 4

Weekday versus weekend average traffic comparison

Temporal profiling and sampling

Firstly, it will explore the data over the days. Figure 5 illustrates the exploration of internet traffic records in randomly selected grid areas, where random days were chosen to analyze daily activity patterns for 62 days. This figure highlights the differences in traffic patterns between weekdays and weekends, with a particular focus on peak hours. Weekdays demonstrate higher traffic levels during specific peak periods, commonly corresponding to work hours such as the morning and evening commute. In contrast, weekend traffic patterns (e.g., Day 2) show variability, with peaks occurring at different times of the day due to user behavior and activity changes. Understanding these distinctions is crucial for analyzing network traffic dynamics, as it significantly influences the design and accuracy of predictive models. By accounting for these variations, models can be more effectively tailored to manage network resources.

[See PDF for image]

Fig. 5

Random days usage over 62 days

Figure 6 presents an analysis of the records of internet traffic within some grid areas that were chosen at random. This comparison reveals that it was noted that the last two weeks exhibited patterns different from the preceding weeks, which have been largely influenced by the holidays of Christmas and New Year. Because of this seasonal impact, there are deviations from the typical patterns of traffic, which are most likely caused by changes in the activity of users during this period. These kinds of anomalies have the potential to introduce biases into the modeling process, which in turn can cause the model’s predictions to be less accurate. It was decided to exclude the final two weeks of data from the analysis to avoid these biases and improve predictive performance.

[See PDF for image]

Fig. 6

Internet traffic in random grid squares over 62 days

The analysis of the traffic patterns for Internet, calls, and SMS in the selected four regions: city center, commercial, residential, and business, is depicted in Fig. 7. It was found that the Internet exhibited the highest traffic in comparison to call and SMS records. Therefore, the Internet traffic pattern exhibits the predominant magnitude and accurately reflects the overall behavior of the data. This suggests that Internet traffic is the principal factor influencing the identified traffic patterns, capturing the predominant trends and variations within the dataset. By focusing on Internet traffic, it is easy to accurately understand and predict the overall traffic patterns, as it serves as a holistic indicator of network activity. Consequently, in the second scenario of the present study (i.e., sampled dataset), the Internet records will be employed to predict traffic.

[See PDF for image]

Fig. 7

Distribution of traffic records usage for the studied regions

To minimize the volume of data used in each model, it proposed aggregating data of internet traffic records weekly by utilizing two specific data types based on the actual dataset. The first dataset represents the average values for each 24 h during workdays (Monday to Friday). Averaging the data for these five days can reduce the data volume while preserving the essential trends and patterns of the typical workweek. The second dataset represents the average values for each of the 24 h during the weekends (Saturday and Sunday), capturing the distinct patterns and behaviors that occur during these days. This approach significantly reduces the overall data volume by summarizing the data into two key weekly datasets, one for workdays and one for weekends; thus, the dataset had 15 days instead of 50 days. Consequently, it maintains a manageable dataset size without losing critical temporal patterns and variations. This data reduction technique ensures that the models remain efficient and effective in processing and analyzing time series data [47]. In this reduction stage, the most effective models derived from the original dataset were employed to mitigate the high computational time typically associated with deep learning models. This approach ensures the retention of performance while optimizing computational efficiency.

Correlation heatmap

Understanding the interdependencies between each traffic record is crucial, and this can be achieved by analyzing the correlation between variables. The Pearson correlation coefficient (r) is the most commonly used metric for this method, and it is essential in comprehending these relationships [48, 49]. It can be calculated as the covariance ratio (cov) of two traffic records to the product of their standard deviations, as illustrated in Eq. (3).

r = \frac{c o v (a, b)}{σ_{a} σ_{b}} = \frac{\sum_{i = 1}^{n} (a_{i} - \bar{a}) (b_{i} - \bar{b})}{\sqrt{\sum_{i = 1}^{n} {(a_{i} - \bar{a})}^{2}} \sqrt{\sum_{i = 1}^{n} {(b_{i} - \bar{b})}^{2}}}

where n is the number of a dataset,

\bar{a}

and

\bar{b}

are the mean of traffic records (a, b). The correlation heatmap presented in Fig. 8 provides valuable insights into the interdependencies among traffic variables: calls, SMS, and Internet across four distinct regions: the city center, commercial area, residential area, and business area. The analysis highlights varying degrees of correlation in each region, reflecting differences in communication patterns and network demand. Each cell in the matrix indicates the correlation coefficient (CC) for the corresponding pair of traffic records represented by the row and column. As expected, the diagonal cells reflect a perfect self-correlation for each variable, with a CC of 1.0.

[See PDF for image]

Fig. 8

Heatmap correlation matrix of Calls, SMS, and Internet for each region

In the city center region (Fig. 8a), the CCs reveal a highly interconnected relationship between all traffic variables. The strongest correlation is observed between calls and SMS, with an r-value of 0.98, indicating nearly perfect synchronization between these two modes of communication. Calls and Internet usage also show a strong correlation of 0.92, reflecting the significant overlap in voice and data demands in such high-density areas. Similarly, SMS and Internet traffic have a robust correlation of 0.94, further underscoring the integrated usage patterns characteristic of city centers. In commercial regions (Fig. 8b), the CC values are slightly lower but still demonstrate notable interdependencies between traffic variables. The relationship between calls and SMS is strong, with an r-value of 0.94, indicating frequent concurrent usage in professional and retail environments. Calls and Internet usage exhibit a correlation of 0.84, suggesting a moderate dependence between voice and data services. The weakest correlation is observed between SMS and Internet traffic, with an r-value of 0.86, implying that messaging and data usage are less interdependent in commercial contexts.

In residential regions (Fig. 8c), the correlations among traffic variables are weaker compared to the above areas, reflecting distinct communication habits. Calls and SMS show a strong relationship, with an r-value of 0.94, indicating that traditional modes of communication dominate in these zones. However, calls and Internet usage has a lower correlation of 0.81, and SMS and Internet traffic exhibit the weakest relationship, with an r-value of 0.87. These results suggest that residential users are more likely to use specific communication modes independently, such as relying on SMS or calls for personal communication and Internet traffic for separate, data-driven activities. In contrast, the business region demonstrates strong correlations across all traffic variables, highlighting the intensive and multitasking nature of communication in professional environments (Fig. 8d). Calls and SMS exhibit a correlation of 0.92, reflecting frequent usage of both services for workplace communication. Calls and Internet usage show a similarly high correlation, also at 0.92, indicating a close link between voice and data traffic. SMS and Internet traffic, while slightly less correlated, still display a strong relationship with an r-value of 0.88. These findings suggest that network activity in business areas is highly integrated, with all communication modes contributing significantly to overall traffic demand.

Overall, in examining the correlation of communication metrics across various urban grids, each designated with specific functional attributes, a nuanced understanding of the integration and utilization of communication technologies unfolds. The city center region displays uniformly high correlations, exemplifying a paradigm of high connectivity, where the convergence of Internet, SMS, and calls is most pronounced. In contrast, commercial regions exhibit moderate correlations, indicating a balanced distribution of traffic across services driven by diverse professional and retail activities. Residential regions show weaker correlations, suggesting segmented communication habits where traffic types are used more independently. Meanwhile, business regions demonstrate strong interdependencies, reflecting the multitasking and high connectivity demands typical of professional environments.

Overview of prediction models

Various models were utilized to predict traffic patterns, spanning statistical, machine learning (ML), and deep learning (DL) approaches. The statistical model employed was seasonal-auto-regressive integrated moving average (SARIMA), while the ML models included adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and Facebook prophet (FB Prophet). The DL models comprised convolutional neural network (CNN) and long short term memory (LSTM), along with integrated approaches such as the hybrid CNN-LSTM and the ensemble CNN+LSTM models.

The SARIMA model enhances the ARIMA model for forecasting univariate time series data that exhibit seasonal characteristics, making it appropriate for data with consistent patterns over fixed intervals. SARIMA parameters consist of two hyperparameters: seasonal and non-seasonal parameters. The seasonal parameters encompass P: order of the seasonal autoregressive part, D: degree of seasonal differencing), Q: order of the seasonal moving average part, and S: number of time steps per seasonal period. Meanwhile, the non-seasonal parameters encompass p: order of the autoregressive part, d: degree of differencing to make the series stationary, and q: order of the moving average part. Identifying the optimal SARIMA parameters entails iterating through multiple combinations and selecting the model with the lowest Akaike Information Criterion (AIC) [50], thereby balancing model fit and complexity.

Transitioning to the ML models, the AdaBoost model is a powerful ensemble learning method that enhances weak learners to form a strong classifier [51]. Optimizing its performance involves selecting parameters such as the number of estimators and learning rate, typically achieved through grid search to balance complexity and accuracy. Similarly, the XGBoost model, a flexible gradient-boosting method [52], requires a careful selection of parameters such as the number of trees, learning rate, maximum depth, and colsample_bytree. In addition to these ML techniques, the FB Prophet model, a time series forecasting tool developed by Meta’s Research and Development team, is an effective model for datasets with strong seasonal patterns and holiday effects. It decomposes time series data into trends, seasonality, and holidays and can be configured as additive or multiplicative. Key parameters include the type of growth (linear or logistic), seasonality, holiday effects, and trend modeling with automatic changepoint selection. Seasonality is modeled using the Fourier series, while holiday effects are independent shocks.

Switching to the DL models, the CNN model effectively predicts time series, capturing local patterns and temporal dependencies [22]. A typical CNN architecture (see Fig. 9a) includes convolutional layers with specified filters and kernel sizes, pooling layers to reduce dimensionality, and dense layers for prediction. Key parameters include the number of filters, kernel size, pooling size, dense layer units, learning rate, and activation functions. The second employed DL model was the LSTM network, a recurrent neural network (RNN) that excels in time series prediction by processing sequences and retaining information over long periods. Important parameters include the number of layers and units, batch size, epochs, learning rate, and choice of optimizer. Regularization techniques like dropout prevent overfitting. Normalizing input data and using time-based cross-validation to choose optimal parameters is crucial. Setting up an LSTM involves iterative testing to find the optimal configuration for the data (see Fig. 9b). The LSTM networks have four essential components, known as gates, which manage the flow of information: the input gate, forget gate, output gate, and cell state update gate. The input gate, defined by Eq. (4), determines how much of the new input should be added to the cell state. The forget gate, given by Eq. (5), decides how much of the previous cell state should be forgotten. Similarly, the output gate, described in Eq. (6), determines the output of the current cell state to the next hidden state. The cell state update, represented by Eq. (7), creates a candidate cell state to be added to the actual cell state where tanh is a hyperbolic tangent activation function. These gates work in unison to maintain, update, and propagate information through the LSTM network, enabling it to capture long-term dependencies in sequential data [46].

i_{t} = σ (w_{i} \cdot x_{t} + u_{i} \cdot h_{t - 1} + b_{i})

f_{t} = σ (w_{f} \cdot x_{t} + u_{f} \cdot h_{t - 1} + b_{f})

o_{t} = σ (w_{o} \cdot x_{t} + u_{o} \cdot h_{t - 1} + b_{o})

\tilde{C_{t}} = tanh (w_{c} \cdot x_{t} + u_{c} \cdot h_{t - 1} + b_{c})

where

w_{i}

u_{i}

, and b_i are the weight matrix for the input, the weight matrix for the previous hidden state, and the bias, respectively.

σ

represents the sigmoid activation function and the exact definition for the parameters for the other gates depending on their kind.

[See PDF for image]

Fig. 9

Architectures of the employed deep learning models

The hybrid CNN-LSTM model excels in time series prediction by combining CNN layers for feature extraction and LSTM layers for sequence modeling [53]. The hybrid CNN-LSTM model leverages the strengths of both CNN and LSTM networks, with CNN capturing patterns and LSTM capturing temporal dependencies, making it suitable for complex traffic prediction tasks. In the hybrid CNN-LSTM, the output of the CNN model is utilized as the input for the LSTM model (see Fig. 9c). The LSTM can get features from the input data that the CNN model has already learned. The setup involves data normalization and segmentation into sequence windows suitable for CNN and LSTM processing. The architecture typically includes CNN layers, reshaping steps for LSTM layers, and possibly dense layers for specific output tasks. Parameter tuning involves selecting layer counts, convolutional filter sizes, LSTM units, and other hyperparameters. Additionally, an ensemble CNN+LSTM architecture can leverage both CNN and LSTM models in parallel [54], with each processing the input data separately (see Fig. 9d). An ensemble of CNN and LSTM models combines the predictive power of multiple models, enhancing accuracy and robustness. In this model, it employs a parallel architecture in which the CNN and LSTM operate independently on the input data. The outputs of the two models are concatenated and passed to the fully connected layer. This ensemble approach can harness the strengths of both architectures and potentially improve performance by blending different types of feature extraction and temporal analysis. Parameter tuning involves selecting layer counts, convolutional filter sizes, LSTM units, and other hyperparameters.

Assessment of traffic prediction models

Assessing model performance is crucial for ensuring predictive outcomes’ scientific reliability and practical applicability [55]. In the current research, the evaluation of predictive models is conducted through two primary methods: visual assessments and quantitative metrics. Initially, visual assessments are graphical representations that offer clear insights into model performance, frequently uncovering patterns, trends, or anomalies that numerical measures may overlook. Violin plots are particularly effective in visualizing data distribution, as they combine aspects of both box plots and kernel density plots. Secondly, quantitative metrics are numerical evaluation metrics that provide an objective assessment of a model’s predictive accuracy, facilitating direct comparisons across different models. Together, these visual and quantitative methods offer complementary insights, ensuring a robust and multi-dimensional evaluation of model performance.

Regression metrics

This study employs three regression metrics comprising the mean absolute error (MAE), root mean squared error (RMSE), and coefficient of determination (R²). Table 4 presents the formal equations for these metrics, including their ideal values. The proportion of variance in the target variable accounted for by the independent variables is quantified by the metric (R²). The range is from 0.0 to 1.0, with values approaching 1.0 signifying a superior model fit. High R² values indicate that the model effectively accounts for the majority of the variability in the target data. RMSE is a metric that quantifies the average magnitude of prediction errors. This metric exhibits heightened sensitivity to outliers because of the squaring of errors prior to averaging, rendering them useful in situations where significant errors are particularly unwelcome. Finally, the MAE is a metric that represents the average absolute difference between predicted and actual values.

Table 4. Employed performance metrics that assessed the adopted models

Regression metric	Equation	Ideal value
Mean absolute error	$MAE = \frac{\sum_{i = 1}^{n} \|y_{i} - \hat{y_{i}}\|}{n}$	0
Root mean squared error	$RMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{n}}$	0
Determination coefficient	$R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}$	1

Where $y_{i}$ and $\hat{y_{i}}$ are actual and predicted i^th values, respectively and $\bar{\hat{y}}$ is the average of predicted values

Uncertainty evaluation

Uncertainty analysis is an essential method for assessing and quantifying uncertainty related to the predictive accuracy of models. Systematic quantification of uncertainties enhances the interpretation of model outputs and guides decision-making according to the confidence level of predictions. However, this research employs the uncertainty measure U₉₅, calculated to represent the 95% confidence interval for model predictions. The formula is presented as follows [56]:

U_{95} = 1.96 \times \sqrt{{SD}^{2} + {RMSE}^{2}}

In this context, 1.96 is the z-value corresponding to a 95% confidence level in a standard-normal-distribution. This formulation allows for a clear quantification of uncertainty by integrating the variability in prediction errors (SD) with the RMSE. Such measures are essential for evaluating model reliability, particularly in applications where high prediction accuracy is critical.

Results and discussion

Hyperparametric configuration of models

Hyperparameter tuning is essential for optimizing the performance of machine learning models. Commonly employed techniques for this purpose include Grid Search (GS), Random Search (RS), and Bayesian Optimization (BO) [57]. The GS technique was utilized in training statistical and machine learning models in the present study. After an extensive grid search, the critical tuning parameters for the models have been determined. The GS technique methodically assesses every possible hyperparameter combination within a specified range, rendering it a comprehensive and efficient approach to developing models with acceptable computational requirements. This systematic approach ensures that the best-performing combinations of hyperparameters are identified for each model.

A common challenge in training deep learning models, especially CNNs and LSTMs, is overfitting, where the model integrates noise or undesirable patterns from the training data, thereby limiting its ability to generalize effectively to new, unseen data. To resolve this, various regularization techniques were utilized. Dropout was employed to randomly deactivate a proportion of neurons during training, thereby preventing the model from over-relying on specific pathways and promoting a more robust learning process. Furthermore, Batch Normalization was employed to normalize the inputs to each layer, thereby stabilizing and accelerating the training process while enhancing the model’s generalization abilities. Finally, Early Stopping was implemented to assess performance on a validation set and terminate the training process when enhancements decreased, decreasing the risk of overfitting from long training. These techniques collectively assisted in reducing overfitting and improving model performance.

Random search is employed in training deep learning models instead of GS because of its efficiency and practicality. GS systematically assesses every potential combination of hyperparameters; however, it is computationally intensive and time-consuming, particularly as the quantity of parameters grows. Conversely, RS samples hyperparameter combinations randomly, allowing for more efficient exploration of a wider search space. This approach often produces positive results with fewer evaluations, making it a more efficient option for deep learning model optimization. The CNN model parameters include the number of filters, kernel size, pooling size, dense layer units, optimizer type, learning rate, number of epochs, batch size, activation functions, and output layer configuration. Lastly, LSTM model parameters include the number of LSTM layers and units, fully connected layer units, dropout rate, optimizer type, learning rate, batch size, and number of epochs. Ensemble and hybrid models, such as CNN-LSTM combinations, take parameters similar to those of LSTM or CNN models with a fully connected layer with 100 units to combine the two models after the output layer in the case of Ensemble CNN+LSTM. However, Table 5 presents the optimal hyperparameters identified through the tuning process for the prediction models.

Table 5. Optimal hyperparameters used in the development of the prediction models

Model	Hyperparameters	Values	Model	Hyperparameters	Values
SARIMA	p, d, q	1, 0, 2	XGBoost	Estimators	100
	P, D, Q	1, 0, 1		Max-depth	7
	S	24		Alpha	10
	S	24		Col_sample by tree	0.7
AdaBoost	Estimators	50	FB Prophet	Changepoint range	0.5
	Estimators	50		Changepoint prior scale	0.01
	Max-depth	7		n_changepoints	50
	Max-depth	7		Weekly seasonality	100
	Learning rate	0.001		Daily seasonality	100
	Learning rate	0.001		Seasonality mode	Additive
CNN	Filters	64	LSTM	LSTM layer	100 units
	Kernel size	7		Dense layer	100 units
	Pool size	2		Dropout	0.2
	Dense layer	50 units		Optimizer	Adam
	Optimizer	Adam		Learning rate	0.001
	Learning rate	0.001		Epochs	100
	Epochs	100		Batch size	32
	Batch size	32		Activation	Relu
	Activation	Relu		Output layer	1 unit
	Output layer	1 unit

Assessment of prediction models via the original dataset

Violin boxplots

Figures 10, 11, 12 present violin plots illustrating the distribution of the Calls, Internet, and SMS records for various regions for each developed model. Each plot includes a marker representing the median value, providing an intuitive reference for the central tendency of the data. In addition to the median, the plots convey the data distribution’s shape, spread, and skewness, offering a comprehensive view of the model’s prediction behavior. The violin plots in Fig. 10 compare the predicted distributions of Call records for each model and the actual distribution. The analysis of call records across different geographical areas reveals distinct patterns in model performance. The city center (Fig. 10a), with the highest median call records (2244), exhibits heavy call traffic and requires highly accurate models. Ensemble CNN+LSTM and CNN stand out in this area, closely aligning with the actual distribution, while FB Prophet and XGBoost significantly underestimate the median. Hybrid CNN-LSTM also performs well, mirroring the actual data’s spread. These findings underscore the city center’s complexity and demand for robust predictive models. In the commercial region (Fig. 10b), the actual median of 96.38 indicates moderately high call traffic. Ensemble CNN+LSTM and hybrid CNN-LSTM excel, closely predicting the actual median and accurately capturing the data’s distribution. However, models like SARIMA and XGBoost falter, underestimating the central tendency. Compared to the city center, the commercial region has a lower median and a wider spread, reflecting less intense but still variable call traffic.

[See PDF for image]

Fig. 10

Violin plots of developed models for call records in a city center, b commercial, c residential, and d business regions

[See PDF for image]

Fig. 11

Violin plots of developed models for Internet records in a city center, b commercial, c residential, and d business regions

[See PDF for image]

Fig. 12

Violin plots of developed models for SMS records in a city center, b commercial, c residential, and d business regions

The residential region (Fig. 10c), with a median of 47.33, reflects moderate and more uniform call behavior. Ensemble CNN+LSTM, Hybrid CNN-LSTM, and LSTM demonstrate strong predictive performance, closely aligning with the actual data. CNN and XGBoost models, however, underestimate the median, revealing limitations in this area. The narrower interquartile range in residential areas suggests steadier communication patterns compared to the variability seen in urban regions like city centers and commercial regions. The business region records the lowest median call volume (25.00) and exhibits the most predictable call traffic (Fig. 10d). Most models perform exceptionally well here, with ensemble CNN+LSTM, hybrid CNN-LSTM, and LSTM providing accurate predictions. Interestingly, unlike in higher-traffic regions, FB Prophet also delivers reliable results in the business region. SARIMA and CNN show slight underestimation but are still reasonably close to the actual median. The narrower spread in this area reflects the uniformity of business communication trends. Overall, ensemble CNN+LSTM, hybrid CNN-LSTM, and LSTM models consistently provide the most reliable predictions across all regions, demonstrating their versatility and robustness. SARIMA and XGBoost, however, struggle in high-traffic regions like the city center and commercial regions but perform relatively better in lower-traffic residential and business regions.

Figure 11 illustrates the analysis of internet records predictions, highlighting distinct differences in model performance based on regional characteristics. In the city center region (Fig. 11a), the highest median of 2244 internet records highlights the region’s intense traffic and high variability, making it the most challenging area to predict. Ensemble CNN+LSTM, LSTM, and CNN demonstrate strong performance, closely aligning with the actual data, while SARIMA and AdaBoost models are slightly underestimated. FB Prophet and XGBoost show significant underestimations, underscoring their limitations in high-traffic environments. The city center’s wide variability amplifies the need for robust and adaptable models. The commercial region (Fig. 11b), with a median of 491, presents a moderate traffic level and more consistent distributions compared to the city center. FB Prophet provides the closest predictions, followed by SARIMA and ensemble CNN+LSTM, with only minor deviations. While AdaBoost overestimates slightly, CNN and LSTM tend to underestimate. The narrower interquartile ranges in the commercial region highlight its steadier traffic patterns, making it less complex to predict than the city center but more variable than residential or business regions.

In the residential region (Fig. 11c), with a median of 495, internet records reflect moderate and uniform traffic. Ensemble CNN+LSTM and AdaBoost provide the most accurate predictions, while SARIMA significantly overestimates, failing to align with the actual data. LSTM and hybrid CNN-LSTM slightly deviate but remain closer to the actual median. The residential region’s predictable patterns make modeling easier, with ensemble CNN+LSTM consistently capturing its distribution. The business region (Fig. 11d), with the lowest median of 94, demonstrates minimal traffic and low variability, making it the simplest region to predict. Most models, including FB Prophet, AdaBoost, and XGBoost, deliver accurate results that are close to the actual median. Hybrid CNN-LSTM and ensemble CNN+LSTM maintain robust performance, while SARIMA and LSTM are slightly underestimated but remain within reasonable bounds. The business area’s narrow interquartile ranges reflect its stable and predictable traffic. Across all regions, the city center poses the greatest challenge due to its high median and wide variability, while the residential and business regions are more predictable with narrower spreads. Ensemble CNN+LSTM consistently emerges as the most reliable model, performing well across all regions. Hybrid CNN-LSTM also demonstrates robustness, whereas SARIMA struggles with overestimation in the residential region and underestimation in high-traffic regions like the city center. FB Prophet and XGBoost show limitations in handling variability but perform better in less variable regions like the business and commercial regions.

The performance of predictive models for SMS records varies significantly across the regions studied, influenced by differences in traffic intensity and variability (see Fig. 12). The city center region (Fig. 12a), with the highest median SMS records (2401) and the greatest variability, poses the most challenging prediction environment. Ensemble CNN+LSTM, hybrid CNN-LSTM, and LSTM consistently align closely with the actual median, demonstrating robust performance. Models like SARIMA and AdaBoost slightly underestimate the median, while FB Prophet and XGBoost significantly underperform due to large underestimations. In the commercial region (Fig. 12b), with a median SMS traffic of 65.15, most models perform effectively. FB Prophet, XGBoost, and ensemble CNN+LSTM align closely with the actual median, while hybrid CNN-LSTM and LSTM show minor deviations. SARIMA and AdaBoost slightly overestimate, and CNN slightly underestimates the median. Compared to the city center, the commercial region has narrower interquartile ranges, reflecting less variability, which allows for more consistent predictions by most models. The residential region (Fig. 12c), with a median of 41.07, reflects moderate and consistent SMS traffic. Predictions by SARIMA, Hybrid CNN-LSTM, and AdaBoost align well with the actual median, while ensemble CNN+LSTM also performs strongly. CNN and XGBoost models exhibit slight overestimation and underestimation, respectively, but remain within acceptable ranges. The narrow interquartile ranges of the residential region reflect uniform traffic patterns, making it easier for models to predict accurately compared to the more variable city center and commercial regions.

The business region (Fig. 12d), with the lowest median SMS records (16.20), demonstrates minimal traffic and the least variability across all areas. Predictions by FB Prophet, SARIMA, and AdaBoost closely align with the actual median, while hybrid CNN-LSTM and ensemble CNN+LSTM also perform accurately. CNN and LSTM slightly underestimate, and XGBoost slightly overestimates the median. The business region’s low variability and narrow interquartile ranges make it the simplest region for models to predict accurately. Among all regions, Ensemble CNN+LSTM consistently emerges as the most reliable model, accurately capturing medians and interquartile ranges in regions with varying levels of traffic and variability. Hybrid CNN-LSTM and LSTM also perform robustly, while SARIMA and FB Prophet demonstrate improved accuracy in regions with lower variability, such as residential and business. These results underscore the need for tailoring prediction models to unique characteristics of each region, particularly in regions like the city center, where high variability poses significant challenges.

Statistical analysis

The models were evaluated across statistical analysis based on their accuracy and computation time. Tables 6, 7, 8, 9 provide detailed performance metrics for various prediction models for Internet, SMS, and call records for the city center, commercial, residential, and business regions. The comparison of the developed models is conducted from two perspectives: accuracy and computational (training) time. The bold values in the tables represent the most effective predictive models in terms of either accuracy or computational efficiency for each traffic type.

Table 6. Results of CDRs of prediction models in the city center region

Traffic type	Prediction model	MAE	RMSE	R²	Training time (seconds)
Internet	FB prophet	1121.71	1640.22	0.962	13.72
	SARIMA	1579.89	2753.94	0.894	10.56
	AdaBoost	1026.31	1561.39	0.966	0.02
	XGBoost	1104.09	1729.18	0.958	0.05
	CNN	656.62	980.96	0.987	21.24
	LSTM	647.04	994.02	0.986	84.42
	Hybrid CNN-LSTM	668.05	930.21	0.988	170.9
	Ensemble CNN+LSTM	577.05	863.83	0.990	85.11
SMS	FB prophet	253.98	356.70	0.910	11.00
	SARIMA	261.9	434.07	0.867	4.93
	AdaBoost	185.54	277.93	0.945	0.05
	XGBoost	257.56	401.63	0.886	0.13
	CNN	147.12	238.04	0.960	21.49
	LSTM	115.89	195.58	0.973	81.54
	Hybrid CNN-LSTM	117.87	184.85	0.976	144.29
	Ensemble CNN+LSTM	123.66	194.69	0.973	84.94
Calls	FB prophet	297.16	396.63	0.876	12.42
	SARIMA	215.36	359.16	0.898	6.62
	AdaBoost	174.33	262.51	0.946	0.13
	XGBoost	296.24	468.04	0.827	0.04
	CNN	91.82	141.60	0.984	10.82
	LSTM	91.32	141.32	0.984	83.67
	Hybrid CNN-LSTM	84.09	132.01	0.986	101.90
	Ensemble CNN+LSTM	83.63	127.26	0.987	85.53

Table 7. Results of CDRs of prediction models in the commercial region

Traffic type	Prediction model	MAE	RMSE	R²	Training time (seconds)
Internet	FB prophet	41.59	56.26	0.894	10.25
	SARIMA	46.73	65.16	0.858	8.24
	AdaBoost	40.66	59.93	0.880	0.31
	XGBoost	39.67	53.79	0.903	0.03
	CNN	37.51	50.71	0.914	21.28
	LSTM	33.21	45.83	0.930	83.72
	Hybrid CNN-LSTM	32.85	43.70	0.936	144.95
	Ensemble CNN+LSTM	31.41	41.98	0.941	84.47
SMS	FB prophet	6.94	11.62	0.873	12.88
	SARIMA	11.68	17.57	0.710	5.52
	AdaBoost	7.16	11.68	0.872	0.05
	XGBoost	6.92	11.62	0.873	0.03
	CNN	8.59	14.35	0.806	21.29
	LSTM	6.79	11.33	0.879	85.11
	Hybrid CNN-LSTM	6.53	10.79	0.89	155.37
	Ensemble CNN+LSTM	6.84	10.61	0.894	85.90
Calls	FB prophet	9.64	15.35	0.897	12.51
	SARIMA	18.98	27.46	0.669	9.33
	AdaBoost	9.85	15.95	0.888	0.13
	XGBoost	10.26	17.28	0.869	0.45
	CNN	8.98	14.46	0.908	16.03
	LSTM	8.06	12.46	0.932	83.60
	Hybrid CNN-LSTM	7.19	12.02	0.937	143.95
	Ensemble CNN+LSTM	7.58	11.24	0.945	84.46

Table 8. Results of CDRs of prediction models in the residential region

Traffic type	Prediction model	MAE	RMSE	R²	Training time (seconds)
Internet	FB prophet	73.49	110.75	0.883	6.15
	SARIMA	136.15	214.78	0.561	7.18
	AdaBoost	54.27	91.78	0.920	0.23
	XGBoost	70.57	126.12	0.849	0.03
	CNN	47.59	72.95	0.949	21.33
	LSTM	55.58	80.02	0.939	59.65
	Hybrid CNN-LSTM	41.68	65.27	0.959	104.84
	Ensemble CNN+LSTM	44.33	72.45	0.95	85.03
SMS	FB prophet	5.05	8.02	0.862	9.04
	SARIMA	4.96	8.10	0.86	9.49
	AdaBoost	5.37	8.21	0.856	0.04
	XGBoost	5.10	8.01	0.863	0.05
	CNN	6.36	10.14	0.78	21.28
	LSTM	5.84	8.51	0.845	83.70
	Hybrid CNN-LSTM	5.46	8.52	0.844	144.23
	Ensemble CNN+LSTM	5.08	8.17	0.857	85.60
Calls	FB prophet	4.81	8.08	0.89	13.99
	SARIMA	4.95	7.94	0.894	8.12
	AdaBoost	4.75	7.86	0.896	0.13
	XGBoost	4.79	8.07	0.891	0.04
	CNN	5.90	9.36	0.853	14.60
	LSTM	5.03	7.64	0.902	83.67
	Hybrid CNN-LSTM	4.91	7.47	0.906	144.06
	Ensemble CNN+LSTM	4.85	7.47	0.906	86.20

Table 9. Results of CDRs of prediction models in the business region

Traffic type	Prediction model	MAE	RMSE	R²	Training time (seconds)
Internet	FB prophet	7.78	10.47	0.906	7.53
	SARIMA	12.61	18.48	0.706	7.66
	AdaBoost	8.78	11.8	0.88	0.25
	XGBoost	7.83	10.24	0.91	0.02
	CNN	9.18	12.72	0.861	21.36
	LSTM	7.55	10.22	0.91	84.34
	Hybrid CNN-LSTM	7.50	9.58	0.921	143.91
	Ensemble CNN+LSTM	7.22	10.1	0.912	86.43
SMS	FB prophet	1.97	3.05	0.831	13.24
	SARIMA	2.61	4.07	0.700	6.97
	AdaBoost	2.5	3.47	0.782	0.06
	XGBoost	2.86	3.78	0.741	0.02
	CNN	2.35	3.48	0.781	14.72
	LSTM	2.23	3.10	0.826	84.04
	Hybrid CNN-LSTM	1.94	3.01	0.836	145.05
	Ensemble CNN+LSTM	1.87	2.94	0.843	85.47
Calls	FB prophet	1.61	2.33	0.964	13.2
	SARIMA	3.49	5.58	0.794	5.16
	AdaBoost	1.88	2.86	0.946	0.14
	XGBoost	1.51	2.38	0.962	0.02
	CNN	1.51	2.28	0.966	15.07
	LSTM	1.53	2.29	0.965	83.74
	Hybrid CNN-LSTM	1.43	2.24	0.968	143.93
	Ensemble CNN+LSTM	1.48	2.23	0.967	84.47

In the city center region (Table 6), the predictive models were evaluated based on their accuracy and computational efficiency. From an accurate perspective, the models were ranked based on their average RMSE across all traffic types as follows: ensemble CNN+LSTM, hybrid CNN-LSTM, LSTM, CNN, AdaBoost, FB Prophet, XGBoost, and SARIMA. SARIMA was the least accurate, while the ensemble CNN+LSTM demonstrated a significant 68.65% improvement in RMSE and a 9.60% improvement in R²-value compared to SARIMA. Similarly, the hybrid CNN-LSTM showed a 66.22% enhancement in RMSE and a 9.40% improvement in R²-value over SARIMA. From a computational time standpoint, the models were ranked from fastest to slowest: AdaBoost, XGBoost, SARIMA, FB Prophet, CNN, LSTM, ensemble CNN+LSTM, and hybrid CNN-LSTM. The AdaBoost model achieved a 35.40% improvement in RMSE and a 7.47% increase in R²-value compared to SARIMA, it remained less accurate than the ensemble and hybrid models. The second-ranked model, XGBoost, demonstrated a 4.81% enhancement in RMSE and a 0.49% increase in R²-value, yielding significant advancements over the SARIMA model. The selection of models is contingent upon the particular demands for precision relative to computational time in the given application.

For the commercial region (Table 7), model performance rankings differ slightly from the city center. In terms of accuracy based on the RMSE metric, the models were ranked as ensemble CNN+LSTM, hybrid CNN-LSTM, LSTM, CNN, XGBoost, AdaBoost, FB Prophet, and SARIMA, with SARIMA again being the least accurate. The ensemble CNN+LSTM model achieved a 44.75% reduction in RMSE and a 25.61% improvement in R²-value over SARIMA, while the hybrid CNN-LSTM provided a 42.58% improvement in RMSE and a 24.83% increase in R²-value. Computationally, the models were ranked from fastest to slowest as follows: XGBoost, AdaBoost, SARIMA, FB Prophet, CNN, LSTM, ensemble CNN+LSTM, and hybrid CNN-LSTM. XGBoost and AdaBoost offered faster computation times, with XGBoost achieving a 29.46% improvement in RMSE and a 19.37% increase in R²-value and AdaBoost showing a 27.82% reduction in RMSE and a 19.37% improvement in R²-value over SARIMA. Despite their slower processing times, the ensemble CNN+LSTM and hybrid CNN-LSTM models consistently delivered the highest accuracy.

In the residential region (Table 8), model rankings revealed a different trend in accuracy compared to the city center and commercial areas. The models were ranked as hybrid CNN-LSTM, ensemble CNN+LSTM, CNN, LSTM, AdaBoost, XGBoost, FB Prophet, and SARIMA. The hybrid CNN-LSTM achieved the best performance with an average RMSE of 27.09 and R² of 0.983, reflecting a 23.45% improvement in RMSE and a 23.48% improvement in R²-value compared to SARIMA. The ensemble CNN+LSTM followed closely with a 23.77% improvement in RMSE and a 23.44% increase in R²-value over SARIMA. Regarding computational efficiency, the models were ranked as AdaBoost, XGBoost, SARIMA, FB Prophet, CNN, LSTM, ensemble CNN+LSTM, and hybrid CNN-LSTM. AdaBoost provided a significant 57.2% reduction in RMSE and a 6.4% increase in R²-value, making it the fastest and most efficient for real-time applications, while hybrid CNN-LSTM remained the most accurate but computationally demanding.

For the traffic records within the business region (Table 9), the models ranked by accuracy were hybrid CNN-LSTM, ensemble CNN+LSTM, LSTM, FB Prophet, XGBoost, AdaBoost, CNN, and SARIMA. The hybrid CNN-LSTM model achieved the highest accuracy with an average RMSE of 4.94, showing a 47.28% improvement in RMSE compared to SARIMA. The ensemble CNN+LSTM followed with an RMSE of 5.09, reflecting a 45.72% improvement over SARIMA. From a computational efficiency perspective, the models were ranked as XGBoost, AdaBoost, SARIMA, FB Prophet, CNN, LSTM, ensemble CNN+LSTM, and hybrid CNN-LSTM. XGBoost demonstrated a 41.70% reduction in RMSE and a 23.06% increase in R²-value, making it the fastest and most effective for real-time scenarios. AdaBoost also performed well, with a 35.55% improvement in RMSE and a 12.34% increase in R²-value, but it lacked the accuracy of the hybrid CNN-LSTM and ensemble CNN+LSTM models.

Across all regions and traffic types, the ensemble CNN+LSTM and hybrid CNN-LSTM models consistently provided the highest accuracy, making them the most reliable options for precise traffic prediction. However, their higher computational costs limit their suitability for real-time applications. In contrast, AdaBoost and XGBoost emerged as faster alternatives, offering reasonable accuracy with significantly reduced computation times, making them ideal for scenarios where real-time predictions are prioritized. The choice of the best model ultimately depends on the trade-off between accuracy and computational efficiency required for specific applications.

Uncertainty analysis

Figure 13 shows the performance of the adopted models based on the U₉₅ value for uncertainty analysis across the four regions studied. The U₉₅ value represents the upper 95% confidence interval for prediction uncertainty, providing insights into the stability and reliability of the models. For the Internet records (Fig. 13a), the hybrid CNN-LSTM and ensemble CNN+LSTM models consistently achieve the lowest U₉₅ values across all areas, indicating their superior reliability in handling uncertainty. These models perform particularly well in high-demand regions such as the city center and commercial, where reliability is crucial due to heavy Internet usage. SARIMA and AdaBoost show the least reliability, with high U₉₅ values across all regions, making them unsuitable for accurate Internet traffic predictions. FB Prophet and XGBoost demonstrate reasonable stability, particularly in the residential and city center regions, but their U₉₅ values are higher than those of the hybrid and ensemble models. CNN and LSTM perform well but show slightly higher uncertainty in low-traffic zones, such as in the business region.

[See PDF for image]

Fig. 13

Performance of the prediction models based on the U₉₅ metric for a Internet, b SMS, and c Call records

For the SMS records (Fig. 13b), ensemble CNN+LSTM emerges as the most stable model across all regions, closely followed by the ensemble XGBoost model. It also observed that the uncertainty linked to SMS traffic predictions is greater than that of Internet traffic, as evidenced by the persistently high U₉₅ values across all regions for SMS records. The city center shows the highest stability for most models, followed by the residential, commercial, and business regions. In contrast, SARIMA and FB Prophet exhibit the highest U₉₅ values, signifying poor predictive reliability and greater uncertainty, particularly in commercial and business regions. AdaBoost and Hybrid CNN-LSTM models show moderate U₉₅ values. Notably, the performance of CNN and LSTM falls between the advanced hybrid models and simpler statistical models, making them reliable yet slightly less robust alternatives. For the call records (Fig. 13c), Unlike Internet and SMS traffic, CNN and LSTM outperform the hybrid and ensemble models for call traffic in terms of uncertainty. In comparison between the combination models, the ensemble CNN+LSTM model slightly outperforms the hybrid CNN-LSTM in terms of U₉₅ values, particularly in the city center and commercial regions. AdaBoost and XGBoost demonstrate moderate uncertainty, particularly excelling in residential and city center regions. In contrast, SARIMA and FB Prophet show the highest U₉₅ values, reflecting poor reliability in handling call traffic, especially in high-activity zones.

Across all traffic types, the hybrid CNN-LSTM and ensemble CNN+LSTM consistently outperform other models regarding uncertainty. Their performance is especially notable in high-demand areas such as the city center and commercial zones. AdaBoost and XGBoost, while less accurate, maintain moderate uncertainty levels, making them practical for applications where speed and efficiency are prioritized. CNN and LSTM provide a middle ground, with slightly higher uncertainty compared to the hybrid models. Notably, the LSTM model exhibits reduced uncertainty relative to the CNN for traffic predictions, as evidenced by its consistently lower U₉₅ values in the majority of regions. In contrast, SARIMA and FB Prophet show the highest U₉₅ values, making them unsuitable for scenarios requiring low uncertainty and high prediction stability.

Trade-off between accuracy and computational time

Reducing the input sequence length is an effective strategy for simplifying temporal dependencies within a model, thereby improving computational efficiency and overall feasibility. By truncating sequences into smaller, manageable segments, the model can focus on the most recent data points, which are often the most relevant for accurate predictions. One of these techniques is referred to as Gated Recurrent Units (GRUs) [45]. The GRUs simplify the architecture by merging the forget and input gates into a single update gate, which reduces the number of trainable parameters. Consequently, replacing LSTM networks with GRUs in the current research can further enhance computational efficiency. This approach not only minimizes computational overhead but also reduces the risk of overfitting to long-term dependencies that may not significantly contribute to performance.

The impact of the GRU technique is evident in the analysis results presented in Table 10. Implementing this approach to the CBB model in the city center region resulted in a substantial improvement in computational efficiency. The training time for GRUs was significantly lower than that of LSTMs across all traffic types. For Internet traffic, the training time decreased from 85.11 s with LSTMs to just 49.1 s with GRUs, resulting in a 42.31% computational enhancement. Similarly, for SMS traffic, training time was reduced from 84.94 s to 50.4 s, achieving a 40.66% improvement. Calls also demonstrated a notable reduction, with training time dropping from 85.53 s to 47.2 s, yielding a 44.81% computational enhancement. While GRUs demonstrated a clear advantage in terms of computational efficiency, a slight reduction in accuracy was observed. For Internet traffic, the R² value decreased marginally from 0.990 with LSTMs to 0.984 with GRUs, reflecting only a 0.60% accuracy reduction. SMS traffic experienced a slightly higher reduction of 1.54%, with R² values dropping from 0.973 to 0.958. Similarly, call traffic showed a minor accuracy reduction of 0.50%, with R² decreasing from 0.987 to 0.982. Despite these minor reductions in accuracy, the overall performance remained within an acceptable range. Thus, the GRUs result in faster training and inference while maintaining accuracy levels comparable to LSTMs in time-series prediction.

Table 10. Comparative analysis of LSTMs and GRUs implemented into the developed CNN model

Traffic type ↓	R²			Training time (second)
Traffic type ↓	LSTMs	GRUs	Accuracy reduction (%)	LSTMs	GRUs	Computational enhancement (%)
Internet	0.990	0.984	0.60	85.11	49.1	42.31
SMS	0.973	0.958	1.54	84.94	50.4	40.66
Calls	0.987	0.982	0.50	85.53	47.2	44.81

The trade-off between accuracy and computational efficiency is justified, particularly for applications that require rapid processing and real-time decision-making where faster processing is crucial. By implementing the GRUs, the CNN model becomes more streamlined and practical for real-world scenarios where computational time constraints are a concern. The significant reduction in training time, coupled with only a slight decline in accuracy, demonstrates that adopting GRUs and reducing sequence length are highly effective strategies for enhancing the efficiency of deep learning models.

Assessment of prediction models via sampled dataset

Two representative days per week were chosen to sample the dataset after a variety of models were applied to the original data. The following two days are the average workdays and weekends, and they include Internet traffic records, as the Internet records were the primary factor influencing the identified traffic patterns. Additionally, it employed the three most effective prediction models regarding accuracy on this sampled dataset. Table 11 illustrates the prediction of Internet records for various areas based on the sampled dataset. From an accurate perspective, the results indicate that the ensemble CNN+LSTM model is the most accurate and generally efficient model for predicting Internet usage across various area types. It consistently matches actual data closely, outperforming the hybrid CNN-LSTM and LSTM models. In terms of computational time, the hybrid CNN-LSTM model exhibited the highest time consumption, while the LSTM and ensemble CNN+LSTM models demonstrated comparable performance. Notably, the ensemble CNN+LSTM model achieved the highest accuracy with moderate computational time, making it an excellent choice for balancing accuracy and time efficiency.

Table 11. Results of Internet records based on best prediction models for different regions via the sampled dataset

Region	Prediction model	MAE	RMSE	R²	Training time (seconds)
City center	LSTM	1168.09	1751.11	0.969	22.16
	Hybrid CNN-LSTM	1071.32	1767.21	0.968	71.25
	Ensemble CNN+LSTM	981.42	1447.83	0.979	19.05
Commercial	LSTM	57.46	68.51	0.926	17.38
	Hybrid CNN-LSTM	25.81	36.74	0.949	84.93
	Ensemble CNN+LSTM	26.59	36.11	0.951	23.79
Residential	LSTM	67.29	97.11	0.946	22.06
	Hybrid CNN-LSTM	45.04	74.00	0.969	84.18
	Ensemble CNN+LSTM	40.88	68.52	0.973	23.19
Business	LSTM	5.24	7.26	0.941	17.38
	Hybrid CNN-LSTM	5.13	6.91	0.946	84.00
	Ensemble CNN+LSTM	4.43	6.15	0.957	22.90

Alternative evaluation method, Fig. 14 shows Dynamic plots of hybrid CNN-LSTM, ensemble, and LSTM models for the Internet records on the sampled data. The city center area exhibits exceptionally high Internet usage on both weekdays and weekends (Fig. 14a). The commercial area demonstrates higher Internet usage on weekdays, reflecting the work-related Internet activity typical of business environments (Fig. 14b). Conversely, the residential area shows significantly higher Internet usage during weekends, indicating increased online activity when residents are at home (Fig. 14c). The business area exhibits relatively low overall Internet usage (Fig. 14d). However, there is a noticeable increase in activity on weekdays, primarily driven by work-related activities typical of business environments. Despite the reduction in data points, the prediction models, particularly ensemble CNN+LSTM, maintained high accuracy across all areas. This method simplified the dataset, reduced computational complexity, and ensured reliable and accurate predictions.

[See PDF for image]

Fig. 14

Dynamic plots of the Internet records prediction for the various regions using the sampled dataset

Generalizability assessment of ensemble CNN+LSTM

To assess the generalizability and reliability of the most effective model employed in this study (Ensemble CNN+LSTM) for practical applications, its performance was validated using an independent dataset distinct from the one used during model development. The validation dataset, titled “Predict Traffic of LTE Network”, was sourced from Kaggle.com and spans one year of network traffic data collection [58]. This dataset comprises 497,544 entries from 57 cellular network cells, capturing temporal patterns in mobile data usage. Given its time-dependent characteristics, the dataset is particularly well-suited for machine learning tasks focused on forecasting cellular network traffic. Figure 15 presents a comparative scatter plot illustrating the predicted versus actual traffic values for the LTE network dataset. Each point in the plot represents a single prediction, with a red dashed line denoting the ideal prediction line (0% error). The concentration of data points around this line indicates that the model effectively aligns its predictions with actual values. Furthermore, the Ensemble CNN+LSTM model achieved an R² value of 0.71, signifying a strong correlation between predicted and actual traffic data.

[See PDF for image]

Fig. 15

Scatter plot of validation process comparing the actual and predicted traffic based on the ensemble CNN+LSTM

For comparison, the same dataset was previously analyzed in an earlier study by Alekseeva et al. [29], where multiple ensemble models, including Bagging, Random Forest, and Gradient Boosting, were evaluated. Their results demonstrated that Gradient Boosting achieved the highest performance among the ensemble models, attaining an R² value of 0.602. However, despite its competitive performance, Gradient Boosting incurred a higher computational cost and significantly longer training times, making it less efficient for large-scale or real-time applications. In contrast, the validation results presented in Fig. 15 (R² = 0.71) demonstrate that the CNN-LSTM model substantially outperforms the Gradient Boosting model from the Alekseeva et al. study [29]. This superior performance highlights the CNN-LSTM model’s ability to effectively capture complex temporal patterns and its strong generalization capabilities when applied to an independent dataset for LTE network traffic prediction. These findings underscore the potential of deep learning-based hybrid models in enhancing the accuracy and efficiency of network traffic forecasting while maintaining computational feasibility.

Comparison with previous studies

Table 12 lists the comparative analysis of the most effective model employed in the current study (Ensemble CNN+LSTM) compared to previous studies for assessing the different CDR predictions. The current study employs an ensemble CNN+LSTM model to predict different types of call detail records (CDRs) with superior accuracy, as demonstrated by its R² values for Internet (0.990), SMS (0.976), and call (0.986) datasets. These results significantly outperform prior studies that applied various ML and DL techniques, as summarized in Table 12. Yuliana et al. [36] explored ML approaches, including KNN, random forest, and XGBoost, to predict cellular network traffic based on hourly Key performance indicators (KPIs) collected from base stations in Bandung, Indonesia. Among these models, XGBoost achieved the best R² value of 0.976. While comparable to the SMS prediction accuracy of the current study (R² = 0.976), Yuliana et al.’s work lacks the multi-dimensional predictive power demonstrated by the ensemble CNN+LSTM model across multiple CDR types, such as Internet and Call records.

Table 12. Comparative analysis of traffic prediction with prior studies

Reference	Best models	R²
Yuliana et al. [36]	XGBoost	0.976
Zeng et al. [59]	Convolution-LSTM	0.927 (SMS), 0.962 (Call), 0.971 (Internet)
Zeng et al. [59]	Convolution-GRU	0.930 (SMS), 0.968 (Call), 0.972 (Internet)
Lin et al. [6]	MGCN-LSTM	0.978 (SMS), 0.985 (Call), 0.972 (Internet)
Aldhyani et al. [32]	LSTM	0.979
Aldhyani et al. [32]	ANFIS	0.967
Current study	Ensemble CNN+LSTM	0.976 (SMS), 0.986 (Call), 0.990 (Internet)

Zeng et al. [59] introduced attention-based multi-context spatio-temporal convolutional networks (att-MCSTCNet) using convolution-LSTM and convolution-GRU (gate recurrent unit) models to predict CDR traffic in Milan. Their convolution-LSTM model achieved R² values of 0.927 for SMS, 0.962 for Call, and 0.971 for Internet, while the convolution-GRU model performed slightly better with R² values of 0.930, 0.968, and 0.972, respectively. The ensemble CNN+LSTM model in the current study not only surpasses these results but also demonstrates a more balanced performance across all datasets. Specifically, the Internet R² value of 0.990 is significantly higher, reflecting the enhanced predictive accuracy achieved by combining CNN’s feature extraction capabilities with LSTM’s sequential learning strengths. Meanwhile, the study by Lin et al. [6] proposed a spatial–temporal traffic prediction model that integrates a multi-graph convolutional network (MGCN) with LSTM to capture both spatial and temporal features. Their MGCN-LSTM model achieved the highest R² values of 0.978 for SMS, 0.985 for call, and 0.972 for Internet records, showcasing excellent predictive accuracy. However, the current study’s ensemble CNN+LSTM model slightly outperforms Lin et al.’s results, with improvements of approximately 0.001 for call and 0.018 for Internet. These marginal but notable enhancements highlight the efficacy of combining CNN and LSTM architectures to boost prediction reliability.

Aldhyani et al. [32] applied hybrid approaches, including non-crisp fuzzy-c-means (FCM) clustering with weighted exponential smoothing, to enhance LSTM and adaptive neuro-fuzzy inference system (ANFIS) models. The LSTM model achieved an R² of 0.979, while the ANFIS model obtained an R² of 0.967. While Aldhyani et al. [32]‘s hybrid methods improved predictive accuracy, their models still fall short compared to the current study’s ensemble CNN+LSTM. Overall, the current study’s ensemble CNN+LSTM model demonstrates the highest predictive accuracy across all CDR types compared to previous studies. With high, it leverages CNN’s superior feature extraction and LSTM’s temporal learning capabilities to achieve more precise and reliable predictions. These findings demonstrate the ensemble CNN+LSTM model has ability to leverage CNN’s feature extraction capabilities alongside LSTM’s sequential learning strengths, offering a more robust and precise network traffic prediction solution than standalone LSTM.

In comparison to other network traffic prediction models, the study by Oliveira et al. [60] provides valuable insights into training time efficiency. Oliveira et al. [60] evaluated several models, including multilayer perceptron (MLP), MLP with resilient backpropagation (RBprop), RNN, and DL stacked autoencoders (SAE), all trained on 1-h interval data spanning 51 days, similar to the dataset in the present study. Their results show that MLP required 7.32 s, MLP with RBprop took 7.18 s, RNN was trained in 2.84 s, and SAE had a significantly longer training time of 837.57 s (approximately 14 min). This comparison underscores that while deep learning models like SAE tend to require considerable training time, this may not be ideal for real-time network traffic prediction tasks due to the extended computational demands. Additionally, Alekseeva et al. [29] examined Gradient Boosting models, revealing that the most accurate model required 25 s of training time. Conversely, the Huber Regression model, which performed poorly in terms of accuracy with an R² value of 0.268, demonstrated the shortest training time of only 4 s.

In contrast, the models in the current study, such as AdaBoost and XGBoost, exhibited impressive training times ranging from 0.02 to 0.13 s, while maintaining high accuracy. The most accurate model, Ensemble CNN+LSTM, required between 84 and 86 s for training. After optimizing this model by substituting LSTM with GRU, the training time was significantly reduced to a range of 42 to 45 s, marking a notable improvement in training efficiency. Therefore, the models in this study, particularly AdaBoost, XGBoost, and the optimized Ensemble CNN+GRU demonstrate superior training efficiency and accuracy compared to previous models, offering significantly reduced training times while maintaining high performance.

Conclusions

The main goal of this study is to develop and assess the effectiveness of various models for cellular network traffic prediction, including SARIMA, FB Prophet, AdaBoost, XGBoost, LSTM, CNN, hybrid CNN-LSTM, and ensemble CNN+LSTM, and to compare them. These models were evaluated in various regions (city center, commercial, residential, and business) and across various CDRs (i.e., Internet, SMS, and Calls). This research was conducted via two datasets. The first dataset was composed of the original dataset. The second dataset was sampled by focusing on the average Internet traffic records to include only two representative days per week: one for workdays and the other for weekend days. The main finding of the present study is summarized as follows:

Based on the accuracy performance metrics, the predictive models were ranked in descending order of effectiveness: ensemble CNN+LSTM, hybrid CNN-LSTM, LSTM, CNN, AdaBoost, XGBoost, and FB Prophet. The SARIMA model, with the highest average RMSE and lowest R² values, demonstrated the highest RMSE and lowest R² values, highlighting its limitations compared to more advanced models.
Regarding the computational time, the developed models were ranked from fastest to slowest: XGBoost, AdaBoost, SARIMA, FB Prophet, CNN, LSTM, ensemble CNN+LSTM, and hybrid CNN-LSTM. The XGBoost model was highly suitable for real-time applications due to its shortest computation time. In contrast, the hybrid CNN-LSTM model required the most computation time.
The ensemble CNN+LSTM model demonstrated superior performance, achieving R² values of 0.990 for Internet, 0.986 for Call, and 0.976 for SMS. Although it required considerable computational time, it outperformed the SARIMA model with average RMSE improvements of 68.6% in city centers, 44.75% in the commercial region, 23.77% in the residential region, and 45.72% in the business region.
Implementation of Gated Recurrent Units (GRUs) instead of LSTMs within the CNN model significantly enhances computational efficiency. It reduced training times by up to 44.81% across various traffic types, with only minor accuracy reductions, offering GRUs a highly efficient approach for optimizing deep learning models in real-time and time-series prediction tasks.
After sampling the dataset and training it on the top three models (i.e., ensemble CNN+LSTM, hybrid CNN-LSTM, and LSTM), these models exhibited high accuracy and markedly decreased computational time. The ensemble CNN+LSTM model achieved significant efficacy, achieving a 73.92% boost in time efficiency relative to the original data. This reduction technique successfully balanced computational efficiency and precision, offering a practical strategy for real-world applications.
The Ensemble CNN+LSTM model exhibits exceptional generalization performance, achieving an R² value of 0.71 on an independent LTE network traffic validation dataset, thereby significantly surpassing the predictive accuracy of previous ensemble models, such as Gradient Boosting, which attained an R² value of 0.602.

Scope of future work

The study identifies several key directions for future research to advance cellular network traffic prediction models. A prominent approach involves the integration of advanced models, such as Transformer-based architecture and reinforcement learning algorithms, which are anticipated to enhance both predictive accuracy and computational efficiency. Another focal area is the optimization of real-time deployment, where refining complex models like CNN+LSTM for real-time applications can be achieved through model compression techniques or distributed computing. Additionally, research into predictive models that support dynamic base station power control could significantly contribute to energy efficiency and the advancement of green networking. Further investigation is warranted to assess the performance of these models in 5G and 6G networks, characterized by increased bandwidth and a higher density of connected devices. Integrating user mobility data is also essential to improve the precision of traffic predictions, particularly in urban environments where traffic patterns are highly variable. Moreover, deploying models at the network’s edge (such as base stations or small cells) holds promises for delivering real-time predictions with reduced latency, thereby alleviating computational demands on central systems. To ensure the broad applicability of these models, future work should focus on generalizing their use across diverse datasets from different regions, potentially utilizing multi-modal data integration to enhance their adaptability. Furthermore, incorporating spatial and weather features is crucial; future research should explore the impact of geographic data (e.g., location and traffic density) and external factors (e.g., weather conditions) on network traffic. By integrating these elements, predictive models can be more accurate, accounting for location-specific events and environmental conditions, ultimately facilitating the development of robust and flexible traffic prediction models that can effectively respond to a wide range of scenarios.

Acknowledgements

The authors would like to thank the journal editor and the anonymous reviewers for their editing and comments.

Author contributions

All authors contributed to the study’s conception and design. Materials preparation, data collection, validation, and writing the first draft of the manuscript were performed by Alaa Ashraf Hussien. Review, edit, and comment on previous manuscript versions by Heba Nashaat. Final review, read, and approved the final manuscript by Rehab Farouk Abdel-Kader.

Funding

This work was not supported or funded by any funding agency.

Availability of data and materials

The data presented in this paper are available on request from the corresponding author.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1. Frauendorf, JL; Almeida de Souza, É. The different architectures used in 1G, 2G, 3G, 4G, and 5G networks. The architectural and technological revolution of 5G; 2023; Cham, Springer International Publishing: pp. 83-107. [DOI: https://dx.doi.org/10.1007/978-3-031-10650-7_7]

2. Jiang, W. Cellular traffic prediction with machine learning: a survey. Expert Syst Appl; 2022; 201, 117163. [DOI: https://dx.doi.org/10.1016/j.eswa.2022.117163]

3. Kalfas, G; Vagionas, C; Antonopoulos, A et al. Next generation fiber-wireless fronthaul for 5G mmWave networks. IEEE Commun Mag; 2019; 57, pp. 138-144. [DOI: https://dx.doi.org/10.1109/MCOM.2019.1800266]

4. Ji, B; Wang, Y; Song, K et al. A survey of computational intelligence for 6G: key technologies, applications and trends. IEEE Trans Ind Inform; 2021; 17, pp. 7145-7154. [DOI: https://dx.doi.org/10.1109/TII.2021.3052531]

5. Trinh HD, Bui N, Widmer J, et al. Analysis and modeling of mobile traffic using real traces. In: 2017 IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC). IEEE; 2017. pp. 1–6. https://doi.org/10.1109/PIMRC.2017.8292200

6. Lin, J; Chen, Y; Zheng, H et al. A data-driven base station sleeping strategy based on traffic prediction. IEEE Trans Netw Sci Eng; 2024; 11, pp. 5627-5643. [DOI: https://dx.doi.org/10.1109/TNSE.2021.3109614]

7. Wang, X; Zhou, Z; Xiao, F et al. Spatio-temporal analysis and prediction of cellular traffic in metropolis. IEEE Trans Mob Comput; 2019; 18, pp. 2190-2202. [DOI: https://dx.doi.org/10.1109/TMC.2018.2870135]

8. Richerzhagen N, Richerzhagen B, Hark R, et al. Adaptive monitoring for mobile networks in challenging environments. In: Advances in computer communications and networks from green, mobile, pervasive networking to big data computing. River Publishers; 2022. pp. 91–126. https://doi.org/10.1109/ICCCN.2015.7288371

9. Zhang S, Zhao S, Yuan M, et al. Traffic prediction based power saving in cellular networks. In: Proceedings of the 25th ACM SIGSPATIAL international conference on advances in geographic information systems. New York: ACM; 2017. pp. 1–10. https://doi.org/10.1145/3139958.3140053

10. Zhang, D; Liu, L; Xie, C et al. Citywide cellular traffic prediction based on a hybrid spatiotemporal network. Algorithms; 2020; 13, 20. [DOI: https://dx.doi.org/10.3390/a13010020]

11. Parmar, KS; Bhardwaj, R. Water quality management using statistical analysis and time-series prediction model. Appl Water Sci; 2014; 4, pp. 425-434. [DOI: https://dx.doi.org/10.1007/s13201-014-0159-9]

12. Correia D, Pinto FC, Sargento S, Georgieva P. Cluster-based approach for cellular traffic prediction with machine learning methods. In: 2024 IEEE 22nd mediterranean electrotechnical conference (MELECON). IEEE; 2024. pp. 514–519. https://doi.org/10.1109/MELECON56669.2024.10608627

13. Poonia A, Garg T, Mishra O, et al. Agriculture 4.0—integrated smart irrigation system. In: 2024 15th international conference on computing communication and networking technologies (ICCCNT). IEEE; 2024. pp. 1–8. https://doi.org/10.1109/ICCCNT61001.2024.10724377

14. Oo, ZZ; Phyu, S. Time series prediction based on Facebook prophet: a case study, temperature forecasting in Myintkyina. Int J Appl Math Electron Comput; 2020; 8, pp. 263-267. [DOI: https://dx.doi.org/10.18100/ijamec.816894]

15. Guesmi L, Mejri A, Radhouane A, Zribi K. Advanced predictive modeling for enhancing traffic forecasting in emerging cellular networks. In: 2024 15th international conference on network of the future (NoF). IEEE; 2024. pp. 209–213. https://doi.org/10.1109/NoF62948.2024.10741363

16. Barrow, DK; Crone, SF. A comparison of AdaBoost algorithms for time series forecast combination. Int J Forecast; 2016; 32, pp. 1103-1119. [DOI: https://dx.doi.org/10.1016/j.ijforecast.2016.01.006]

17. Zhang Q. A framework for enhanced time series forecasting of internet traffic based on AdaBoost-LSTM integration. In: 2024 5th international conference on computer engineering and application (ICCEA). IEEE; 2024. pp. 348–353. https://doi.org/10.1109/ICCEA62105.2024.10604099

18. Fang, Z; Yang, S; Lv, C et al. Application of a data-driven XGBoost model for the prediction of COVID-19 in the USA: a time-series study. BMJ Open; 2022; 12, [DOI: https://dx.doi.org/10.1136/bmjopen-2021-056685] e056685.

19. Hussien AA, Nashaat H, Abdel-Kader RF. Evaluating AI approaches for 5G network traffic prediction: a comparative analysis BT—proceedings of the 11th international conference on advanced intelligent systems and informatics (AISI 2025). Cham: Springer; 2025. pp. 124–135. https://doi.org/10.1007/978-3-031-81308-5_12

20. Khattak, A; Zhang, J; Chan, P et al. A new frontier in wind shear intensity forecasting: stacked temporal convolutional networks and tree-based models framework. Atmosphere (Basel); 2024; 15, 1369. [DOI: https://dx.doi.org/10.3390/atmos15111369]

21. Trinh HD, Giupponi L, Dini P. Mobile traffic prediction from raw data using LSTM networks. In: 2018 IEEE 29th annual international symposium on personal, indoor and mobile radio communications (PIMRC). IEEE; 2018. pp. 1827–1832. https://doi.org/10.1109/PIMRC.2018.8581000

22. Zeng Z, Kaur R, Siddagangappa S, et al. Financial time series forecasting using CNN and transformer; 2023. arXiv Prepr arXiv230404912. https://doi.org/10.48550/arXiv.2304.04912

23. Priya SA, Bhat N, Kanna BR, et al. Proactive network optimization using deep learning in predicting IoT traffic dynamics. In: 2024 4th international conference on innovative practices in technology and management (ICIPTM). IEEE; 2024. pp. 1–6. https://doi.org/10.1109/ICIPTM59628.2024.10563433

24. Xu, F; Lin, Y; Huang, J et al. Big data driven mobile traffic understanding and forecasting: a time series approach. IEEE Trans Serv Comput; 2016; 9, pp. 796-805. [DOI: https://dx.doi.org/10.1109/TSC.2016.2599878]

25. Huang CW, Chiang CT, Li Q. A study of deep learning networks on mobile traffic forecasting. In: 2017 IEEE 28th annual international symposium on personal, indoor, and mobile radio communications (PIMRC). IEEE; 2017. pp. 1–6. https://doi.org/10.1109/PIMRC.2017.8292737

26. Abbasi, M; Shahraki, A; Taherkordi, A. Deep learning for network traffic monitoring and analysis (NTMA): a survey. Comput Commun; 2021; 170, pp. 19-41. [DOI: https://dx.doi.org/10.1016/j.comcom.2021.01.021]

27. Kim, H-W; Lee, J-H; Choi, Y-H et al. Dynamic bandwidth provisioning using ARIMA-based traffic forecasting for Mobile WiMAX. Comput Commun; 2011; 34, pp. 99-106. [DOI: https://dx.doi.org/10.1016/j.comcom.2010.08.008]

28. Kochetkova, I; Kushchazli, A; Burtseva, S; Gorshenin, A. Short-term mobile network traffic forecasting using seasonal ARIMA and holt-winters models. Futur Internet; 2023; 15, 290. [DOI: https://dx.doi.org/10.3390/fi15090290]

29. Alekseeva, D; Stepanov, N; Veprev, A et al. Comparison of machine learning techniques applied to traffic prediction of real wireless network. IEEE Access; 2021; 9, pp. 159495-159514. [DOI: https://dx.doi.org/10.1109/ACCESS.2021.3129850]

30. Fang, L; Cheng, X; Wang, H; Yang, L. Mobile demand forecasting via deep graph-sequence spatiotemporal modeling in cellular networks. IEEE Internet Things J; 2018; 5, pp. 3091-3101. [DOI: https://dx.doi.org/10.1109/JIOT.2018.2832071]

31. Andreoletti D, Troia S, Musumeci F, et al. Network traffic prediction based on diffusion convolutional recurrent neural networks. In: IEEE INFOCOM 2019—IEEE conference on computer communications workshops (INFOCOM WKSHPS). IEEE; 2019. pp. 246–251. https://doi.org/10.1109/INFCOMW.2019.8845132

32. Aldhyani, THH; Alrasheedi, M; Alqarni, AA et al. Intelligent hybrid model to enhance time series models for predicting network traffic. IEEE Access; 2020; 8, pp. 130431-130451. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.3009169]

33. Dommaraju, VS; Nathani, K; Tariq, U et al. ECMCRR-MPDNL for cellular network traffic prediction with big data. IEEE Access; 2020; 8, pp. 113419-113428. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.3002380]

34. Hassan, MK; Syed Ariffin, SH; Ghazali, NE et al. Dynamic learning framework for smooth-aided machine-learning-based backbone traffic forecasts. Sensors; 2022; 22, 3592. [DOI: https://dx.doi.org/10.3390/s22093592]

35. Ferreira, GO; Ravazzi, C; Dabbene, F et al. Forecasting network traffic: a survey and tutorial with open-source comparative evaluation. IEEE Access; 2023; 11, pp. 6018-6044. [DOI: https://dx.doi.org/10.1109/ACCESS.2023.3236261]

36. Yuliana, H; Hendrawan, I; Musashi, Y. Estimating base station traffic and throughput using machine learning based on hourly key performance indicator (KPI) network analysis. IEEE Access; 2024; 12, pp. 116285-116301. [DOI: https://dx.doi.org/10.1109/ACCESS.2024.3447098]

37. Azari A, Papapetrou P, Denic S, Peters G. Cellular traffic prediction and classification: a comparative evaluation of LSTM and ARIMA. In: Discovery science: 22nd international conference, DS 2019, Split, Croatia, October 28–30, 2019, proceedings 22. Springer; 2019. pp. 129–144. https://doi.org/10.1007/978-3-030-33778-0_11

38. Madan R, Mangipudi PS. Predicting computer network traffic: a time series forecasting approach using DWT, ARIMA and RNN. In: 2018 eleventh international conference on contemporary computing (IC3). IEEE; 2018. pp. 1–5. https://doi.org/10.1109/IC3.2018.8530608

39. Zhang, C; Zhang, H; Yuan, D; Zhang, M. Citywide cellular traffic prediction based on densely connected convolutional neural networks. IEEE Commun Lett; 2018; 22, pp. 1656-1659. [DOI: https://dx.doi.org/10.1109/LCOMM.2018.2841832]

40. Zhang, C; Patras, P; Haddadi, H. Deep learning in mobile and wireless networking: a survey. IEEE Commun Surv Tutorials; 2019; 21, pp. 2224-2287. [DOI: https://dx.doi.org/10.1109/COMST.2019.2904897]

41. Hardegen, C; Pfulb, B; Rieger, S; Gepperth, A. Predicting network flow characteristics using deep learning and real-world network traffic. IEEE Trans Netw Serv Manag; 2020; 17, pp. 2662-2676. [DOI: https://dx.doi.org/10.1109/TNSM.2020.3025131]

42. Liu, Z; Li, Z; Wu, K; Li, M. Urban traffic prediction from mobility data using deep learning. IEEE Netw; 2018; 32, pp. 40-46. [DOI: https://dx.doi.org/10.1109/MNET.2018.1700411]

43. Barlacchi, G; De Nadai, M; Larcher, R et al. A multi-source dataset of urban life in the city of Milan and the Province of Trentino. Sci Data; 2015; 2, [DOI: https://dx.doi.org/10.1038/sdata.2015.55] 150055.

44. Elshaarawy, MK; Hamed, AK. Modeling hydraulic jump roller length on rough beds: a comparative study of ANN and GEP models. J Umm Al-Qura Univ Eng Archit; 2025; [DOI: https://dx.doi.org/10.1007/s43995-024-00093-x]

45. Jaiswal R, Singh B. A hybrid convolutional recurrent (CNN-GRU) model for stock price prediction. In: 2022 IEEE 11th international conference on communication systems and network technologies (CSNT). IEEE; 2022. pp. 299–304. https://doi.org/10.1109/CSNT54456.2022.9787651

46. Li, F; Zhang, Z; Chu, X et al. A meta-learning based framework for cell-level mobile network traffic prediction. IEEE Trans Wirel Commun; 2023; 22, pp. 4264-4280. [DOI: https://dx.doi.org/10.1109/TWC.2023.3247241]

47. Nashaat, H; Mohammed, NH; Abdel-Mageid, SM; Rizk, RY. Machine learning-based cellular traffic prediction using data reduction techniques. IEEE Access; 2024; 12, pp. 58927-58939. [DOI: https://dx.doi.org/10.1109/ACCESS.2024.3392624]

48. Elshaarawy, MK; Hamed, AK. Machine learning and interactive GUI for estimating roller length of hydraulic jumps. Neural Comput Appl; 2024; [DOI: https://dx.doi.org/10.1007/s00521-024-10846-3]

49. Williams B, Halloin C, Löbel W, et al. Data-driven model development for cardiomyocyte production experimental failure prediction. In: Computer aided chemical engineering. Elsevier; 2020. pp. 1639–1644. https://doi.org/10.1016/B978-0-12-823377-1.50274-3

50. Adams, SO; Mustapha, B; Alumbugu, AI. Seasonal autoregressive integrated moving average (SARIMA) model for the analysis of frequency of monthly rainfall in Osun State, Nigeria. Phys Sci Int J; 2019; [DOI: https://dx.doi.org/10.9734/psij/2019/v22i430139]

51. Hamed, AK; Elshaarawy, MK; Alsaadawi, MM. Stacked-based machine learning to predict the uniaxial compressive strength of concrete materials. Comput Struct; 2025; 308, [DOI: https://dx.doi.org/10.1016/j.compstruc.2025.107644] 107644.

52. Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 2016. pp. 785–794. https://doi.org/10.1145/2939672.2939785

53. Xie, H; Zhang, L; Lim, CP. Evolving CNN-LSTM models for time series prediction using enhanced grey wolf optimizer. IEEE Access; 2020; 8, pp. 161519-161541. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.3021527]

54. Mijan R. Different ways to combine CNN and LSTM networks for time series classification tasks. Medium. https://medium.com/@mijanr/different-ways-to-combine-cnn-and-lstm-networks-for-time-series-classification-tasks-b03fc37e91b6. Accessed 21 Sep 2024.

55. Eltarabily, MG; Hamed, AK; Elkiki, M; Selim, T. Hydraulic assessment of different types of piano key weirs. ISH J Hydraul Eng.; 2024; [DOI: https://dx.doi.org/10.1080/09715010.2024.2415938]

56. Tian, W; Isleem, HF; Hamed, AK; Elshaarawy, MK. Enhancing discharge prediction over type-A piano key weirs: an innovative machine learning approach. Flow Meas Instrum; 2024; 100, [DOI: https://dx.doi.org/10.1016/j.flowmeasinst.2024.102732] 102732.

57. Ren, Y; Isleem, HF; Almoghaye, WJK et al. Machine learning-based prediction of elliptical double steel columns under compression loading. J Big Data; 2025; 12, 50. [DOI: https://dx.doi.org/10.1186/s40537-025-01081-1]

58. Predict traffic of LTE network. https://www.kaggle.com/code/kerneler/starter-predict-traffic-of-lte-network-b464a5dc-c. Accessed 1 Feb 2025.

59. Zeng, Q; Sun, Q; Chen, G; Duan, H. Attention based multi-component spatiotemporal cross-domain neural network model for wireless cellular network traffic prediction. EURASIP J Adv Signal Process; 2021; 2021, 46. [DOI: https://dx.doi.org/10.1186/s13634-021-00756-0]

60. Oliveira, TP; Barbar, JS; Soares, AS. Computer network traffic prediction: a comparison between traditional and deep learning neural networks. Int J Big Data Intell; 2016; 3, 28. [DOI: https://dx.doi.org/10.1504/IJBDI.2016.073903]

Word count: 15295

Show less

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Machine learning techniques for spatiotemporal traffic prediction in 5G cellular networks

Content area

Abstract

Full text