Content area
This study evaluates the performance and energy trade-offs of three popular data processing libraries—Pandas, PySpark, and Polars—applied to GreenNav, a CO2 emission prediction pipeline for urban traffic. GreenNav is an eco-friendly navigation app designed to predict CO2 emissions and determine low-carbon routes using a hybrid CNN-LSTM model integrated into a complete pipeline for the ingestion and processing of large, heterogeneous geospatial and road data. Our study quantifies the end-to-end execution time, cumulative CPU load, and maximum RAM consumption for each library when applied to the GreenNav pipeline; it then converts these metrics into energy consumption and CO2 equivalents. Experiments conducted on datasets ranging from 100 MB to 8 GB demonstrate that Polars in lazy mode offers substantial gains, reducing the processing time by a factor of more than twenty, memory consumption by about two-thirds, and energy consumption by about 60%, while maintaining the predictive accuracy of the model (R2 ≈ 0.91). These results clearly show that the careful selection of data processing libraries can reconcile high computing performance and environmental sustainability in large-scale machine learning applications.
Details
; Lahmer Mohammed 2
; Karim, Mohammed 3
1 Paragraphe Laboratory, Paris 8 University of Paris, Vincennes–Saint-Denis, 93200 Saint-Denis, France, Laboratory of Engineering, Modeling, and Systems Analysis (LIMAS), Faculty of Sciences, Sidi Mohamed Ben Abdellah University (USMBA), Fez 30000, Morocco; [email protected], ESISA ANALYTICA Laboratory (LEA), Department of Artificial Intelligence, School of Engineering in Applied Sciences (ESISA), Fez 30050, Morocco; [email protected]
2 ESISA ANALYTICA Laboratory (LEA), Department of Artificial Intelligence, School of Engineering in Applied Sciences (ESISA), Fez 30050, Morocco; [email protected], Department of Computer Engineering High School of Technology, Moulay Ismail University, Meknes 50050, Morocco
3 Laboratory of Engineering, Modeling, and Systems Analysis (LIMAS), Faculty of Sciences, Sidi Mohamed Ben Abdellah University (USMBA), Fez 30000, Morocco; [email protected]