Content area

Abstract

In the era of big data, organizations face critical decisions when selecting between data lakes and data warehouses to meet their analytics requirements. This article presents a comprehensive comparative analysis of these two predominant data management architectures, emphasizing their structural differences, functional capabilities, and suitability for diverse analytics workloads. Data lakes offer scalable, cost-effective storage for raw, unstructured, and semi-structured data, supporting advanced analytics and machine learning applications. In contrast, data warehouses provide optimized, schema-on-write frameworks for fast querying and reliable reporting on structured data. Through detailed examination of architectural designs, integration with big data tools including Hadoop, Spark, and Kafka, and evaluations based on performance, scalability, cost, and governance, this paper provides organizations with evidence-based guidance to align their data strategies with business objectives. Case studies from healthcare and retail sectors illustrate practical implications of each approach, while emerging trends such as lakehouse architectures, AI integration, blockchain security, edge computing, and quantum computing highlight future directions. The findings support for a hybrid data management solution that leverages the strengths of both data lakes and warehouses to enable robust, scalable, and innovative big data analytics.

Details

1009240
Title
Data lakes versus data warehouses: choosing the right approach for big data analytics
Author
Mezzoudj, Saliha 1   VIAFID ORCID Logo  ; Khelifa, Meriem 2 ; Saadna, Yassmina 3 

 University of Algiers 1, Faculty of Sciences, Department of Computer Science, Algiers, Algeria (GRID:grid.472451.1) (ISNI:0000 0004 4654 9795) 
 University of Kasdi Merbah Ouargla, Department of Computer Science, Faculty of Sciences, Ouargla, Algeria (GRID:grid.442522.7) (ISNI:0000 0004 0524 3132) 
 University of Batna, Lastic Laboratory, Department of Mathematic Computer Science, Batna, Algeria (GRID:grid.440475.6) (ISNI:0000 0004 1771 734X) 
Volume
12
Issue
1
Pages
89
Publication year
2025
Publication date
Dec 2025
Publisher
Springer Nature B.V.
Place of publication
Cairo
Country of publication
Netherlands
e-ISSN
23147172
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-10-27
Milestone dates
2025-10-01 (Registration); 2025-01-08 (Received); 2025-09-30 (Accepted)
Publication history
 
 
   First posting date
27 Oct 2025
ProQuest document ID
3265228395
Document URL
https://www.proquest.com/scholarly-journals/data-lakes-versus-warehouses-choosing-right/docview/3265228395/se-2?accountid=208611
Copyright
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-10-27
Database
ProQuest One Academic