Content area
Conference Title: 2025 IEEE/ACIS 29th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)
Conference Start Date: 2025 June 25
Conference End Date: 2025 June 27
Conference Location: Busan, Korea, Republic of
Existing dataset recommendation (rec) systems including those named as ZhangRec23, WangRec22, and GDS19, face certain limitations, such as lack of focus on e-commerce datasets, inability to address complex queries, and reliance on inconsistent metadata (e.g., data structure of domain of products being recommended). This leads to incomplete or mismatched results returned by the system for complex query searches, such as "impact of seasonal sales on customer reviews for electronics". These traditional dataset rec systems rely on simple keyword matching, failing to interpret context-sensitive queries that researchers often need, and are unable to capture the dynamic trends in the e-commerce domain. This highlights the need for an advanced dataset rec system that improves metadata quality and integrates semantic understanding to recommend precise and relevant e-commerce datasets to researchers. This paper proposes an E-commerce Datasets Mining Rec System (EDMRec), an adaptation of ZhangRec23 approach. EDMRec combines content-based filtering, advanced metadata processing, and machine learning approach in a three-layered structure involving (i) Data Collection, (ii)Data Processing, and (iii) Query Processing. It utilizes Named Entity Recognition (NER) to complete metadata and uses TF-IDF with Bidirectional Encoder Representations from Transformers (BERT) embeddings to capture both keyword relevance and semantic context, enhancing recommendation precision for complex queries. Experimental results show that EDMRec improves precision, recall, and F1 score by 15% over existing systems, consistently providing contextually accurate recommendations across 4,373 metadata entries from sources such as Kaggle and Google Dataset Search, making it well-suited for supporting data-driven insights in e-commerce.
Details
1 University of Windsor,School of Computer Science,Windsor,Canada
2 Algoma University,Faculty of Computer Science and Technology,Canada