Content area

Abstract

In the rapidly evolving field of e-commerce, precise and efficient attribute extraction from product descriptions is crucial for enhancing search functionality, improving customer experience, and streamlining the listing process for sellers. This study proposes a large language model (LLM)-based approach for automated attribute extraction on Trendyol’s e-commerce platform. For comparison purposes, a deep learning (DL) model is also developed, leveraging a transformer-based architecture to efficiently identify explicit attributes. In contrast, the LLM, built on the Mistral architecture, demonstrates superior contextual understanding, enabling the extraction of both explicit and implicit attributes from unstructured text. The models are evaluated on an extensive dataset derived from Trendyol’s Turkish-language product catalog, using performance metrics such as precision, recall, and F1-score. Results indicate that the proposed LLM outperforms the DL model across most metrics, demonstrating superiority not only in direct single-model comparisons but also in average performance across all evaluated categories. This advantage is particularly evident in handling complex linguistic structures and diverse product descriptions. The system has been integrated into Trendyol’s platform with a scalable backend infrastructure, employing Kubernetes and Nvidia Triton Inference Server for efficient bulk processing and real-time attribute suggestions during the product listing process. This study not only advances attribute extraction for Turkish-language e-commerce but also provides a scalable and efficient NLP-based solution applicable to large-scale marketplaces. The findings offer critical insights into the trade-offs between accuracy and computational efficiency in large-scale multilingual NLP applications, contributing to the broader field of automated product classification and information retrieval in e-commerce ecosystems.

Details

1009240
Business indexing term
Title
A New Large Language Model for Attribute Extraction in E-Commerce Product Categorization
Author
Serhan, Çiftlikçi Mehmet 1 ; Çakmak Yusuf 1   VIAFID ORCID Logo  ; Kalaycı, Tolga Ahmet 1   VIAFID ORCID Logo  ; Abut Fatih 2   VIAFID ORCID Logo  ; Akay, Mehmet Fatih 2   VIAFID ORCID Logo  ; Kızıldağ Mehmet 3   VIAFID ORCID Logo 

 Department of Data Science, Trendyol, Istanbul 34485, Turkey 
 Department of Computer Engineering, Faculty of Engineering, Çukurova University, Adana 01250, Turkey 
 BADEM Bilgi Sistemleri Danışmanlık Sağlık Hizm. Tic. Ltd. Şti, Adana 01250, Turkey 
Publication title
Volume
14
Issue
10
First page
1930
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-05-09
Milestone dates
2025-03-27 (Received); 2025-05-07 (Accepted)
Publication history
 
 
   First posting date
09 May 2025
ProQuest document ID
3211937582
Document URL
https://www.proquest.com/scholarly-journals/new-large-language-model-attribute-extraction-e/docview/3211937582/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-05-27
Database
ProQuest One Academic