Content area

Abstract

Hierarchical Text Classification (HTC) is a specialised task in natural language processing that involves categorising text into a hierarchical structure of classes. This approach is particularly valuable in several domains, such as document organisation, sentiment analysis, and information retrieval, where classification schemas naturally form hierarchical structures. In this paper, we propose and compare two deep learning-based models for HTC. The first model involves fine-tuning GPT-2, a large language model (LLM), specifically for hierarchical classification tasks. Fine-tuning adapts GPT-2’s extensive pre-trained knowledge to the nuances of hierarchical classification. The second model leverages BERT for text preprocessing and encoding, followed by a BiLSTM layer for the classification process. Experimental results demonstrate that the fine-tuned GPT-2 model significantly outperforms the BERT-BiLSTM model in accuracy and F1 scores, underscoring the advantages of using advanced LLMs for hierarchical text classification.

Details

1009240
Title
Hierarchical Text Classification: Fine-tuned GPT-2 vs BERT-BiLSTM
Author
Djelloul, Bouchiha 1   VIAFID ORCID Logo  ; Abdelghani, Bouziane 1   VIAFID ORCID Logo  ; Doumi Noureddine 2   VIAFID ORCID Logo  ; Benamar, Hamzaoui 1   VIAFID ORCID Logo  ; Boukli-Hacene Sofiane 3   VIAFID ORCID Logo 

 1,2,4 University Centre of Naama , Naama , Algeria 
 University of Saida , Saida , Algeria 
 University of Sidi Bel Abbes , EEDIS Lab ., Sidi Bel Abbes , Algeria 
Publication title
Volume
30
Issue
1
Pages
40-46
Number of pages
8
Publication year
2025
Publication date
2025
Publisher
De Gruyter Brill Sp. z o.o., Paradigm Publishing Services
Place of publication
Riga
Country of publication
Poland
Publication subject
ISSN
22558683
e-ISSN
22558691
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-03-15
Milestone dates
2024-11-07 (Received); 2025-05-26 (Accepted)
Publication history
 
 
   First posting date
15 Mar 2025
ProQuest document ID
3206825308
Document URL
https://www.proquest.com/scholarly-journals/hierarchical-text-classification-fine-tuned-gpt-2/docview/3206825308/se-2?accountid=208611
Copyright
© 2025. This work is published under http://creativecommons.org/licenses/by/4.0 (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-13
Database
2 databases
  • ProQuest One Academic
  • ProQuest One Academic