Full text

Turn on search term navigation

Copyright © 2023 Majd E. Tannous et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

Many unstructured documents contain segments with specific topics. Extracting these segments and identifying their topics helps to access the required information directly. This can improve the quality of many NLP applications such as information extraction, information retrieval, summarization, and question answering. Resumes (CVs) are unstructured documents that have diverse formats. They contain various segments such as personal information, experience, and education. Manually processing resumes to find the most suitable candidates for a particular job is a difficult task. Due to the increased amount of data, it has become very necessary to manipulate resumes by computer to save time and effort. This research presents a new algorithm named TSHD for topic segmentation based on headings detection. We apply the algorithm to extract resume segments and identify their topics. The proposed TSHD algorithm is accurate and addresses many weaknesses in previous studies. Evaluation results show a very high F1 score (about 96%) and a very low segmentation error (about 2%). The algorithm can be easily adapted to deal with other textual domains that contain headings in their segments.

Details

Title
TSHD: Topic Segmentation Based on Headings Detection (Case Study: Resumes)
Author
Tannous, Majd E 1   VIAFID ORCID Logo  ; Ramadan, Wassim H 2   VIAFID ORCID Logo  ; Rajab, Mohanad A 3 

 Department of Computer Engineering, Al-Wataniya Private University, Hama, Syria; Faculty of Informatics Engineering, Al Baath University, Homs, Syria 
 Department of Computer Engineering, Al-Wataniya Private University, Hama, Syria 
 Faculty of Informatics Engineering, Al Baath University, Homs, Syria 
Editor
Christos Troussas
Publication year
2023
Publication date
2023
Publisher
John Wiley & Sons, Inc.
ISSN
16875893
e-ISSN
16875907
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2777922286
Copyright
Copyright © 2023 Majd E. Tannous et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/