Optimising Contextual Embeddings for Meaning Conflation Deficiency Resolution in Low-Resourced Languages

Abstract

Meaning conflation deficiency (MCD) presents a continual obstacle in natural language processing (NLP), especially for low-resourced and morphologically complex languages, where polysemy and contextual ambiguity diminish model precision in word sense disambiguation (WSD) tasks. This paper examines the optimisation of contextual embedding models, namely XLNet, ELMo, BART, and their improved variations, to tackle MCD in linguistic settings. Utilising Sesotho sa Leboa as a case study, researchers devised an enhanced XLNet architecture with specific hyperparameter optimisation, dynamic padding, early termination, and class-balanced training. Comparative assessments reveal that the optimised XLNet attains an accuracy of 91% and exhibits balanced precision–recall metrics of 92% and 91%, respectively, surpassing both its baseline counterpart and competing models. Optimised ELMo attained the greatest overall metrics (accuracy: 92%, F1-score: 96%), whilst optimised BART demonstrated significant accuracy improvements (96%) despite a reduced recall. The results demonstrate that fine-tuning contextual embeddings using MCD-specific methodologies significantly improves semantic disambiguation for under-represented languages. This study offers a scalable and flexible optimisation approach suitable for additional low-resource language contexts.

Details

Subject

Sparsity;
Language;
Recall;
Accuracy;
Semantics;
Word sense disambiguation;
Case studies;
Natural language processing;
Optimization;
Adaptation;
Morphological complexity;
Annotations;
Polysemy;
Morphology;
Sotho languages;
Meaning;
Termination;
Ambiguity;
Languages

Identifier / keyword

meaning conflation deficiency; word sense disambiguation; low-resourced languages; contextual embeddings; XLNet optimisation; ELMo; BART; hyperparameter tuning; Sesotho sa Leboa; morphologically rich languages

Title

Optimising Contextual Embeddings for Meaning Conflation Deficiency Resolution in Low-Resourced Languages

Author

Masethe, Mosima A¹

; Ojo, Sunday O²

; Masethe, Hlaudi D³

¹ Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, Durban 4001, South Africa, Department of Computer Science and Information Technology, School of Science and Technology, Sefako Makgatho Health Sciences University, Ga-Rankuwa 0208, South Africa
² Department of Information Technology, Faculty of Accounting and Informatics, Durban University of Technology, Durban 4001, South Africa
³ Department of Data Science, Faculty of Information Communication Technology, Tshwane University of Technology, Pretoria 0001, South Africa

Publication title

Computers; Basel

Volume

Issue

First page

402

Number of pages

Publication year

2025

Publication date

2025

Publisher

MDPI AG

Place of publication

Basel

Country of publication

Switzerland

Publication subject

Computers

e-ISSN

2073431X

Source type

Scholarly Journal

Language of publication

English

Document type

Journal Article

Publication history

Online publication date

2025-09-22

Milestone dates

2025-07-28 (Received); 2025-09-06 (Accepted)

Publication history

First posting date

22 Sep 2025

DOI

https://doi.org/10.3390/computers14090402

ProQuest document ID

3254483396

Document URL

https://www.proquest.com/scholarly-journals/optimising-contextual-embeddings-meaning/docview/3254483396/se-2?accountid=208611

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Last updated

2025-11-07

Database

ProQuest One Academic

Optimising Contextual Embeddings for Meaning Conflation Deficiency Resolution in Low-Resourced Languages

Content area

Abstract

Details