Full text

Turn on search term navigation

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Current language models have achieved remarkable success in NLP tasks. Nonetheless, individual decoding methods face difficulties in realizing the immense potential of these models. The challenge is primarily due to the lack of a decoding framework that can integrate language models and decoding methods. We introduce DecoStrat, which bridges the gap between language modeling and the decoding process in D2T generation. By leveraging language models, DecoStrat facilitates the exploration of alternative decoding methods tailored to specific tasks. We fine-tuned the model on the MultiWOZ dataset to meet task-specific requirements and employed it to generate output(s) through multiple interactive modules of the framework. The Director module orchestrates the decoding processes, engaging the Generator to produce output(s) text based on the selected decoding method and input data. The Manager module enforces a selection strategy, integrating Ranker and Selector to identify the optimal result. Evaluations on the stated dataset show that DecoStrat effectively produces a diverse and accurate output, with MBR variants consistently outperforming other methods. DecoStrat with the T5-small model surpasses some baseline frameworks. Generally, the findings highlight DecoStrat’s potential for optimizing decoding methods in diverse real-world applications.

Details

Title
DecoStrat: Leveraging the Capabilities of Language Models in D2T Generation via Decoding Framework
Author
Elias Lemuye Jimale 1   VIAFID ORCID Logo  ; Chen, Wenyu 2   VIAFID ORCID Logo  ; Al-antari, Mugahed A 3   VIAFID ORCID Logo  ; Gu, Yeong Hyeon 3   VIAFID ORCID Logo  ; Victor Kwaku Agbesi 2   VIAFID ORCID Logo  ; Wasif Feroze 2   VIAFID ORCID Logo 

 School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; [email protected] (E.L.J.); [email protected] (V.K.A.); [email protected] (W.F.); School of Electrical Engineering and Computing, Adama Science and Technology University, Adama 1888, Ethiopia 
 School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; [email protected] (E.L.J.); [email protected] (V.K.A.); [email protected] (W.F.) 
 Department of Artificial Intelligence and Data Science, College of AI Convergence, Daeyang AI Center, Sejong University, Seoul 05006, Republic of Korea 
First page
3596
Publication year
2024
Publication date
2024
Publisher
MDPI AG
e-ISSN
22277390
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3133317949
Copyright
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.