Content area
Recent advances in large language models have driven major breakthroughs in Text-to-SQL tasks. However, many challenges hinder the use of SQL parsers for cross-language tasks. In this article, we introduce FGCSQL, a novel three-stage pipeline framework to deal with three challenges: cross-language schema linking, SQL parsing potential of LLM, and error propagation in SQL parsers, in which the framework uniquely incorporates a filtering encoder to eliminate irrelevant database schema items, harnessing a pre-trained generative large language model fine-tuned on a carefully structured dataset for enhanced SQL parsing. Finally, a correcting decoder addresses error propagation, culminating in a robust system for semantic parsing tasks. Tested on the CSpider dataset, the FGCSQL showcases a substantial improvement in the exact-set-match (EM) accuracy and execution accuracy (EX) metrics, validating the pipeline’s architecture’s effectiveness in mitigating the challenges typically confronted in Text-to-SQL conversion, especially in cross-lingual contexts. FGCSQL outstrips existing methods in execution precision, indicating the validity of our proposed method.
Details
; Yu, Chenglong 2 ; Zhu, Zixuan 1 ; Li, Wei 1 1 Hangzhou Institute of Technology, Xidian University, Hangzhou 311231, China;
2 School of Artificial Intelligence, Xidian University, Xi’an 710071, China;