Content area
Text-to-SQL for spatial databases enables the translation of natural language questions into corresponding SQL queries, allowing non-experts to easily access spatial data, which has gained increasing attention from researchers. Previous research has primarily focused on rule-based methods. However, these methods have limitations when dealing with complicated or unknown natural language questions. While advanced machine learning models can be trained, they typically require large labeled training datasets, which are severely lacking for spatial databases. Recently, Generative Pre-Trained Transformer (GPT) models have emerged as a promising paradigm for Text-to-SQL tasks in relational databases, driven by carefully designed prompts. In response to the severe lack of datasets for spatial databases, we have created a publicly available dataset that supports both English and Chinese. Furthermore, we propose a GPT-based method to construct prompts for spatial databases, which incorporates geographic and spatial database knowledge into the prompts and requires only a small number of training samples, such as 1, 3, or 5 examples. Extensive experiments demonstrate that incorporating geographic and spatial database knowledge into prompts improves the accuracy of Text-to-SQL tasks for spatial databases. Our proposed method can help non-experts access spatial databases more easily and conveniently.
Details
1 Department of Geography, Tianjin Normal University, Tianjin 300387, China; [email protected]
2 Institute of Geospatial Information, Information Engineering University, Zhengzhou 450001, China; [email protected]
3 Faculty of Architectural Engineering, Tianjin University, Tianjin 300350, China; [email protected]
4 Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; [email protected], Beijing International Collaboration Base on Brain Informatics and Wisdom Services, Beijing 100124, China