Content area

Abstract

Table-based question answering (TableQA) has made significant progress in recent years; however, most advancements have focused on English datasets and SQL-based techniques, leaving Arabic TableQA largely unexplored. This gap is especially critical given the widespread use of structured Arabic content in domains such as government, education, and media. The main challenge lies in the absence of benchmark datasets and the difficulty that large language models (LLMs) face when reasoning over long, complex tables in Arabic, due to token limitations and morphological complexity. To address this, we introduce Arabic WikiTableQA, the first large-scale dataset for non-SQL Arabic TableQA, constructed from the WikiTableQuestions dataset and enriched with natural questions and gold-standard answers. We developed three methods to evaluate this dataset: a direct input approach, a sub-table selection strategy using SQL-like filtering, and a knowledge-guided framework that filters the table using semantic graphs. Experimental results with an LLM show that the graph-guided approach outperforms the others, achieving 74% accuracy, compared to 64% for sub-table selection and 45% for direct input, demonstrating its effectiveness in handling long and complex Arabic tables.

Details

1009240
Company / organization
Title
Arabic WikiTableQA: Benchmarking Question Answering over Arabic Tables Using Large Language Models
Author
Fawaz, Alsolami 1   VIAFID ORCID Logo  ; Alrayzah Asmaa 2   VIAFID ORCID Logo 

 Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia; [email protected] 
 Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran 55461, Saudi Arabia 
Publication title
Volume
14
Issue
19
First page
3829
Number of pages
15
Publication year
2025
Publication date
2025
Publisher
MDPI AG
Place of publication
Basel
Country of publication
Switzerland
Publication subject
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Online publication date
2025-09-27
Milestone dates
2025-08-28 (Received); 2025-09-26 (Accepted)
Publication history
 
 
   First posting date
27 Sep 2025
ProQuest document ID
3261057209
Document URL
https://www.proquest.com/scholarly-journals/arabic-wikitableqa-benchmarking-question/docview/3261057209/se-2?accountid=208611
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-10-16
Database
2 databases
  • ProQuest One Academic
  • ProQuest One Academic