Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

AutoPaperBench proposes a benchmark generation system to automatically evaluate the comprehensibility of papers in a Multimodal Large Language Model (MLLM). The proposed system efficiently structures the content of a paper through semantic parsing and automatically generates text-based QAs and visual-based VQAs. To ensure the quality of the generated QA, we introduce a reviewer system that evaluates six criteria such as logic and appropriateness. In our experiments on 60 research papers from the medical, natural, and engineering fields, the generated benchmarks demonstrate comparable performance rankings to those of previous benchmarks, and the performance improvements achieved through semantic parsing are validated. The system can run on a single GPU environment and provides a framework for efficiently evaluating LLM thesis comprehension.

Details

Title
AutoPaperBench: An MLLM-Based Framework for Automatic Generation of Paper Understanding Evaluation Benchmarks
Author
Min-Woo, Kim 1   VIAFID ORCID Logo  ; Park, Hyo-Bin 1 ; Hee-Jin, Ahn 1 ; Woo-Ram, Park 1 ; Jae-Wan Jeon 1 ; Kyong-Ha, Lee 2   VIAFID ORCID Logo  ; Lee, Ryong 2 ; Choi, Dong-Geol 1   VIAFID ORCID Logo 

 Department of Information and Communication Engineering, Hanbat National University, Daejeon 34158, Republic of Korea; [email protected] (M.-W.K.); [email protected] (H.-B.P.); [email protected] (H.-J.A.); [email protected] (W.-R.P.); [email protected] (J.-W.J.) 
 Department of Large-Scale AI Research Group, Korea Institute of Science and Technology Information, Daejeon 34141, Republic of Korea; [email protected] 
First page
1175
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
20799292
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3181455844
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.