Content area

Abstract

Timely and accurate access to financial data is crucial for empirical research in accounting and finance. However, current data collection processes are often manual, inconsistent, and difficult to scale. This study asks: How can large language models (LLMs) be effectively used to automate financial data collection? Using design science research methodology (DSRM), the author develops a modular architecture that integrates a real-time search API and auxiliary information processing into LLM workflows. The study applies the model to two tasks: extracting ESG report release dates and identifying customer firm tickers from COMPUSTAT. The system achieves 96% and 95% accuracy, respectively, comparable to human performance. This study advances LLM applications in accounting by providing a scalable, practical framework for automating financial data retrieval.

Details

10000008
Business indexing term
Title
Collecting Financial Data From Online Sources: Enhancing Large Language Models With Real-Time Search
Author
Li, Yang 1 

 Montclair State University, USA 
Volume
37
Issue
1
Pages
1-23
Number of pages
24
Publication year
2025
Publication date
2025
Publisher
IGI Global
Place of publication
Hershey
Country of publication
United States
ISSN
15462234
e-ISSN
15465012
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
Publication history
 
 
Milestone dates
2025-01-01 (pubdate)
ProQuest document ID
3252275207
Document URL
https://www.proquest.com/scholarly-journals/collecting-financial-data-online-sources/docview/3252275207/se-2?accountid=208611
Copyright
© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the "License").  Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-12-29
Database
ProQuest One Academic