Content area
The FAIR principles—Findability, Accessibility, Interoperability, and Reusability—have been widely applied to data sharing. They are also crucial for sharing knowledge and software to promote open science. However, using these principles to create mathematical models in interdisciplinary contexts is challenging due to the difficulty of integrating new models into simulation codes. The VIPRA project addresses interoperability in pedestrian dynamics by providing a modular software framework to which scientists can contribute new models. The VIPRA Recommender System (VRS) enhances reusability by suggesting suitable models and parameter configurations for user-defined problems. It is being expanded to improve findability by using large language models (LLMs) to recommend scientific papers with relevant models. Despite this advancement, translating mathematical equations from scientific papers into executable code remains challenging. This thesis bridges this gap by developing a novel tool that extracts mathematical equations from PDFs and generates corresponding C++ code using LLMs, thus enhancing reusability by seamlessly integrating published models into the software. Three key research contributions support this tool. First, this thesis evaluates the strengths and limitations of various LLMs in converting equations into code. Second, it examines the role of prompt engineering in optimizing this process. Third, it establishes a benchmark dataset of equations to assess LLMs’ effectiveness in code generation. These contributions advance open science through new capabilities and knowledge for AI-assisted computational science and promote transparency and reproducibility through an openly accessible benchmark.