Content area
This dataset was developed to support research at the intersection of web accessibility and Artificial Intelligence, with a focus on evaluating how Large Language Models (LLMs) can detect and remediate accessibility issues in source code. It consists of code examples written in PHP, Angular, React, and Vue.js, organized into accessible and non-accessible versions of tabular components. A substantial portion of the dataset was collected from student-developed Vue components, implemented using both the Options and Composition APIs. The dataset is structured to enable both a static analysis of source code and a dynamic analysis of rendered outputs, supporting a range of accessibility research tasks. All files are in plain text and adhere to the FAIR principles, with open licensing (CC BY 4.0) and long-term hosting via Zenodo. This resource is intended for researchers and practitioners working on LLM-based accessibility validation, inclusive software engineering, and AI-assisted frontend development.
Full text
1. Summary
Web accessibility is a fundamental pillar of an inclusive digital society [1]. It ensures that individuals with disabilities can perceive, understand, navigate, and interact with online content effectively, thus supporting equitable participation in economic, educational, and civic life [2]. With the increasing reliance on the web as a primary interface for services and information, accessibility is no longer a niche concern but a mainstream requirement [3]. In particular, adherence to global standards such as the Web Content Accessibility Guidelines (WCAG) has become a legal and ethical imperative across many sectors and jurisdictions [4].
Despite the formalization of accessibility standards and the proliferation of automated tools, such as WAVE [5], Axe [6], and Lighthouse [7], web accessibility remains a persistent challenge [8]. Existing tools are often limited to detecting syntactic issues and fail to capture semantic or context-dependent violations [9], especially in modern web applications that rely heavily on JavaScript frameworks, dynamic content rendering, and component-based architectures [10]. These limitations contribute to widespread accessibility gaps, particularly in websites developed without expert oversight or adequate accessibility training [11].
Artificial Intelligence (AI) and Machine Learning (ML) have recently emerged as promising avenues for addressing these limitations by enabling more nuanced and scalable evaluations of web accessibility [12]. For example, Makati [13] presented DocTTTTTQuery, a method that employs a sequence-to-sequence model (i.e., T5) to generate a set of anticipated queries for a given document. These generated queries are then appended to the original document, creating an enhanced version. The underlying principle is that by enriching the document with potential queries, the overall retrieval performance is improved. The author demonstrated that using these enhanced documents leads to superior retrieval results. Instead, Ara and Sik-Lanyi [14] evaluated website accessibility using a decision tree. The model is trained on data generated by the Mauve tool, which assesses websites against various checkpoints from the WCAG 2.1. The authors aimed to identify accessibility barriers and demonstrate the need for stronger web accessibility policies. Their experiments, validated through a confusion matrix and classification report, found that none of the tested webpages were free from accessibility errors, highlighting a significant need for improved directives to enhance social inclusiveness.
In such a context, Large Language Models (LLMs) such as GPT-4, Claude, and LLaMA have shown considerable potential due to their ability to interpret source code, understand natural language requirements, and reason about content structure and intent [15]. Rather than relying solely on rigid rule-based systems, LLMs can offer more flexible and context-aware assessments. This represents a paradigm shift in accessibility validation from conventional DOM-level checks and static analyzers to AI-powered agents that can analyze raw HTML/CSS/JS code, interpret it semantically, and suggest or apply fixes [16]. Several recent studies illustrate this emerging capability. For instance, Othman et al. [17] investigated how ChatGPT can remediate WCAG 2.1 violations identified by the WAVE tool, showing significant improvement in compliance through automated fixes. Similarly, Delnevo et al. [18] assessed LLMs’ ability to validate and correct HTML forms and tables, finding that ChatGPT could successfully detect and fix structural issues such as missing labels and incorrect semantic tags. López-Gil and Pereira [19] further explored LLMs for automating manual WCAG criteria evaluation, such as interpreting link purposes or identifying language mismatches, reaching detection accuracies beyond 85%. However, these studies also highlight the limitations of current models, especially in dealing with missing structural elements and generating consistent remediations.
The dataset described in this article was created to systematically investigate and benchmark the capabilities of LLMs in detecting and remediating accessibility issues directly at the source code level. It is motivated by the need for a structured, reproducible, and scalable resource that allows researchers to explore how LLMs perform in realistic, code-centric accessibility validation scenarios. The dataset includes a diverse set of code snippets, some intentionally injected with accessibility violations and others designed to meet best practices.
The dataset is structured into two primary directories, each targeting distinct facets of accessibility evaluation within modern web development. The first directory, titled “Dynamic Generated Content”, comprises a collection of code snippets developed using four widely adopted frontend and backend technologies, Angular, React, Vue, and PHP. Each snippet contains two functionally equivalent versions of a tabular component, with one that adheres to established WCAG and another that deliberately incorporates common accessibility violations. These violations reflect typical real-world pitfalls, such as missing ARIA attributes, poor semantic structure, or inaccessible keyboard interactions, thereby simulating realistic development scenarios where accessibility might be overlooked.
The second directory, named “Vue Table Components”, includes 25 distinct Vue-based implementations of table components. These examples were carefully selected and constructed to represent a diverse range of development patterns and accessibility practices. They vary in several dimensions, including the programming paradigm (with examples written using both the Composition API and the Options API), the degree and type of accessibility support provided, and the complexity of the internal structure (e.g., use of slots, scoped components, or dynamic data binding). This portion of the dataset is particularly valuable for studying how different architectural choices in Vue development influence the interpretability and accessibility of the resulting components.
Together, these two parts provide a rich and heterogeneous source of annotated web components for training, evaluating, and benchmarking LLMs and accessibility auditing tools, particularly in contexts involving dynamically generated content, which remains a significant challenge for traditional static analysis methods.
This dataset supports a broader line of research aimed at exploring the responsible integration of AI into web development workflows, with an emphasis on equity, sustainability, and inclusivity. The dataset is intended to serve multiple audiences, including AI researchers developing accessibility-aware models, Human–Computer Interaction (HCI) practitioners evaluating usability, accessibility auditors seeking AI assistance, and educators looking to train students in inclusive web development. By releasing this dataset publicly, we aim to facilitate a deeper understanding of the strengths and limitations of LLMs in real-world accessibility use cases.
The initial results based on this dataset have been published in [20], where GPT-4o mini, o3-mini, and Gemini 2.0 Flash were employed to systematically evaluate the capabilities of LLMs to identify and assess accessibility issues.
2. Data Description
The dataset is structured into two primary directories, each targeting distinct facets of accessibility evaluation within modern web development. The first directory, titled “Dynamic Generated Content”, comprises a collection of code snippets developed using four widely adopted frontend and backend technologies, Angular, React, Vue, and PHP. This directory is further organized into two subfolders. The “data” subfolder contains the source code for each snippet, representing the underlying implementations of the accessible and non-accessible tabular components. The “eval” subfolder includes resources for benchmarking and analysis, including a “labels.csv” file that enumerates the accessibility violations present in each snippet based on source code inspection and an “accessibility-reports” directory containing the reports automatically generated by the Mauve accessibility tool from the rendered HTML output. Together, these artifacts enable both a code-level and rendered-output evaluation of accessibility, facilitating a comparative analysis of LLM predictions against ground truth labels and tool-generated diagnostics. Each snippet contains two functionally equivalent versions of a tabular component, with one that adheres to the established WCAG and another that deliberately incorporates common accessibility violations. These violations reflect typical real-world pitfalls, such as missing ARIA attributes, poor semantic structure, or inaccessible keyboard interactions, thereby simulating realistic development scenarios where accessibility might be overlooked.
The name of files is defined as “$framework-table-$validity”, where
$framework indicates the Javascript framework (i.e., “angular”, “react”, and “vue”) or the “php” language.
$validity indicates whether the generated table will be valid according to the WCAG (“accessible”) or not (“invalid”).
Hence, the full list of files of the “Dynamic Generated Content” folder is
angular-table-accessible.js;
angular-table-invalid.js;
php-table-accessible.php;
php-table-invalid.php;
react-table-accessible.js;
react-table-invalid.js;
vue-table-accessible.html;
vue-table-invalid.html.
The second directory, named “Vue Table Components”, includes 25 distinct Vue-based implementations of table components. The former one is structured in the “data” and “eval” subfolders. The table components were carefully constructed to represent a diverse range of development patterns and accessibility practices. They vary in several dimensions, including the programming paradigm (with examples written using both the Composition API and the Options API), the degree and type of accessibility support provided, and the complexity of internal structure (e.g., use of slots, scoped components, or dynamic data binding). This portion of the dataset is particularly valuable for studying how different architectural choices in Vue development influence the interpretability and accessibility of the resulting components.
Each distinct Vue-based implementation is structured in a separate folder named “delivery-$n”, where “$n” ranges from “01” to “25”. Each folder contains the following files and folders:
[Figure omitted. See PDF]
3. Methods
3.1. Languages and Frameworks Covered
The dataset has been designed to reflect real-world patterns in contemporary web development, with a particular focus on technologies that dominate the current frontend and backend ecosystem. In selecting the technologies to be included, priority was given to JavaScript frameworks and PHP based on their widespread adoption and relevance in both legacy and modern web applications.
JavaScript remains the cornerstone of interactive web development. Within the JavaScript ecosystem, the dataset includes examples written in Vue, React, and Angular, three of the most prominent frontend frameworks. These frameworks offer declarative syntax, component-based architecture, and varying levels of support for accessibility, making them particularly relevant for studying how accessibility violations can manifest and be remediated in different environments. Their differences in templating systems, rendering strategies, and reactive paradigms also provide a useful contrast for evaluating the reasoning capabilities of LLMs when analyzing source code.
Vue.js was selected for special attention due to its flexibility in allowing developers to write components using two distinct paradigms, namely the Options API and the newer Composition API. The Options API relies on a traditional object-based syntax, promoting readability and clarity for beginners. In contrast, the Composition API uses the
In parallel, PHP was included as a representative of server-side technologies that still play a significant role in generating dynamic HTML content. Though often overlooked in modern single-page applications, PHP remains widely used in content management systems and traditional websites. Including PHP in the dataset allows for the evaluation of LLMs in contexts where accessibility violations arise from backend generated content, such as dynamic tables or forms rendered from server-side scripts.
Together, these frameworks and languages ensure that the dataset captures both client-side and server-side sources of accessibility issues, supporting comprehensive analysis and benchmarking.
3.2. Details on Development
The dataset was constructed through a two-phase process involving both professor-authored and student-authored code. The first phase involved manually crafting dynamic web snippets in Angular, React, Vue, and PHP. For each technology, two versions of a table component were created, with one being accessible, adhering to WCAG 2.1 guidelines (e.g., semantic markup, ARIA roles, keyboard support), and one being non-accessible, featuring common violations such as missing labels, improper heading structures, or lack of alternative text. These components were written to reflect typical developer practices and errors, ensuring realistic input for LLM evaluation.
The second phase involved collecting contributions from a cohort of computer science students participating in a web development course. Each student was instructed to design and implement four independent Vue table components as Single-File Components (SFCs) using
A minimal table using the Options API;
A minimal table using the Composition API;
An accessible table using the Options API;
An accessible table using the Composition API.
To guide development and promote consistency, the following constraints were imposed: Each student was required to select a unique dataset of at least 50 rows and 5 attributes, representing real-world data such as songs, movies, environmental measurements, etc. Students were encouraged to modularize their code using multiple components where appropriate and to leverage native Vue features (e.g., props, slots, reactivity). All components had to remain autonomous, meaning no shared logic or global imports were allowed across implementations. Code duplication was explicitly permitted to ensure isolation. Deliverables were submitted as compressed
To ensure methodological rigor and reproducibility, the dataset creation process was structured into a multi-step workflow. The activities were carried out iteratively by students and the professor as follows (Figure 1 provides a visual overview of the process): A set of rules and consistency guidelines for code submission was defined. Each student proposed a unique dataset through a dedicated online form, ensuring diversity and avoiding duplication across the cohort. Students implemented their Vue components according to the predefined rules and submitted their initial code. The professor performed a first review of all submissions, providing individual feedback. Students revised and corrected their code based on the professor’s comments and submitted the final version. The professor collected all the submitted works and performed manual quality checks, including Normalization of directory structures; Removal of unnecessary or temporary files (e.g., Verification of code readability, indentation, and consistency; Ensuring presence of both accessible and non-accessible versions. The entire dataset was anonymized to remove any personally identifiable information or metadata.
In parallel to the student contributions, the professor developed the dynamically generated content dataset, which has gone through its own validation phase to integrate manually created components. Once both parts were refined and validated, they were carefully integrated into a single collection that constituted the final dataset. Finally, the dataset was then uploaded to Zenodo to ensure open access, reproducibility, and long-term preservation. This workflow guaranteed both the quality and the FAIR compliance of the resulting dataset.
This rigorous protocol ensured that each submission represented a self-contained, fully functional Vue project capable of being evaluated in isolation. Importantly, the dual-paradigm structure (Options vs. Composition API) allows researchers to study whether the underlying code organization influences the likelihood of accessibility errors or the ease with which LLMs can interpret and remediate them.
3.3. Implementation Diversity Within Table Structures
A critical aspect of our benchmarking dataset is the diversity of table implementations. To effectively evaluate an LLM’s ability to navigate and interpret the DOM of complex web pages, we first categorized our collected table examples based on their underlying Vue component architecture. The results of this categorization are presented in Table 1.
As shown in the table, we identified three primary approaches to table component splitting. The first, labeled “Single file”, represents the simplest and most common implementation, where the entire table structure, including the
| AI | Artificial Intelligence |
| ML | Machine Learning |
| LLM | Large Language Model |
| SDGs | Sustainable Development Goals |
| WCAG | Web Content Accessibility Guidelines |
| HCI | Human–Computer Interaction |
| DOI | Digital Object Identifier |
| SFC | Single-File Component |
| UI | User Interface |
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1 Methodology for the creation of the dataset. Tasks were performed accordingly by professor (P) or students (S).
Categorization of table implementations based on Vue component splitting and their corresponding delivery examples.
| Split Approach | Deliveries |
|---|---|
| Single file | 3, 4, 10, 13, 17, 18, 19, 23 |
| Body-only | 1, 8, 12 |
| Header and body | 2, 5, 6, 7, 9, 11, 14, 15, 16, 20, 21, 22, 24, 25 |
List of datasets used by students in the various deliveries.
| Delivery | Name | Reference |
|---|---|---|
| delivery-01 | Cinema anomalies | N/A |
| delivery-02 | Amazon Musical Instruments Reviews | [ |
| delivery-03 | NASA - Near-Earth Asteroids and Comets | [ |
| delivery-04 | NVIDIA-STOCK-DATA | [ |
| delivery-05 | Real World Smartphone’s | [ |
| delivery-06 | Used Car | [ |
| delivery-07 | Achieved Frames per Second (FPS) in Video Games | [ |
| delivery-08 | Pokemon | [ |
| delivery-09 | Footballers with 50+ International Goals [men] | [ |
| delivery-10 | EPL Dataset 2022/2023 | [ |
| delivery-11 | Global Heath Statistics | [ |
| delivery-12 | World Population by Country | [ |
| delivery-13 | Air Quality and Pollution Assessment | [ |
| delivery-14 | Crime Rate by Country 2024 | [ |
| delivery-15 | NBA Players | [ |
| delivery-16 | Fortnite Players Stats | [ |
| delivery-17 | League of Legends Master+ Players | [ |
| delivery-18 | Top 100 Most Streamed Songs on Spotify | [ |
| delivery-19 | The Office Dataset | [ |
| delivery-20 | Food Recipes | [ |
| delivery-21 | NASA - Earth Meteorite Landings | [ |
| delivery-22 | Google Playstore | [ |
| delivery-23 | Erasmus mobility statistics 2014_2019 | [ |
| delivery-24 | Bank marketing | [ |
| delivery-25 | Game Recommendations on Steam | [ |
1. Ferri, D.; Favalli, S. Web Accessibility for People with Disabilities in the European Union: Paving the Road to Social Inclusion. Societies; 2018; 8, 40. [DOI: https://dx.doi.org/10.3390/soc8020040]
2. Bricout, J.; Baker, P.M.A.; Moon, N.W.; Sharma, B. Exploring the Smart Future of Participation: Community, Inclusivity, and People With Disabilities. Int. J.-E-Plan. Res.; 2021; 10, pp. 94-108. [DOI: https://dx.doi.org/10.4018/IJEPR.20210401.oa8]
3. Teixeira, P.; Eusébio, C.; Teixeira, L. Understanding the integration of accessibility requirements in the development process of information systems: A systematic literature review. Requir. Eng.; 2024; 29, pp. 143-176. [DOI: https://dx.doi.org/10.1007/s00766-023-00409-8]
4. Abou-Zahra, S.; Brewer, J. Standards, Guidelines, and Trends. Web Accessibility; Springer: London, UK, 2019; pp. 225-246. [DOI: https://dx.doi.org/10.1007/978-1-4471-7440-0_13]
5. WebAIM-Web Accessibility in Mind. WAVE Web Accessibility Evaluation Tools. 2025; Available online: https://wave.webaim.org/ (accessed on 6 August 2025).
6. Deque Systems Inc. Axe: The Accessibility Engine. 2024; Available online: https://www.deque.com/axe/ (accessed on 6 August 2025).
7. Chrome for Developers. Introduzione a Lighthouse. 2025; Available online: https://developer.chrome.com/docs/lighthouse/overview?hl=en (accessed on 6 August 2025).
8. Ara, J.; Sik-Lanyi, C. Automated evaluation of accessibility issues of webpage content: Tool and evaluation. Sci. Rep.; 2025; 15, 9516. [DOI: https://dx.doi.org/10.1038/s41598-025-92192-5] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/40108199]
9. Alarcon, R.; Moreno, L.; Martinez, P. Lexical Simplification System to Improve Web Accessibility. IEEE Access; 2021; 9, pp. 58755-58767. [DOI: https://dx.doi.org/10.1109/ACCESS.2021.3072697]
10. Roumeliotis, K.I.; Tselikas, N.D. Evaluating Progressive Web App Accessibility for People with Disabilities. Network; 2022; 2, pp. 350-369. [DOI: https://dx.doi.org/10.3390/network2020022]
11. Seixas Pereira, L.; Duarte, C. Evaluating and monitoring digital accessibility: Practitioners’ perspectives on challenges and opportunities. Univers. Access Inf. Soc.; 2025; 24, pp. 2553-2571. [DOI: https://dx.doi.org/10.1007/s10209-025-01210-w]
12. Abou-Zahra, S.; Brewer, J.; Cooper, M. Artificial Intelligence (AI) for Web Accessibility: Is Conformance Evaluation a Way Forward?. Proceedings of the 15th International Web for All Conference, W4A ’18; Lyon, France, 23–25 April 2018; ACM: New York, NY, USA, 2018; pp. 1-4. [DOI: https://dx.doi.org/10.1145/3192714.3192834]
13. Makati, T. Machine learning for accessible web navigation. Proceedings of the 19th International Web for All Conference, W4A’22; Lyon, France, 25–26 April 2022; ACM: New York, NY, USA, 2022; pp. 1-3. [DOI: https://dx.doi.org/10.1145/3493612.3520463]
14. Ara, J.; Sik-Lanyi, C. Webpage Accessibility Evaluation Using Machine Learning Technique. Proceedings of the 2023 14th IEEE International Conference on Cognitive Infocommunications (CogInfoCom); Budapest, Hungary, 22–23 September 2023; IEEE: New York, NY, USA, 2023; Volume 9, pp. 000069-000074. [DOI: https://dx.doi.org/10.1109/coginfocom59411.2023.10397496]
15. Pedemonte, G.; Leotta, M.; Ribaudo, M. Improving Web Accessibility With an LLM-Based Tool: A Preliminary Evaluation for STEM Images. IEEE Access; 2025; 13, pp. 107566-107582. [DOI: https://dx.doi.org/10.1109/ACCESS.2025.3577519]
16. Oswal, S.K.; Oswal, H.K. Conversational AI for Accessible Website Design: Integrating LLM Assistants in Website Builders. New Frontiers for Inclusion; Springer Nature: Cham, Switzerland, 2025; pp. 251-261. [DOI: https://dx.doi.org/10.1007/978-3-031-84681-6_22]
17. Othman, A.; Dhouib, A.; Nasser Al Jabor, A. Fostering websites accessibility: A case study on the use of the Large Language Models ChatGPT for automatic remediation. Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments, PETRA ’23; Corfu, Greece, 5–7 July 2023; ACM: New York, NY, USA, 2023; pp. 707-713. [DOI: https://dx.doi.org/10.1145/3594806.3596542]
18. Delnevo, G.; Andruccioli, M.; Mirri, S. On the Interaction with Large Language Models for Web Accessibility: Implications and Challenges. Proceedings of the 2024 IEEE 21st Consumer Communications & Networking Conference (CCNC); Las Vegas, NV, USA, 6–9 January 2024; IEEE: New York, NY, USA, 2024; pp. 1-6. [DOI: https://dx.doi.org/10.1109/ccnc51664.2024.10454680]
19. López-Gil, J.M.; Pereira, J. Turning manual web accessibility success criteria into automatic: An LLM-based approach. Univers. Access Inf. Soc.; 2024; 24, pp. 837-852. [DOI: https://dx.doi.org/10.1007/s10209-024-01108-z]
20. Andruccioli, M.; Bassi, B.; Delnevo, G.; Salomoni, P. Leveraging Large Language Models for Sustainable and Inclusive Web Accessibility. Preprints; 2025; [DOI: https://dx.doi.org/10.20944/preprints202508.0453.v1]
21. Amazon Music Reviews Dataset. Available online: https://www.kaggle.com/datasets/eswarchandt/amazon-music-reviews (accessed on 15 May 2025).
22. NASA. Near-Earth Asteroids and Comets. Available online: https://data.nasa.gov/resource/2vr3-k9wn.json (accessed on 6 August 2025).
23. NVIDIA Stock Data. Available online: https://www.kaggle.com/datasets/muhammaddawood42/nvidia-stock-data (accessed on 15 May 2025).
24. Real World Smartphones Dataset. Available online: https://www.kaggle.com/datasets/abhijitdahatonde/real-world-smartphones-dataset (accessed on 15 May 2025).
25. Used Car Dataset. Available online: https://www.kaggle.com/datasets/mohitkumar282/used-car-dataset?select=used_car_dataset.csv (accessed on 15 May 2025).
26. FPS in Video Games Dataset. Available online: https://www.kaggle.com/datasets/kritikseth/achieved-frames-per-second-fps-in-video-games (accessed on 15 May 2025).
27. Pokémon Dataset. Available online: https://www.kaggle.com/datasets/jaidalmotra/pokemon-dataset (accessed on 15 May 2025).
28. Footballers with 50+ International Goals. Available online: https://www.kaggle.com/datasets/whisperingkahuna/footballers-with-50-international-goals-men (accessed on 15 May 2025).
29. EPL Dataset 2022-2023. Available online: https://www.kaggle.com/datasets/acothaha/epl-dataset-20222023-update-every-week (accessed on 15 May 2025).
30. Global Health Statistics. Available online: https://www.kaggle.com/datasets/malaiarasugraj/global-health-statistics (accessed on 15 May 2025).
31. 2023 World Population by Country. Available online: https://www.kaggle.com/datasets/rajkumarpandey02/2023-world-population-by-country?resource=download&select=countries-table.json (accessed on 15 May 2025).
32. Air Quality and Pollution Assessment. Available online: https://www.kaggle.com/datasets/mujtabamatin/air-quality-and-pollution-assessment (accessed on 15 May 2025).
33. Crime Rate by Country 2024. Available online: https://www.kaggle.com/datasets/shahriarkabir/crime-rate-by-country-2024 (accessed on 15 May 2025).
34. NBA Players Data. Available online: https://www.kaggle.com/datasets/justinas/nba-players-data (accessed on 15 May 2025).
35. Fortnite Players Stats. Available online: https://www.kaggle.com/datasets/iyadali/fortnite-players-stats?select=Fortnite_players_stats.csv (accessed on 15 May 2025).
36. League of Legends Master Players. Available online: https://www.kaggle.com/datasets/jasperan/league-of-legends-master-players (accessed on 15 May 2025).
37. Top 100 Most Streamed Songs on Spotify. Available online: https://www.kaggle.com/datasets/pavan9065/top-100-most-streamed-songs-on-spotify/data (accessed on 15 May 2025).
38. The Office Dataset. Available online: https://www.kaggle.com/datasets/nehaprabhavalkar/the-office-dataset (accessed on 15 May 2025).
39. Recipes3k Dataset. Available online: https://www.kaggle.com/datasets/crispen5gar/recipes3k (accessed on 15 May 2025).
40. NASA Meteorite Landings (Y77D-TH95). Available online: https://data.nasa.gov/resource/y77d-th95.json (accessed on 15 May 2025).
41. Google Play Store Dataset. Available online: https://www.kaggle.com/datasets/arnikaer/googleplaystore (accessed on 15 May 2025).
42. Erasmus Mobility Statistics (2014–2019). Available online: https://www.kaggle.com/datasets/donjoeml/erasmus-mobility-statistics-2014-2019 (accessed on 15 May 2025).
43. Bank Marketing Dataset. Available online: https://www.kaggle.com/datasets/mahdiehhajian/bank-marketing (accessed on 15 May 2025).
44. Steam Game Recommendations. Available online: https://www.kaggle.com/datasets/antonkozyriev/game-recommendations-on-steam (accessed on 15 May 2025).
45. Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J.; Appleton, G.; Axton, M.; Baak, A.; Blomberg, N.; Boiten, J.W.; da Silva Santos, L.B.; Bourne, P.E.
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.