Development of a BIM-based AI-driven matching tool for LCA datasets

Abstract

The construction sector significantly contributes to environmental issues and often relies on Life Cycle Assessment (LCA) for the quantification and optimization of its environmental impacts. One of the most time- and labour-intensive tasks in LCA is matching real elements (e.g., construction elements and materials) to suitable environmental datasets to get an idea of the element’s sustainability performance (emissions). In this regard, this study presents an open-access software tool that leverages artificial intelligence (AI) to support the matching process between construction elements in Building Information Modelling (BIM) with corresponding environmental datasets in a semi-automatic manner. Developed in Python and using the GPT-4o mini model from OpenAI for its matching mechanism, the tool demonstrates how AI-driven digital innovation can improve efficiency, reduce manual effort, and enhance early-stage environmental assessment in construction planning, while integrating sustainability data into BIM workflows. Through a series of use cases, the software’s ability to address key challenges in the integration of BIM and LCA tools is demonstrated, showcasing a high degree of automation and interoperability. Moreover, the accessible design of the tool allows use without extensive technical knowledge. The conducted validation tests confirmed the tool’s potential for accurate LCA matching, highlighting opportunities for AI to enhance sustainability workflows while offering BIM experts a better understanding of the challenges in sustainability assessment.

Full text

Translate

Turn on search term navigation

Introduction

The primary industry, particularly the construction sector, contributes to approximately 21% of the global greenhouse gas emissions and, in 2022, accounted for 34% of global energy demand, as well as for 37% of energy and process-related CO₂ emissions [1]. Given the significant negative impact of the construction sector on the environment, it plays a crucial role in combating climate change, emphasizing the necessity for solutions that promote more environmentally friendly alternatives within the industry [2, 3–4].

Life Cycle Assessment (LCA) is a method for assessing the environmental impacts of, for example, a construction project over its entire life cycle [5, 6, 7–8]. Guided by the ISO 14040 and ISO 14044, LCA is considered the most robust methodology for environmental assessment and is the only internationally standardized approach for measuring environmental sustainability [9, 10, 11–12].

LCA has been applied in the construction industry for years and diverse standards regulate its implementation, including EN 15978:2011, which specifies the methodology for conducting LCA of buildings [13]. However, its application still faces challenges [14, 15]. One of the primary issues when conducting LCA is the need for collecting, compiling, and calculating a substantial amount of data, making it a time- and resource-intensive process [14]. Furthermore, LCAs are typically carried out at the end of the design phase, once all required data is available. This implies that design decisions with the most significant environmental impacts have already been made, thus preventing the potential optimization of the construction project’s environmental performance in the decision-making process [16, 17–18].

To optimise LCA applications within the construction sector, integrating Building Information Modelling (BIM) with LCA methods is potentially one of the most promising solutions [19, 20–21]. BIM is a holistic, collaborative working method for the creation, processing, and management of data in a digital, dynamic three-dimensional model of construction projects throughout their entire life cycle [22].

On the one hand, applying advanced modelling technologies and data-driven solutions provides enhanced precision, efficiency, and scalability in conducting environmental impact assessments [23]. On the other hand, integrating the digital representation of construction projects through BIM with the comprehensive assessment offered by LCA creates a dynamic process, capturing design specifics and evaluating their environmental impacts. The synergy arising from the incorporation of BIM and LCA offers the potential to modernize evaluations, improve exchange of information, and increase the accuracy and range of results [24]. Additionally, combining BIM and LCA tools enables the rapid generation of assessment results for every possible design choice, supporting decision-making throughout the design phase [25, 26–27]. With relevant information automatically retrieved, time and labour savings can be achieved, streamlining the process further [28, 29].

Alongside LCA and BIM, artificial intelligence (AI)/machine learning (ML) (in the following referred to as AI) represents an additional category of tools that supports the designers in the decision-making process during the initial phases (ex-ante) of more environmentally friendly design [30]. This early decision-making under high uncertainty represents one major challenge for LCA. AI is a remarkable technology that currently plays a significant role in the digital transformation and has gained wide acceptance throughout diverse sectors, boosting efficiency and accelerating design development [30, 31].

Considering the aforementioned potentials, a holistic approach that combines BIM, LCA, and AI is essential for the path towards more sustainable construction [30]. In particular, one of the most critical, yet time- and labour-intensive processes in an LCA study is identifying the appropriate datasets in LCA databases that correspond to the building elements [15, 19, 32]. This dataset matching process currently lacks sufficient automation, which delays early-stage environmental assessment and limits the broader adoption of LCA in design and construction workflows.

This study develops and validates an open-access software solution that leverages AI to automate the matching of construction elements within a BIM model to the corresponding LCA datasets. Based on this aim, the following sub-objectives – formulated as research questions – are addressed:

How can AI be applied to improve the matching process between BIM models and LCA datasets?
To what extent can such AI-based matching reduce the manual effort and time required to conduct an environmental assessment of a construction project?
What are the technical and practical implications of the proposed tool?

To answer these questions, we first provide a systematic literature review of the current state-of-the-art, highlighting the main challenges faced in the integration BIM, LCA, and AI. This is followed by a detailed description of the methodology used in this study, including the development and validation process of our proposed matching solution. The results of the tool’s implementation are presented, showcasing its achievements, limitations, and potential future applications. Finally, the discussion, limitations and conclusion sections provide a critical analysis of the findings and their implications for sustainable construction practices, along with recommendations for further research – also sector-independent.

State of the art

A systematic literature review was conducted to determine the extent to which automated matching systems for LCA datasets within planning tools have been developed. A structured approach was used to ensure a comprehensive overview of existing publications consisting of three steps: Identification, Screening and Eligibility.

The databases Scopus and Web of Science were selected for the identification of relevant studies due to their coverage of a broad range of scientific fields and their recognition as leading indexed databases [32, 33]. The literature identification process was restricted to English and German publications from 2014 to 2024. To ensure that all relevant literature is included, no further limitations were imposed on publication types or sources. The search query was constructed by selecting keywords and their relevant synonyms that pertain to BIM, LCA, and AI. Furthermore, the Boolean operators “AND” and “OR” were used to combine these terms. The following search queries were used: (“Building Information Modelling” OR “BIM”) AND (“LCA” OR “Life Cycle Assessment” OR “Life Cycle Analysis” OR “Sustainability Analysis” OR “Carbon Footprint”) AND (“AI” OR “Artificial Intelligence” OR “Machine learning” OR “Automated”). Searches were performed within the title, abstract, and keyword sections of each publication. The review was conducted in Spring 2024, identifying 51 publications in Web of Science and 62 in Scopus, with 29 duplicates removed. Ultimately, 84 publications were selected for the subsequent screening.

The screening involved a detailed evaluation of titles and abstracts to assess relevance to the research objectives. Two main inclusion criteria were applied: (1) The publications should address the development of software for BIM-based LCA dataset matching using AI. Publications that only describe theoretical approaches without actual code implementation were excluded, as the focus is on practical, programmed software solutions. (2) The software described in the publications must have open access source code to allow for transparency and potential further development.

After screening, 76 publications were excluded, leaving eight publications for the eligibility phase. In this step, full-text analyses were conducted to evaluate compliance with the inclusion criteria. Ultimately, only one publication met all criteria and was selected for in-depth analysis.

Hermann et al. [34] introduced the BIMetrix method as a solution to automate LCAs in construction projects. The method incorporates several processes, including semantic and geometric processing of Industry Foundation Classes (IFC) models, environmental impact calculation, and result visualization. These processes are intended to be performed automatically through the ETL software (Exchange, Transform, Load) Feature Manipulation Engine (FME) Workbench. According to Hermann et al. [34], the AI model, trained on data from multiple large-scale projects, matches the mapping strings to the appropriate LCAs. However, manual verification of AI-assigned Environmental Product Declarations (EPDs) remains necessary, although full automation is the long-term goal.

Several tools exist for BIM-based LCA dataset matching, including commercial software such as One Click LCA, which enable matching of construction materials or processes based on LCA data. However, most of these tools do not transparently document their use of AI, as the extent of automation is often unclear or only briefly indicated on official sources. From the conducted literature review, BIMetrix represented a notable exception, as it explicitly documents the application of AI within its matching process, making it a particularly relevant reference for this study. Nevertheless, further examination of BIMetrix and direct correspondence with the authors confirmed that BIMetrix currently lacks a user interface and is not available as an open-access tool. Furthermore, in the current development stage, dataset mapping is still performed manually by experts.

The findings of the review indicate that no open-access BIM-based LCA smart-matching software is currently available. Complementary studies outside the inclusion criteria include the work by Forth et al. [35, 36], which presents an approach based on Natural Language Processing (NLP) that links early-stage BIM elements with LCA datasets through semantic processing. Although the software is not openly accessible and was therefore excluded from the review, this study demonstrates the potential of domain-specific NLP models to achieve high matching accuracy and highlights alternative methodological directions for automated BIM-LCA integration. Building on these insights, the following section discusses current challenges in this research field and outlines the conceptual foundation for the proposed software solution.

Challenges in BIM-LCA integration

Despite the lack of publicly accessible BIM-based LCA smart-matching software tools, the conducted literature review revealed possible reasons for this research and application gap. In particular, two recent reviews provided key insights. Chen et al. [2] evaluated the role of BIM software in optimizing the LCA process to improve its efficiency and accuracy, delving into the capacities and constraints of a range of BIM software and LCA tools and presenting characteristics from cases of BIM-LCA integration. Meanwhile, Akbari et al. [37] conducted a comprehensive analysis of the integration between BIM and sustainability, highlighting overlooked issues. These studies outlined three core challenges in BIM-LCA integration:

Degree of automation: Many processes within BIM-LCA integration require manual intervention, such as data selection and the linking of information across different software platforms, which complicates the iterative design process. Automated semantic analysis tools offer a promising solution to overcome the challenges posed by manual data classification and reasoning during data exchange. An automatic or semi-automatic integration of BIM and LCA improves evaluation efficiency and enhances the usability of more complex models in the future. According to Chen et al. [2], enhancing the automation and precision of semantic analysis is one direction for advancing BIM-LCA together with other technologies.
Data exchange and interoperability: Interoperability facilitates data translation between BIM software, energy consumption tools, and LCA tools, streamlining workflows by eliminating the need for manual data copying from previous applications [38]. Despite extensive research efforts, interoperability continues to be one of the most significant challenges hindering the long-termed adoption of BIM [37]. A key issue lies in the differing data structures between LCA and BIM, which hinders efficient data exchange and necessitates manual mapping of material data [2]. To enable mutual data exchange, BIM software and LCA tools must align with a common data structure [39]. The use of standardized formats for data sharing during the planning and design phase plays a critical role in achieving interoperability. The adoption of such formats is an important direction for the future of BIM-LCA integration [2]. According to Akbari et al. [37], the development of a fully automated approach using AI techniques is a potential future research direction for overcoming interoperability challenges.
Level of information need: The Level of Information Need (LOIN) in BIM enables architects, engineers, and construction professionals to define and explain the content and accuracy of Building Information Models throughout different stages in the design and construction process [40]. LOIN defines, for each use case and project stage, the necessary geometric, alphanumeric and documentation requirements. As such, it is purpose-driven and non-numeric, and is specified per exchange scenario. It is crucial to recognize that assigning a fixed LOIN for all construction components is impractical, and continuous updates to the LOIN throughout the project life cycle are critical [19]. Early-stage models (low LOIN) allow designers to quickly assess and adjust early design decisions, but these models offer limited precision in predicting environmental impacts [2, 5]. This presents a challenge where LCA can be applied in a simplified approach with limited accuracy early in the project using incomplete data, or later when all information is accessible, although this would be regarded as being too late to influence decision-making [37].

Based on this literature research and the analysed challenges, the methodology for the developed matching tool for BIM and LCA databases is presented in the following.

Methodology

This section presents the development of an innovative software approach for a BIM-based AI-driven LCA dataset matching. The proposed software solution leverages AI to achieve precise and time-efficient matching of LCA datasets with the corresponding construction elements within a BIM model. This research follows an applied, deductive approach, combining conceptual design with iterative development and testing to address the stated research objectives (see Sect. 1). This approach enables a systematic design and evaluation of the software, with the aim to improve the matching process between BIM models and LCA datasets through targeted AI integration, assess efficiency gains and reduction of manual effort in LCA workflows, and evaluate the technical feasibility and practical implications of the proposed tool. The goal is to provide a comprehensive understanding of the theoretical principles, practical implementation, and evaluation of this BIM-based AI-driven LCA dataset matching approach. This section is complemented by Section S1 of the Supplementary Materials, which provides an extended description and illustration of the code.

Framework for software development

The BIM-based AI-driven LCA dataset matching methodology involves three key steps, as illustrated in Fig. 1: Data Acquisition, Data Input and Data Analysis. The first step (left side Fig. 1), data acquisition, consists of two components. The first component handles the extraction of data from a BIM model, which is created using Autodesk® Revit® 2022 and exported in IFC 4 format, enabling the extraction of relevant data required for the LCA dataset matching.

[See PDF for image]

Fig. 1

Matching tool workflow

Broad compatibility is crucial for the software development in this study, as it allows the integration of data from various BIM tools, ensuring that the AI-driven LCA dataset matching methodology can be applied universally. IFC also enhances the performance, effectiveness, and efficiency of life cycle management by enabling information sharing between BIM and other IFC-compatible environments [41]. This capability is particularly important for the software, as it supports a continuous flow of information throughout the project life cycle, enabling the consistent application of environmental assessments.

The second key component is the connection to an LCA database. The database chosen for this work is ÖKOBAUDAT, a standardised German database offering extensive LCA datasets for construction materials, as well as for processes like construction, transport, energy, and disposal [42]. ÖKOBAUDAT provides open access data and contains generic, as well as company- or association-specific datasets [42]. The decision to use ÖKOBAUDAT is based on its comprehensiveness, reliability and accessibility. Moreover, ÖKOBAUDAT is specifically tailored to the requirements of the construction industry in Germany. Furthermore, the datasets are standardised and verified. The ÖKOBAUDAT datasets are downloaded as Comma-separated values (CSV) export for importing and processing in the software and correspond to the most current version of the database, Release 2023-I [42].

The second step comprises the data input (middle square Fig. 1). For each construction element in the IFC model, values required for the matching process with the LCA datasets are queried. These parameters include IFC type, name, and a variety of property sets for each construction element. We use a rule-based extractor that targets IFC-native sources and standardized property sets rather than project-specific ad-hoc property sets. We read element/type and material associations (e.g., IfcMaterialLayerSet(Usage)), quantities (BaseQuantities: NetVolume, NetArea; Revit-fallback PSet_Revit_Dimensions), and standardized common property sets (e.g., Pset_WallCommon.IsExternal, LoadBearing). Project-specific or custom property sets are supported via a configurable list of aliases and used only as fallback when standardized sources are missing. In parallel to the BIM data query, essential parameters are extracted from each LCA dataset, including the name, category, and country of origin of the material or product. The automated linking of these two data sources is optimized through the targeted query and comparison of relevant parameters from the IFC model and the LCA datasets. The systematic extraction of location information, typifications and specific properties ensures improved reliability and accuracy of the matching process.

Within this step, a pre-matching procedure is conducted to ensure the reliability and consistency of the information extracted from the IFC model and the LCA datasets before initiating the matching step. This procedure addresses several important aspects. First, the consistency of the reference units is verified by extracting the functional unit of each LCA dataset and calculating the corresponding quantity (e.g., volume, area, mass) from the geometric representation of the IFC entity. Second, coverage is checked by identifying missing values in the property sets of each building element. Although the workflow assumes that the full IFC model is used “as-is”, matches are still initiated for elements with incomplete feature sets. Finally, mapping preparation is performed through the generation of unique fingerprints for each building component, which reduces noise and ambiguity caused by incomplete BIM information.

The final step (right side Fig. 1) is the data analysis, which involves the matching mechanism of the extracted elements from the BIM model with LCA datasets from ÖKOBAUDAT. The matching is conducted by the GPT-4o mini model from OpenAI. GPT-4o mini achieved an accuracy of 82% in Massive Multitask Language Understanding (MMLU), a benchmark for textual intelligence and reasoning [43]. These capabilities are crucial for accurately identifying the similarities between the construction elements in the BIM model and the LCA datasets. GPT-4o mini is also characterized by high cost-efficiency, making it more affordable for users and enabling a wider adoption [43].

It should be highlighted that the current implementation focuses on the core task of LCA dataset matching. Other fundamental LCA aspects, such as the definition of the functional unit, system boundaries, allocation procedures, or specific scenarios, are not yet integrated, but could be integrated in future versions of the software. In its current state, the main goal of the tool is to demonstrate the feasibility of AI-supported matching and lay the foundation for more comprehensive LCA automation in subsequent development phases.

The chosen programming language for implementing our approach is Python, which is advantageous with regards to ease of use and range of applications [44], e.g. ML and data analysis [45]. For processing IFC files, we use the ifcopenshell library and use OpenAI’s own library for conducting Application Programming Interface (API) calls on the Large Language Model (LLM). We differentiate between two variants of our workflow, one being an interactive matching approach and the second one representing an automatic matching, which we utilize for the comparison to manual LCA dataset matching in Sect. 3.3.

Interactive matching process

The interactive matching process focuses on correctness and applicability in realistic scenarios, where a supervisor validates the matching process consistently. Fig. 2 depicts the process for interactive matching of construction elements with the most appropriate LCA datasets. This process ensures that the environmental datasets are accurately aligned with the characteristics of the construction element (in terms of quality, materials, production processes, etc.) and system boundaries, incorporating location-based variations, user input, and iterative refinement to achieve optimal results, while achieving a semi-automated matching process.

[See PDF for image]

Fig. 2

Matching process

The process is structured into a user dialog, which begins by verifying whether the country of the construction site is the same as the country from which the construction element originates. This user input helps limiting the choice of an appropriate LCA dataset, that might otherwise be competing with similar products from unrealistic locations. Next, a request based on the most informative properties of the building element is extracted from the IFC file and used for generation of a request to the LLM model. The information set comprises the IFC type, name, and associated property set giving information about the material. The environmental database was first provided to the LLM as a vector store to allow efficient data access. This comparison focuses on aligning the attributes of the element with corresponding fields in the database, specifically the LCA dataset name and category. Based on this comparison, the most suitable LCA dataset for the element is automatically identified and suggested to the user.

The user evaluates whether the selected LCA dataset sufficiently corresponds to the element’s characteristics. This verification step ensures that the dataset accurately reflects the environmental impact profile of the selected element. In the event of a mismatch, the user is presented with the option to reassess the element’s property sets. A further refined description may include more specific attributes or additional contextual information about the element, enhancing the accuracy of subsequent comparisons. This iterative approach — comprising comparison, refinement, and re-evaluation — continues until a suitable LCA dataset is successfully identified.

The process concludes when a matching LCA dataset is confirmed to align with the construction element. The resulting matching can be stored as an additional property set of the respective component in the IFC file.

While the proposed workflow requires user verification, it differs from existing BIM-LCA plug-ins available in the market. Conventional solutions typically import quantities and classifications from BIM models but still require users to manually browse the LCA database and assign datasets to each element, often repeating this process for numerous building components depending on the complexity of the model. Some tools offer algorithm-based automated matching (e.g., One Click LCA), but unsuccessful matches are not addressed, leaving users to manually resolve potentially large numbers of mismatches in complex models. In contrast, our tool proposes alternative datasets automatically based on the properties of the building elements. In this case, the user only needs to verify and, if necessary, refine the matching.

Automatic matching process

The focus of the automatic matching process is speed and efficiency. While the core approach works similar, the IFC entities are first automatically extracted and split into entities with and without multiple layers. Since in most cases BIM models are created by reusing similar wall structures and material layers, a key necessity to avoid re-matching similar element is fingerprinting unique elements, which is conducted by assigning keys corresponding to IFC type, material association and layer structure if applicable. For each fingerprint the GUIDs for matching elements are collected so that they can be matched within a single run. Afterwards, the requests are generated as in step two of the interactive approach. However, we restrict the model to reply the UUID of the matched LCA dataset, its name and the model’s confidence in the correctness of the match. This allows automatically creating respective property sets for each element in the IFC file. Fig. 3 depicts an example of matched datasets for the layers of an external wall.

[See PDF for image]

Fig. 3

Example of dataset property sets of each layer of an external wall in ifc. “EPD assignment” and “EPD Name” refers to the LCA dataset assignment and LCA dataset name, respectively

Testing–interactive matching

To test the developed tool and validate the AI-driven LCA dataset matching, seven hypothetical IFC elements already available in Autodesk® Revit® 2022 were created in combination with shortened versions of the CSV database. The selected elements are five walls, one door and one window. The property set of the wall elements included information regarding whether it is external or load-bearing, or whether or not it extends to the structure above, as well as the thermal transmittance of the element. The property set of the door and window included whether or not these elements were external. These shortened databases were used to minimise the number of tokens during the tests and to focus on the OpenAI model’s approach to specific matching situations. Among the entries in the shortened databases, one or more LCA datasets were defined as the most suitable for each element before each test.

All selected components are modelled using basic elements in Autodesk® Revit® 2022. These base elements are generalised, which increases the tolerance for a suitable LCA dataset matching. However, this increased tolerance could lead to the AI providing identical similarity scores for different LCA datasets and eventually making a random selection of the most suitable LCA dataset among those with the same similarity scores. To check this and avoid any double counting, the tests are performed ten times under the same conditions. In the following, one of the seven tested IFC elements is introduced as an example for the tests, with the remaining tests provided as Supplementary Materials (Section S2).

One of the seven different IFC elements was defined and characterized as a reinforced concrete wall. This wall was modelled with a thickness of 20 cm and paired with the shortened database as shown in Fig. 4 (extracted from ÖKOBAUDAT).

[See PDF for image]

Fig. 4

Shortened database for ‘Basis Wall STB 200’

Six LCA datasets were selected for the shortened database (Fig. 4), with each of the four available concrete types defined as suitable datasets for the reinforced concrete wall. A precast concrete wall was included in the database to check how the AI prioritizes the term ‘Wall’ compared to ‘Reinforced Concrete’ (translation used in ÖKOBAUDAT: ‘Wand’ compared to ‘Stahlbeton’). The LCA datasets for a reinforced concrete pipe and a reinforced concrete shaft were also added to test the sensitivity to the key term ‘Reinforced Concrete’. The four concrete types were chosen to see if the AI model breaks down the term ‘Reinforced Concrete’ into its name components and accordingly selects a concrete type.

Since the concrete type cannot be determined from the element’s properties, an exact match is not expected. This situation illustrates the value of an additional element description provided by the user. The goal of this test is to examine how the model processes the term ‘Reinforced Concrete’ and whether it prioritizes a concrete type, the wall, or the reinforced concrete pipe and shaft.

In this test, the AI model chose different concrete types across multiple runs (see Fig. 5), demonstrating its ability to break down the key term ‘Reinforced Concrete’ and successfully prioritize concrete types. To further illustrate the value of the optional element description, the model was informed that the selected concrete type was incorrect and that concrete C30/37 should be selected instead. The AI model adjusted its selection accordingly and chose the correct concrete type. This test clearly shows that the additional element description has a decisive influence on the matching process and leads to a precise selection of the suitable LCA dataset.

[See PDF for image]

Fig. 5

Example: AI decision on LCA dataset when searching for ‘reinforced concrete’

Case study

To further evaluate the potential for increasing speed and efficiency of initially estimating the Global Warming Potential (GWP) of a planned construction, a case study was conducted on the widely known IFC example building Duplex_A published by the American National Institute of Building Sciences [46]. The automatic matching process developed in this manuscript was used and compared with the manual assignment by an expert. To minimize the number of influential factors that may compromise the results of the comparison, the component dimensions were chosen according to the property set of the Autodesk® Revit® dimensions available in the IFC file. These data were then used to derive the total GWP. The assignment was also limited to the physical elements relevant for the structure while disregarding other elements, such as staircase railings and furniture elements. Figure 6 shows the coloured model according to the confidence scores assigned by the LLM to each match, where the colour thresholds are defined as follows:

> = 85%—green
> = 70%—orange
< 70%—red
ignored elements—blue

[See PDF for image]

Fig. 6

Coloured Duplex_A model according to the LLMs confidence in the correct match (bottom), compared to normal colouring (top)

Although the confidence is not linked to a consistent metric, it showed to closely correspond to elements where the name and material were not directly interpretable, making it difficult for the model to assign a dataset. The pre-processing step for request generation resulted in a total 55 requests for 296 individual elements in the model. The distribution of assignment confidence scores is depicted in Fig. 7.

[See PDF for image]

Fig. 7

Distribution of the LLMs confidence in the matched entities

As shown in the figure, six elements were matched with a low perceived confidence of 75%, one example being an IfcRoof element with material information ‘Site–Grass’. This element was mapped to the dataset ‘ EPD plastic roof window—Roto Frank DST—RotoQ double glazing’, which in fact represents an incorrect match. The other elements illustrate the occasional instability of the response. In one of those cases, a layer of IfcWallStandardCase with material ‘Plasterboard’ was matched incorrectly with a multi-layer parquet, while other Plasterboard layers, e.g., for a layer in an element with type IfcCovering, was matched with an available ‘Plasterboard/Tool’ dataset or a layer of an IfcWall element with Plasterboard material that was matched to a ‘Gypsum Plaster’ dataset. Still, when comparing the outputs of the manual evaluation and the automatic matching the total GWP is estimated to 36.6 t CO₂ eq. compared to 28.0 t CO₂ eq., respectively (a negative GWP was obtained due to the accounting of biogenic carbon in elements containing wood).

The manual matching process comprised approximately 3 h, while the automatic matching took approximately 11 min of processing time. The latter included time for request creation, processing replies, storing of datasets into new elementwise property sets and evaluating the total GWP. The costs for one full set pass of requests via OpenAI platforms total around 0.06 US dollars (USD), with an approximate number of tokens 538.000 in our small example, where we make a minimalistic request like: “Please assign an EPD to Layer 0 of IfcWallStandardCase in storey ‘T/FDN’: Material is ‘Concrete—Cast In Situ’, thickness 0.417 m. External: True. Load-bearing: True.”.

To qualitatively evaluate the matching quality, each match was evaluated by an expert to determine whether the matching is realistic or not. While this is not a hard metric due to multiple datasets of various types (i.e., generic, specific, average) and with different scopes in terms of product and life cycle stages, it still enables the estimation of the degree of reliability of the automatic matching. The evaluation of the correctness of the matches resulted in a percentage of 66% accuracy, while the correlation of confidence values and correctness of the match was 0.4125, indicating that the model is not capable of fully autonomously conducting the matching. Fig. 8 shows the relation between confidence and correctness. The findings support the hypothesis that at this point the interactive approach is more useful given that a limited set of information is provided.

[See PDF for image]

Fig. 8

Visual relationship between confidence and correctness of an automatic matching

Finally, it must be highlighted that the LOIN of the Duplex_A model is low. Therefore, much information in the property sets, which is relevant for correct LCA dataset assignments, is missing or non-specific. In these cases, the Retrieval-Augmented Generation (RAG) approach of LLM might be instable due to missing cues for a correct embedding.

Discussion

One of the major obstacles identified in existing BIM-LCA integrations is the reliance on manual processes, particularly in selecting and linking data across various software platforms. This manual intervention often leads to inefficiencies related to environmental evaluations, especially in complex design scenarios. Automating the integration of BIM and LCA, while enhancing the precision of semantic analysis is seen as a crucial step toward improving evaluation efficiency and model usability. Furthermore, automation makes LCA more accessible to BIM users without a background in the method by streamlining one of its most complex steps. Additionally, it facilitates the incorporation of environmental considerations in early design stages.

The developed software addresses the automation challenge by incorporating AI, specifically the GPT-4o mini model. It provides an initial automatic match of construction components to LCA datasets without requiring user corrections for each dataset allocation. This significantly reduces manual effort, enabling real-time, automated assessments as soon as the BIM model is uploaded. By reducing the need for manual data classification, the proposed approach not only saves time but also has the potential to enhance accuracy by decreasing human error. Additionally, the integration of user-defined prompts allows for varying levels of automation. Users can manually refine results, providing a flexible balance between full automation and user control as we demonstrated with our two approaches accounting for controlled interactive matching and fully automated matching with focus on efficiency.

It is important to note that this software serves primarily as a proof-of-concept. Its main goal is to highlight the challenges of AI-supported LCA dataset matching and to facilitate collaboration between sustainability experts and programmers. The OpenAI model was selected for its accessibility and ease of implementation, demonstrating the feasibility of integrating general-purpose LLM into this workflow. This exploratory implementation lays the foundation for further iterations, refinements, and methodological advancements.

Regarding interoperability, the primary issue lies in the differing data structures between BIM software and LCA tools, which necessitates manual mapping of material data, resulting in inefficiencies and potential errors. Achieving interoperability between these platforms is crucial for improving workflow automation. This issue is addressed in our process by utilizing the IFC standard, which ensures wide compatibility across BIM platforms and consistency regarding building component properties. Although the tool proposed in this study was demonstrated using Autodesk® Revit® models, it is compatible with any BIM software that can export IFC files, making it broadly applicable across different platforms. IFC enables the generic extraction of relevant data from BIM models and match it with LCA datasets. Using standardized formats for data sharing plays a critical role in mapping entities to EPDs and LCA datasets, and represents an important step towards BIM-LCA integration. By integrating AI techniques into the matching process, our process automates data translation, reducing the need for manual and especially static mapping. This significantly addresses one of the primary obstacles to seamless information flow between BIM and LCA systems throughout the project life cycle, ensuring that environmental assessments are continuously updated as the project progresses.

Another key challenge in BIM-LCA integration is the varying LOIN across different project stages and construction components. BIM objects with lower LOIN provide limited detail, which can result in imprecise environmental assessments during early design phases. Conversely, higher LOINs typically become available later in the project, when opportunities to influence design decisions are more restricted. We mitigate this problem through our LLM-based matching, which provides LOIN flexibility in a sense that if a building element comprises less information, the matching will consequently select a more general dataset. In line with ISO 19650 and EN 17412–1, we specify the LOIN for each information exchange instead of using legacy Level of Development (LOD) 100–500 scales. For the proposed LCA matching, we formalize LOIN so that the model reliably provides: (a) geometry and unit (e.g., net area or net volume), (b) alphanumeric parameters (e.g., material name, layer thickness, density, classification, LCA dataset ID) and (c) documentation (LCA dataset source and/or version). Higher-quality LOIN at later stages improves LCA-precision, while early-stage LOIN enables coarse but decision-relevant assessments. This makes the proposed approach applicable even during the early design stages. This forms the foundation for future software iterations that enable the comparison of multiple LCA datasets during the initial design phases, allowing users to assess and explore different design options. As a result, design changes can be evaluated based on environmental impacts early on, providing valuable insights that can influence key decisions before the project progresses to more detailed stages. Moreover, the simplicity with which users can input the element description makes the software highly user-friendly, as it does not require extensive technical knowledge or expertise. This feature enhances accessibility, allowing a wide range of users to easily refine the LCA dataset matching process, further improving the software’s flexibility and ease of use.

Limitations

While the proposed LCA dataset matching software demonstrates several innovative features, certain limitations were encountered during its development and testing. These limitations highlight challenges in both the operation of the software and its potential for broader application in the current BIM-LCA integration.

Some of the limitations are the cost and energy consumption associated with using the OpenAI API. Since the software relies on the OpenAI model for LCA dataset matching, every request made to the AI incurs a cost in terms of token usage. This cost increases with the complexity and size of the BIM model being processed, particularly when large amounts of data are involved. Additionally, there is a token limit per request, which restricts the amount of data that can be processed in a single API call. This constraint can reduce the software’s ability to handle larger BIM models effectively, as it limits the comprehensiveness of the LCA dataset matching process. Furthermore, the use of LLM is associated with high energy consumption, particularly when compared to the manual matching process. It is estimated that the energy use in data centres will rise above 1000 TWh in 2026 due to the rapid growth of AI [47]. For context, Germany’s total electricity consumption in 2024 was 464 TWh [48]. To further illustrate the scale, a single Google search consumes around 0.3 Wh of electricity, whereas a ChatGPT request consumes around 2.9 Wh [47]. Therefore, despite the significant improvements in speed thanks to the developed tool, it is important to recognize that it is associated to a substantial energy consumption.

Another issue is the incompleteness of the LCA datasets of ÖKOBAUDAT. While the database provides a broad range of LCA datasets, it does not cover all possible construction components. Without full data coverage, certain construction elements may not have corresponding LCA datasets, leading to gaps in the environmental assessment. This limitation can affect the accuracy and reliability of the LCA dataset matching process, hindering a fully automated and comprehensive BIM-LCA solution.

Furthermore, there is no publicly accessible API available for the used database. Consequently, the database CSV file was used as an alternative, which also poses limitations. The structure of the CSV file often includes duplicate entries for the same LCA dataset, which can complicate the matching process and unnecessarily increase token usage. Additionally, the CSV format itself is not ideal for handling large and complex datasets due to its lack of data density. The original database contains much richer and more detailed information than the simplified CSV files provided by the database, leading to potential mismatches.

A further limitation, not only of the matching software developed in this manuscript but also of similar tools, is that many existing buildings lack “as-is” BIM models. Therefore, comprehensive and up-to-date digital representations that reflect the current conditions of these structure are often missing. This gap limits the immediate applicability of BIM-based automation tools, whose performance and accuracy heavily depend on input data quality.

Furthermore, an additional aspect of classification systems and matching processes should be mentioned, which should be examined in more detail in future studies. Within the scope of this study, the proposed matching process refers exclusively to semantic metadata (e.g., element name, type, property sets). While this can be effective in well-structured models, it also gives rise to potential limitations resulting from the omission of geometric data, especially in cases of ambiguity or incomplete information.

Finally, the lack of transparency regarding how the OpenAI model assesses similarity between BIM elements and LCA datasets is a further limitation. The internal scoring mechanism used by the AI to generate similarity values is not disclosed, which reduces the transparency of the software. This lack of insight into the AI’s decision-making process makes it difficult to interpret why certain matches are deemed more appropriate than others, ultimately reducing the user’s confidence in the system’s results.

In addition, compared to more specialized NLP-based models – e.g., such as the approach by Forth et al. [35] –, the use of a general-purpose models, such as OpenAI, involves some trade-offs. While general-purpose models are easy to implement, flexible, and require no domain-specific training data, they provide less control, interpretability, and potentially lower matching accuracy than fine-tuned models trained specifically for BIM-LCA tasks. For this study, the general-purpose LLM was chosen to enable rapid prototyping and proof-of-concept testing, but future work should examine domain-specific models that may deliver higher precision and transparency. Some of these drawbacks could potentially be mitigated by introducing simple rule-based checks to filter out implausible matches, or by lightly adapting the model with a small set of domain-specific examples. Such measures could improve reliability without losing the ease of use.

To address these limitations, future studies need to explore alternative AI models with higher token limits or designing more efficient data compression methods, that could help alleviate the constraints caused by high API usage costs and token limitations. Furthermore, integrating alternative LCA databases would improve data coverage and the reliability of the matching process, which would also broaden the scope by opening the system up to other sectors beyond the construction sector. Promoting the development of a publicly accessible API for the used database or refining the CSV file to minimize duplicate entries and increase data density would also facilitate more efficient and accurate data handling. Additionally, future research needs to focus on improving the transparency of the AI model’s similarity assessment by creating custom scoring mechanisms that provide clearer insights into the matching criteria. Such advancements would not only enhance user confidence but also promote wider adoption of the BIM-based AI-driven LCA dataset matching software.

Future research

To further improve the matching process and reduce reliance on user-generated descriptions, additional fine-tuning of the model is essential focusing on future research. An additional future key improvement would be to make the LCA database (ÖKOBAUDAT) API publicly accessible, as this would enable direct access to the most current data and ensure higher information availability. Furthermore, to increase the international applicability of the software, the code could be extended to integrate with other global publicly available LCA databases. An example of such database is the Embodied Carbon in Construction Calculator (EC3), which digitizes print construction LCA datasets and publishes them in a standardized and open-access platform [49]. This would allow the software to support a variety of databases, making it a versatile tool for environmental assessments worldwide.

Another important prospect is the development of a plug-in for Autodesk® Revit® based on the existing code. This would facilitate seamless integration of the software into common BIM workflows, enabling users to conduct LCA dataset matching directly within the BIM environment. Further refinement of the exclusion criteria for BIM elements would also optimize the matching process. Adjustments should be made to include or exclude specific elements based on the use case to ensure that all relevant components are considered in the matching process. Furthermore, a formal validation protocol could be developed to systematically assess parameter coverage, detect unit mismatches, and quantify unmapped elements, providing an additional layer of quality control and further enhancing the robustness and transparency of the workflow.

For construction elements with low LOINs, the software could be extended to display the LCA datasets with the highest similarity scores, allowing users to choose from these options. Furthermore, the software could recommend LCA datasets based on the environmental impacts to guide the decision-making process more effectively and learn from user preferences over time. Establishing clear distinctions for different system boundaries is also crucial to ensure accurate environmental assessments. Future research should focus on implementing these boundaries as customizable options within the software.

A new functionality could be developed for construction elements composed of multiple individual components that exist in the ÖKOBAUDAT database but not as a combined element. This feature could enable the selection of such components and the summation of their individual environmental impact values, thereby providing a representation of the entire element. Additionally, a calculation feature should be introduced, where the functional unit of the LCA dataset is first queried, and then the value of this unit is extracted from the construction element in the IFC file. This would enable direct calculation of emissions, thereby improving the utility of the software.

A further aspect that needs to be addressed in further research is the quantification of time savings or error reduction in the selection of datasets when this process is carried out manually (status quo) or with the solution proposed in the authors. Moreover, and in relation to the costs associated to the tool moving towards modularity in the software could enable a more flexible plug-and-play design, which in turn would facilitate the use of local instances of open-source LLMs such as Large Language Model Meta AI (LLaMA), to being able to dynamically select models on different sides of the cost-complexity spectrum of LLMs. Additionally, integrating such a custom GPT with the selected environmental database could significantly reduce the number of tokens needed per query.

Implementing these recommendations would significantly enhance the functionality and versatility of the BIM-based AI-driven LCA dataset matching software, making it a powerful tool for sustainable construction design on a global scale.

Conclusions

The research presented in this study demonstrates a well-structured and methodologically consistent progression from conceptual foundations to empirical validation. Each research component is clearly defined and logically connected, creating a coherent narrative that aligns with the overarching objective: the development and validation of an AI-driven matching tool that links Building Information Modelling (BIM) elements with Life Cycle Assessment (LCA) datasets to enhance automation, interoperability, and sustainability assessment efficiency within the construction sector.

Based on identified persistent challenges in integrating LCA into early design stages, the research gap is defined: the absence of open-access, AI-supported tools capable of automating the matching between BIM elements and LCA datasets. The systematic literature review functions as a critical research component that not only maps the current state of knowledge but also lays the conceptual groundwork for the proposed solution. The review’s structure — comprising identification, screening, and eligibility phases — ensures methodological transparency. It highlights that existing BIM–LCA tools either lack automation, rely heavily on manual intervention, or are inaccessible due to proprietary restrictions. The synthesis of the reviewed studies leads to the formulation of three key challenges: the degree of automation, data exchange and interoperability, and the level of information need (LOIN). The methodology is clearly aligned with these identified challenges and is designed to provide a proof-of-concept for addressing them. The three-step framework — data acquisition, data input, and data analysis — translates the theoretical discussion into a practical, operational workflow. The choice of the IFC standard for BIM data exchange directly addresses interoperability, while the integration of the GPT-4o mini model for semantic matching tackles the automation gap. Likewise, the consideration of different levels of detail in BIM models and their corresponding influence on LOIN ensures methodological adaptability across varying project phases. The explicit differentiation between interactive and automatic matching modes strengthens the study’s internal logic by allowing the evaluation of both user-guided precision and fully automated efficiency, thus empirically addressing the dual focus of accuracy and time savings posed in the research questions.

The following testing and validation components serve as a direct operationalization of the methodological framework. The use of controlled hypothetical IFC elements and a real-world case study (Duplex_A) enables both experimental rigor and practical applicability. The comparative analysis between manual and automated matching processes demonstrates clear time efficiencies, confirming one of the core hypotheses regarding reduced manual effort. Moreover, the inclusion of expert validation and multiple test iterations introduces a qualitative layer of assessment, supporting the reliability of the findings. The results, while modest in accuracy (66%), provide substantial empirical evidence of feasibility and confirm that the proposed approach significantly enhances efficiency in LCA dataset matching.

The discussion section is deeply interwoven with the preceding empirical findings and the theoretical framework established earlier in the paper. It revisits the main challenges identified in the literature (automation, interoperability, and LOIN) and evaluates how effectively the developed tool addresses each. The reflection on AI interpretability, model transparency, and the balance between user control and automation demonstrates an understanding of the broader implications of integrating LLMs into sustainability assessment workflows. Importantly, this section maintains conceptual alignment with the study’s original objectives by interpreting the findings not merely as technical achievements but as steps toward fostering interdisciplinary collaboration between sustainability science and computational design. The inclusion of limitations related to dataset structure and the lack of API access demonstrates awareness of the external dependencies affecting interoperability, thereby closing the loop with one of the initially identified research challenges. This paper also seeks to draw the interest of sustainability experts to AI technologies that could improve workflow design efficiency, while offering computer scientists a deeper understanding of the challenges associated with sustainability assessment.

By connecting these two fields, the potential for synergies arises, encouraging collaborative solutions. In addition to the fact that a first automated and open-access matching between planning tool and LCA data has been established, there are some challenges and future approaches that will help this first approach to a broader application and further optimize the developed AI-system.

Author contributions

D.P.: investigation, data curation, formal analysis, methodology, writing—original draft, writing—review and editing, visualization. P.H.: conceptualization, data curation, methodology, writing—review and editing, funding acquisition, project administration, resources, supervision, validation. J.G.B.: conceptualization, methodology, writing—review and editing, funding acquisition, supervision. D.C.: conceptualization, data curation, methodology, resources, supervision, validation, visualization, writing—review and editing. J.B. & M.T.: funding acquisition, project administration, resources, supervision, writing—review and editing.

Funding

Open Access funding enabled and organized by Projekt DEAL. This contribution is collaborative research between subprojects B01 and C02 of the CRC/TRR 339, Project ID 453596084, funded by the German Research Foundation (DFG). The financial support by the DFG is gratefully acknowledged.

Data availability

The datasets generated and/or analysed during the current study are available in the GitHub repository, https://github.com/DavidCrampen/EPDMatching.git

Declarations

Ethics approval and Consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Abbreviations

Artificial intelligence

API

Application programming interface

BIM

Building information modelling

CO₂

Carbon dioxide

CSV

Comma-separated values

DIN

Deutsches Institut für Normung e.V.

EC3

Embodied carbon in construction calculator

EPD

Environmental product declaration

ETL

Exchange, transform, load

FME

Feature manipulation engine

GWP

Global Warming Potential

IFC

Industry foundation classes

ISO

International standardization organization

LCA

Life cycle assessment

LLM

Large Language Model

LOD

Level of development

LOIN

Level of information need

Machine learning

MMLU

Massive multitask language understanding

NLP

Natural Language Processing

RAG

Retrieval-Augmented Generation

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1. United Nations Environment Programme, & Global Alliance for Buildings and Construction, Global Status Report for Buildings and Construction–Beyond foundations: Mainstreaming sustainable solutions to cut emissions from the buildings sector, United Nations Environment Programme, 2024.

2. Chen, Z; Chen, L; Zhou, X; Huang, L; Sandanayake, M; Yap, PS. Recent technological advancements in BIM and LCA integration for sustainable construction: a review. Sustainability; 2024; [DOI: https://dx.doi.org/10.3390/su16031340]

3. International Energy Agency and the United Nations Environment Programme, 2018 Global Status Report: towards a zero‐emission, efficient and resilient buildings and construction sector.

4. Soust-Verdaguer, B; Llatas, C; García-Martínez, A. Simplification in life cycle assessment of single-family houses: a review of recent developments. Build Environ; 2016; 103, pp. 215-227. [DOI: https://dx.doi.org/10.1016/j.buildenv.2016.04.014]

5. Cavalliere, C; Habert, G; Dell’Osso, GR; Hollberg, A. Continuous BIM-based assessment of embodied environmental impacts throughout the design process. J Clean Prod; 2019; 211, pp. 941-952. [DOI: https://dx.doi.org/10.1016/j.jclepro.2018.11.247]

6. Mesa, JA; Fúquene, CE; Maury-Ramírez, A. Life cycle assessment on construction and demolition waste: a systematic literature review. Sustainability; 2021; 13, 14 7676. [DOI: https://dx.doi.org/10.3390/su13147676]

7. Nwodo, MN; Anumba, CJ. A review of life cycle assessment of buildings using a systematic approach. Build Environ; 2019; [DOI: https://dx.doi.org/10.1016/J.BUILDENV.2019.106290]

8. Sibiude G, Lasvaux S, Lebert A, Nibel S, Peuportier B, Bonnet R, Senegas J, Raquin T. Survey on LCA results analysis, interpretation and reporting in the construction sector, Barcelone, 2017.

9. DIN EN ISO 14040, Umweltmanagement–Ökobilanz–Grundsätze und Rahmenbedingungen (ISO 14040:2006 + Amd 1:2020); Deutsche Fassung EN ISO 14040:2006 + A1:2020 2021.

10. DIN EN ISO 14044, Umweltmanagement – Ökobilanz – Anforderungen und Anleitungen (ISO 14044:2006 + Amd 1:2017 + Amd 2:2020); Deutsche Fassung EN ISO 14044:2006 + A1:2018 + A2:2020 2021.

11. Hollberg A. Parametric life cycle assessment: introducing a time-efficient method for environmental building design optimization. Dissertation, Bauhaus-Universitätsverlag Weimar, Weimar, 2017.

12. Meex, E; Hollberg, A; Knapen, E; Hildebrand, L; Verbeeck, G. Requirements for applying LCA-based environmental impact assessment tools in the early stages of building design. Build Environ; 2018; 133, pp. 228-236. [DOI: https://dx.doi.org/10.1016/j.buildenv.2018.02.016]

13. CEN, Sustainability of construction works–assessment of environmental performance of buildings–calculation method 91.040.99, 2012.

14. Almeida, R; Chaves, L; Silva, M; Carvalho, M; Caldas, L. Integration between BIM and EPDs: evaluation of the main difficulties and proposal of a framework based on ISO 19650:2018. J Build Eng; 2023; 68, 106091. [DOI: https://dx.doi.org/10.1016/J.JOBE.2023.106091]

15. Arenas, NF; Shafique, M. Recent progress on BIM-based sustainable buildings: state of the art review. Dev Built Environ; 2023; 15, 100176. [DOI: https://dx.doi.org/10.1016/j.dibe.2023.100176]

16. Basbagill, J; Flager, F; Lepech, M; Fischer, M. Application of life-cycle assessment to early stage building design for reduced embodied environmental impacts. Build Environ; 2013; 60, pp. 81-92. [DOI: https://dx.doi.org/10.1016/J.BUILDENV.2012.11.009]

17. Bueno, C; Fabricio, MM. Comparative analysis between a complete LCA study and results from a BIM-LCA plug-in. Autom Constr; 2018; 90, pp. 188-200. [DOI: https://dx.doi.org/10.1016/J.AUTCON.2018.02.028]

18. Bueno, C; Pereira, LM; Fabricio, MM. Life cycle assessment and environmental-based choices at the early design stages: an application using building information modelling. Archit Eng Des Manag; 2018; 14, pp. 332-346. [DOI: https://dx.doi.org/10.1080/17452007.2018.1458593]

19. Hollberg, A; Genova, G; Habert, G. Evaluation of BIM-based LCA results for building design. Autom Constr; 2020; 109, 102972. [DOI: https://dx.doi.org/10.1016/j.autcon.2019.102972]

20. Obrecht, TP; Röck, M; Hoxha, E; Passer, A. BIM and LCA integration: a systematic literature review. Sustainability; 2020; [DOI: https://dx.doi.org/10.3390/SU12145534]

21. Sobhkhiz, S; Taghaddos, H; Rezvani, M; Ramezanianpour, AM. Utilization of semantic web technologies to improve BIM-LCA applications. Autom Constr; 2021; [DOI: https://dx.doi.org/10.1016/J.AUTCON.2021.103842]

22. Bundesministerium für Verkehr und digitale Infrastruktur, Stufenplan Digitales Planen und Bauen Einführung moderner, IT-gestützter Prozesse und Technologien bei Planung, Bau und Betrieb von Bauwerken 2015.

23. Boje, C; Hahn Menacho, ÁJ; Marvuglia, A; Benetto, E; Kubicki, S; Schaubroeck, T; Navarrete Gutiérrez, T. A framework using BIM and digital twins in facilitating LCSA for buildings. J Build Eng; 2023; 76, 107232. [DOI: https://dx.doi.org/10.1016/j.jobe.2023.107232]

24. Siverio Lima, MS; Duarte, S; Exenberger, H; Fröch, G; Flora, M. Integrating BIM-LCA to enhance sustainability assessments of constructions. Sustainability; 2024; [DOI: https://dx.doi.org/10.3390/su16031172]

25. Ajayi, SO; Oyedele, LO; Ceranic, B; Gallanagh, M; Kadiri, KO. Life cycle environmental performance of material specification: a BIM-enhanced comparative assessment. Int J Sustain Build Technol Urban Dev; 2015; 6, pp. 14-24. [DOI: https://dx.doi.org/10.1080/2093761X.2015.1006708]

26. Najjar, M; Figueiredo, K; Palumbo, M; Haddad, A. Integration of BIM and LCA: evaluating the environmental impacts of building materials at an early stage of designing a typical office building. J Build Eng; 2017; 14, pp. 115-126. [DOI: https://dx.doi.org/10.1016/J.JOBE.2017.10.005]

27. Theißen, S; Höper, J; Drzymalla, J; Wimmer, R; Markova, S; Meins-Becker, A; Lambertz, M. Using open BIM and IFC to enable a comprehensive consideration of building services within a whole-building LCA. Sustainability; 2020; [DOI: https://dx.doi.org/10.3390/SU12145644]

28. Crippa, J; Boeing, LC; Caparelli, APA; Costa, M; Scheer, S; Araujo, AM; Bem, D. A BIM–LCA integration technique to embodied carbon estimation applied on wall systems in Brazil. Built Environ Proj Asset Manag; 2018; [DOI: https://dx.doi.org/10.1108/BEPAM-10-2017-0093]

29. .Jrade A, Abdulla R. Integrating building information modeling and life cycle assessment tools to design sustainable buildings. In: Proceedings of the CIB W78 2012: 29th international conference–Beirut, 2012.

30. Płoszaj-Mazurek, M; Ryńska, E. Artificial intelligence and digital tools for assisting low-carbon architectural design: merging the use of machine learning, large language models, and building information modeling for life cycle assessment tool development. Energies; 2024; [DOI: https://dx.doi.org/10.3390/en17122997]

31. Egwim, CN; Alaka, H; Demir, E; Balogun, H; Olu-Ajayi, R; Sulaimon, I; Wusu, G; Yusuf, W; Muideen, AA. Artificial Intelligence in the construction industry: a systematic review of the entire construction value chain lifecycle. Energies; 2024; [DOI: https://dx.doi.org/10.3390/en17010182]

32. Teng, Y; Xu, J; Pan, W; Zhang, Y. A systematic review of the integration of building information modeling into life cycle assessment. Build Environ; 2022; 221, 109260. [DOI: https://dx.doi.org/10.1016/J.BUILDENV.2022.109260]

33. Karunarathna I, Alvis K, Gunasena P, Hapuarachchi T, Ekanayake U, Gunawardana K, Aluthge P, Gunathilake S,Bandara S, Jayawardana A. Mastering the literature review: a framework for scholars, 2024.

34. Hermann, F; Max, PC; Kunz, M. Automatisierte, BIM-basierte Ökobilanzierung am Beispiel des Infrastrukturbaus. Bautechnik; 2024; 101, pp. 128-133. [DOI: https://dx.doi.org/10.1002/bate.202300115]

35. Forth, K; Abualdenien, J; Borrmann, A. Scherer, RJ; Sujan, SF; Hjelseth, E. NLP-based semantic model healing for calculating LCA in early building design stages. ECPPM 2022–eWork and eBusiness in architecture, engineering and construction 2022; 2023; London, CRC Press: pp. 77-84. [DOI: https://dx.doi.org/10.1201/9781003354222-10]

36. Forth, K; Abualdenien, J; Borrmann, A. Calculation of embodied GHG emissions in early building design stages using BIM and NLP-based semantic model healing. Energy Build; 2023; 284, 112837. [DOI: https://dx.doi.org/10.1016/j.enbuild.2023.112837]

37. Akbari, S; Sheikhkhoshkar, M; Rahimian, FP; El Haouzi, HB; Najafi, M; Talebi, S. Sustainability and building information modelling: integration, research gaps, and future directions. Autom Constr; 2024; [DOI: https://dx.doi.org/10.1016/J.AUTCON.2024.105420]

38. Utkucu, D; Sözer, H. Interoperability and data exchange within BIM platform to evaluate building energy performance and indoor comfort. Autom Constr; 2020; [DOI: https://dx.doi.org/10.1016/J.AUTCON.2020.103225]

39. Yang, X; Hu, M; Wu, J; Zhao, B. Building-information-modeling enabled life cycle assessment, a case study on carbon footprint accounting for a residential building in China. J Clean Prod; 2018; 183, pp. 729-743. [DOI: https://dx.doi.org/10.1016/J.JCLEPRO.2018.02.070]

40. BIMForum, level of development (LOD) specification, 2018. http://bimforum.org/lod/ (accessed 14.07.25).

41. Jia, J; Ma, H; Zhang, Z. Integration of industry foundation classes and ontology: data, applications, modes, challenges, and opportunities. Buildings; 2024; [DOI: https://dx.doi.org/10.3390/BUILDINGS14040911]

42. ÖKOBAUDAT, ÖKOBAUDAT sustainable construction information portal. https://www.oekobaudat.de/en.html (accessed 15 July 2025).

43. OpenAI, GPT-4o mini: advancing cost-efficient intelligence, 2024. https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/.

44. Li, Y. Python data analysis and attribute information extraction method based on intelligent decision system. Mob Inf Syst; 2022; [DOI: https://dx.doi.org/10.1155/2022/2495166]

45. Shashanth, A; Patil, P; Surya, KM; Bhat, SV; Kumar, KK. Social media assisting platform using sentiment analysis 2024; 10.5281/zenodo.12679682Indiana J Multidiscip Res;

46. National institute of building sciences, National institute of building sciences. https://nibs.org/ (accessed 10 July 2025).

47. Håkans O. The environmental impact of using Artificial Intelligence: exploring the environmental impact of ChatGPT: a literature study and user perception analysis. degree programme in information and communication technology, 2025.

48. Statista, Nettostromverbrauch in Deutschland in den Jahren 1991 bis 2024, 2025. https://de.statista.com/statistik/daten/studie/164149/umfrage/netto-stromverbrauch-in-deutschland-seit-1999/ (accessed 9 July 2025).

49. Building transparency, building a better future, together. https://www.buildingtransparency.org/ (accessed 9 July 2025).

Word count: 9553

Show less

© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Development of a BIM-based AI-driven matching tool for LCA datasets

Content area

Abstract

Full text