Content area
This study proposes an AI-driven framework to automate the extraction and analysis of Customer Requirements (CRs) and Engineering Characteristics (ECs) from large-scale product review data. Traditional Quality Function Deployment (QFD) methods are labor-intensive, costly, and lack scalability and real-time responsiveness. To address these issues, the framework leverages recent advances in generative AI: instruction tuning improves task-specific comprehension, Retrieval-Augmented Generation (RAG) enhances contextual grounding, and prompt engineering ensures structured, actionable outputs. A domain-specific CR-EC dictionary aligns customer language with technical attributes, while instruction-response training improves model interpretability. The framework includes a scalable pipeline for data segmentation, inference, and post-processing. By enabling real-time demand sensing and reducing VOC collection costs, it supports agile product development and quality management. Future research will focus on validating the framework across domains and refining it based on empirical findings, contributing to AI-enabled, customer-driven innovation and automated product quality assessment.
Abstract: This study proposes an AI-driven framework to automate the extraction and analysis of Customer Requirements (CRs) and Engineering Characteristics (ECs) from large-scale product review data. Traditional Quality Function Deployment (QFD) methods are labor-intensive, costly, and lack scalability and real-time responsiveness. To address these issues, the framework leverages recent advances in generative AI: instruction tuning improves task-specific comprehension, Retrieval-Augmented Generation (RAG) enhances contextual grounding, and prompt engineering ensures structured, actionable outputs. A domain-specific CR-EC dictionary aligns customer language with technical attributes, while instruction-response training improves model interpretability. The framework includes a scalable pipeline for data segmentation, inference, and post-processing. By enabling real-time demand sensing and reducing VOC collection costs, it supports agile product development and quality management. Future research will focus on validating the framework across domains and refining it based on empirical findings, contributing to AI-enabled, customer-driven innovation and automated product quality assessment.
Keywords: Quality Function Deployment; Generative AI; Customer Reviews; Product Innovation; AI Framework
1 Introduction
In the era of rapid technological change and evolving customer expectations, many companies face increasing pressure to adapt quickly and effectively to maintain competitiveness in the global market (Schaller et al., 2022). Accurately identifying Customer Requirements (CRs) and incorporating them into product development processes is essential for delivering customer-centered innovations and achieving sustainable growth (Sudirjo, 2023). One widely adopted method for translating customer needs into product specifications is Quality Function Deployment (QFD). QFD enables the systematic collection of the Voice of the Customer (VOC), maps it to Engineering Characteristics (ECs), and organizes this information into a structured relationship matrix known as the House of Quality (HOQ) (Akao, 1972). This approach allows firms to prioritize customer demands and allocate resources efficiently during early product planning. However, traditional QFD processes and related tools, such as the Kano model, rely heavily on time-consuming and costly methods, including market surveys, interviews, and focus groups. These methods often suffer from subjectivity, limited scalability, and an inability to capture real-time market trends (Shen et al., 2022; Wang & Chen, 2020). As a result, there has been growing interest in using online customer reviews as a scalable, low-cost alternative data source. These reviews provide rich, real-time feedback directly from end-users, offering insights into product performance, satisfaction, and unmet needs. Recent advances in Artificial Intelligence (AI), especially in Natural Language Processing (NLP) and deep learning, have enabled the automatic extraction of customer insights from vast amounts of unstructured text data (Al Rabaiei et al., 2021). More recently, the emergence of generative AI models, such as OpenAEs GPT, has opened new possibilities for understanding the semantics and context of customer language with unprecedented precision (Brynjolfsson et al., 2023). These models can be fine-tuned to generate high-quality outputs, interpret complex instructions, and extract structured information from informal textual sources such as customer reviews. Despite this progress, significant challenges remain. Current approaches often struggle with high costs in data preprocessing and lack integrated frameworks that jointly analyze CRs and ECs. In many cases, CRs and ECs are extracted independently, leading to fragmented insights and missed opportunities for product improvement. Moreover, the absence of real-time, automated systems hinders agile decision-making in Research and Development (R&D) and quality management.
To address these limitations, this study proposes an Al-based framework that leverages generative models to automate the extraction and analysis of CRs and ECs from large-scale customer review data. The proposed framework utilizes instruction tuning, Retrieval-Augmented Generation (RAG), and prompt engineering to improve accuracy, context awareness, and scalability. By doing so, it enables real-time demand sensing, optimizes resource allocation, and supports automated quality assessment. Ultimately, this approach aims to enhance product innovation, improve customer satisfaction, and strengthen competitive positioning in dynamic global markets.
2 Literature review
Quality Function Deployment
QFD has been widely adopted across various industries to develop customer-oriented products and improve product and service quality by reflecting CRs. It systematically collects the VOC and translates it into ECs, constructing a structured matrix known as the HOQ. The HOQ identifies the relationships between CRs and ECs, helps prioritize technical efforts, and ultimately contributes to enhanced customer satisfaction in product and service development. Traditionally, CRs have been collected through surveys and interviews, but these methods are often time-consuming, expensive, and subject to bias. To address this, recent studies have explored the use of online customer reviews as alternative sources for VOC data. With the advancement of AI, especially deep learning techniques, there has been increasing research on automated frameworks that analyze such review data to extract customer insights. These approaches aim to reduce the cost and time required for CR collection while improving the scalability of QFD applications.
Natural Language Processing and Generative AI
NLP enables computers to understand and interpret human language. Early embedding models such as Word2Vec and FastText converted linguistic inputs into numerical vectors, facilitating semantic analysis. Later advancements, including recurrent neural network models like LSTM and Seq2Seq, enhanced sentence-level understanding by incorporating sequential information. The advent of transformer-based pre-trained language models (PLMs), particularly BERT (Bidirectional Encoder Representations from Transformers), represented a significant breakthrough, improving performance on various language tasks through fine-tuning on specific domains (Reimers, 2019). Generative AI models extend NLP capabilities by not only understanding language but also generating coherent and contextually appropriate content. OpenAFs GPT (Generative Pre-trained Transformer) scries exemplifies this evolution, particularly ChatGPT, which leverages Reinforcement Learning from Human Feedback (RLHF) and Proximal Policy Optimization (PPO) to align outputs closely with human expectations. This advancement allows generative models to effectively analyze unstructured text such as customer reviews, thus automating the extraction of structured CR and EC data with high accuracy and contextual relevance (Brynjolfsson et ak, 2023).
3 Research framework
This study presents a comprehensive framework that leverages generative AI to extract CRs and ECs from large-scale product review data. The overall process is composed of five sequential stages-each designed to improve precision, contextual relevance, and automation capability in product quality analysis. The conceptual flow of this framework is illustrated in Figure 1. The process begins with data collection and domain-specific dictionary construction. Review data from over 1,000 product categories were gathered from reliable sources and organized into a structured dictionary that captures key expressions and technical attributes. This dictionary serves as the foundational resource for contextualizing customer input and aligning it with engineering interpretation. To improve model performance, Instruction Tuning was applied using tailored commandresponse pairs (e.g., extract key quality attributes from the review ), enabling the model to better understand domain-specific tasks. This step enhances consistency and interpretability in the generation process. Next, RAG mechanism was incorporated using a vector database, allowing the model to retrieve relevant knowledge in real time. This improves precision by grounding generation outputs in reliable external sources. In parallel, Prompt Engineering was used to define structured output formats such and justification texts, further increasing usability for decision-makers and analysts. The trained model will be evaluated across ten instruction-tuned product categories and two unseen categories to assess its generalization ability. Evaluation metrics include precision, recall, and Fl-score. Pilot test results will inform iterative model refinement. Furthermore, the system's reasoning capabilities will help identify discrepancies between generated CRs and source review data. Finally, the research includes the development of a scalable, automated pipeline that handles data segmentation, inference, and post-processing. Potential issues such as contextual retrieval errors or prompt conflicts will be addressed through refinement of the RAG logic and prompt templates. This ensures adaptability to evolving customer language and dynamic review trends, supporting agile product innovation and resource optimization.
4 Conclusion
This study proposes a generative Al-based framework for extracting CRs ECs from product review data. By integrating instruction tuning, RAG, and prompt engineering, the framework aims to address the limitations of traditional QFD methods. A domainspecific CR-EC dictionary has been constructed, and command-response pairs are used to improve the model's instruction understanding and consistency. Real-time retrieval and structured outputs further enhance usability and reliability. The framework is currently under experimental validation across selected product categories. If successful, it is expected to provide significant benefits in terms of reducing VOC collection costs, supporting early-stage product planning, and enabling real-time quality analysis. Moreover, the combination of generative AI and retrieval mechanisms could offer a scalable and adaptive approach to monitoring customer feedback in dynamic markets. As this research is still in progress, future work will focus on broader validation across domains and the refinement of the system based on empirical results. The proposed framework has the potential to evolve into a practical tool for customer-driven innovation and automated quality assessment.
References and Notes
Akao, Y. (1972). New product development and quality assurance-quality deployment system. Standardization and Quality Control, 25(4), pp. 7-14.
Al Rabaiei, K., Alnajjar, F., & Ahmad, A. (2021). Kano model integration with data mining to predict customer satisfaction. Big Data and Cognitive Computing, 5(4), Article 66. https://doi.org/10.3390/bdcc5040066
Brynjolfsson, E., Li, D., & Raymond, L. (2023). Generative AI at work. National Bureau of Economic Research. Retrieved from https://www.nber.org/papers/w31161
Reimers, N. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. arXiv preprint arXiv: 1908.10084.
Schaller, A.-M., Schaller, A.-A., & Vatananan-Thesenvitz, R. (2022). What are the general mechanisms that push a company to transform digitally? 2022 Portland International Conference on Management of Engineering and Technology (PICMET). IEEE.
Shen, Y., Zhou, J., Pantelous, A. A., Liu, Y., & Zhang, Z. (2022). A voice of the customer real-time strategy: An integrated quality function deployment approach. Computers & Industrial Engineering, 169, Article 108233.
Sudirjo, F. (2023). Marketing strategy in improving product competitiveness in the global market. Journal of Contemporary Administration and Management (ADMAN), 1(2), pp. 63-69.
Wang, Z., & Chen, Q. (2020). Monitoring online reviews for reputation fraud campaigns. Knowledge-Based Systems, 195, Article 105685.
Copyright The International Society for Professional Innovation Management (ISPIM) 2025