Content area

Abstract

In response to the increasing complexity and volume of patent applications, this research introduces a semiautomated system to streamline the literature review process for Indonesian patent data. The proposed system employs a synthesis of multilabel classification techniques based on natural language processing (NLP) algorithms. This methodology focuses on developing an iterative and modular system, with each step visualised in detailed flowcharts. The system design incorporates data collection and preprocessing, multilabel classification model development, model optimisation, query and prediction, and results presentation modules. Experimental results demonstrate the promising potential of the multilabel classification model, achieving a micro F1 score of 0.6723 and a macro F1 score of 0.6009. The OneVsRestClassifier model with LinearSVC as the base classifier shows reasonably good performance in handling a bilingual dataset comprising 15,097 patent documents. The optimal model configuration uses TfidfVectorizer with 20,000 features, including bigrams, and an optimal C parameter of 0.1 for LinearSVC. Performance analysis reveals variations across IPC classes, indicating areas for further improvement. The discussion highlights the implications of the proposed system for researchers, patent examiners and industry professionals by facilitating efficient searches within patent databases. This study acknowledges the potential of semiautomated systems to enhance the efficiency of patent analysis while emphasising the need for further research to address identified challenges, such as class imbalance and performance variations across patent categories. This research paves the way for further developments in the field of automated patent classification, aiming to improve efficiency and accuracy in international patent systems while recognising the crucial role of human experts in the patent classification process.

Details

1009240
Location
Title
Multilabel Classification of Bilingual Patents Using OneVsRestClassifier: A Semiautomated Approach
Author
Volume
16
Issue
1
Publication year
2025
Publication date
2025
Publisher
Science and Information (SAI) Organization Limited
Place of publication
West Yorkshire
Country of publication
United Kingdom
ISSN
2158107X
e-ISSN
21565570
Source type
Scholarly Journal
Language of publication
English
Document type
Journal Article
ProQuest document ID
3168740464
Document URL
https://www.proquest.com/scholarly-journals/multilabel-classification-bilingual-patents-using/docview/3168740464/se-2?accountid=208611
Copyright
© 2025. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2025-02-24
Database
ProQuest One Academic