Abstract
The full text may take 40-120 seconds to translate; larger documents may take longer.
Entity resolution is essential for data integration, facilitating analytics and insights from complex systems. Multi-source and incremental entity resolution address the challenges of integrating diverse and dynamic data, which is common in real-world scenarios. A critical question is how to classify matches and non-matches among record pairs from new and existing data sources. Traditional threshold-based methods often yield lower quality than machine learning (ML) approaches, while incremental methods may lack stability depending on the order in which new data is integrated. Additionally, reusing training data and existing models for new data sources is unresolved for multi-source entity resolution. Even the approach of transfer learning does not consider the challenge of which source domain should be used to transfer model and training data information for a certain target domain. Naive strategies for training new models for each new linkage problem are inefficient. This work addresses these challenges and focuses on creating as well as managing models with a small labeling effort and the selection of suitable models for new data sources based on feature distributions. The results of our method StoRe demonstrate that our approach achieves comparable qualitative results. Regarding efficiency, StoRe outperforms both a multi-source active learning and a transfer learning approach, achieving efficiency improvements of up to 48 times faster than the active learning approach and by a factor of 163 compared to the transfer learning method.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer