Full text

Turn on search term navigation

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

In long-tail scenarios, models have a very high demand for high-quality data. Information augmentation, as an important class of data-centric methods, has been proposed to improve model performance by expanding the richness and quantity of samples in tail classes. However, the underlying mechanisms behind the effectiveness of information augmentation methods remain underexplored. This has led to reliance on empirical and intricate fine-tuning in the use of information augmentation for long-tail recognition tasks. In this work, we simultaneously consider the richness gain and distribution shift introduced by information augmentation methods and propose effective information gain (EIG) to explore the mechanisms behind the effectiveness of these methods. We find that when the value of the effective information gain appropriately balances the richness gain and distribution shift, the performance of information augmentation methods is fully realized. Comprehensive experiments on long-tail benchmark datasets CIFAR-10-LT, CIFAR-100-LT, and ImageNet-LT demonstrate that using effective information gain to filter augmented data can further enhance model performance without any modifications to the model’s architecture. Therefore, in addition to proposing new model architectures, data-centric approaches also hold significant potential in the field of long-tail recognition.

Details

Title
Tradeoffs Between Richness and Bias of Augmented Data in Long-Tail Recognition
Author
Dai, Wei 1   VIAFID ORCID Logo  ; Ma, Yanbiao 2   VIAFID ORCID Logo  ; Chen, Jiayi 1   VIAFID ORCID Logo  ; Chen, Xiaohua 3 ; Li, Shuo 2 

 School of Telecommunications Engineering, Xidian University, Xi’an 710071, China; [email protected] (W.D.); [email protected] (J.C.) 
 School of Artificial Intelligence, Xidian University, Xi’an 710071, China; [email protected] 
 Department of Automation, Tsinghua University, Beijing 100190, China; [email protected] 
First page
201
Publication year
2025
Publication date
2025
Publisher
MDPI AG
e-ISSN
10994300
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
3170908363
Copyright
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.