Content area
Full text
Introduction
Current data-driven artificial intelligence (AI) has achieved remarkable success across various fields1, 2, 3, 4, 5, 6, 7–8, primarily through model training and evaluation using datasets9, 10, 11, 12, 13, 14, 15, 16–17. However, most datasets contain inherent biases, leading models to learn and exploit unintended task-correlated features or shortcuts, a phenomenon referred to as shortcut learning18, 19, 20, 21–22. This issue undermines the assessment of AI models’ true capabilities, limits our understanding of their underlying mechanisms, and hinders their explainability and robust deployment in critical areas such as healthcare and autonomous driving. As illustrated in Fig. 1a, both humans and AI models may rely on these unintended features when evaluated using such biased datasets, resulting in biased assessments that reflect models’ preferences rather than their true abilities. Given the importance of trustworthy AI applications, developing a shortcut-free evaluation methodology is vital, yet it poses a considerable challenge.
[See PDF for image]
Fig. 1
Evaluation of the capabilities of AI models on shortcut-free datasets reliably.
a When individuals or AI models learn from datasets containing shortcuts, they may use features other than the intended ones to recognize the same samples, leading to misleading evaluation results. In contrast, when learning from shortcut-free datasets, different individuals or AI models will only use the intended feature to recognize the same samples, thus producing reliable evaluation results. b The curse of shortcuts encompasses two challenges. The first challenge lies in covering all possible shortcut features, as the number of features in high-dimensional data grows exponentially with the data dimensions. The second challenge is in intervening in the covered shortcut features, where the overall label is coupled with local features, making it inevitable that intervening in local features will affect the overall label. c SHL includes a model suite composed of models with different inductive biases and learns the SH of high-dimensional datasets through the intersection of features learned by each model. The more models in the model suite, the more accurate the learning of SH. The diversity in the inductive biases of the models significantly accelerates the learning speed of SH, and directly learning SH avoids intervening in the features of the data, thus addressing both challenges mentioned in (b).