Content area
Full text
Abstract - Ablation studies have been widely used in the field of neuroscience to uncover structure and organization in complex biological systems. In this paper, we transfer the principle of ablation studies to two types of artificial neural networks trained to investigate the structure of their learned representations. We found that features distinct to the local and global structure of the training data are selectively and sometimes redundantly represented in specific parts of the network. Further, we found that the importance of these specific parts for the learning task can be determined based on the distribution of incoming weights of single units. Finally, we found that the network's class-specific accuracy can be partly increased after training via ablations.
Keywords: Ablations, Artificial Neural Networks, Learning Representations, AI Transparency, Explainable AI
1Introduction
Recent research on deep learning has brought forth a number of remarkable applications for different problems in a variety of domains. Prominent examples are visual object recognition, object detection and semantic segmentation in the field of computer vision [1-5], speech recognition and speech separation in the field of natural language processing [6-10] or self-learning agents based on deep reinforcement learning for video games [11-14], classic board games [15-17] as well as locomotion and robotic control [18-23]. During the last few years, the strong increase in availability of computational resources combined with the facilitation of new computing paradigms such as GPU programming [1] and asynchronous methods for training deep neural networks (DNNs) [24, 25] resulted in an increase of the average size, i.e. the number of trainable parameters, of state-of-the-art DNNs. Due to this development, enabling the use of more complex algorithms and brute force methods, the main research focus in the past has been placed on increasing performance and speed of the trained networks to solve specific benchmark tests. Meanwhile, the development of new methods and perspectives for a deeper understanding of the structure of the learned representations in these complex networks was largely neglected.
In this paper, we follow a neuroscience-inspired approach based on the idea of ablation studies to analyze the structure of learned representations in DNNs. In these studies, neural tissue is damaged in a controlled manner while investigating how the inflicted damage influences the brain's capabilities to perform a specific...




