Full text

Turn on search term navigation

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Recent work has shown that deep neural networks are vulnerable to backdoor attacks. In comparison with the success of backdoor-attack methods, existing backdoor-defense methods face a lack of theoretical foundations and interpretable solutions. Most defense methods are based on experience with the characteristics of previous attacks, but fail to defend against new attacks. In this paper, we propose IBD, an interpretable backdoor-detection method via multivariate interactions. Using information theory techniques, IBD reveals how the backdoor works from the perspective of multivariate interactions of features. Based on the interpretable theorem, IBD enables defenders to detect backdoor models and poisoned examples without introducing additional information about the specific attack method. Experiments on widely used datasets and models show that IBD achieves a 78% increase in average in detection accuracy and an order-of-magnitude reduction in time cost compared with existing backdoor-detection methods.

Details

Title
IBD: An Interpretable Backdoor-Detection Method via Multivariate Interactions
Author
Xu, Yixiao  VIAFID ORCID Logo  ; Liu, Xiaolei  VIAFID ORCID Logo  ; Ding, Kangyi; Bangzhou Xin
First page
8697
Publication year
2022
Publication date
2022
Publisher
MDPI AG
e-ISSN
14248220
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2739457309
Copyright
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.