Content area
Full Text
Data Min Knowl Disc (2015) 29:626688 DOI 10.1007/s10618-014-0365-y
Graph based anomaly detection and description: a survey
Leman Akoglu Hanghang Tong Danai Koutra
Received: 5 April 2013 / Accepted: 12 June 2014 / Published online: 5 July 2014 The Author(s) 2014
Abstract Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, nance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised versus (semi-)supervised approaches, for static versus dynamic graphs, for attributed versus plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the why, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly
Responsible editor: G. Karypis.
L. Akoglu (B)
Department of Computer Science, Stony Brook University, Stony Brook, NY 11794, USA e-mail: [email protected]
H. TongDepartment of Computer Science, City College, City University of New York, New York, NY 10031, USAe-mail: [email protected]
D. KoutraComputer Science Department, Carnegie Mellon University, Pittsburgh, PA 15217, USA e-mail: [email protected]
123
Graph based anomaly detection and description: a survey 627
detection in diverse domains, including nancial, auction, computer trafc, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the eld.
Keywords Anomaly detection Graph mining Network anomaly detection Event
detection Change point detection Fraud detection Anomaly description Visual
analytics
1 Introduction
When analyzing large and complex datasets, knowing what stands out in the data is often at least, or even more important and interesting than learning about its general structure. The branch of data mining concerned with...