Decreasing the Management Burden in Multi-tier Systems Through Partial Correlation-Based Monitoring

Abstract

Modern web applications often consist of hundreds of services distributed in different servers or tiers. On one hand, this architecture may provide easy abstraction and modularity for software development and reuse. On the other hand, such architecture makes difficult to predict the behavior of the systems, as each tier has its own functionality, configuration, and demands for computing resources. Thus, anomaly detection becomes an important aspect for the management and operation of multi-tier web systems. In order to track their operation and aid on their behavior analysis, web systems expose numerous metrics in all the tiers. However, collecting and analyzing all available metrics reduces the system performance due to a non-negligible overhead on communication, storage, and processing. Another concern is the nature of the workload of these systems, which may fluctuate widely over time. One of the approaches to support anomaly detection in web systems is to use stable correlations among monitoring metrics. This approach, called correlation-based monitoring, does not require any deep understanding about the system internals or metric semantic, and also does not demand the existence of data about the faults. In addition, as only the metrics involved in stable correlations are periodically collected, the monitoring overhead is reduced. Stable correlations also have the desired property of holding for long period of time before becoming invalid due to workload fluctuations. The challenge, however, is to identify the stable correlations. In this work, we address this challenge by proposing three novel strategies based on partial correlation, a statistical tool commonly employed to summarize the relevant information of complex systems. We evaluate our strategies using traces obtained from an e-commerce, web transaction benchmark deployed in our testbed. Results show that our best strategy allows the construction of a monitoring network with less metrics than a state-of-the-art solution while achieving larger fault coverage. They also show that the correlations are reasonably stable, and the models can be applied for sufficiently long periods of time (at least 50 times the training time) before they become invalid.

Details

Key topics

Monitoring;
Partial correlation;
Anomaly detection;
Web application

Business indexing term

Subject:

Software industry;
Workloads

Industry:

51321:‎ Software Publishers

Classification

25: Computer Communication Networks (CI)
51321: Software Publishers

Title

Decreasing the Management Burden in Multi-tier Systems Through Partial Correlation-Based Monitoring

Author

Pinno, Otto J; A¹; Correa, Sand L¹; Dos Santos, Aldri L²; Cardoso, Kleber V¹

¹ Instituto de Informática, Universidade Federal de Goiás (UFG), Goiânia, GO, Brazil
² Departamento de Informática, Universidade Federal Do Paraná (UFPR), Curitiba, PR, Brazil

Publication title

Journal of Network and Systems Management; New York

Volume

Issue

Pages

612-642

Publication year

2017

Publication date

Jul 2017

Publisher

Springer Nature B.V.

Place of publication

New York

Country of publication

Netherlands

Publication subject

Computers--Computer Systems, Computers--Computer Networks

ISSN

10647570

e-ISSN

15737705

Source type

Scholarly Journal

Language of publication

English

Document type

Journal Article

DOI

https://doi.org/10.1007/s10922-017-9402-7

ProQuest document ID

1904775535

Document URL

https://www.proquest.com/scholarly-journals/decreasing-management-burden-multi-tier-systems/docview/1904775535/se-2?accountid=208611

Last updated

2026-01-16

Database

ProQuest One Academic

Decreasing the Management Burden in Multi-tier Systems Through Partial Correlation-Based Monitoring

Content area

Abstract

Details