FANT: Flexible Attention-Shifting Network

Full text

Turn on search term navigation

1. Introduction

In recent years, the scale of data center networks has grown significantly due to the continuous expansion of cloud services and virtualization. To manage and control these large-scale networks effectively, various network applications such as congestion control [1], load balancing [2], and network verification [3] have been developed. These network applications require accurate network measurements that cover a broader range of metrics (the content supported for monitoring, such as port rate, queue occupancy, and timestamps, is named differently by various standardization organizations, such as metadata or data fields. For simplicity, in this paper, we will uniformly refer to each type of content supported for monitoring as a metric), offer finer granularity, and provide higher measurement frequency. Traditional network measurement solutions, such as SNMP [4] and SFlow [5], cannot meet these requirements. To address these needs, a contemporary measurement solution, inband network telemetry (INT), has been introduced. INT provides precise in-device network visibility, making it an increasingly preferred choice for network monitoring.

INT is a network measurement mechanism that depends on network infrastructure, such as switches. It can be broadly categorized into two types based on the nature of the data packets that carry the INT header: passive inband network telemetry (PINT) and active inband network telemetry (AINT).

PINT: PINT embeds the INT header into data packets of users’ traffic. Consequently, PINT can insert network status into all packets and achieve more accurate and timely measurements. This capability facilitates real-time control of traffic in data planes, supporting network applications that require rapid decision-making, such as congestion control [1] or load balancing [2]. The basic process of PINT is as follows:

(i.). The INT source switches embed an INT header into the user’s data packets.
(ii.). As these packets traverse the network, switches insert their internal state (e.g., queue length, queuing time, port speed) into the INT header.
(iii.). The INT sink switches extract the network status from the packets’ INT header and report it to the INT controller.

AINT: AINT embeds the INT header into specially made probe packets. Since these probe packets are generated by the AINT system, they can be sent to any location in the network at any time (PINT can only measure the network status when users’ flows go through the switches or links). Thus, AINT primarily provides network status information for applications that require a comprehensive view of the network state at low frequency (around every 10 s), such as traffic engineering [6], network-wide monitoring [7], and network verification [3]. The basic process of AINT is as follows: (i.) INT generators/collectors dispatch probe packets into the network. (ii.) As these probe packets traverse the network, switches insert their internal state into the INT header. (iii.) The INT generators/collectors then extract the network status from the packets and report it to the INT controller. Although AINT’s measurement frequency is relatively slower, it covers a broad range.

In this paper, we primarily focus on active inband network telemetry (AINT). AINT offers comprehensive network visibility with low network overhead and can be seamlessly integrated into existing data centers. However, as network applications continue to evolve—such as the automation of network fault diagnosis, expansion of network monitoring, and refined traffic scheduling—there is a growing need for more dimensions of network status to enhance the capabilities of network applications. Consequently, the range of metrics that needs to be reported by the telemetry system is increasing (from 96 bit/probe to 128 bit/probe), as shown in Figure 1. The initially required reporting content of the measurement system for the application was a few coarse metrics such as port rate, queue length, and device packet loss rate. Now, the metrics required for the application have expanded to include a variety of new metrics, such as specific protocol rates, queue waiting times, and port/queue packet loss rates. In addition, as the inband network telemetry protocol evolves, we can also see an increase in the number of measurement metrics, with newer versions (INT specification v0.5 [8], INT specification v1.0 [9], INT specification v2.0 [10], INT specification v2.1 [11]) generally supporting more metrics. The increase in the number of monitored metrics will inevitably increase the bandwidth consumption of measurements, which subsequently affects the performance of business flows.

FANT does not monitor all metrics continuously across all locations. Instead, it categorizes measurements into two groups: basic (low attention) and detailed (high attention). Based on the analysis from the previous cycle’s measurements, FANT determines which links in the next cycle require comprehensive metrics and which can function with simplified metrics. FANT dynamically adjusts its attention to offer either comprehensive or simplified measurements for the upcoming cycle, based on the decisions. To ensure measurement precision, we propose a strategy for measurement metrics categorization. To further reduce overhead, we model the problem of dynamic attention shifting as a weighted set cover problem (WSCP), which is NP-hard. We introduce a heuristic algorithm to address this challenge. By flexibly shifting its focus, FANT conserves bandwidth while still providing comprehensive, multi-dimensional measurements. Our major contributions are summarized as follows:

We propose the FANT system (the proposed method, encompassing the system architecture, probe-switching mechanism, metrics categorization strategy, attention switching strategy, and experimental validation, was all developed entirely by the authors), a network-wide measurement system based on an attention mechanism that reduces the increase in measurement costs caused by a surge in measurement metrics through flexible attention shifting.
We introduce a measurement metrics category and an attention-shifting strategy to enhance the effectiveness of the aforementioned mechanism, thereby avoiding impacts on accuracy and promoting cost savings.
Our experiments validate that our approach reduces costs significantly, consuming only 42.6% of the bandwidth used by state-of-the-art solutions. For scenarios requiring high-speed computation, our algorithm can accelerate by $10^{5}$ × with a 6.4% performance loss, keeping computation times under 10 ms.

2. Related Work

Zhou et al. developed NetSeer [12], an event-monitoring network measurement framework. NetSeer precisely identifies packets encountering specific events through programmable data planes, aggregates packets experiencing the same flow-level events, and performs lossless compression. The compressed results are then reported to the INT collector, reducing measurement overhead. Jia et al. introduced a prediction-based network measurement framework called INT-filter [13] to reduce network overhead. The INT-filter’s data plane retrieves the current network status as recorded values and uses predictive algorithms to generate estimated values. If the discrepancy between the estimated and recorded values exceeds a set threshold, the data plane uploads the recorded values; otherwise, it uploads nothing. When the control plane receives no measurement records, it uses the same algorithms to reconstruct the estimated values, which are then stored in the database as the official records for that measurement. Song et al. proposed INT-label [14], which employs a time-based probabilistic tagging algorithm on the data plane, allowing fewer packets to carry more INT information and reducing measurement overhead. Additionally, INT-label uses a feedback mechanism that adaptively changes the tagging frequency to compensate for telemetry resolution degradation due to packet loss. Leveraging the programmability of the data plane, these works analyze current measurements and dynamically adjust the reported metrics based on the analysis results, effectively conserving bandwidth. However, these strategies require that all network switches be programmable, demanding high hardware specifications as shown in Table 1.

Pan et al. introduced INT Path [15], which utilizes Eulerian circuits to design an algorithm for probe generation and routing. This method generates probe traces that achieve theoretically optimal network-wide measurements with a single probe generator/collector in the topology. INT Path effectively minimizes overhead and reduces measurement costs. Building on INT Path, Lin et al. introduced INT-Probe [16], which reduced the packet length of probes and enhanced the robustness of the ANT system. By partitioning the network graph into multiple subgraphs, INT-Probe provided a theoretically optimal set of probe traces for scenarios with multiple probe generators/collectors, thereby further reducing measurement overhead. Zhang et al. developed INT Balance [17], which involves dividing the original network graph at odd nodes into multiple path segments and subsequently reassembling these segments to form multiple probe traces of approximately equal length. This approach optimizes both the delay incurred by probe measurements and the overall network overhead. These strategies carefully organize INT-Probes to minimize the measurement overhead in AINT systems. However, as shown in Table 1 these methods treat all measurement metrics as a single entity, which does not effectively reduce overhead when the number of measurement metrics increases.

3. FANT System

In this section, we first outline the essential requirements that FANT must meet as a monitoring system. Next, we describe the architecture and workflow of FANT. Finally, we detail the design of both the data plane and the control plane of the FANT system.

3.1. Design Requirements

FANT is designed to efficiently handle the increased measurement metrics; however, as a monitoring system, it also needs to meet the following requirements: Satisfying Measurement requirements. R1.1: Areas identified for basic measurement based on previous measurements must report monitoring results that include all basic measurement metrics. R1.2: Areas identified for detailed measurement based on previous measurement data must report monitoring results that include all metrics. Reducing Measurement Cost. R2.1: The system avoids injecting additional measurement probes into the network. R2.2: The system minimizes the measurement bandwidth usage. Deployability. R3.1: The system relies on functions that are already supported by existing devices or that can be easily expanded in the future.

3.2. Architecture

As shown in Figure 2, the FANT architecture consists of a data plane (forwarding devices) and a control plane (INT controller and probe generator/collector).

Forwarding devices are switches or routers equipped with Segment Routing (SR) and inband telemetry functionalities. Their responsibilities include:

(i.). Forwarding business traffic;
(ii.). Forwarding INT-Probes based on SR instructions;
(iii.). Embedding the in-device status into the probe packets’ INT header as per the instructions in the INT header.

Probe generators/collectors are virtual machines or servers mounted at specified locations of the topology. For example, in a data center, they are generally allowed to be mounted on certain Top-of-Rack (TOR) switches according to the network operator’s network planning. This is because mounting them on higher-layer switches (i.e., core switches or aggregation switches) could impact network throughput. Their duties involve:

(i.). Constructing and sending probe packets;
(ii.). Receiving probe packets and extracting monitoring results from the INT header;
(iii.). Reporting the monitoring results to the INT controller.

The INT controller is a cluster of servers with the following tasks: (i.) Analyzing measurement results from the current measurement cycle and determining the measurement requirements for various locations in the next cycle and (ii.) utilizing the attention-switching strategy to control the probe generators/collectors for focus adjustment

3.3. Workflow

For clarity, let us assume the system’s initial measurement occurs at time t, and subsequent measurements occur at time $t + n * δ t$ (FANT optimizes the existing AINT system, and thus $δ t$ is the same as the frequency of the AINT systems). The workflow of FANT is outlined as follows:

A (not shown in the Figure 2). Prior to initiating measurements, FANT executes a metrics categorization strategy (Section 4), categorizing the measurement metrics into basic and detailed measurement metrics. Note that unlike FINT [18] and SFANT [19], which support multiple types of measurements to enhance flexibility, the FANT system only supports two types of measurements: basic and detailed. This approach reduces the hardware requirements [R3.1].

B. The INT controller activates an existing network-wide probe generation algorithm to create a set of probes for basic network-wide measurement [R1.1]. Note that the system supports any user-selected probe generation algorithm for probe trace planning, thus maintaining the fundamental characteristics of the original solutions. For example, the INT-Probe algorithm can achieve theoretically optimal overhead; the INT balance algorithm can approximate the completion time for each probe’s measurements; and traversing every link in the topology can enhance the system’s robustness.

C. The INT controller directs the probe generators/collectors to inject probes (for either detailed or basic measurements) into the data plane at time t. When t is 0, all probes are set to perform basic measurements. These probes gather network status according to the planned traces and specified measurement instructions and then report back to the INT controller after being aggregated by the probe generators/collectors.

D. The INT controller analyzes time t’s measurement results to determine which links require detailed measurements (suspicious links) and which require basic measurements (normal links) at time $t + 1 * δ t$ . FANT analyzes suspicious links using historical absolute deviation and historical difference deviation (calculated based on the links metrics values from the last few cycles). Additional methods, such as AI-based techniques, can also be used to detect suspicious links. Since this is not the primary focus of FANT, it is not discussed here. Details on how to analyze suspicious links based on the measurement result and the complexity of the methods are provided in Appendix B.

E. Based on the analysis of measurement needs from step D, the INT controller implements an attention-switching strategy (Section 5) to determine which basic measurement probes should switch to detailed measurement probes and which detailed measurement probes should switch to basic measurement probes at time $t + 1 * δ t$ [R1.2 & R2.2]. The results are then communicated to the INT-Probe generator/collector.

F. The probe generators/collectors update the INT instructions for some probes to enable or disable detailed measurements, based on the probe-switching instructions received. At time $t + 1 * δ t$ , both updated and unchanged probes are injected into the data plane [R2.1].

Note that FANT does not inject additional probes for detail measurements. This is because we prioritize ensuring that packets per second (pps) does not increase, before striving to reduce bits per second (bps) as much as possible. This approach is taken because the measurement system introduces both additional bps and pps, but typically, the impact of higher pps is more significant than the increase in bps in current data center networks due to two reasons: (i.) The chips within transmission devices (i.e., switches and network cards) process network traffic (i.e., forwarding, counting, querying, and queuing) on a per-packet basis. Consequently, the processing latency of these devices exhibits an inverse correlation with the increase in pps while showing no direct correlation with bps. Similarly, the CPUs within analysis devices also handle interrupts on a per-packet basis, making them more susceptible to the increase in pps. (ii.) Although the single-port rate of transmission devices in data centers has increased significantly, the growth in buffer size has been modest. This suggests that in current data centers, even with higher traffic input, there is no urgent need to expand buffer capacity to accommodate more bits. This indirectly demonstrates that the bottleneck in device rates is more related to pps (processing rate) and relatively less to bps. Therefore, FANT does not introduce new probes, even though introducing new probes might save some bps compared to the existing switching strategy.

G. These probes collect network state information based on INT instructions from switches and report back to the INT controller after being aggregated by the INT generators/collectors.

H. Update $δ t$ to $n * δ t$ when the ${(n + 1)}^{t h}$ measurement occurs. Repeat steps D to G.

3.4. Data Plane

In the FANT monitoring system, the data plane is responsible for forwarding both user data packets and measurement-related probe packets. The architecture of the probe packets is as illustrated. In the data plane, probe packets first go through the same basic data plane logic as user packets to ensure the accuracy of measurement, and then they also pass through SR logic and telemetry logic. The SR logic is focused on routing and forwarding based on the SR protocol. Therefore, switches that support either SR-MPLS [20] or SRv6 [21] implementations are compatible. The INT logic is primarily responsible for collecting network states within the device. Since FANT only requires support for basic and detailed measurements, switches that support INT specification [11] and IOAM [22] are suitable, and those supporting IFA [23] can also be adapted with simple extensions. For detailed implementation, see Appendix A.

3.5. Control Plane

The architecture of the FANT control plane is shown in Figure 2. The INT controller, based on the topology and the mounting locations of the probe collector/generator, invokes an existing probe generation algorithm to create a set of probe traces that cover the entire network. These probe traces for measurement are assigned a number ranging from 1 to the maximum probe number and recorded in the INT controller. Then, the INT controller also adds a record of each probe’s type. Initially, all probes are set for basic measurement, but in subsequent cycles, some are adjusted to detailed measurement based on changing requirements. Based on the record of probe traces and their types, the Probe Configure Module configures which probe generator/collector needs to send which numbered probe. It also specifies the particular SR protocol stack and INT instructions that need to be added to each probe.

Once these instructions are issued, the probe generator/collector’s Probe Construction Module is configured to generate packets containing the specified SR protocol stack and INT instructions. Based on the configuration, the probe generator/collector calls the Probe Generation Module to generate the probe packets containing the specified SR protocol stack and INT instructions. When time is t, these packets are injected into the data plane to fetch the network status.

When the packets are forwarded back to the probe generator/collector by the data plane, they first pass through a Parsing Module that extracts the network measurement information carried by the probes. This information is then written locally via Redis for suspicious metrics detection and transmitted to a distributed database via a Kafka Producer for long-term storage (after which, the data can be consumed by network applications). For each metric, the Result Analysis Module retrieves data from Redis to assess both its absolute value and its variation over time. If the absolute values or their changes exceed predefined thresholds, these metrics are flagged as suspicious. The links associated with these suspicious metrics are then updated to a notification table within Redis.

The Requirement Collection Model subscribes to this table and is notified when updates occur. Based on the notification, the model updates a record for links which require more comprehensive measurement. After a specified waiting period, the INT controller invokes an attention-switching method based on the suspicious links record and probe traces record, subsequently updating the record of each probe’s type. Utilizing the record of the updated probe traces and each probe’s type, the Probe Configure Module reconfigures the generator/collector to construct the SR stack and INT instructions for the probe packets for the $t + 1 * δ t$ measurement cycle. When time reaches $t + 1 * δ t$ , the probe generator/collector dispatches these probe packets.

4. Metrics Categorization Strategy

To ensure the visibility of the FANT system network, basic metrics and detailed metrics should be carefully categorized as follows:

Basic metrics require timely measurements and are essential for the continuous operation of network applications.
Detailed metrics are determined by variations in basic metrics, and delays in their measurement have minimal impact on the functionality of network applications.

4.1. Basic Idea

The fundamental idea of the metrics categorizing strategy is based on the attributes of the metrics themselves or the relationships between metrics. Specifically:

The Attribute of the Metric Itself. Continuous Monitoring: If a metric is continuously monitored by a network application (such as the Rx/Tx Packets of a port, which are consistently required for a network-wide status panel), it should be classified as a basic metric. Impact on Functionality: A metric classified as detailed and triggered in the subsequent measurement cycle should not impair the functionality of network applications. For example, in a classic ANT application like a network-wide status panel, if an increase in port packet loss is observed at time t, and further increases in port packet loss and IPv4 packet loss are observed at time $t + 1 * δ t$ , the application’s functionality should remain virtually unaffected. Otherwise, if any metric does impair the functionality, it must be categorized as a basic measurement.

The Relationships Between Metrics.Dependency: If metric A is unavailable without the value of metric B, we say metric A depends on metric B. For instance, a queue length is only valid if the queue id is valid. If B is a detailed measurement, A can be either a basic or a detailed measurement; if B is a basic measurement, then A must also be a basic measurement. Inference: If a change in metric A might cause a change in metric B, but not necessarily vice versa, we say metric A can infer metric B. For example, if the bandwidth utilization of a port increases, the packet loss rate might also increase. In such cases, if metric A and metric B do not have any other relationships, we can categorize A as a basic metric and B as a detailed metric. Similarity: If metrics such as A, B, C, etc., can infer each other, we consider them similar, for instance, queue length, queue occupancy rate, and queuing time. If any one of these is in basic measurement, the others can be inferred. In such cases, when not influenced by other relationships, it is advisable to choose the one with the smallest bit width to minimize overhead.

4.2. Implementation

Algorithm 1 outlines the pseudocode of the metrics categorization strategy.

Algorithm 1: Metrics Categorization Strategy

Initialization: The input to the algorithm is a set E ( $e \in E$ ) that consists of all metrics e and can be monitored by the FANT system. The output of the algorithm includes a set $e \in E_{b}$ of basic metrics and a set $e \in E_{d}$ of detailed metrics. Initially, both $E_{b}$ and $E_{d}$ are empty sets.

Calculation: Lines 2–7: Analyze any two distinct metrics in E to determine if there is a dependency or similarity relationship. If a dependency exists between e and $e_{i}$ , add e to set $T (e_{i})$ . If a similarity exists between e and $e_{i}$ , add e to set $S (e_{i})$ .

Check any two distinct metrics in E for an inference relationship. If any other metric can be inferred by e, classify e as a basic metric (lines 8–9).

As described in Lines 10–11, Algorithm 1 classifies metrics in E based on the metric’s attributes. If a metric cannot tolerate delay (impact on functionality) or requires continuous monitoring by some applications, add the metric to $E_{b}$ .

Lines 12–18: Use the relationships between metrics to further identify which should be classified as basic metrics: First, iterate through the current $E_{b}$ . If e is in $E_{b}$ but the metrics e depends on are not, then add all its dependent metrics to $E_{b}$ . Next, iterate through $E_{b}$ again. During this iteration, if a metric and its similar metrics are not in $E_{b}$ , add e whose report overhead is smallest to $E_{b}$ .

As reported from Line 19 to Line 20, detailed metrics are obtained by subtracting the basic metrics from the complete set of metrics.

5. Attention-Switching Strategy

The attention-switching strategy aims to determine which probes should conduct detailed measurements and which should perform basic ones to satisfy requirement R1.2 while minimizing network overhead. Therefore, in this section, we first model this problem, then introduce the fundamental concepts of the algorithm, and finally provide a detailed implementation of the algorithm.

5.1. Problem

Formally, FANT addresses the following weighted set cover problem (WSCP), where sets may contain overlapping elements. Given:

L: L The universal set containing all links l in the topology.
$R \subseteq L$ : The target links that need comprehensive measurement.
P: A collection of probe cycle sets, where each probe cycle $p \in P$ is a set of links l.
If $l \in p$ , $I_{l, p} = 1$ .
$C (p)$ : The cost associated with each probe cycle p.
$x_{p}$ : A binary decision variable where $x_{p} = 1$ if probe cycle p is selected, and $x_{p} = 0$ otherwise.

Objective:

(1) $min \sum_{p \in P} C (p) \cdot x_{p}$

Constraints:

(2) $\sum_{p \in p} x_{p} * I_{l, p} \geq 1, \forall l \in R$

(3) $C (p) = | p |, \forall p \in P$

The Objective is to minimize the total cost of the selected probe cycles. A probe cycle is considered ’selected’ when it is configured for detailed measurement. Constraint (1) ensures that every target link in subset R is covered by at least one of the selected probe cycles. Constraint (2) requires that the cost of a probe cycle p is determined by the number of links it contains, reflecting the length of the probe.

To clearly introduce the algorithm, as shown in Figure 3, we present an example using a topology with seven links and seven probes (i.e., Probe 1 visits links 2 and 5; Probe 2 visits link 1; Probe 3 visits links 2, 3, and 4; Probe 4 visits links 1, 2, 4, and 6; Probe 5 visits links 4 and 6; Probe 6 visits links 3 and 7; Probe 7 visits links 1, 6, and 7).

5.2. Basic Idea

The basic workflow of our algorithm is as follows: Step 1: Based on an existing probe generation algorithm, we generate probe traces for network-wide measurements. Step 2: We transform the relationship between probes and links into a 0–1 matrix (if probe p measures link l, then $Matrix_huge (p, l) \leftarrow 1$ ). Step 3: We identify the suspicious links in the measurement cycle t as links 1, 3, 4, and 7. Subsequently, we remove columns not corresponding to these links and delete rows that become entirely zeros following this column removal. This process yields a smaller matrix, which serves as the actual input for the algorithm. Step 4: As shown in the data structure in Figure 3, we store this smaller matrix in a Two-Dimensional Doubly Circular Linked List (TDDCL) [24,25] to enhance the efficiency of subsequent computations. Step 5: We conduct a search using rapid pruning and iterative deepening techniques to find the most cost-effective combination that satisfies requirement R1.2 (i.e., setting probes 2, 3, and 6 for detailed measurements at time $t + 1 * δ t$ ).

In this workflow, we introduce four optimizations to significantly enhance the efficiency of the algorithm.

A. Reducing the Input: Based on the suspicious links (marked in blue) from basic measurements, we compress the original matrix that represents the coverage relationship between the entire network’s probes and links to generate a smaller input matrix. Specifically, we first remove all non-blue columns ( $l_{2}$ , $l_{5}$ , $l_{6}$ ). After deleting these columns, rows that do not cover any links (such as $p_{1}$ , $p_{5}$ ) are also removed. (Step 3).

B. Cost-Based Iterative Deepening Search: To manage space and time efficiently while solving NP-hard problems, FANT employs a cost-based iterative deepening search. By limiting the cost of the current solution (search node), FANT avoids getting stuck in deep search branches, which could consume excessive time and resources. For instance, the solution ( $p_{2}$ , $p_{6}$ ) is only attempted when the allowable cost exceeds 4. When the allowable cost is 2, the solution ( $p_{2}$ ) does not try to add $p_{6}$ to the solution. Moreover, the allowable cost does not increase incrementally by 1 like traditional iterative deepening search algorithms [26]. Instead, it updates based on the minimum value that exceeded the allowable cost during the previous search (the allowable cost jumps to 4 from 2).

C. Early Pruning Based on Quick Estimation: When the allowable cost is set at 5 and the solution is ( $p_{4}$ ), FANT does not attempt to construct complete solutions such as ( $p_{4}$ , $p_{3}$ , $p_{6}$ ), ( $p_{4}$ , $p_{3}$ , $p_{7}$ ), or ( $p_{4}$ , $p_{6}$ ) and then calculate their costs to decide whether to backtrack. Instead, a conservative quick estimation is used to assess the minimum additional cost required to achieve full coverage based on the current solution. If the current cost plus the estimated minimum cost exceeds the limit, the search backtracks immediately. The quick estimation process operates as follows: when solution ( $p_{4}$ ) is selected, it is observed that $l_{3}$ remains uncovered. Consequently, FANT utilizes all paths that include $l_{3}$ , such as $p_{3}$ and $p_{6}$ , to estimate the coverage of links ( $l_{3}$ , $l_{7}$ ). Then, FANT computes the estimated cost by adding the minimum costs of $p_{3}$ and $p_{6}$ , which is 2. Therefore, FANT estimates that choosing $p_{4}$ , regardless of how optimally the remaining selections are made, will necessitate a minimum total cost of 6 (2+4). This results in the immediate backtracking of the search.

D. Fast Recursion and Backtracking: The attention-shifting strategy, employing a TDDCL data structure to store the current search state, enhances the efficiency of iteration and backtracking by accelerating the deletion and restoration of rows and columns. As shown in Figure 3, when the root node (where P is an empty set) traverses to the node marked by the red circle ( $P {p_{7}}$ ), it involves three transformations: the deletion of column $l_{1}$ , row $p_{7}$ , and column $l_{7}$ . Conversely, backtracking from the red circle stage to the root node involves three transformations in reverse order: adding column $l_{7}$ , adding row $p_{7}$ , and then adding column $l_{1}$ . As illustrates in Figure 4, we selected several deletion and restoration operations as examples to more clearly illustrate this basic idea.

When deleting column $l_{1}$ , the right pointers of the nodes immediately to the left of all nodes in $l_{1}$ are updated to point to the nodes indicated by the right pointers of the $l_{1}$ nodes. Similarly, the left pointers of the nodes immediately to the right of all nodes in $l_{1}$ are updated to point to the nodes indicated by the left pointers of the $l_{1}$ nodes. In the process of deleting row $p_{7}$ , the down pointers of the nodes immediately above all nodes in $p_{7}$ are updated to point to the nodes indicated by the down pointers of the $p_{7}$ nodes. Similarly, the up pointers of the nodes immediately below all nodes in $p_{7}$ are updated to point to the nodes indicated by the up pointers of the $p_{7}$ nodes.

The process of restoring rows and columns is the inverse of the deletion process. For example, to restore row $p_{7}$ , first access the node immediately above $p_{7}$ using the up pointer of $p_{7}$ , and then update that node’s down pointer to reconnect to $p_{7}$ . Similarly, access the node immediately below $p_{7}$ using the down pointer of $p_{7}$ , and update that node’s up pointer to reconnect to $p_{7}$ . To restore column $l_{1}$ , first access the node immediately to the left of $l_{1}$ using the left pointer of $l_{1}$ , and then update that node’s right pointer to reconnect to $l_{1}$ . Similarly, access the node immediately to the right of $l_{1}$ using the right pointer of $l_{1}$ , and update that node’s left pointer to reconnect to $l_{1}$ .

5.3. Implementation

To make the algorithm’s implementation clearer, we use the following notations: $c . r o w s$ refers to the set of paths that include all paths in M containing the link c. $M . c o l u m n s$ refers to the set of links that includes all links in M.

Input and Output The algorithm takes the following inputs: (i.) A set of links R ( $l \in R$ ). (ii.) A probe trace set $p \in P$ generated by a path generation algorithm chosen by the user. (iii.) Acceleration factors $a_{1}$ and $a_{2}$ (ranging from 0 to 1, where a higher value speeds up the algorithm but the solution deviates more from the theoretical optimum). The output is a set of paths A ( $p \in A$ ), where each path p in A is set for detailed measurement in the upcoming measurement cycles.

Initialization Lines 1–4: Initialize A as an empty set and M as the head node of a doubly circular linked list (initially empty). For each path p in P, check if it shares any links with R. If so, update p to the intersection of p and R, and add p as a row of M to form the data structure (optimization D). If not, do not add p to M. The function M.add_rows works as in the following steps: We first create column header nodes for all links l ( $l \in p$ ) that are not already present in $M . c o l u m n s$ . Then, create a row header node for p. Next, for each links in p, create a node and vertically link it to the bottom of its respective column node. Finally, connect all these newly created nodes horizontally to form a circular doubly linked list.

The $f u n c t i o n$ Estimate Coverage Cost (ECC, Lines 5–10) takes the current data structure M as input and returns an estimated cost to cover all columns in this state.

Calculation First we initialize cover as an empty set (storing unique links already covered), $C o u n t_o u t$ as 0 (tracking the number of coverage iterations during the estimation), and $C o u n t_i n$ as 0 (tracking the number of paths in each coverage iteration). The estimation process involves iterating over uncovered columns c in M, selecting the union of all paths containing c as the coverage capability for this iteration and choosing the minimum cost among all paths containing c as the cost for this iteration. Update $C o u n t_i n$ , $C o u n t_O u t$ , and $C o s t H$ (where the update amount for $C o s t H$ is slightly amplified by the acceleration factor $a_{1}$ ). When cover includes all links in M, it indicates that our estimation has covered the current state of M. Finally, we slightly amplify the cost based on the number of coverages and the acceleration factor $a_{2}$ , then output the result.

The $f u n c t i o n$ Explore Path Combinations Recursively (EPCR, Lines 11–33) is a recursive function that takes M, A, and T as inputs (the pseudocode in Algorithm 2 is written recursively to help readers understand. However, implementing the algorithm iteratively, by manually maintaining a stack, would be more efficient than implementing it recursively). T represents the maximum allowed cost in the search process (similar to the allowed search depth in traditional iterative deepening search algorithm [26]).

Lines 12–15 check if the estimated cost to cover the current state is still less than the allowed cost. If it is, proceed with the logic; otherwise, return false and begin backtracking. Before backtracking, update $T_{m i n}$ based on the $c o s t_{s u m}$ (the minimum value of the actual cost plus the estimated cost). $T_{m i n}$ maintains the minimum $c o s t_{s u m}$ that exceeds T. Lines 16–17 check if M is empty (i.e., the head node of the doubly linked list points to itself) to determine if the recursion of EPCR should stop because a feasible solution has been found. After verifying the above two conditions, EPCR is recursively invoked. The implementation of EPCR involves three steps: (A) a downward search, (B) recursive calls, and (C) upward backtracking:

A. The downward search consists of three deletion steps:

D1 (Lines 18–19): Iterate over all columns in M and remove the current column c using the function $r e m o v e_c o l u m n s$ .
D2 (Lines 20–21): For each path p (by traversing horizontally from the column node, we can find each p’s path header and add p to A) that contains column c, add p to A and remove this row p from M (details on row and column removal are provided in Section 5.2).
D3 (Lines 22–23): Remove all links (columns) contained in the row p (the row currently being removed in D2), thereby completing the necessary updates to M. Note that, unlike Donald Knuth’s approach [25] of removing all rows containing the column ( $c . r o w s$ ), only the currently selected row p is removed from $c . r o w s$ .

B. After completing the downward search, Lines 24–25 recursively call EPCR with the updated M, A, and the cost reduced by the cost of the selected path p. If the return value of the call is true, the recursion of EPCR will stop, indicating that a feasible solution has been found. If not, we perform upward backtracking.

C. Upward backtracking is the reverse process of the downward search and includes three restoration steps:

U1 (Lines 27–28): Restore the columns that were previously removed due to the selection of p by calling $r e s t o r e_c o l u m n$ for each $c^{'}$ in p.
U2 (Line 29): Then, restore row p by calling $restore_row$ and remove p from A.
U3 (Line 30–31): Continue to select another suitable p. If all selections of p are unsuccessful, further backtrack by restoring the initially covered element c (previously removed in step D1) using $restore_column.$ . If all recursive searches have been finished without finding a feasible solution, return false.

Algorithm 2: Attention-Switching Strategy

Line 32 calls ECC with the initialized M to determine the allowed cost for the first search. Lines 33–38 continuously execute the following logic: FANT initializes $T_{m i n}$ to a high value to facilitate updating it to the minimum value during the recursion of PECR. Then, check if the EPCR function returns true. If true, output the current A as the solution of the entire algorithm. Otherwise, update the allowed cost of the solution to the larger of $T_{m i n}$ and $T + 1$ , and call the EPCR function again. Note that R is a subset of the union of P. Therefore, when the allowed cost of the solution is sufficiently large, EPCR will inevitably return true. This avoids non-termination of the search.

6. Experiment

Our evaluation focuses on two primary questions:

Is the FANT capable of supporting the normal operation of network applications?
How does the FANT system perform under various conditions and configurations?

To address these questions:

We built a prototype of the FANT system and used this prototype to provide measurement results for a network application.
We compared the performance of the FANT algorithm with other comparative methods under various conditions and using different parameter settings.

6.1. Testbed

Experiments were conducted on a machine equipped with an Intel i7-13700 processor, 32 GB of memory, and running Ubuntu 18.04. The metrics categorization strategy and attention switching strategy were implemented in Python. We used an existing probe generation algorithm that is implemented in C++ and open source on github [27].

Prototype: We implemented the forwarding logic of the device with P4 and loaded this logic into the P4 software (1.15.0) switch (to ensure that the experimental results were not influenced by the performance of BMv2, we disabled log printing by modifying the build flag of BMv2 to accelerate packet forwarding. Additionally, we set BMv2’s queue depth to 4096 packets to prevent packet loss due to uneven traffic flows) (Behavioral Model 2, BMv2) to serve as the switch in the prototype. The hosts were instantiated using Linux network namespaces, and business flows were generated using iperf. The switches were interconnected and also connected to hosts using virtual Ethernet (veth) pairs.

INT generator/collectors were implemented in Python using Scapy to construct and receive probe packets. INT controllers were also developed in Python and communicated with switches via gRPC and with INT generators/collectors via sockets. We employed Redis to implement the high-speed database, and for simplicity, we did not integrate the Kafka database into the prototype. The label management functionality of SR (i.e., the mapping relationship between adjacency labels and ports) was statically written into .json files and read by BMv2.

We implemented a network verification application. The application can compare whether network traffic changes as expected when new network configurations are added, thereby verifying the correctness of the configuration settings.

Large Scale Simulation: The topology generation, INT-Probe generator/collector selection, and suspicious link generation methods were all implemented in Python.

6.2. Comparative Methods

Since there are no existing methods specifically addressing our problem, we introduce two baseline solutions to better evaluate the performance of our solution, FANT, with a heuristic algorithm (FANT-H).

INT-P: INT-Probe is the state-of-the-art solution that minimizes overhead and has demonstrated its theoretical optimality. We implemented INT-Probe (INT-P) without the attention-switching mechanism for comparison. FANT-G: We modeled the probe-switching strategy as a WSCP, which is generally solved using a greedy algorithm. We refer to the FANT system that utilizes a greedy algorithm for the attention-shifting strategy as FANT-Greedy (FANT-G), and we implemented FANT-G for comparison.

6.3. Experiment Setting

Network applications supported by the ANT system, such as Network Verification and Network Performance Analysis, always require a robust measurement system in real networks. Therefore, to closely mimic real network conditions, we have enhanced the robustness of the INT-Probe algorithm and selected this enhanced version as the probe generator algorithm for three methods. Specifically, we invoke the INT-Probe method twice with two independent sets S to generate two mutually independent path traces. We then output the union of these path traces as the output of the enhanced algorithm. For simplicity, we refer to the enhanced INT-Probe algorithm for probe generation as EINP.

We extended the metrics in the INT specification [10] and established 15 measurement metrics as the comprehensive set supported for monitoring by FANT, as detailed in Table 2. We employ a Metrics Categorization Strategy to classify these metrics. The metric marked by an asterisk in Table 2 is categorized as a basic measurement by the Metrics Categorization Strategy. Consequently, the instruction bitmap for basic measurements is 0xF161, and the instruction bitmap for detailed measurements is 0xFFFF. Unless otherwise stated, the attention-switching strategy is configured to operate with the acceleration factors set as $a_{1} = 0$ and $a_{2} = 0$ .

Prototype: We deployed our FANT prototype in a 4-pod fat-tree topology as shown in Figure 5. We assigned identifiers 1–16 to the hosts connected to the Top-of-Rack (TOR) switches from left to right. We designated hosts 1 and 5 as the first set of probe generators/collectors and hosts 8 and 9 as the second set of probe generators/collectors. If the value of any result in this measurement cycle is more than twice the value from the previous cycle, we identify the links associated with that result as suspicious. Each host is instructed to send traffic at a rate of 71 KB/s to all other hosts, resulting in a background traffic rate of 1 MB/s at all links.

Large Scale Simulation: We utilize topology generation methods to create diverse network topologies, including various sizes (detailed in the figure axis) of fat-tree or spine-leaf (the spine-leaf topology in our experiment, specified with a total number of switches n, consists of $\frac{1}{3} * n$ switches in the spine layer and $\frac{2}{3} * n$ switches in the leaf layer. Each switch in the spine layer is interconnected with every switch in the leaf layer, and there are no connections between switches within the same layer). We set $| S |$ as 10% of the total number of hosts. We randomly select $| S |$ hosts to form the first set of probe generators/collectors, and then randomly select another $| S |$ hosts that do not overlap with the first set to form the second set of probe generators/collectors. Based on topology and two distinct S, we use the EINP to get P. In subsequent performance evaluations, we set the abnormal rate at $x %$ . This means that in each measurement cycle, $x %$ of all links are randomly chosen as the suspicion area R. To minimize the impact of randomness, the results of the experiment are averaged over 100 cycles. Based on P and R, we invoke FANT-H and FANT-G to obtain the performance and time consumption.

6.4. Experimental Result

In this section, we first present a use case of a network application supported by FANT. Next, we discuss the basic performance of FANT. Finally, we explore the impact of different parameter configurations on FANT.

6.4.1. Use Case

In the prototype, we generate a 5MB/s business flow (IPv4 TCP) on host 6 and direct this flow along a predetermined path (marked in red) to host 9 by configuring an SR-Policy on switch 15 at time 2. This setup is being used to implement a use case that mimics a scenario where a new network rule is applied to manage business traffic. We then check the monitoring results of the Network Application along the path. For simplicity, we only display the monitoring result of the port on switch 7 that connects to switch 15, as shown in Figure 6. From the Network Verification application supported by INT-P, an increase in traffic is detected in the application’s port statistics (Egress Interface Tx Utilization), segment routing statistics (SR Byte count), and IPv4 statistics (IPv4 Byte Count) at port 5. Similarly, the Network Verification application supported by FANT-H also detects an increase in traffic in port statistics (Egress Interface Tx Utilization), segment routing statistics (SR Byte count), and IPv4 statistics (IPv4 Byte Count) at port 5. We observe that FANT has a minimal impact on ANT network applications, which focus on monitoring the sustained, comprehensive network-wide status.

6.4.2. Basic Performance

In actual network deployments, the number of suspicious links detected in each monitoring cycle should be low; hence, we set the anomaly rate at 3%.

Performance Across Different Sizes or Topologies: As shown in Figure 7a, within a topology consisting of 1125 switches, INT-Probe consumed 4.7 MBps of bandwidth, while FANT-G and FANT-H utilized 3.2 MBps and 2.0 MBps (only 42.6% of INT-P), respectively. This result indicates that FANT introduces the least overhead compared to other approaches. As the network scale increases from 125 to 1125 switches, we observe that the bandwidth difference between FANT-H and INT-P grows from 0.12 MBps to 2.0 MBps, which is larger than the bandwidth difference between FANT-G and INT-P. This highlights the efficiency of our heuristic algorithm.

Figure 7a and Figure 8a illustrate that the bandwidth usage in spine-leaf topologies is higher than in fat-tree topologies. This is because spine-leaf topologies have more links for the same number of switches, which consequently enlarges the algorithm input (R and P).

We normalize the overhead of each method based on the bandwidth usage of INT-Probe, as depicted in Figure 7b and Figure 8b. The normalized bandwidth of FANT-H remains stable when number of switches increase, demonstrating that the FANT-H strategy effectively reduces bandwidth across various network sizes. Additionally, in spine-leaf topologies, both FANT-H and FANT-G exhibit lower normalized bandwidth than in fat-tree topologies, indicating that our approach performs better in topologies with stronger connectivity.

Figure 7c shows that without configured acceleration factors, the algorithm’s execution time almost reaches $10^{2}$ s in topologies with more than $10^{3}$ switches, significantly exceeding the measurement cycle (i.e., one second). This delay impacts the normal functioning of attention-switching strategies. FANT employs two methods to address this issue: (i.) Enhance the script’s execution speed: FANT can improve execution efficiency by converting the logic implemented in Python to C++ using PyPy (using PyPy, we only need 10 s with 1125 switches) or by multithreading the program. (ii.) Set acceleration factors: Higher values of $a_{1}$ and $a_{2}$ can effectively accelerate the algorithm’s operation, as demonstrated in the following experiments.

Performance Across Different Anomaly Rate: The anomaly rate in the experiment is a predefined value that is influenced by two factors: (i.) Conditions for identifying suspicious links in the current cycle: If we use strict criteria for identification, the anomaly rate will be higher; if the criteria are more lenient, the anomaly rate will be lower. (ii.) Network stability or rapid changes: A network that changes slowly tends to have a lower anomaly rate.

As illustrated in Figure 9, regardless of the anomaly rate, FANT-H consumes the least bandwidth. As the anomaly rate increases from 0% to 30%, the bandwidth consumed by FANT-G and FANT-H increases from 1.3 MBps to 2 MBps and from 1.2 MBps to 6 MBps, respectively. The growth rate of bandwidth usage in FANT-H is slower than that of FANT-G. The increase in bandwidth usage by FANT as the anomaly rate rises is due to the need for more detailed measurements at higher numbers of anomaly points, which diminishes the bandwidth-saving effect of FANT. At an anomaly rate below 10%, the bandwidth consumed by FANT grows rapidly. However, as the anomaly rate exceeds 10%, the rate of bandwidth increase gradually slows down. This trend occurs because it is challenging for a single probe to cover multiple suspicious links when there are few random anomalies. However, as the number of anomalies increases, it becomes easier for a probe to cover multiple links.

6.4.3. Impact of Parameters

In this section, we explore the impact of various parameters on algorithm performance and execution time. To more clearly demonstrate these effects, we set the anomaly rate at 10%, which is higher than in previous experiments.

Time Consumption with Different $a_{1}$ and $a_{2}$ : In previous experiments, we observe that the algorithm’s execution time exceeds the measurement cycle interval. This means that the INT controller cannot set correct INT instructions for the probe generator/collector before the next measurement cycle. Consequently, in the actual FANT system, we accelerate measurements by setting acceleration factors. As illustrated in Figure 10, when ( $a_{1}$ = 0), the execution time is 85.7 s; when ( $a_{1}$ = 0.9), the execution time is 0.22 ms. The results indicate that increasing $a_{1}$ can effectively accelerate the algorithm, enhancing efficiency by $3.85 * 10^{5}$ ×. Similarly, when we set $a_{1}$ to 0 and adjust $a_{2}$ , we find that with ( $a_{2}$ = 0), the execution time remains 85.7 s; when ( $a_{2}$ = 0.15), the execution time changes to 0.69 ms. The results demonstrate that increasing $a_{2}$ also significantly accelerates the algorithm, with an efficiency increase of $1.24 * 10^{5}$ ×. We also note that when $a_{1}$ and $a_{2}$ are already high, further increases in $a_{1}$ and $a_{2}$ result in diminishing reductions in execution time. Additionally, for the same configuration parameters, larger topologies do not always result in longer execution times. This is primarily because the FANT takes probe traces as input. And the probe traces generated by INT-Probes may vary in different sizes and thus influence the execution time more than the topology size itself. Therefore, we recommend that users conduct tests with the same topology to determine the optimal configuration of acceleration factors when using FANT.

Performance with Different $a_{1}$ and $a_{2}$ : By configuring acceleration factors, we have successfully accelerated the algorithm’s operation. However, when the acceleration factor is nonzero, the algorithm cannot guarantee achieving the theoretical optimum for two reasons: (i.) The estimation function may not ensure values are less than the actual values, leading to excessive pruning and premature termination of the search for the optimal solution. (ii.) The allowable cost is updated to $M a x (T_{m i n}, T + 1)$ rather than $T + 1$ during the search and thus might cause an excessive increase in allowable costs, leading to the output of non-theoretical optimal solutions.

As shown in Figure 11a,b, when $a_{1}$ is 0.1 and $a_{2}$ is 0.07, the algorithm still maintains the theoretical optimum. However, as $a_{1}$ and $a_{2}$ gradually increase, the algorithm begins to deviate from this optimum. This deviation becomes more pronounced in larger topologies. That is because a larger topology has a large search tree, thereby increasing the likelihood of the algorithm finding non-optimal solutions when cost limits are excessively updated. Nevertheless, these non-optimal solutions are still sufficiently effective. Even when $a_{1}$ and $a_{2}$ are set to higher values, compared to the theoretical optimum, their bandwidth only increases by 20.1% ( $a_{1}$ = 0.9, bandwidth usage is 3.4 Mbps) and 6.4% ( $a_{2}$ = 0.15, bandwidth usage is 3.0 Mbps).

By analyzing Figure 10 and Figure 11 together, we conclude that configuring $a_{1}$ and $a_{2}$ enables us to achieve a speed improvement of over $10^{5}$ ×. However, this comes at the cost of a 6.4% reduction in bandwidth-savings ability.

Based on all the experimental results presented in this section, we can conclude that compared to state-of-the-art solutions, the FANT system:

Achieves measurement accuracy that is nearly equivalent.
Significantly reduces overhead, particularly in stable network conditions, where it can save up to 65%, and even in highly unstable network conditions, it still manages to save 30%.
The reduction in overhead with the FANT system leads to networks using this system having more available bandwidth, thereby increasing the throughput of users’ traffic.

7. Limitations and Future Work

7.1. Limitations

FANT is tailored for any AINT-based full-network measurement system, effectively mitigating the increased overhead as metrics rise. However, FANT is not suitable for PINT-based full-network measurement systems.
FANT relies on probe generators/collectors to detect anomalous metrics based on local information. This approach, which does not use global information, requires a cautious strategy to avoid missing measurements. As a result, more metrics that are not actually anomalous might be incorrectly identified as suspicious, thereby increasing the system’s overhead.

7.2. Future Work

FANT optimizes probe-switching decisions in each cycle. Nonetheless, during the probe-switching process, the controller must issue switching instructions to multiple probe generators/collectors every cycle. Consequently, (i.) as the network size expands, the number of probe generators/collectors increases, causing the control costs to rise; (ii.) similarly, when the cycle frequency is high, the control costs also increase. Thus, we plan to evolve the existing model from a weighted set cover problem focused solely on measurement costs to a dynamic weighted set cover problem that balances both control and measurement costs in the future.

8. Conclusions

In this paper, we introduce FANT, an AINT monitoring system that employs an attention-shifting strategy to reduce overhead as metrics increase. This reduction is achieved by concentrating more on links that were suspicious in the previous measurement cycle and less on those that performed normally. To minimize overhead, FANT does not inject extra probe packets for measurement; instead, it reallocates attention by modifying the INT instructions of existing measurement probes. To optimize this shifting strategy, we model the problem as a weighted set cover problem. Considering that the model is NP-hard, we utilize a heuristic algorithm to address it. Our method surpasses the state-of-the-art solution by achieving bandwidth savings of 30% to 65%, and it can compute rapidly around $10^{5}$ × fast with 6.4% reduction in performance when configured with appropriate acceleration factors.

Author Contributions

Conceptualization, M.C.; methodology, M.C.; software, M.C. and Y.P.; validation, M.C. and Y.P.; writing—original draft preparation, M.C.; writing—review and editing, M.C. and F.Y. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author.

Acknowledgments

All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

Figure 1. The increase in the range of metrics leads to higher bandwidth consumption.

Figure 2. Architecture of the FANT system.

Figure 3. Basic idea of attention-switching strategy.

Figure 4. Fast recursion and backtracking based on data structures in idea D.

Figure 5. Experiment topology (a 4-pod Fat-Tree) using FANT prototype.

View Image - Figure 6. Functional verification of network applications based on the prototype. (a) INT-P-based Network Verification application. (b) FANT-based Network Verification application.

Figure 6. Functional verification of network applications based on the prototype. (a) INT-P-based Network Verification application. (b) FANT-based Network Verification application.

View Image - Figure 7. Performance and execution time of three methods across various sizes of spine-leaf topology. (a) Bandwidth consumption. (b) Normalized bandwidth consumption. (c) Strategy’s execution time.

Figure 7. Performance and execution time of three methods across various sizes of spine-leaf topology. (a) Bandwidth consumption. (b) Normalized bandwidth consumption. (c) Strategy’s execution time.

View Image - Figure 8. Basic performance of three methods in different sizes of spine-leaf topology. (a) Bandwidth consumption. (b) Normalized bandwidth consumption.

Figure 8. Basic performance of three methods in different sizes of spine-leaf topology. (a) Bandwidth consumption. (b) Normalized bandwidth consumption.

View Image - Figure 9. Attention-shifting strategy performance of different abnormal rates. (a) Bandwidth consumption. (b) Normalized bandwidth consumption.

Figure 9. Attention-shifting strategy performance of different abnormal rates. (a) Bandwidth consumption. (b) Normalized bandwidth consumption.

View Image - Figure 10. Execution time of FANT-H across different [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] settings. (a) Execution time across various [Forumla omitted. See PDF.] settings. (b) Execution time across various [Forumla omitted. See PDF.] settings.

Figure 10. Execution time of FANT-H across different [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] settings. (a) Execution time across various [Forumla omitted. See PDF.] settings. (b) Execution time across various [Forumla omitted. See PDF.] settings.

View Image - Figure 11. Bandwidth usage of FANT-H across different [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] settings. (a) Performance across various [Forumla omitted. See PDF.] settings. (b) Performance across various [Forumla omitted. See PDF.] settings.

Figure 11. Bandwidth usage of FANT-H across different [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] settings. (a) Performance across various [Forumla omitted. See PDF.] settings. (b) Performance across various [Forumla omitted. See PDF.] settings.

Table 1

Comparison of network-wide measurement solutions.

Research	Device Requirement	Overhead ¹	Deployability
INT-filter [13]	High	Data	Data
INTplabel [14]	High	Data	Data
INT Path [15]	Low	High	Low
INT-Probe [16]	Low	High	High
INT Balance [17]	Low	High	High
FANT(Ours)	Low	Low	High

¹ Overhead associated with increasing number of metrics.

Table 2

Metrics and corresponding instruction bitmap field values in experiments.

Position (bit)	Description and Width (bits)
0 *	Node ID (32 bits)
1 *	Level 1 Ingress Interface ID (16 bits) + Egress Interface ID (16 bits)
2 *	Hop latency (32 bits)
3 *	Queue ID (8 bits) + Queue occupancy (24 bits)
4	Ingress timestamp (64 bits)
5	Egress timestamp (64 bits)
6	Level 2 Ingress Interface ID (32 bits) + Egress Interface ID (32 bits)
7 *	Egress interface Tx utilization (32 bits)
8	Buffer ID (8 bits) + Buffer occupancy (24 bits)
9 *	Egress Interface Tx packets count (32 bits)
10 *	Drop packets count (32 bits)
11	Drop bytes count (32 bits)
12	Segment Routing packets count (32 bits) + bytes count (32 bits)
13	IPv4 packets count (32 bits) + bytes count (32 bits)
14	IPv6 packets count (32 bits) + bytes count (32 bits)
15 *	Checksum Complement

The metrics underlined are newly extended in our experiment. Metrics marked with an asterisk “*” are categorized as basic measurement metrics.

Appendix A

Our investigation reveals that the telemetry supported by existing switch chips primarily follows three standards: IOAM [22], IFA [23], and INT specification [11]. FANT can be seamlessly integrated with some current switches, while others may require minor logic modifications.

Figure A1. IOAM header format.

The IOAM standard allows for the insertion of a variable number of metrics. The metrics inserted into the telemetry result field are determined by the node length field and trace-type field. FANT’s telemetry requirements can be met by utilizing different trace types to obtain basic measurement and detail measurement.

Figure A2. INT spec packet format.

Similarly, the INT specification supports the insertion of a variable number of metrics. Switches determine the metrics embedded into the INT header based on the instruction bitmap field. By specifying two distinct combinations of INT instructions, devices supporting the INT specification can provide measurement results with two different numbers of metrics.

Figure A3. IFA header format.

Under IFA2.0, the switch embeds predefined INT metadata (which includes a fixed number of metrics) into packets when it detects an IFA header. To meet FANT requirements, we utilize one bit from the last six-bit reserved field of the action vector in the IFA header to represent two different measurement commands, enabling switching between basic and detailed measurements. As a result, a FANT-enabled switch can return two different fixed-length measurement results based on the IFA header with varying action vectors. Although current switches supporting IFA 2.0 cannot directly support the FANT system, this additional functionality (inserting a fixed-length telemetry result based on the value of the specific one bit) can be easily implemented.

Thus, FANT can be directly applied to switches supporting telemetry based on the INT specification (Intel Barefoot switches) and IOAM (Cisco switches supporting telemetry) standards. However, our method cannot yet be applied to switches that support telemetry based on the IFA (Broadcom switches supporting telemetry) standard.

Appendix B

FANT identifies suspicious links by analyzing metrics from basic measurements that exhibit abnormal deviations from historical data. Links associated with any suspicious metrics are added to the set of suspicious links. Below, we detail the methods described in the main text for analyzing metric anomalies based on historical absolute deviation and historical difference deviation.

For clarity, we introduce the following notation:

$m_{i}$ : One of the metrics in the basic measurement metric set.
$e_{j}$ : A direct edge in the topology.
$p_{h}$ : One of the probe cycles in the set P.
$v_{i, j, h, t}$ : The measurement result of metric $m_{i}$ , obtained by probe $p_{h}$ on edge $e_{j}$ at time t.
$t h r e s h o l d_{1}$ & $t h r e s h o l d_{2}$ : Two predefined thresholds. If the absolute change in a metric’s measurement exceeds $t h r e s h o l d_{1}$ or the rate of change surpasses $t h r e s h o l d_{2}$ , FANT considers this metric and its associated edge to be suspicious.

FANT maintains two Redis lists for each < $m_{i}, e_{j}, p_{l}$ > tuple:

$L i s t_{1}$ sequentially stores values from $v_{i, j, h, (t_{s t a r t} + 1 * δ t)}$ , $v_{i, j, h, (t_{s t a r t} + 2 * δ t)}$ , …, $v_{i, j, h, (t_{s t a r t} + k * δ t)}$ .

$L i s t_{2}$ sequentially stores values from $δ v_{i, j, h, (t_{s t a r t} + 1 * δ t)}$ , …, $δ v_{i, j, h, (t_{s t a r t} + (k - 1) * δ t)}$ . $δ v_{i, j, h, l}$ can be calculated as following: $δ v_{i, j, h, (t_{s t a r t} + n * δ t)} = | v_{i, j, h, (t_{s t a r t} + n * δ t)} - v_{i, j, h, (t_{s t a r t} + (n + 1) * δ t)} |$

Additionally, FANT maintains two (key, value) pairs for each < $m_{i}, e_{j}, p_{h}$ > tuple:

( $k e y_{1}$ , $v_{s u m}$ ): The sum of all values in List 1, $v_{s u m} = \sum_{n = 0}^{k} v_{i, j, l, (t_{s t a r t} + n * δ t)}$ .

( $k e y_{2}$ , $δ_{s u m}$ ): The sum of all values in List 2, $δ_{s u m} = \sum_{n = 0}^{k - 1} δ v_{i, j, h, (t_{s t a r t} + n * δ t)}$ .

As Algorithm A1 shows, since the calculation process for any $v_{i, j, h, t}$ is executed only a constant number of times, the complexity of analyzing each cycle is O (the sum of the number of metrics received by all probes).

Algorithm A1: Analysis of Suspicious Links

[Image omitted. Please see PDF.]

References

1. Li, Y.; Miao, R.; Liu, H.H.; Zhuang, Y.; Feng, F.; Tang, L.; Cao, Z.; Zhang, M.; Kelly, F.; Alizadeh, M. et al. HPCC: High precision congestion control. Proceedings of the ACM Special Interest Group on Data Communication; Beijing, China, 19–24 August 2019; pp. 44-58.

2. Gandhi, R.; Liu, H.H.; Hu, Y.C.; Lu, G.; Padhye, J.; Yuan, L.; Zhang, M. Duet: Cloud scale load balancing with hardware and software. ACM SIGCOMM Comput. Commun. Rev.; 2014; 44, pp. 27-38. [DOI: https://dx.doi.org/10.1145/2740070.2626317]

3. Horn, A.; Kheradmand, A.; Prasad, M. Delta-net: Real-time network verification using atoms. Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17); Boston, MA, USA, 27–29 March 2017; pp. 735-749.

4. Fedor, M.; Schoffstall, M.L.; Davin, J.R.; Case, D.J.D. Simple Network Management Protocol (SNMP). RFC 1157; 1990; Available online: https://www.rfc-editor.org/info/rfc1157 (accessed on 1 December 2024).

5. Panchen, S.; McKee, N.; Phaal, P. InMon Corporation’s sFlow: A Method for Monitoring Traffic in Switched and Routed Networks. RFC 3176; 2001; Available online: https://www.rfc-editor.org/info/rfc3176 (accessed on 1 December 2024).

6. Agarwal, S.; Kodialam, M.; Lakshman, T. Traffic engineering in software defined networks. Proceedings of the 2013 Proceedings IEEE INFOCOM; Turin, Italy, 14–19 April 2013; pp. 2211-2219.

7. Jeyakumar, V.; Alizadeh, M.; Geng, Y.; Kim, C.; Mazières, D. Millions of little minions: Using packets for low latency network programming and visibility. ACM SIGCOMM Comput. Commun. Rev.; 2014; 44, pp. 3-14. [DOI: https://dx.doi.org/10.1145/2740070.2626292]

8. In-Band Network Telemetry (INT) Dataplane Specification Version v0.5. Available online: https://p4.org/p4-spec/docs/INT_v0_5.pdf (accessed on 1 December 2024).

9. In-Band Network Telemetry (INT) Dataplane Specification Version v1.0. Available online: https://p4.org/p4-spec/docs/INT_v1_0.pdf (accessed on 1 December 2024).

10. In-Band Network Telemetry (INT) Dataplane Specification Version v2.0. Available online: https://p4.org/p4-spec/docs/INT_v2_0.pdf (accessed on 1 December 2024).

11. In-Band Network Telemetry (INT) Dataplane Specification Version v2.1. Available online: https://p4.org/p4-spec/docs/INT_v2_1.pdf (accessed on 1 December 2024).

12. Zhou, Y.; Sun, C.; Liu, H.H.; Miao, R.; Bai, S.; Li, B.; Zheng, Z.; Zhu, L.; Shen, Z.; Xi, Y. et al. Flow event telemetry on programmable data plane. Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication; New York, NY, USA, 10–14 August 2020; pp. 76-89.

13. Song, E.; Pan, T.; Jia, C.; Cao, W.; Zhang, J.; Huang, T.; Liu, Y. Int-filter: Mitigating data collection overhead for high-resolution in-band network telemetry. Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference; IEEE, Taipei, Taiwan, 7–11 December 2020; pp. 1-6.

14. Song, E.; Pan, T.; Jia, C.; Cao, W.; Zhang, J.; Huang, T.; Liu, Y. INT-label: Lightweight in-band network-wide telemetry via interval-based distributed labelling. Proceedings of the IEEE INFOCOM 2021-IEEE Conference on Computer Communications; IEEE, Vancouver, BC, Canada, 10–13 May 2021; pp. 1-10.

15. Pan, T.; Song, E.; Bian, Z.; Lin, X.; Peng, X.; Zhang, J.; Huang, T.; Liu, B.; Liu, Y. Int-path: Towards optimal path planning for in-band network-wide telemetry. Proceedings of the IEEE INFOCOM 2019-IEEE Conference On Computer Communications; IEEE, Paris, France, 29 April–2 May 2019; pp. 487-495.

16. Pan, T.; Lin, X.; Song, H.; Song, E.; Bian, Z.; Li, H.; Zhang, J.; Li, F.; Huang, T.; Jia, C. et al. Int-probe: Lightweight in-band network-wide telemetry with stationary probes. Proceedings of the 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS); IEEE, Washington, DC, USA, 7–10 July 2021; pp. 898-909.

17. Zhang, Y.; Pan, T.; Zheng, Y.; Song, E.; Liu, J.; Huang, T.; Liu, Y. INT-Balance: In-Band Network-Wide Telemetry with Balanced Monitoring Path Planning. Proceedings of the ICC 2023-IEEE International Conference on Communications; IEEE, Rome, Italy, 28 May–1 June 2023; pp. 2351-2356.

18. Xie, S.; Hu, G.; Xing, C.; Zu, J.; Liu, Y. FINT: Flexible In-band Network Telemetry method for data center network. Comput. Netw.; 2022; 216, 109232. [DOI: https://dx.doi.org/10.1016/j.comnet.2022.109232]

19. Liu, Y.; Xia, Y.; Zhang, W.; Jia, W.; Wu, J. SFANT: A SRv6-Based Flexible and Active Network Telemetry Scheme in Programming Data Plane. IEEE Trans. Netw. Sci. Eng.; 2023; 11, pp. 2415-2425. [DOI: https://dx.doi.org/10.1109/TNSE.2023.3277000]

20. Bashandy, A.; Filsfils, C.; Previdi, S.; Decraene, B.; Litkowski, S.; Shakir, R. Segment Routing with the MPLS Data Plane. RFC 8660; 2019; Available online: https://www.rfc-editor.org/info/rfc8660 (accessed on 1 December 2024).

21. Filsfils, C.; Camarillo, P.; Leddy, J.; Voyer, D.; Matsushima, S.; Li, Z. Segment Routing over IPv6 (SRv6) Network Programming. RFC 8986; 2021; Available online: https://www.rfc-editor.org/info/rfc8986 (accessed on 1 December 2024).

22. Data Fields for In Situ Operations, Administration, and Maintenance (IOAM) RFC 9197. Available online: https://datatracker.ietf.org/doc/rfc9197/ (accessed on 1 December 2024).

23. Kumar, J.; Anubolu, S.; Lemon, J.; Manur, R.; Holbrook, H.; Ghanwani, A.; Cai, D.; Ou, H.; Li, Y.; Wang, X. Inband Flow Analyzer. Internet-Draft draft-kumar-ippm-ifa-08, Internet Engineering Task Force. 2024; Available online: https://datatracker.ietf.org/doc/draft-kumar-ippm-ifa/08/ (accessed on 1 December 2024).

24. Hitotumatu, H.; Noshita, K. A technique for implementing backtrack algorithms and its application. Inf. Process. Lett.; 1979; 8, pp. 174-175. [DOI: https://dx.doi.org/10.1016/0020-0190(79)90016-4]

25. Knuth, D.E. Dancing links. arXiv; 2000; arXiv: cs/0011047

26. Korf, R.E. Depth-first iterative-deepening: An optimal admissible tree search. Artif. Intell.; 1985; 27, pp. 97-109. [DOI: https://dx.doi.org/10.1016/0004-3702(85)90084-0]

27. Topology Generator. Available online: https://github.com/graytower/INT_Probe/tree/main/MDCPP_set/GraphGenerator (accessed on 1 December 2024).

Word count: 10823

Show less

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

As data center networks grow in scale and complexity, the active inband network telemetry (AINT) system collects a broader range of network status metrics to provide comprehensive visibility for AINT-related network applications, but it also leads to higher measurement costs. To address this issue, we introduce the Flexible Attention-shifting Network Telemetry (FANT), which dynamically focuses on critical links in each measurement cycle. Specifically, FANT employs a metric categorization strategy that divides all measurement metrics into two categories: basic measurements, which are lightweight but cover fewer metrics, and detailed measurements, which are comprehensive but incur higher overhead. Based on the analysis of the previous cycle’s measurements, FANT identifies which links are suspicious and then activates certain probe traces through an attention-shifting mechanism to collect detailed measurements of these links in the current cycle. To further save bandwidth, we model the attention-shifting process and apply heuristic algorithms to solve it. Our experiments show that FANT effectively supports the operation of ANT network applications. In a fat-tree topology with 30 pods, FANT significantly reduces bandwidth usage to 42.6% of the state-of-the-art solution. For scenarios requiring rapid computation, FANT can accelerate algorithm execution $10^{5}$ × by setting acceleration factors, with only a 6.4% performance loss.

Details

Title

FANT: Flexible Attention-Shifting Network Telemetry

Author

Cui, Mingwei¹

; Peng, Yufan¹; Yang, Fan²

¹ State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China; [email protected] (M.C.); [email protected] (Y.P.)
² State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China; [email protected] (M.C.); [email protected] (Y.P.); Purple Mountain Laboratories, Nanjing 211111, China

First page

892

Publication year

2025

Publication date

2025

Publisher

MDPI AG

e-ISSN

20763417

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/app15020892

ProQuest document ID

3159291604