Content area
The deployment of parallel Service Function Chains (SFCs) in Network Function Virtualization (NFV) environments presents significant challenges in jointly optimizing Virtual Network Function (VNF) parallelization and placement decisions. Traditional approaches typically decouple these decisions, leading to suboptimal performance and inefficient resource utilization. This paper proposes HGNN-PSFC, a novel heterogeneous graph neural network-assisted multi-agent deep reinforcement learning framework that jointly optimizes VNF parallelization and placement for parallel SFC deployment. Our approach employs cooperative agents: a Parallelization Agent that determines optimal VNF parallelization structures, and multiple Placement Agents that make VNF placement decisions. The framework utilizes a heterogeneous graph representation to capture complex relationships between VNFs, substrate network topology, and current VNF placement states. Through Multi-Agent Proximal Policy Optimization (MAPPO) training within a Centralized Training with Decentralized Execution (CTDE) paradigm, our method achieves effective coordination between parallelization and placement decisions. Extensive experimental results demonstrate that HGNN-PSFC achieves near-optimal performance with approximately 92% of the optimal algorithm’s effectiveness while maintaining polynomial computational complexity.
Introduction
The exponential growth of network services demands increasingly sophisticated processing capabilities to meet stringent quality of service (QoS) requirements. Traditional network architectures, where network functions are implemented through proprietary hardware devices, face significant limitations in terms of flexibility, cost-efficiency, and scalability (Mijumbi et al. 2015). These hardware-based approaches often result in rigid network infrastructures that struggle to adapt to evolving service requirements and dynamic traffic patterns, leading to suboptimal resource utilization and increased operational complexity (Herrera and Botero 2016).
Network Function Virtualization (NFV) and Software-Defined Networking (SDN) have emerged as transformative paradigms that address these limitations by decoupling network functions from dedicated hardware and separating control logic from data forwarding. NFV enables the implementation of Virtual Network Functions (VNFs) as software components that can execute on general-purpose hardware (Yi et al. 2018), while SDN provides programmable control over network traffic by centralizing network intelligence (Kreutz et al. 2014). Together, these technologies create a more agile and efficient networking environment, enabling dynamic deployment and management of network services.
In the NFV paradigm, network services are commonly deployed as Service Function Chains (SFCs), which are ordered sequences of VNFs that process network traffic to deliver specific services (Mijumbi et al. 2015). While NFV and SDN enable flexible SFC deployment, the traditional serial execution model of SFCs presents significant performance challenges, particularly for latency-sensitive applications. As packets traverse each VNF sequentially, the end-to-end delay increases proportionally with chain length, creating bottlenecks for ultra-low latency services such as autonomous driving, telemedicine, and interactive gaming. Additionally, virtualization overhead can further degrade performance compared to hardware-based implementations (Sun et al. 2017a).
To address the aforementioned issues, one potential solution is to accelerate each individual network function (Li et al. 2016). However, this approach only yields a limited reduction in latency. In contrast, most of the recent studies aims to reduce SFC latency by introducing parallelism among VNFs. Studies have revealed that 53.8% of VNF pairs in an enterprise network can operate in parallel, while 41.5% of VNF pairs can be parallelized without incurring additional resource overhead (Sun et al. 2017b). The parallel processing of VNFs can significantly reduce end-to-end packet latency by exploiting functional independence and computational parallelism.
Early approaches to VNF parallelization, such as those by Sun et al. (2017b) and Zhang et al. (2017), focus on deploying VNFs in parallel on the same server to minimize packet duplication overhead through shared memory access. However, due to limited computational resources on individual servers, collocating all parallel VNFs on the same physical host is often impractical and restricts overall SFC deployment flexibility. When VNFs are distributed across multiple servers, two main challenges emerge:
Over-parallelization overhead
To achieve parallel execution across distributed servers, each packet must be duplicated and forwarded to parallel VNF instances, and their outputs subsequently merged. This duplication and merging process increases both processing latency and network bandwidth consumption, particularly under high-throughput or bursty traffic conditions. When individual VNFs have low intrinsic execution times, parallelization can paradoxically increase end-to-end delay due to synchronization and merging costs.
Packet reordering and buffering issues
Disparate processing delays among parallel VNF instances cause out-of-order packet arrivals at the merging module, necessitating additional memory buffers to cache early arrivals and wait for stragglers. Such buffering can exacerbate latency and memory overhead, degrading overall system performance.
Consequently, as observed by Lin et al. (2021), partial parallelization often yields the optimal trade-off between performance and resource consumption. Therefore, determining the optimal parallelization degree, in particular the construction of parallel SFC branches, is essential prior to deploying parallel SFCs. The number of parallel branches and the processing time of VNFs within each branch significantly influence parallelization effectiveness. Since VNF processing times vary across different nodes due to heterogeneous hardware capabilities and varying load conditions (Troia et al. 2023), the placement of parallel VNFs directly impacts their construction efficiency. This interdependence necessitates a joint optimization approach for both the construction and placement of parallel VNFs in order to achieve optimal performance.
Figure 1 illustrates the parallel SFC placement process. Beyond determining VNF placement and routing paths, it is critical to configure an optimal degree of parallelism to minimize end-to-end service chain delay. Existing research has explored heuristic (Luo et al. 2020; Lin and Chou 2021) and reinforcement learning (RL) approaches (Cai et al. 2021; Zhang et al. 2024) for parallel SFC deployment. However, compared to serial SFC deployment, parallel SFC deployment presents additional complexities and several open challenges remain unaddressed: 1) Separate treatment of parallelization and placement. Most studies decouple SFC parallelism decisions from VNF placement strategies, leading to suboptimal end-to-end performance. 2) Lack of RL-based parallelism adaptation. There is a scarcity of RL methods that dynamically determine the optimal parallelization configuration by capturing network state variations. 3) Insufficient distributed placement solutions.There is a deficiency of reinforcement learning-based distributed placement methods for parallel SFCs. For example, some studies only consider placing mutually parallel VNFs on a single physical node, or some studies treat the VNF placement problem as a deployed instance selection problem (Lin et al. 2021).
[See PDF for image]
Fig. 1
Illustration of parallel SFC placement
Our method comprises two types of cooperative agents: a Parallelization Agent dedicated to determining the degree and structure of SFC parallelism, and multiple Placement Agents for VNF placement decisions. A multi-agent cooperative learning paradigm is employed to simplify the joint action space and accelerate training. This multi-agent approach is superior to single-agent alternatives as it decomposes the complex joint optimization problem into more manageable sub-tasks, allowing each specialized agent to focus on its specific objective while maintaining coordination through shared information. Both agent policies are trained offline using Multi-Agent Proximal Policy Optimization (MAPPO) (Yu et al. 2022). We represent both the substrate network and the parallel SFC request as graph-structured data and use a heterogeneous GNN extractor to capture diverse node and edge features effectively.
The main contributions of this paper are summarized as follows:
A novel heterogeneous graph neural network-assisted multi-agent deep reinforcement learning algorithm for parallelized SFC deployment is proposed, which jointly optimizes SFC parallelization and VNF placement, leading to significant improvements in deployment efficiency.
We propose, to our knowledge, the first RL-based solution to the SFC parallelization problem. Our approach employs a novel VNF parallelism graph representation that captures dependencies between VNFs while substantially reducing action space complexity. The framework integrates parallelization decisions with placement optimization in a unified learning process, avoiding suboptimal outcomes of decoupled approaches.
We develop a comprehensive heterogeneous graph representation that captures complex relationships between VNFs, substrate network topology, existing VNF-to-node mappings, enabling more effective feature learning for parallel SFC deployment decisions.
Related works
SFC parallelization
Recent research has proposed parallelizing SFCs to reduce service latency by enabling simultaneous VNF processing. Zhang et al. (2017) and Sun et al. (2017b) pioneered the exploration of parallelism between different VNFs by analyzing dependencies between VNFs in SFCs to determine parallelization feasibility. These dependencies are determined by VNFs’ read and write operations on packet headers or payloads. These studies establish guidelines for identifying parallelizable VNFs in SFCs and deploy all parallelizable VNFs on the same server, duplicating only packet headers while accessing payloads through shared memory to avoid high overhead of complete packet duplication. Building on this foundation, Cai et al. (2020) extended the study of VNF dependencies to determine parallelization feasibility for any two VNFs in distributed networks. They designed a parallelization algorithm to convert serial SFCs into parallelized SFCs represented as service graphs. Zhang et al. (2019) proposed a parallelization framework supporting SFC parallelism across multi-core servers with a HybridSFC controller enabling parallelism at both VNF and traffic levels.
However, when parallelizable VNFs cannot be served by the same server due to limited capacity constraints, VNFs must be deployed on different servers, causing additional overhead from over-parallelization and packet reordering problems. Therefore, recent research by Wang et al. (2019), Lin et al. (2021), and Zhang et al. (2023a) has focused on determining optimal partial VNF parallelism that yields minimum end-to-end latency. Wang et al. (2019) proposed ParaNF, a delay-balanced parallelized SFC processing framework for delay coordination between parallel branches. They introduced a delay-balanced SFC parallelism optimization problem to eliminate unnecessary parallelism, though ParaNF lacks deployment flexibility due to strict requirements that newly created branch processing latency must not exceed maximum branch processing latency.
Similarly, Lin et al. (2021) investigated optimal parallel processing of distributed VNFs under different SFC scenarios and proposed PPC (Partially Parallelized Chain processing). The authors enumerate all feasible parallel structures for each requesting SFC and develop a heuristic approach to find optimal partial parallelism. Their results indicate that when packet processing delay exceeds VNF execution delay, SFC parallelization benefits are minimal. Zhang et al. (2023a) proposed a parallel SFC deployment model assuming VNF sojourn time as a nonlinear function of arrival traffic and allocated computational resources, determining location and routing of parallel SFCs to address imbalance problems while minimizing deployment cost under end-to-end delay constraints.
Parallelized SFC deployment
Beyond identifying parallelizable VNFs, research has increasingly focused on mapping parallelized service chains onto physical network infrastructure. Existing approaches can be categorized into heuristic algorithms and reinforcement learning-based methods.
Heuristic approaches
Early research primarily focused on latency minimization, recognizing the key advantage of SFC parallelization in reducing end-to-end service delay. Bao et al. (2020) proposed a Directed Acyclic Graph (DAG) abstraction model for parallel SFCs and introduced the Prune and Plant (P&P) scheme, which first prunes the original DAG into a series-parallel graph to maintain parallelism while eliminating NP-hardness, then optimizes VNF placement to minimize delay. Luo et al. (2020) developed ParaSFC, a dynamic programming-based approach that sequentially deploys parallelized SFCs while finding optimal deployment paths, introducing concepts of resource abusiveness and node competitiveness to coordinate resource competition. Xie et al. (2020) emphasized joint optimization of parallelization and placement decisions with FlexChain, enabling VNFs on the same server to work in parallel while allowing SFC placement across multiple servers.
Recognizing traditional delay models’ limitations, Zhang et al. (2023a) proposed a deployment model considering traffic-dependent delay characteristics, where VNF sojourn time is modeled as a nonlinear function of both traffic and allocated resources. Their approach incorporates VNF sharing and addresses imbalance between parallel branches through flexible resource allocation. Concurrently, Zheng et al. (2022) introduced a comprehensive framework for parallelism-aware service chaining and embedding (PSFCE) that jointly optimizes processing and propagation delays. They formalized parallel relationships using Parallel Graphs and quantified processing delay savings through Parallel Block Gain (PBG).
A significant advancement in addressing distributed VNF parallelism challenges came from Lin and Chou (2021), who identified that parallel functions distributed across multiple servers introduce substantial coordination overhead. They proposed a scoring-based VNF embedding algorithm that clusters independent functions to ensure homogeneous delays among parallel functions. Lin et al. (2021) further refined this approach with their Partial Parallel Chaining (PPC) framework, which identifies optimal partial parallelism by jointly considering function placement and parallelization costs, demonstrating that strategic partial parallelization often outperforms both full parallelism and sequential chaining.
Beyond delay optimization, research has expanded to address multiple objectives in parallelized SFC deployment. Lin et al. (2018) developed breadth-first search-based greedy methods to minimize VNF rental and link costs. Chintapalli et al. (2023) introduced ERASE, the first scheme to jointly consider energy consumption minimization and reliability guarantees in parallelized SFC deployment, employing a three-stage process with novel backup strategies. For dynamic network environments, Lin et al. (2022) proposed SFT-Box, an architecture enabling real-time response to multiple hybrid SFC requests through a pre-computing strategy with modules for problem decomposition, sub-solution calculation, and fast online solution generation.
While these heuristic approaches have made substantial progress in addressing various aspects of parallelized SFC deployment, they typically rely on problem-specific assumptions and may not adapt well to dynamic network conditions or varying traffic patterns.
Reinforcement learning approaches
Recognizing limitations of heuristic approaches in adapting to dynamic network environments, researchers have increasingly explored reinforcement learning techniques for parallelized SFC deployment. Cai et al. (2021) pioneered this direction with their Adaptive Parallel Processing Mechanism (APPM), which introduced joint optimization combining parallelism optimization and deployment scheduling using Q-learning-based Joint Optimization algorithm (JORL). APPM first optimizes the parallel structure using a Parallelism Optimization Algorithm (POA) based on bin-packing principles, then applies JORL for deployment and scheduling. This work demonstrated RL’s effectiveness in reducing delay while maintaining high resource utilization. Huang et al. (2022) developed a Delay-Aware traffic Scheduling Mechanism (DASM) that transformed Parallel SFCs (PSFCs) into multiple serial SFCs with reduced VNF dependencies, then employed Q-learning for delay-aware traffic scheduling. Zhang et al. (2023b) introduced PVRDU (Parallelizing VNFs under Resource Demand Uncertainty), formulating the problem using deep reinforcement learning with Markov-based approximation algorithms to handle dynamic resource requests. Similarly, Zhang et al. (2024) proposed LVPRU (Latency-aware VNF Parallelization strategy under Resource demand Uncertainty), employing Quadratic Integer Programming combined with Asynchronous Advantage Actor-Critic (A3C) algorithms. LVPRU introduced VNF reuse for computational acceleration and demonstrated the effectiveness of actor-critic methods in handling complex state and action spaces. In the context of mobile edge computing, Lu and Long (2025) developed a Network Function Parallelism (NFP)-enhanced SFC placement framework that addresses a fairness-aware throughput maximization problem.
Further advancing multi-objective optimization, Xu et al. (2023) developed DSPPR (Dynamic SFC Placement scheme with Parallelized SFCs and Reuse of initialized VNFs), which integrated A3C algorithms with VNF Queue Network for extracting the distribution of initialized VNFs over time. DSPPR effectively balanced five competing objectives: deployment success rate, resource consumption, resource utilization, throughput, and delay cost. Huang et al. (2024) introduced PVFP (Parallel VNF Placement), the first approach to realize parallel VNF placement via Federated Deep Reinforcement Learning (FDRL), combining federated learning with deep reinforcement learning to train global models while preserving data privacy across different administrative domains. Recently, Jang et al. (2025) proposed DeepNFP, a deep RL-based network function parallelism configuration scheme operating over Segment Routing over IPv6 (SRv6). DeepNFP dynamically selects sequential or parallel execution for each service function and implements the resulting processing policy via an SRv6-based data plane protocol to reduce configuration overhead.
Despite significant progress, existing RL-based approaches for parallelized SFC deployment face several limitations. Most current methods treat parallelization and placement as separate sequential decisions, leading to sub-optimal solutions due to the lack of joint optimization. Additionally, the action space complexity remains a significant challenge when dealing with large-scale networks and complex parallel structures. The representation of SFC parallel structures in RL frameworks often lacks the expressiveness needed to capture complex VNF dependencies and parallelization opportunities effectively.
Network model and problem formalization
This section presents a comprehensive formalization of the parallelized SFC deployment problem, addressing two interconnected subproblems: (1) constructing parallel SFCs from VNF parallelism relationships, and (2) optimally placing these parallel VNFs in substrate networks while considering resource constraints and delay requirements. We develop mathematical models for both subproblems and formulate a joint optimization objective.
Service function parallelism graph
Prior research by Sun et al. (2017b) has systematically investigated the conditions under which service functions can be executed in parallel. To effectively represent and analyze these parallelism relationships, we introduce the Service Function Parallelism Graph (SFPG). Unlike traditional representation methods that use notation such as parentheses to group parallelizable service functions, this graph-based model provides a formal foundation for describing concurrent execution possibilities. The SFPG not only captures complex parallelism relationships but also facilitates the study of mappings between theoretical parallelism opportunities and implementable parallel service function chains.
Definition 1
(Service Function Parallelism Graph) A Service Function Parallelism Graph (SFPG) is an undirected graph , where:
represents the set of VNFs in the service function chain, with each vertex corresponding to a VNF in the chain.
denotes the set of edges between vertices, where an edge exists if and only if VNFs and can be executed in parallel without compromising the functional correctness of the service chain.
The SFPG represents potential parallelism, while the derived structure implementing these relationships is called the Parallel Service Function Graph (PSFG). Figure 2 illustrates both an SFPG and its corresponding Parallel Service Function Graph.
[See PDF for image]
Fig. 2
An example of a SFPG, its corresponding PSFG
The SFPG exhibits several important properties that directly inform parallel SFC deployment strategies. Not all SFPGs can be mapped to valid PSFGs. Specifically, an SFPG containing what we define as a Parallelism Conflict Subgraph (as illustrated in Fig. 3) cannot be mapped to a valid implementation:
[See PDF for image]
Fig. 3
Parallelism Conflict Subgraph
Definition 2
(Parallelism Conflict Subgraph) A Parallelism Conflict Subgraph is a substructure within an SFPG where VNFs , , , and form a path such that:
Edges , , and exist in the SFPG
Edges , , and do not exist in the SFPG
Theorem 1
An SFPG containing a Parallelism Conflict Subgraph cannot be mapped to a valid PSFG.
Proof of Theorem 1
We prove that an SFPG containing a Parallelism Conflict Subgraph cannot be mapped to a valid Parallel Service Function Graph through the following steps:
First, we demonstrate that a Parallelism Conflict Subgraph itself cannot be mapped to a valid implementation. Consider VNFs , , , and forming a path in the SFPG. By definition, can execute in parallel with , with , and with . However, cannot execute in parallel with , cannot execute in parallel with , and cannot execute in parallel with . If we attempt to position these functions in a Parallel Service Function Graph:
If and are placed in parallel, and and are placed in parallel, and and must not be in parallel (as no edge exists between and ). This means and must form a sequential chain that together runs in parallel with .
If and are placed in parallel, and and are placed in parallel, and and must not be in parallel (as no edge exists between and ). This means and must form a sequential chain that together runs in parallel with .
Consequently, must be in parallel with the sequential chain containing (from the first condition), but and cannot be parallel (as no edge exists between and ), creating a contradiction.
An SFPG may be a disconnected graph, where each connected component forms a parallel module. These disconnected components maintain sequential relationships with each other, with their execution order determined by the specified sequential constraints. Through our analysis, we observe that the complement graph of an SFPG represents the inverse parallelism relationship, where an edge indicates that two VNFs cannot be executed in parallel.
Notably, the complement of a Parallelism Conflict Subgraph is isomorphic to itself. We define an SFPG without any Parallelism Conflict Subgraph as a "feasible SFPG". The corresponding PSFG derived from a feasible SFPG represents a fully parallelized SFC. For implementation flexibility, we can also derive partially parallelized SFCs by selectively removing certain edges from a feasible SFPG, representing different implementation trade-offs.
Substrate networks
We model the substrate network as a weighted undirected graph , where N denotes the set of substrate physical nodes and L refers to the set of substrate physical links. Each substrate node has available computing resources . The processing delay for a VNF instance of type m at node n is denoted as . Each physical link connecting adjacent nodes and has bandwidth capacity and transmission delay .
Table 1. Notation Summary
Symbol | Description |
|---|---|
Service Function Parallelism Graph for request i | |
Parallel Service Function Graph for request i | |
Substrate network with nodes N and links L | |
Set of SFC requests | |
SFC request i with source , destination , rate | |
The j-th VNF in request i | |
Number of VNFs in request i | |
Set of VNF types | |
VNF type of | |
Computing capacity of node n | |
Bandwidth capacity of link l | |
Computing requirements of VNF type m at node n | |
Processing delay of VNF type m at node n | |
Transmission delay of link l | |
Binary variable: edge (j, k) retained in modified SFPG | |
Binary variable: VNF placed on node n | |
Binary variable: virtual link (j, k) uses substrate link l | |
Set of parallel branches in PSFG | |
Delay of parallel branch p | |
End-to-end delay of request i |
For the sake of clarity, all the variables and symbols introduced in this section are summarized in Table 1.
Parallel service function graph
We model the optimally parallelized SFC requests that need to be deployed into the substrate network as Directed Acyclic Graphs (DAGs), termed Parallel Service Function Graphs (PSFGs). For a set of parallelized SFC requests , each parallelized SFC request is represented as , where denotes the Service Function Parallelism Graph representing the parallelism relationships, denotes the PSFG structure, represents the source node where packets enter the SFC, represents the destination node where packets exit the SFC, and represents the data rate requirement.
Each PSFG is formally defined as , where represents the set of vertices and represents the set of directed edges. The vertex set includes VNF instances , where denotes the j-th VNF in parallel SFC request i, and is the number of VNFs in the SFC. To facilitate subsequent formulations, we introduce two virtual functions: and , representing the virtual ingress and egress functions corresponding to source node and destination node , respectively.
The directed edges represent virtual links (VLs) that define traffic flow between VNFs. The set of all VNF types across all requests is represented by , where is the total number of distinct VNF types. Let denote the VNF type corresponding to the j-th VNF of service request . The resource requirements for deploying a VNF of type m are characterized by computing resources .
Problem definition and formulation
Given a substrate network and a set of parallel SFC requests represented by SFPG graphs, the parallel SFC deployment problem aims to: (1) determine the optimal partial parallelization by constructing valid PSFGs from given SFPGs, and (2) optimally place the VNFs and route virtual links to minimize end-to-end delay while satisfying resource constraints.
Specifically, the problem consists of two phases: In the first phase, we construct a valid PSFG from a given SFPG. The SFPG represents all potential parallelism relationships between VNFs, but not all these relationships can be simultaneously implemented in a valid deployment. Therefore, we must identify a feasible subset of parallelism relationships that can be realized in optimal partial parallelization. Given an original SFPG , we aim to derive a modified SFPG where by selectively removing certain edges (possibly none if the original SFPG is already conflict-free). This modified SFPG uniquely corresponds to a valid PSFG that will be deployed as the parallel SFC. In the second phase, given a constructed PSFG converted from a modified SFPG, we place all network function nodes.
Decision variables
We define the following decision variables that jointly represent parallelization and VNF placement for request i:
1
2
3
Constraints
The parallel SFC deployment process is subject to several critical constraints, which we elaborate below:
Parallelization constraint
The modified SFPG must not contain any Parallelism Conflict Subgraph:
4
Resource constraint
For each substrate node , the computing resource constraint must be satisfied:
5
where denotes the required computational resources for VNF type placed on node n.For each substrate link , the bandwidth constraint ensures that the total bandwidth used by the SFCs does not exceed the link capacity:
6
VNF placement constraint
Each VNF must be deployed on exactly one physical node, ensuring uniqueness in placement:
7
Moreover, the total number of VNFs deployed for request must match the number of required VNFs:8
Optimization objective
The primary objective of this work is to minimize the end-to-end delay in the deployment of parallel SFCs. Let denote the set of all possible parallel branches in PSFG . For each branch , let be the set of VNFs on branch p and be the set of virtual links on branch p.
The delay in a parallel SFC has several components that affect the end-to-end performance:
Processing delay
The processing delay at node n for VNF is denoted as , which depends on both the node capabilities and the service function requirements.
Transmission delay
The transmission delay over a substrate link l for traffic of rate is denoted as .
For parallel execution, we need to consider the additional delays introduced by packet duplication and merging operations. Assuming that distributed parallel deployment only requires copying packet headers (Sun et al. 2017b).
Duplication delay
The time required to duplicate packet headers for parallel processing , where is the time to duplicate one packet header, q is the number of parallel branches at .
Merging delay
The time required to merge results from parallel branches , where is the time to merge one packet header and is the merge node.
The delay of branch p is:
9
The end-to-end delay is determined by the critical path:10
The objective is to optimize the parallelization and placement decisions, denoted as , and , to minimize end-to-end delay while ensuring the successful deployment of SFCs.The objective function is formulated as:
11
The inherent complexity stems from the VNF placement subproblem being NP-hard, rendering the overall problem computationally intractable for large-scale networks. This motivates our multi-agent deep reinforcement learning approach.[See PDF for image]
Fig. 4
Overview of the Proposed HGNN-PSFC Approach
Proposed multi-agent reinforcement learning based approach
Overview of the proposed approach
This section presents our novel approach, HGNN-PSFC, for parallel SFC deployment using multi-agent reinforcement learning (MARL) assisted by heterogeneous graph neural networks (Hetero-GNN). We first introduce the multi-agent framework design, followed by the heterogeneous graph neural network architecture for feature extraction.
Multi-agent framework design
Parallel SFC deployment encompasses two distinct subtasks: determining VNF parallelization and placing VNFs across physical nodes. These tasks address fundamentally different aspects and require specialized agents with distinct policy strategies. To address this complexity, we propose a novel MARL framework employing two types of collaborative agents: a Parallelization Agent (PA) and multiple Placement Agents (PLAs), as illustrated in Fig. 4. The Parallelization Agent makes decisions regarding the parallelization structure of SFC components, while each Placement Agent is assigned to a specific VNF in the service chain and determines its optimal physical node placement within the network infrastructure.
This dual-agent architecture facilitates efficient coordination between parallelization and placement strategies through a sequential two-phase process. As illustrated in Fig. 5, in the initial phase, an initial SFPG is constructed based on the pairwise parallel relationships among the SFs in the SFC request. The PA then observes the network state and the SFPG, performing edge-removing actions to eliminate Parallelism Conflict Subgraphs while simultaneously constructing an optimal partial parallel structure. This feasible SFPG is subsequently transformed into a partial parallel PSFG structure using Algorithm 1. In the second phase, PLAs observe the network state and the generated PSFG to determine optimal physical nodes for hosting each VNF component.
[See PDF for image]
Fig. 5
Workflow of the dual-agent architecture
Algorithm 1 implements the transformation from a feasible SFPG to a PSFG structure. The algorithm leverages the fundamental principle that edges in the original graph represent potential for parallelism between VNFs–hence, disconnected components must be executed sequentially. Conversely, in the complement graph, edges imply sequential dependencies, and thus disconnected components can be executed in parallel. As illustrated in Fig. 5, starting from a feasible SFPG, the algorithm recursively constructs the complement graph for each connected component and determines the execution semantics (parallel or sequential) based on the above rules. This recursive process continues until the complete PSFG is constructed.
[See PDF for image]
Algorithm 1
Transformation from Feasible SFPG to PSFG
Effective coordination between the parallelization and placement phases is critical to deployment success. In our dual-agent architecture, decisions made by the PA directly shape the observation spaces of the PLAs, creating an implicit information flow that facilitates coordination during execution.
Heterogeneous graph network feature extraction
We model the system state as a heterogeneous graph at each decision step t, where denotes the heterogeneous node collection, represents the heterogeneous edge collection, contains node attribute matrices, and encompasses edge attribute matrices. Here, the term heterogeneous refers to the structural differences between the two types of graphs involved in our problem: the parallel SFC request graph and the substrate network graph, which differ in node types and edge semantics, and connectivity patterns. The heterogeneous node set comprises:
VNF nodes representing virtual network functions derived from the SFPG vertex set, where each node corresponds to a VNF in the service function chain.
Physical nodes representing substrate network infrastructure.
The heterogeneous edge set encompasses:
VNF dependency edges representing parallelization dependencies within service chains, which are derived from the SFPG. It is important to note that the construction of dependency edges differs between agents: for the parallelization agent, dependency edges are derived from the original SFPG, while for the placement agent, dependency edges are derived from the modified SFPG that has been updated by the parallelization agent through selective edge removal decisions.
Physical network edges representing substrate connectivity.
Deployment mapping edges indicating current or potential VNF-to-node assignments.
To extract comprehensive topological and resource information, we develop an enhanced TransformerConv-based (Shi et al. 2020) architecture. Initially, we employ separate MLPs to generate base representations:
12
We apply K successive layers of heterogeneous GNNs. At layer , we deploy specialized TransformerConv operations:13
14
15
We employ additive aggregation to construct unified node representations:16
17
Finally, we incorporate residual connections to obtain final node embeddings:18
This heterogeneous graph architecture enables our framework to capture multi-faceted interactions between service requirements, substrate capabilities, and network topology constraints, providing rich contextual embeddings that facilitate informed decision-making across both parallelization and placement optimization tasks.Reinforcement learning framework
To effectively address the parallel SFC deployment challenge through multi-agent reinforcement learning, we formalize our approach within the Decentralized Partially Observable Markov Decision Process (Dec-POMDP) (Oliehoek et al. 2016) framework. This section establishes the theoretical foundations of our model and details the specific implementation of state spaces, action spaces, and reward functions for our dual-agent architecture.
State space definition
VNF node features
For VNF nodes , the feature vector encapsulates:
19
where represent normalized computational resource demands, captures the relative resource intensity of node compared to the mean computational requirements across all VNF nodes, serves as a sequential identifier within the service chain architecture.Physical node features
For physical nodes , the feature vector contains:
20
Here, denote the available computational capacity remaining at node u. The metric quantifies the resource consumption ratio, where values below unity indicate sufficient capacity to accommodate the VNF requirements, while values exceeding unity signal insufficient resources, suggesting either non-availability or the need for alternative placement strategies.The term establishes the relative resource standing of node u against the network-wide average, facilitating load balancing decisions. The normalized processing daley quantifies computational delay characteristics, where serves as the network-wide delay benchmark. Finally, captures the node’s connectivity degree within the network topology.
Edge feature
SFC dependency edges carry features:
21
Physical network edges are characterized by:22
where denotes the unused bandwidth capacity on link . The utilization factor evaluates whether current bandwidth reserves can satisfy SFC traffic demands. The values below one indicate adequate capacity, while higher values signal potential bottlenecks requiring traffic rerouting. The comparative bandwidth metric positions the link’s capacity within the broader network context. Furthermore, represents the normalized current delay on link , where is the maximum delay across all links in the network.Mapping edges represent deployment relationships with features:
23
where is a binary indicator variable. Specifically, if function has been deployed on node , and otherwise.Action space definition
Parallelization agent
The action space for the Parallelization Agent consists of edge removing operations on the SFPG, defined as:
24
where indicates that edge in the SFPG should be removed, and indicates that edge should be preserved. Each combination of edge removals results in a specific parallelization structure, with the constraint that the resulting graph must be a feasible SFPG without Parallelism Conflict Subgraphs. The PA must learn to select edge removals that optimize parallelization potential while ensuring deployability.Placement agent
The action space for each Placement Agent at time step t consists of selecting a physical node from the set of available nodes:
25
where each action corresponds to selecting a specific physical node for hosting the VNF. The action creates a new mapping edge and updates the heterogeneous graph state.Reward function design
The reward function design is critical for guiding the multi-agent learning process toward optimal parallel SFC deployment strategies. Given the sequential nature of our two-phase approach and the distinct objectives of parallelization and placement decisions, we develop a hierarchical reward mechanism that provides appropriate incentives for both individual agent performance and collaborative optimization.
Our reward system employs a three-tier structure: immediate rewards for phase-specific feedback, intermediate rewards for partial deployment assessment, and global rewards for overall deployment quality evaluation.
Immediate rewards
During the parallelization phase, the PA receives immediate feedback based on the validity of the generated SFPG structure:
26
where denotes the number of detected Parallelism Conflict Subgraphs, is the conflict penalty coefficient, is the valid parallelization reward coefficient, and represents the ratio of preserved parallelization edges, encouraging maximal parallelization potential while maintaining feasibility.For placement agents, immediate rewards provide feedback on constraint satisfaction and placement validity:
27
where represent penalty coefficients for resource violations, invalid placements, and success rewards for valid placements, respectively.Intermediate rewards
During the placement phase, when some PLAs have attempted placement while others remain unassigned, we provide differentiated rewards based on each agent’s current status and performance:
28
where represents the end-to-end delay of the currently deployed partial path that includes successfully placed VNFs, is a baseline delay reference (e.g., random placement), is the failure penalty coefficient, and is the weighting parameter that encourages placements contributing to lower partial delays.
[See PDF for image]
Fig. 6
Framework of the MAPPO algorithm within the CTDE paradigm for parallel SFC deployment
Global rewards
Upon successful completion of the entire deployment process, all agents receive a shared global reward based on the overall system performance:
29
where is the end-to-end delay for request i, and is the global optimization coefficient.Final reward allocation
The final reward for each agent combines immediate feedback, intermediate assessment, and global performance:
30
31
where is the global reward weight that ensures all agents benefit from successful collaborative deployment while maintaining sensitivity to individual performance. To ensure training stability and convergence, we apply reward normalization using exponential moving averages.Training process and algorithms
In this work, our novel MARL framework, HGNN-PSFC, for parallel SFC deployment is trained using MAPPO, an extension of the Proximal Policy Optimization (PPO) algorithm (Schulman et al. 2017) designed for multi-agent environments. As illustrated in Fig. 6, MAPPO operates within the Centralized Training with Decentralized Execution (CTDE) paradigm (Lowe et al. 2017), which is particularly well-suited for our dual-agent architecture. During training, the Parallelization Agent and Placement Agents share information to enable a centralized critic to evaluate joint actions accurately, while maintaining separate policy networks for decentralized execution. This approach facilitates effective coordination between parallelization decisions and subsequent placement strategies while ensuring that each agent can operate independently during deployment.
Each agent type has separate actor and critic networks. The actor produces a probability distribution over actions given local observations, and the critic estimates the state value for advantage computation. As an on-policy algorithm, MAPPO improves data efficiency through importance sampling and a clipping mechanism to prevent excessive policy updates. The importance sampling ratio is defined as , where represents the policy parameters used to collect experience tuples . The clipped objective function in MAPPO is given by:
32
where is the advantage function estimated using generalized advantage estimation (GAE) (Schulman et al. 2015):33
with , where is the discount factor and is the GAE parameter.Our approach employs two distinct categories of agents-the Parallelization Agent and multiple Placement Agents-each with specialized actor and critic networks tailored to their respective roles. Denoting a generic agent by index k (where PA or ), the actor objective is:
34
where B denotes the batch size and represents the number of each Agents. The probability ratio between updated and previous policies is denoted as , and is the advantage function for agent j.For the critic networks, the objective is to minimize the value function loss, defined as:
35
where represents the predicted value of the state , and is the discounted reward-to-go for agent k.The overall training process is outlined in Algorithm 2, which iterates over multiple episodes until convergence. During each episode, both agent types interact with the environment in their prescribed sequence, collecting observations, rewards, and next states. These experiences are then used to update the actor and critic networks according to the MAPPO algorithm.
[See PDF for image]
Algorithm 2
MAPPO for Parallel SFC Deployment.
By leveraging MAPPO within the CTDE framework, our approach enables effective coordination between the Parallelization Agent and Placement Agents, balancing the trade-offs between parallelization potential and resource constraints. The Parallelization Agent learns to identify and eliminate Parallelism Conflict Subgraphs while maximizing parallelization opportunities, while Placement Agents learn to select optimal physical nodes that satisfy the constraints imposed by the parallelization structure.
Complexity analysis
In HGNN-PSFC method, the training phase is performed offline, allowing computationally intensive model updates without affecting real-time system performance. Its computational complexity is proportional to both the size of the training data and the training duration. The offline training phase ensures that, once completed, the system can efficiently handle SFC deployment tasks using the trained heterogeneous graph neural network model. Since the online scenario is where computational efficiency is most critical, we focus our analysis on the online complexity during real-time execution of SF placement decisions.
The complexity of the online execution is mainly driven by the feedforward pass of the heterogeneous graph neural network for graph representation learning. HGNN-PSFC exhibits a computational complexity of , where |F| and denote the number of VNF nodes and SFC dependency edges, |N| and denote the number of physical nodes and substrate links, K denotes the number of GNN layers, and d denotes the embedding dimension.
Concretely, when constructing a solution for one parallel SFC deployment instance, HGNN-PSFC performs inference |F| times with the GNN policy for placement decisions, plus one additional inference for parallelization decisions. One TransformerConv layer has complexity , where and denote the number of nodes and edges. Our heterogeneous enhancement incorporates three specialized attention computation types: VNF dependency modeling with complexity , physical network topology modeling with complexity , and cross-graph mapping relationship modeling with complexity .
Considering that |F| is significantly smaller than and typically in practical network systems, the overall complexity of HGNN-PSFC is .
Because training is performed offline and inference is polynomial in substrate size and embedding dimension, HGNN-PSFC provides predictable online latency for repeated decision-making. By contrast, exact combinatorial solutions (e.g., ILP formulations or dynamic programming approaches such as PPC) have worst-case super-polynomial or exponential growth in problem parameters. In particular, the SF-cluster assignment step of PPC has been shown to admit worst-case growth of order in chain length |F| (cf. (Lin et al. 2021)), making such methods practical only for small |F| or when exact optimality is required.
Performance evaluation and analysis
This section presents the experimental evaluation of the proposed reinforcement learning-based parallel SFC deployment framework. The evaluation is conducted in two parts: the first part validates the effectiveness of parallel SFC placement by comparing with baseline methods when parallel SFC requests (represented as PSFGs, as described earlier) are pre-generated, focusing solely on the placement decision. The second part evaluates the complete framework performance where both parallelization and placement decisions are made jointly by the cooperative agents.
Baseline methods and algorithms compared
To the best of our knowledge, no existing RL-based approach explicitly addresses the parallelization decision in parallel SFC deployment. Therefore, for a fair comparison, we construct hybrid baselines in which the parallelization configuration is determined by an enumeration-based non-RL algorithm, while the VNF placement is handled by representative RL algorithms.
We compare the proposed method with four baseline algorithms:
Partial Parallel Chaining (PPC) (Lin et al. 2021): This approach identifies optimal partial parallelism configuration and instance selection to minimize end-to-end delay. PPC consists of two phases: partial parallelism enumeration using dynamic programming to find all feasible configurations, and NF instance assignment using layer graph construction to minimize overall delay considering execution, transmission, and parallelization costs.
Sequential Chaining (SEQ-PPC): This baseline represents PPC with a constrained partial parallelism enumeration phase that only considers sequential execution configurations. In this variant, all network functions are processed in strict sequential order without any parallelization, essentially selecting the fully sequential option from PPC’s configuration space. The NF instance assignment phase remains identical to PPC, using layer graph construction for optimal instance selection.
Full Parallelization (FULL-PPC): This baseline represents PPC with a constrained partial parallelism enumeration phase that only considers maximal parallelization configurations. All independent network functions identified by dependency analysis are executed simultaneously, essentially selecting the fully parallel option from PPC’s configuration space. The NF instance assignment phase follows the same layer graph approach as PPC.
PPC + MADRL-P&R (Wang et al. 2022): This hybrid baseline combines PPC’s partial parallelism configuration phase with the MADRL-P&R algorithm for VNF placement. MADRL-P&R utilizes a multi-agent deep reinforcement learning framework to jointly optimize placement and routing.
Evaluation settings
We conduct experiments using the GEANT topology from the SDNlib (Orlowski et al. 2010) dataset, which represents the European academic network infrastructure. This real-world topology provides a realistic evaluation environment for testing the effectiveness of our parallel SFC deployment framework. The network nodes and link resources, including processing and transmission delays, are parameterized as follows: , , , . The number of VNF types and the SFC length are both set to 5. The VNFs used in our simulations are typical middleboxes (firewall, load balancer, NIDS, gateway, and monitor); for each SFC the VNFs are selected randomly. Parallelizability relationships between VNF pairs are assigned following the parallelization/duplication criteria described in Sun et al. (2017b). These settings are consistent with those used in prior work Wang et al. (2022); Lin et al. (2021).
For the reinforcement learning framework, the hetero GNN feature extractors consist of three layers of TransformerConv. The embedding dimension is 128. Training uses the MAPPO algorithm with default hyperparameters from RLlib (Liang et al. 2018).
[See PDF for image]
Fig. 7
CDF of end-to-end delay for parallel SFC placement
Evaluation results
Parallel SFC placement effectiveness
In this evaluation, we provide pre-generated parallel SFC requests where the parallelization decisions have been made, and algorithms only decide the placement of parallel VNFs. In this scenario, the parallelization agent remains inactive as the parallelization structure is predetermined. Since parallelization decisions are not considered in this phase, we compare our approach against SEQ-PPC, FULL-PPC, and the hybrid PPC + MADRL-P&R algorithms to evaluate the placement effectiveness under different parallelization strategies.
Delay performance
We evaluate the placement effectiveness by deploying 200 parallel SFC requests across the substrate network. Figure 7 shows the cumulative distribution function (CDF) of end-to-end delay for different placement algorithms. The HGNN-PSFC algorithm exhibits superior performance compared to the SEQ-PPC and the hybrid PPC + MADRL-P&R baseline, while remaining competitive with the benchmark FULL-PPC method. Specifically, HGNN-PSFC achieves an average delay of 39.78, closely aligning with FULL-PPC’s 36.65, and outperforming the hybrid PPC + MADRL-P&R baseline which records a slightly higher average delay of 43.27. By contrast, SEQ-PPC shows substantially worse performance with an average delay of 53.75. Remarkably, approximately 75% of the delay values for HGNN-PSFC are distributed within the range of 0 to 50 units, compared to only about 44% for SEQ-PPC and roughly 67% for the hybrid PPC + MADRL-P&R baseline. These results underscore that HGNN-PSFC effectively manages delay metrics within the parallel SFC deployment context.
Figure 8 further illustrates the average delay comparison among the three algorithms. HGNN-PSFC achieves approximately 92% of FULL-PPC’s performance while outperforming the hybrid PPC + MADRL-P&R baseline and significantly surpassing SEQ-PPC, demonstrating the effectiveness of our learning-based approach. This near-optimal performance is particularly noteworthy considering that HGNN-PSFC operates without the exponential computational overhead associated with PPC’s exhaustive search strategy.
[See PDF for image]
Fig. 8
Average end-to-end delay comparison for parallel SFC placement
Joint parallelization and placement performance
This evaluation validates the complete HGNN-PSFC framework’s capability for joint optimization of both parallelization and placement decisions. In this scenario, SFC requests are provided in SFPG format, allowing the cooperative agents to make both parallelization structure decisions and subsequent placement decisions collaboratively. We compare against PPC, SEQ-PPC, FULL-PPC, and the hybrid PPC + MADRL-P&R algorithms to assess the effectiveness of the joint optimization approach.
Delay performance
Figure 9 shows the CDF of end-to-end delay across different algorithms. HGNN-PSFC achieves performance that approaches PPC’s optimal results while substantially outperforming both SEQ-PPC and FULL-PPC baseline methods. Specifically, HGNN-PSFC attains an average delay of 33.90, which represents approximately 91.8% of PPC’s performance (31.13), while delivering significant improvements over SEQ-PPC (53.75) and FULL-PPC (36.65). Compared with the hybrid PPC + MADRL-P&R (39.73), HGNN-PSFC also achieves a notably lower average delay. Although the hybrid baseline uses PPC to determine a near-optimal parallelization configuration, its subsequent placement stage (being trained or designed separately) lacks the simultaneous coordination with parallelization that HGNN-PSFC performs. Notably, approximately 70% of HGNN-PSFC’s delay values fall within the optimal range of 0 to 40 units, compared to 62% for FULL-PPC and only 23% for SEQ-PPC. This distribution pattern underscores the effectiveness of joint optimization in achieving consistently low delay across diverse network scenarios.
Figure 10 further illustrates the average delay comparison among the four algorithms. The results clearly demonstrate that HGNN-PSFC achieves near-optimal performance compared to PPC. The 8.2% performance gap relative to PPC represents an excellent trade-off considering the substantial reduction in computational complexity from exponential to polynomial time. Moreover, HGNN-PSFC consistently outperforms the hybrid PPC + MADRL-P&R baseline, indicating the advantage of end-to-end joint learning over approaches that decouple parallelization and placement.
Figure 11 shows that HGNN-PSFC achieves a level of parallelization that is close to that of the benchmark PPC, indicating that while the learned policy may occasionally select suboptimal parallelization structures, it maintains high overall efficiency. This slight deviation in parallelization optimality is consistent with the delay results discussed previously, where the performance gap can be attributed to suboptimalities in both placement and parallelization decisions.
[See PDF for image]
Fig. 9
CDF of end-to-end delay for joint optimization
[See PDF for image]
Fig. 10
Average end-to-end delay comparison for joint optimization
[See PDF for image]
Fig. 11
Degree of parallelization comparison
[See PDF for image]
Fig. 12
Acceptance rate under varying arrival rates
[See PDF for image]
Fig. 13
Average delay by topology
Acceptance rate under varying arrival rates
To evaluate scalability under different traffic loads, we measure the acceptance rate as the SFC request arrival rate increases. SFC requests are generated according to a Poisson process with arrival rates varied between 50 and 300 arrivals per 100 time units; the lifetime of each request is modeled as an independent exponential random variable with mean 1000 time units. Figure 12 reports results for the three algorithms considered in this work (PPC, HGNN-PSFC, and the hybrid PPC + MADRL-P&R). While PPC attains the highest acceptance rate, HGNN-PSFC closely approaches PPC’s performance and consistently outperforms the hybrid PPC + MADRL-P&R across the tested range of arrival rates. For example, at the upper end of the tested load (300 arrivals per 100 time units), HGNN-PSFC sustains an acceptance rate of 41%, noticeably below PPC (52%) but substantially higher than the hybrid PPC + MADRL-P&R (21.5%). These results indicate that HGNN-PSFC effectively balances parallelization and placement decisions under resource contention.
Performance across different network topologies
We further verify the generality of HGNN-PSFC by evaluating it on the Abilene and ChinaNet (Knight et al. 2011) topologies in addition to GEANT. Figure 13 presents the average end-to-end delay across all three topologies. HGNN-PSFC achieves at least 90% of PPC’s performance in all cases, confirming that the learned policies are not overfitted to a specific network structure. These results highlight the adaptability of the proposed method to different topological scales and connectivity patterns.
The experimental results comprehensively validate that the proposed HGNN-PSFC framework effectively addresses both parallelization and placement decisions through joint optimization, while maintaining computational efficiency and adaptability to dynamic network environments.
Conclusion
This paper addresses the critical challenge of jointly optimizing parallelization and placement for parallel SFC deployment in NFV environments. Traditional methods that decouple these decisions often result in suboptimal performance due to the complex trade-offs between parallelization benefits and overhead costs.
We propose HGNN-PSFC, a novel framework that combines heterogeneous graph neural networks with multi-agent deep reinforcement learning to jointly optimize VNF parallelization and placement. The architecture introduces a Parallelization Agent for deciding VNF parallel structures and multiple Placement Agents for determining placements, coordinated via Centralized Training with Decentralized Execution. This design decomposes the joint optimization into manageable subtasks while maintaining coordination. HGNN-PSFC is the first RL-based solution for SFC parallelization, leveraging a new VNF parallelism graph to capture VNF dependencies and reduce action space complexity. A comprehensive heterogeneous graph representation further enables effective feature learning by modeling the interplay between VNFs, the substrate network, and current placements.
Extensive experiments show that HGNN-PSFC achieves around 92% of the performance of the optimal PPC algorithm while maintaining polynomial computational complexity, indicating its potential scalability and applicability to real-time decision-making in large problem instances. Future work will focus on incorporating additional objectives such as energy efficiency and security, and conducting in-depth evaluations under dynamic traffic conditions and fault scenarios to further assess robustness for real-world deployment.
Author Contributions
Yintan Ai: Conceptualization, Methodology, Software, Writing - original draft. Hua Li: Supervision, Project administration, Funding acquisition. Hongwei Ruan: Validation, Writing - review & editing. Hanlin Liu: Data curation, Investigation.
Funding
This work was supported by the National Natural Science Foundation of China (No. 61862047, No. 62262047), the Inner Mongolia Science and Technology Program (No.201802028, No.2020GG0186), the fund of Supporting the Reform and Development of Local Universities (Disciplinary Construction) and the special research project of First-class Discipline of Inner Mongolia A. R. of China (YLXKZX-ND-036) and the Inner Mongolia Key Research and Development and Achievement Transformation Project (2025YFHH0101).
Data Availability
Data will be made available on reasonable request.
Declarations
Competing interests
The authors declare that they have no competing interests.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
Bao, W; Yuan, D; Zhou, BB et al. Prune and plant: Efficient placement and parallelism of virtual network functions. IEEE Trans Comput; 2020; 69,
Cai, J; Huang, Z; Luo, J et al. Composing and deploying parallelized service function chains. J Netw Comput Appl; 2020; 163, 102637. [DOI: https://dx.doi.org/10.1016/j.jnca.2020.102637]
Cai, J; Huang, Z; Liao, L et al. Appm: adaptive parallel processing mechanism for service function chains. IEEE Trans Netw Serv Manage; 2021; 18,
Chintapalli, VR; Killi, BR; Partani, R et al. Energy-and reliability-aware provisioning of parallelized service function chains with delay guarantees. IEEE Trans Green Commun Network; 2023; 8,
Herrera, JG; Botero, JF. Resource allocation in nfv: A comprehensive survey. IEEE Trans Netw Serv Manage; 2016; 13,
Huang, Z; Li, D; Wu, C et al. Reinforcement learning-based delay-aware path exploration of parallelized service function chains. Math; 2022; 10,
Huang H, Tian J, Min G, et al (2024) Parallel placement of virtualized network functions via federated deep reinforcement learning. IEEE/ACM Trans Network
Jang, S; Ko, N; Kyung, Y et al. Network function parallelism configuration with segment routing over ipv6 based on deep reinforcement learning. ETRI J; 2025; 47,
Knight, S; Nguyen, HX; Falkner, N et al. The internet topology zoo. IEEE J Sel Areas Commun; 2011; 29,
Kreutz, D; Ramos, FM; Verissimo, PE et al. Software-defined networking: a comprehensive survey. Proc IEEE; 2014; 103,
Liang E, Liaw R, Nishihara R, et al (2018) Rllib: Abstractions for distributed reinforcement learning. In: International conference on machine learning. PMLR, pp 3053–3062
Lin, KCJ; Chou, PL. Vnf embedding and assignment for network function parallelism. IEEE Trans Netw Serv Manage; 2021; 19,
Lin, IC; Yeh, YH; Lin, KCJ. Toward optimal partial parallelization for service function chaining. IEEE/ACM Trans Network; 2021; 29,
Lin, X; Guo, D; Shen, Y et al. Sft-box: An online approach for minimizing the embedding cost of multiple hybrid sfcs. IEEE/ACM Trans Network; 2022; 31,
Lin X, Guo D, Shen Y, et al (2018) Dag-sfc: Minimize the embedding cost of sfc with parallel vnfs. In: Proceedings of the 47th international conference on parallel processing, pp 1–10
Li B, Tan K, Luo L, et al (2016) Clicknp: Highly flexible and high performance network processing with reconfigurable hardware. In: Proceedings of the 2016 ACM SIGCOMM Conference, pp 1–14
Lowe R, Wu YI, Tamar A, et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inf Process Syst 30
Lu, D; Long, S. Enhancing network function parallelism in mobile edge computing using deep reinforcement learning. ICT Express; 2025; 11,
Luo, J; Li, J; Jiao, L et al. On the effective parallelization and near-optimal deployment of service function chains. IEEE Trans Parallel Distrib Syst; 2020; 32,
Mijumbi, R; Serrat, J; Gorricho, JL et al. Network function virtualization: State-of-the-art and research challenges. IEEE Commun Surveys Tutor; 2015; 18,
Oliehoek, FA; Amato, C et al. A concise introduction to decentralized POMDPs,; 2016; Springer: [DOI: https://dx.doi.org/10.1007/978-3-319-28929-8]
Orlowski S, Wessäly R, Pióro M, et al (2010) Sndlib 1.0—survivable network design library. Netw: An Int J 55(3):276–286
Schulman J, Moritz P, Levine S, et al (2015) High-dimensional continuous control using generalized advantage estimation. arXiv:1506.02438
Schulman J, Wolski F, Dhariwal P, et al (2017) Proximal policy optimization algorithms. arXiv:1707.06347
Shi Y, Huang Z, Feng S, et al (2020) Masked label prediction: Unified message passing model for semi-supervised classification. arXiv:2009.03509
Sun, C; Bi, J; Zheng, Z et al. Hyper: a hybrid high-performance framework for network function virtualization. IEEE J Sel Areas Commun; 2017; 35,
Sun C, Bi J, Zheng Z, et al (2017b) Nfp: Enabling network function parallelism in nfv. In: Proceedings of the conference of the ACM special interest group on data communication, pp 43–56
Troia, S; Savi, M; Nava, G et al. Performance characterization and profiling of chained cpu-bound virtual network functions. Comput Netw; 2023; 231, 109815. [DOI: https://dx.doi.org/10.1016/j.comnet.2023.109815]
Wang, S; Yuen, C; Ni, W et al. Multiagent deep reinforcement learning for cost-and delay-sensitive virtual network function placement and routing. IEEE Trans Commun; 2022; 70,
Wang R, Luo J, Dong F, et al (2019) Paranf: Enabling delay-balanced network function parallelism in nfv. In: 2019 IEEE 23rd international conference on computer supported cooperative work in design (CSCWD). IEEE, pp 392–397
Xie, S; Ma, J; Zhao, J. Flexchain: Bridging parallelism and placement for service function chains. IEEE Trans Netw Serv Manage; 2020; 18,
Xu, H; Fan, G; Sun, L et al. Dynamic sfc placement scheme with parallelized sfcs and reuse of initialized vnfs: An a3c-based drl approach. J King Saud University-Comput Inf Sci; 2023; 35,
Yi, B; Wang, X; Li, K et al. A comprehensive survey of network function virtualization. Comput Netw; 2018; 133, pp. 212-262. [DOI: https://dx.doi.org/10.1016/j.comnet.2018.01.021]
Yu, C; Velu, A; Vinitsky, E et al. The surprising effectiveness of ppo in cooperative multi-agent games. Adv Neural Inf Process Syst; 2022; 35, pp. 24611-24624.
Zhang C, Sato T, Oki E (2023) Service deployment for parallelized function chains considering traffic-dependent delay. IEEE Trans Netw Serv Manage 21(2):2266–2286
Zhang, K; Zhou, Y; Zhang, S et al. Towards deploying sfc with parallelized vnfs under resource demand uncertainty in mobile edge computing. J King Saud University-Comput Inf Sci; 2023; 35,
Zhang Y, Anwer B, Gopalakrishnan V, et al (2017) Parabox: exploiting parallelism for virtual network functions in service chaining. In: Proceedings of the symposium on SDN research, pp 143–149
Zhang D, Wang L, Rezaeipanah A (2024) Towards deploying parallelized service function chains under dynamic resource request in multi-access edge computing. IEEE Trans Netw Service Manage
Zhang Y, Zhang ZL, Han B (2019) Hybridsfc: Accelerating service function chains with parallelism. In: 2019 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN). IEEE, pp 1–7
Zheng, D; Shen, G; Cao, X et al. Towards optimal parallelism-aware service chaining and embedding. IEEE Trans Netw Serv Manage; 2022; 19,
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.