Content area
Network interconnection critically impacts the performance of data centers (DCs) and high-performance computing (HPC) systems, with scalability becoming vital as computing demands grow. This necessitates interconnection architectures that meet stringent latency, bandwidth, cost, and power consumption requirements. Optical interconnections provide cost-efficiency, reduced power consumption, and scalability to fulfill bandwidth needs. However, optical switches lack optical buffers, complicating the operation of all optical networks. To this end, we propose HiveNet, a novel hybrid interconnect architecture based on dual-port nodes and arrayed waveguide grating routers (AWGRs). HiveNet integrates low-radix electrical switches at lower layers to reduce cable complexity and construction costs, while AWGR-based optical connections at upper layers ensure fast switching and high bandwidth. The dual-port capability enables robust fault tolerance and supports all communication types (node-switch and node-node). Furthermore, a customized routing algorithm significantly enhances performance. Simulations conducted under various traffic patterns demonstrate that HiveNet achieves controlled delay and superior aggregate throughput. For large-scale networks (with 104,976 nodes at 10 Gb/s), HiveNet reduces construction costs by 49.3%, 26.4%, 32.7%, 54.1%, and 59.3% compared to Fat-Tree, H-LION, Leaf-Spine, BCube, and Lotus, respectively. Additionally, HiveNet decreases power consumption by 34.8%, 48.2%, 29.8%, and 23.1% compared to Fat-Tree, BCube, Leaf-Spine, and Lotus, respectively.
[See PDF for image]
Fig. 1
The servers in HiveNet are endowed with dual ports
Introduction
Data centers (DCs) and High-Performance Computing (HPC) systems are experiencing exponential traffic growth due to the proliferation of cloud applications, the Internet of Things (IoT), and the increasing scale of big data analytics (Hou et al. 2025). As a result, modern interconnection architectures must meet stringent demands regarding latency, scalability, bandwidth, power consumption, and infrastructure cost (Hintemann and Hinterholzer 2019; Ullah et al. 2025; Bari et al. 2012; Al-Makhlafi et al. 2020).
Traditional multi-layer electrical interconnection architectures, though widely used, face significant scaling challenges. When the number of interconnected nodes exceeds 10,000 and data rates surpass 10 Gb/s, these architectures suffer from increased latency and the need for a massive number of (O/E/O) converters, which significantly raise both operational cost and energy consumption. Mitigating latency in hierarchical electrical designs often requires expensive high-radix switches; however, such switches are increasingly limited by the density constraints of Ball Grid Array (BGA) packaging and I/O capacity in ASICs (Cisco Systems, Inc. 2024; Yan et al. 2018). Consequently, even with advanced electrical designs, scalability, and energy efficiency become bottlenecks on a large scale (Ghiasi 2015).
On the other hand, all-optical interconnects offer a promising alternative due to their inherent advantages, such as high bandwidth, low latency, minimal interference, and the potential for non-blocking switching through wavelength-division multiplexing. Technologies like Arrayed Waveguide Grating Routers (AWGRs) enable contention-free, all-to-all communication with low configuration time and passive operation. When combined with fast Tunable Wavelength Converters (TWCs), AWGRs can efficiently manage large volumes of data with minimal delay (Yin et al. 2012).
However, purely optical architectures are not without drawbacks. One of the most critical challenges is the absence of optical buffering, which makes it difficult to manage packet-level granularity and dynamic traffic patterns (Christodoulopoulos et al. 2015). Furthermore, implementing complex traffic control mechanisms and fine-grained scheduling in the optical domain is still technologically restrictive and expensive.
To address the complementary limitations of purely electrical and purely optical architectures, this paper proposes a hybrid interconnection design, HiveNet, which combines low-radix electrical switches in the lower layers with AWGR-based optical switching in the upper layers. In HiveNet, AWGRs are utilized at the core to handle aggregated inter-pod traffic, taking advantage of their high port count, passive operation, and contention-free all-to-all communication capabilities. In the lower layers, electrical pods comprised of low-radix switches and dual-port programmable computing nodes manage intra-pod communication and provide local buffering to absorb burst traffic. These programmable nodes play a critical role in routing, computing, and flow shaping, thereby reducing the burden on the electrical switches and enabling the network to scale effectively without increasing port requirements at the pod level. Additionally, architecture supports all types of connections , node-to-node, node-to-switch, switch-to-switch, as shown in Fig. 1, offering flexibility and scalability that is essential for large-scale, high-performance computing environments.
This design effectively avoids the need for costly optical buffering at the packet level by confining optical switching to the upper layer and ensuring that only aggregated traffic enters the optical domain. HiveNet thus achieves an optimal balance between performance, cost, and scalability, addressing the inherent limitations of purely electrical or optical solutions through architectural co-design.
Furthermore, recent advancements, such as Google’s de-ployment of MEMS-based Optical Circuit Switches (OCS), highlight the growing role of optical technologies in data centers (Poutievski et al. 2022; Porter et al. 2013; Farrington et al. 2013; Peng et al. 2014; Fiorani et al. 2014). However, MEMS-based OCS still suffers from millisecond-scale reconfiguration delays, which limits their applicability for bursty, dynamic traffic (Kamei et al. 2003; Zheng et al. 2025; Ballani et al. 2020; Grani et al. 2017). By contrast, use of AWGRs with nanosecond-level tuning provides far more responsive switching, positioning it as a forward-looking solution for the next generation of high-throughput, low-latency DCN and HPC systems.
The main contributions of this paper can be summarized in the following manner:
A new, scalable, cost-effective, and low-latency architecture named HiveNet is proposed for HPC or DCN. The proposed architecture incorporates hybrid optical/electrical switches and dual-port nodes. The inclusion of electrical switches in the lower layers allows for pod establishment, enabling enhanced incremental network scalability while maintaining the entirety of its topological characteristics. The AWGRs used in the upper layer of the network are in charge of inter-pods connection, decreasing switching latency. Moreover, dual-centric design combines the benefits of switch-centric and server-centric architectures, offering cost efficiency, fault tolerance, and flexibility while maintaining low power consumption.
A comprehensive theoretical analysis is conducted to examine and compare the topological attributes of the HiveNet architecture with four distinct architectures: Fat-Tree, Leaf-Spine, H-LION, BCube, and Louts. This analysis encompasses crucial aspects like scalability, bisection bandwidth, network diameter, and the total number of switches and links, offering a significant understanding of the comparative efficiency and features of these architectures. Additionally, the study evaluates the costs and power consumption of these architectures.
To leverage path diversity and enhance the network’s performance, an optimal and shortest-path routing algorithm is presented. The suggested routing scheme utilizes balanced construction to ease load balancing within the network.
To enable a comprehensive evaluation, multiple network architectures, including HiveNet, Fat-Tree, BCube, H-LION, and Louts, are constructed and simulated to assess key performance metrics such as network throughput and latency, as well as to analyze the effects of switch and link failures on the reliability of network paths.
Related work
The innovations in DC and HPC systems dominate today’s electronic infrastructure landscape. In scientific computing, HPCs enable simulations, visualization, and extreme data analysis. Both DC and HPC systems contain thousands of servers or computing nodes that are intimately connected. In large-scale computing systems, the interconnection network has become more significant than CPU efficiency, as implementation time often depends more on communication latency than computation time. (Kumar et al. 1994). Several interconnection architectures have been proposed to address these challenges in DC and HPC environments, such as Al-Makhlafi et al. (2022), Al-Fares et al. (2008), Chen et al. (2015), Kitayama et al. (2015), Liao et al. (2015), Guo et al. (2009), Proietti et al. (2015), Xia et al. (2017), Moeen et al. (2024), and Proietti et al. (2015), each offering unique scalability, latency, and fault tolerance trade-offs. We will briefly discuss some of the most common DC and HPC architectures.
Fat-tree
Fat-Tree (Al-Fares et al. 2008) is a multi-tier switch-centric based architecture in which the switches are responsible for the routing task. Fat-Tree is built of three layers of switches: The core, the aggregation, and the edge layers. Fat-Tree considers the most common interconnection architecture. The TH-2 interconnection system construction is an instance of employing the Fat-Tree network (Liao et al. 2015) for realizing a low network latency and wide bandwidth. Fat-Tree is considered as most common interconnection architecture due to the links abundance and the multiple routes among the nodes. Although it can provide a full bisection bandwidth, the number of core switch ports limits its scalability. The Fat-Tree scalability with electrical switches that have 64-port does not exceed 100,000 nodes. On the other hand, the more layers in the network, the more network cost, energy consumption, and network latency.
BCube
BCube (Guo et al. 2009) is a server-centric based architecture, where the servers are supported by multi Network Interface Cards (NICs) ports. In turn, the servers are in charge of the forwarding and computing tasks, while the switches act like dummy crossbars. BCube is constructed of several levels; low levels in the network are used to build high levels in a recursive structure. The building block of the architecture is the created from an n-port switch and n servers. Hence, the architecture can be constructed of the n network and n-port switches. BCube has a high performance in the network throughput and the bisection bandwidth. However, it bears a high cost and energy consumption as well as its scalability is insufficient, limiting its potential for expansion and adaptability.
Lotus
Lotus (Proietti et al. 2015) Network is a topology that combines electrical switches with AWGRs. It is built using a hierarchical multilevel structure consisting of direct switches, indirect switches, and AWGRs. A complete bipartite graph topology is used within each group, with direct switches connected to computing nodes and indirect switches linking multiple groups. The construction of the Lotus network involves numerous groups, each with computing nodes and 2n switches. These switches are connected within each group through a complete bipartite graph, providing high bisection bandwidth and scalability. Communication between adjacent groups is enhanced by AWGRs, which offer path diversity and improved reliability. However, the Lotus Network also has some limitations. The primary challenge lies in its complexity due to integrating electrical and optical components, which can increase the cost and management overhead. Additionally, the lack of a unified architecture for all types of connections limits its scalability compared to more streamlined solutions.
H-LION
An understanding of the topological interconnection is essential for achieving scalable interconnection architectures. The utilization of wavelength protocols within Arrayed AWGRs is a key aspect of Hierarchical Lightwave Optical Interconnect Networks (H-LION) (Proietti et al. 2015). Moreover, it leverages the nodes/servers with embedded switches to achieve low latency and high throughput.By leveraging passive (AWGRs), the interconnect topology within the racks and clusters is designed to be hierarchical and support all-to-all connectivity. Exploiting a distributed and flat Thin-CLOS topology (Proietti et al. 2013) assists H-LION in fulfilling the communication among the clusters. The network achieves an outstanding performance; however, the physical cost of a network is exaggerated.
[See PDF for image]
Fig. 2
The construction of HiveNet architecture based on AWGRs and dual-port nodes, where m=4
Concisely, the Switch-centric architectures are characterized by their ability to switch packets at extremely high speeds, utilizing switches that are fast but relatively less programmable and often costly in large-scale systems. Conversely, architectures based on server-centric designs have a low construction cost due to the deployment of low-end switches. However, those architectures assign the forwarding and computing tasks to their servers/computing nodes, which leads to servers being burdened with relaying packets. Furthermore, the processing delay in nodes is higher than that of switches, resulting in more critical delays in these architectures. HiveNet addresses these limitations associated with previous works and leverages the advantages that a tree-based construction can offer, which involves deploying and exploiting communication among nodes for both computing and routing purposes. The proposed architecture is a dual-centric architecture that assigns both switches and nodes to perform computing and forwarding tasks. This arrangement can harness the impressive switching capabilities of switches and the significant programming abilities of servers, thereby deriving benefits from both to apply them in the proposed architecture. As a result of this new architecture proposal, all connections can be made (node-node, node-switch, and switch-switch).
Furthermore, the AWGR is exploited in HiveNet for providing a large number of switch ports and achieving a fast switching time via employing the fast-tunable lasers, in which, AWGR performs the communication among the pods within the system. The HiveNet network allows the gradual addition of servers/computing nodes by adding pods to the network while preserving the system’s entire topological features and the connections in the pod. In other words, when HiveNet expands, it exclusively adds more ports to its AWGR while keeping the number of electrical switch ports unchanged. In contrast, in other architectures, such as Fat-Tree, the network’s size increment results in growth in both the number of switches and switch ports. Thus, there is a slight growth in total number of switch ports compared to hierarchical architectures.
While AWGRs offer low power consumption, high port counts, and passive operation, challenges remain in fabricating high-performance AWGRs at large scales. However, advancements in photonic integration are making this more feasible. HiveNet’s hybrid architecture, with electrical switches for intra-pod communication and AWGRs for inter-pod communication, ensures compatibility with current standards and allows for incremental deployment without overhauling existing infrastructures. Similar to MEMS-based OCS deployments at Google, HiveNet could complement optical interconnects in data centers. However, its large-scale adoption will depend on cost reductions and ongoing standardization efforts in optical switching. We believe that HiveNet’s scalable and flexible design positions it as a viable solution for future data center networks.
Interconnection network
First, we will outline physical architecture, which utilizes a dual-centric approach for connecting nodes and switches. Next, we will explain the connection rules for both node-switch and node-node interactions. Finally, we will highlight the key properties of the HiveNet network.
Table 1. Description table of some important symbols
Symbol | Description |
|---|---|
N | Total number of the nodes |
m | Number of ports at an AWGR |
k | Degree of the local and intermediary switches |
Degree of the AWGRs switches | |
p | ID of the Pods |
c | ID of the clusters |
l | ID of the local switches |
n | ID of the nodes |
HiveNet physical architecture
The proposed hybrid structure is symmetrical, scalable, low-latency, and has a wide bandwidth. It is based on dual-centric interconnection, which incorporates both server-centric and switch-centric components, as illustrated in Fig. 2. This network consists of three hierarchical layers. The lower layer utilizes dual-port nodes to create the server-centric interconnection. The second and third layers employ hybrid (electronic/optical) and optical switches to establish the switch-centric interconnection. The symbols used in the HiveNet network and their descriptions are detailed in Table 1.
Definition 1
The HiveNet network is divided into m pods; each pod comprises m clusters. Each cluster is constructed of m racks and contains nodes in which m indicates the number of ports in AWGR. To preserve the symmetrical network structure, m must be an even number and not less than 4, i.e., 4, 6, 8, etc.
A pod includes two types of switches: local switches (LS) and intermediary switches (IS). Electrical links connect the servers to each other and to the local switches, as well as to the intermediary switches. To construct the network and enable communication among all the pods, optical cables are utilized to connect the intermediary switches across multiple pods through AWGRs (Grani et al. 2017). As a passive optical switch, the AWGR facilitates fast and non-blocking transmission among any ports within the device.
There are nodes in each cluster, in which the nodes are defined through a 4-coordinate (p, c, l, n) . For achieving the server-centric interconnection within the cluster, the nodes are responsible for forwarding the packets; each node is endowed with two ports. The first port is linked with its local switch, while the other port is linked to another node in a shared cluster based on (1) and (2).
1
Equation (1) describes for every pair of nodes that belong to the same Cluster (c) and (), the node 1 links to node 2 if and only if &.2
Equation (2) describes the case for every pair of nodes that belong to the same Cluster (c) and (l = n), the node 1 links to node 2 if and only if .In addition, every cluster has m number of local switches with ports, where these switches can be defined based on a 3-coordinate (p, c, l) . The degree of local switch is , in which the port’s Identifier differs from 0 to m. The ports from 0 to are connected to m nodes, and port m is linked to an intermediary switch in the same pod via electronic links.
There are m clusters and m hybrid optical-electronic packet switches (intermediary switches) in each pod, which are endowed with pairs of optical transceivers. These switches are determined through a 2-coordinate (p, l) . The intermediary switches are responsible for the communications among all the clusters in the same pod. Each intermediary switch is equipped with ports; the port number differs from 0 to m. Ports from 0 to are connected to m local switches in different clusters through electric links, while port m is linked to an AWGR in the upper layer via an optical link. Thus, the switch degree of both an intermediary and local is .
Each pod contains electrical links for connecting the local switches to the nodes and nodes to each other in the same cluster. In addition, it includes electrical links to interconnect among the local switches and the intermediary switches in the same pod. Besides that, each pod is equipped with m optical links, which are used to interconnect the intermediary switches to the AWGR to achieve connection among all the pods. The optical link can provide higher bandwidth, facilitating faster transmission of the nodes’ aggregated traffic. The network interconnection rules of both the server-centric and switch-centric parts are shown in Algorithm 1.
[See PDF for image]
Algorithm 1
The procedures of construction and interconnections in the HiveNet network.
In the network, there are m number of AWGRs, which is the same as the number of the pods. Each AWGR is endowed with m ports , i.e., there are number of AWGRs ports in the network. By using optical links, these ports are connected directly to all intermediary switches at the lower layer in different pods based on the connection rules shown in Algorithm 1. Therefore, the data can be transferred amongst pods with just one hop. This communication manner could enhance the network reliability and minimize the connection overhead among the pods.
Definition 1 provides the basis for deriving Theorem 1.
Theorem 1
The number of electric switches, AWGRs, nodes, and links in HiveNet can be expressed deterministically as: , m, , , respectively.
HiveNet can accommodate staged expansion by provisioning AWGRs with a higher port count than initially required to support incremental scalability in practical deployments. In this approach, some AWGR ports remain unused during early deployment phases but are reserved for future pod additions. This strategy allows the network to grow without structural redesign or major reconfiguration, while preserving the architecture’s symmetrical topology and the contention-free communication enabled by AWGRs. By planning for anticipated growth through overprovisioned AWGRs, HiveNet maintains both its scalability and performance advantages over time.
Shortest distance of internode
Computations of path lengths vary based on the type of architecture (switch-centric and server-centric). The path length in the switch-centric structures can be computed as the number of connections in the route (Li et al. 2016; Liu et al. 2013). However, the path’s length in server-centric architectures calculates how many nodes (excluding source and destination nodes) are along the path (Guo et al. 2009, 2008; Li et al. 2009). Since the HiveNet network is dual-centric interconnection based, computations of path lengths comprise two protocols: server-centric routing and switch-centric routing. Therefore, for any pair of nodes a and b in HiveNet, Theorem 2 can be drawn as follows, where the references the length of the shortest distance From to .
Theorem 2
A can be calculated for any pair of nodes in HiveNet as follows for and .
, , ,
, , ,
,
Proof
Per the connection rules of HiveNet, it is possible to establish routes between pair of nodes and to prove the distance between them for all seven possible paths.
Case 1: .
Case 1.1: . Since and share the same pod, cluster, and local switch, direct path can be constructed as via . As a result, = 1.
Case 1.2: . A route can be established with a length of 1 as by the second port of . Then, by using Case.1.1 a path can be created as . As a consequence, = 1.
Case 1.3: . First, The following step can be taken to construct a route with a 1 length through . Then, to reach , we apply the shortest path in Case 1.2. Thus, = 2.
Case 2: .
Case 2.1: . First, a route can be created with 1 length as . Then, another path is established with 2 lengths as . Finally, a route is built with 1 length to reach b as . Therefore, = 4.
Case 2.2:. A route is established with 1 length by using the second port of a, as . After that, based on the shortest path in Case 2.1, it is determined how to reach . Hence, = 1+4.
Case 2.3:. To acquire the shortest path, a route is contracted as . Next, in Case 2.2, the shortest path is used to establish a path to reach b . consequence, = 1+5.
Case 3:. In the network, the length of the route among any two pods is 2 hops at the upper layer, where this path passes through AWGR. Therefore, to compute the path length from and under the above cases, 2 lengths are added to each case.
Network diameter
The diameter is considered one of the key metrics that characterize the structure and efficiency of a network. diameter is the maximum distance between all pairs of nodes, providing valuable insights into the communication delay within the system. Networks with smaller diameters generally exhibit better connectivity, shorter communication paths, and more efficient information flow. Therefore, the diameter serves as a significant indicator of the topological properties, performance, and overall effectiveness of interconnection architectures (Li et al. 2016).
Theorem 3
HiveNet has a diameter of 2 for intra-clusters, 6 for intra-pods, and 8 for inter-pods.
Proof
Theorem 2 examines the shortest internode distance in three cases, which are divided into six subcases. Therefore, the maximum distance between any node pairs for intra-clusters, intra-pods, and inter-pods are 2, 6, and 8, respectively. Whatever the size of the network, the HiveNet diameter remains constant and is restricted by 8.
Bisection width
Another essential property of the HiveNet architecture is the bisection width, which requires attention. Bisection width (BW) indicates the links’ fewest amount that should be eliminated for splitting the whole network into two subnetworks of identical size. Therefore, BW reflects the capacity and the resilience of the network. In other words, a wider BW offers a greater network capacity, indicating a significant resistance against faults.
Theorem 4
The bisection width of HiveNet can be expres-sed asBW= , where N is the total number of nodes in the network, and m is the number of ports at an AWGR.
Proof
Since HiveNet comprises multiple pods, we can calculate the bisection width (BW) by dividing the network into two halves. Based on the structure of HiveNet, the network consists of m pods, interconnected through optical links. Each pair of pods is connected by m connections. We assume that the pods are split evenly between the two halves, with d/2 pods in each half. In the first half of the network, each pod is connected to d/2 other pods in the second half through links. The calculation for BW can be expressed as:
3
Routing in HiveNet
The HiveNet Routing (HNR) algorithm, as illustrated in Algorithm 2, employs a single-path routing approach that optimizes the network’s routing efficiency. HNR utilizes the dual-centric architecture of HiveNet to ensure effective load balancing across the network. Consider a pair of nodes (source and destination) performing the communication, and they are addressed by 4-coordinates src and dest, where it’s applied for identifying the pod, cluster, local switch, and node respectively.
[See PDF for image]
Algorithm 2
HiveNet Routing (HNR)
As HiveNet is a dual-centric, interconnection-based network, HNR is divided into two key procedures: Server-Centric Routing (for intra-pod communication) and Switch-Centric Routing (for inter-pod communication via AWGRs). In general, when the packet is generated within the source node src, it will check the packet header (destination). The traffic in HNR is divided into three cases: (a) intra-cluster traffic, (b) intra-pod (inter-cluster) traffic, and (c) inter-pod traffic.
Case 1: Intra-cluster (server-centric) traffic occurs when both src and dest are sharing the same pod and cluster, i.e., and . This traffic can be divided into three subcases.
Case 1.1: The nodes of the src and dest shared a local switch, i.e., . Hence, for enabling the packet transmission, a route is created from the first port of src to the local switch . Then, applying the local switch routine can transmit the packet to dest .
Case 1.2: In this configuration, src and dest are connected to different local switches, and the node-id of the src and the local switch-id of dest are equal, i.e., and . Based on these terms, a route is directly constructed from the second port of src for transmitting the packet.
Case1.2.1: Here, the local switch-id of the src and the node-id of dest are equal, i.e., Therefore, = , which means the packet had reached its destination.
Case1.2.2: If Case 1.2.1 is otherwise, then we use the routine in Case 1.1 for constructing a route from to dest to packet transmission.
Case1.3: In this case,the src and dest are located in the same cluster; however, they have no direct connection, i.e., and . In this regard, by using the first port of src, a path can be established from src to an intermediate node through a local switch for forwarding the packet. Then, we apply the procedure in Case1.2 for delivering the packet to the dest.
Case 3: Inter-pod (switch-centric) traffic arises when the src and dest nodes are in different pods, i.e., and . In this case, to enable packet forwarding, a route is found from src to dest as follows.
First, using the first port of src, a path is established from src to a local switch src for sending the packet. Then, this local switch forwards the packet to an intermediary switch in the same pod . Next, the packet passes through the intermediary switch of the src pod to an intermediary switch of the dest pod via AWGR, i.e., AWGR. After that, the intermediary switch in the dest pod will send the packet to a local switch within the dest cluster . Finally, utilizing the procedure in Case1 can create a route for delivering the packet to dest.
The above analysis reveals that the routing scheme in both server-centric and switch-centric only demands computing whether the addresses of the source and destination nodes can be satisfied by a given set of conditions for deriving the next-hop address. Therefore, the proposed routing algorithm has a constant time complexity with regard to the size of the network. Since the algorithm mechanism performs independently of the conditions of the current network, the packets of data will keep moving forward until reaching their destination rather than repeatedly passing them among the switches. Moreover, the complete dual-centric graph employed in the pod will not cause occasional deadlocks. In addition, the AWGRs that are used for the connection among all pods by optical links support rapid and non-blocking transmission between any two ports.
The path variety and dual-centric interconnections that characterized HiveNet lead to remarkable fault tolerance when a failure in the switch or the links within the network occurs. Moreover, the HiveNet network exploits the dual ports in the nodes to avoid any fault at the root switch, representing the main drawback in all the switch-centric networks, e.g., Fat-Tree, that can cause the entire network to collapse.
Each cluster in a pod has m of the local switches (root switch) connected to m nodes, with each node having two ports. In case of a fault happening in the local switch, the node can still forward the packet by using the second port connected to another node and then to another local switch in the same cluster. The intermediate switches are responsible for the pods’ communication. Similarly, if a fault occurs in an intermediate switch, the packet can still be forwarded via of the intermediate switches in the same pod. As a result, the nodes in a pod are not able to communicate with each other only if all m of the local or intermediate switches fail simultaneously. The communications among the nodes in several pods are through m of AWGRs. When an AWGR or the link fails, the packet can still be forwarded by AWGRs. The communication among the pods cannot be completed unless all AWGRs or all m intermediate switches fail, whether in the source or destination pod. These communication failures described in the scenarios above have a small probability of occurrence. Further research will be conducted on fault-tolerant routing to carry out in-depth analysis In future work.
Performance evaluations
This section investigates the HiveNet network performance under different traffic manners in terms of network latency and throughput, as well as the impact of switch and link failures on the reliability of network paths. Then, the proposed network’s performance is compared with that of the Fat-Tree, H-LION, BCube, and Lotus networks.
Network latency and throughput
An HPC or DCN architecture can be evaluated through a variety of measurements; however, we study two significant measures, i.e., network latency and throughput. Offered load refers to the rate at which source nodes inject packets into the network, and when this demand exceeds the network’s capacity, congestion occurs, leading to performance degradation. As congestion builds up, the latency increases, measured through end-to-end (ETE) latency. ETE latency is the total time a packet travels from its source to its destination, encompassing transmission, processing, propagation, and queuing delays. Additionally, throughput, which is defined as the number of packets successfully delivered per unit time, rises with the offered load, but only marginally. Eventually, throughput reaches a saturation point, where further increases in the offered load no longer result in higher throughput, as the network becomes fully congested.
[See PDF for image]
Fig. 3
The (a)-(c) represent end-to-end latency and (d)-(f) represent the average throughput of HiveNet, Fat-Tree, H-LION, BCube, and Lotus networks (with 256 nodes) under the traffic patterns of all-to-all, nearest-neighbor, and hot-spot respectively
[See PDF for image]
Fig. 4
The (a)-(c) represent end-to-end latency and (d)-(f) represent the average throughput of HiveNet, Fat-Tree, H-LION, BCube, and Lotus networks (with 4096 nodes) under the traffic patterns of all-to-all, nearest-neighbor, and hot-spot respectively
[See PDF for image]
Fig. 5
The Impact of switch and link failure on path failures in the networks of class A
We use the OPNET (OPNET Modeler 2025) simulator to evaluate the latency and throughput of HiveNet, Fat-Tree, H-LION, BCube, and Lotus networks. The simulation results of latency and throughput under different traffic patterns are demonstrated regarding the average values of tests. In the simulation environment, the size of the packet is determined as 512 Bytes. Four interfaces of 100 Gb/s InfiniBand EDR and two interfaces of 10 Gb/s Ethernet are included in a DGX 1 system (NVIDIA Corporation 2025). Therefore, the optical and electrical transmission bandwidths are determined as 100 Gb/s and 10Gb/s, respectively. A delay of 40 nanoseconds has been set for the electrical switch, while a delay of 5 nanoseconds per meter has been set for the cable (Proietti et al. 2015). To solve the traffic aggregation issues in the network’s lower layers, the 100 GB/s optical transmission technology is used (Kuschnerov et al. 2015). Thus, the bandwidth of transmissions is determined to be 10 times that of electricity.
To comprehensively understand the latency and throughput performance evaluation for HiveNet, Fat-Tree, H-LION, BCube, and Lotus networks, we set two different network sizes for the simulation. First, we set HiveNet with 256 nodes (m = 4), Fat-Tree with 252 nodes (16-port switches), H-LION with 256 nodes (p = 2, = 4, and W = 7 ), BCube with 256 nodes (n = 4), and Lotus is part Lotus (n = 6 and nd W = 2) with 234 nodes. Second, we set HiveNet with 4096 nodes (m = 8), Fat-Tree with 4096 nodes (32-port switches), H-LION with 4032 nodes (p = 7, = 3, and W = 7 ), BCube with 4096 nodes (n = 8), and Lotus with 4100 nodes (n = 10 and nd W = 4).
All-to-all traffic
Under the all-to-all scenario, the average latency of different networks is illustrated in Fig. 3(a). The performance of Fat-Tree, HiveNet, H-LION, and Lotus structures are similar, with saturation points of the injected load are at about 95%, 86%, 84%, and 76%, respectively. Since the diameter of the B-Cube is significant and increases with increasing network layers, its performance is the worst, where its saturation point of the injected load is 64%. In contrast, Fat-Tree and HiveNet have small and stable diameters irrespective of the size of the network, leading to low latency. Furthermore, the AWGRs used in the upper layer of the HiveNet network address the traffic convergence issue and minimize transmission latency.
On the other hand, Fig. 3(d) illustrates the throughput performance of Fat-Tree, HiveNet, H-LION, Lotus, and BCube. Considering the injected load in these networks, the average throughputs are saturated at about 92%, 83%, 80%, 76% and 60%, respectively.
As the number of nodes in Fat-Tree, HiveNet, H-LION, Lotus, and BCube networks increases, their performance decreases, as demonstrated in Fig. 4(a) and (d). Compared to small-scale networks (256 nodes), the latency and throughput of Fat-Tree and HiveNet in large-scale (4096 nodes) are slightly decreased. More precisely, the latency and throughput saturation points are lower in large-scale networks by approximately 5% than those of small-scale networks. This result is due to Fat-Tree and network diameter being stable and not affected by the network scaling. However, the performance of the BCube is significantly deteriorating because the network diameter increases with the network expansion.
Nearest-neighbor traffic
In this traffic, every node in a pod transmits and receives data from every adjacent pod. The latency and throughput performance of Fat-Tree, HiveNet, H-LION, Lotus, and BCube within this traffic are almost similar to that of all-to-all traffic, as illustrated in Figs. 3(b) and (e) and 4 (b) and (e). The abundance of resources in the connections between adjacent groups in these networks is the reason behind this result. Additionally, HiveNet and B-Cube exploit the ports in nodes for forwarding data, creating different paths between adjacent clusters.
Hot-spot traffic
The end-to-end network latency of Fat-Tree, HiveNet, H-LION, Lotus, and BCube is illustrated in Fig. 3(c). In the HiveNet network, all the packets are transmitted from pods 1-3 to pod 0. Similarly, in the networks of Fat-Tree, H-LION, Lotus, and BCube, all the packets from other clusters are transmitted to a single cluster. end-to-end network delay in this traffic is the worst among the other networks, where it saturates at 2.8% of the injected load. On the other hand, the Fat-Tree, HiveNet, H-LION, and Lotus saturation points of the injected load are at about 12.1%, 10.4%, 9.1% and 8.2%, respectively.
The throughput of the networks mentioned above is shown in Fig. 3(f). The BCube network has the lowest performance compared to Fat-Tree, HiveNet, H-LION, and Lotus due to the Limited global links in the high level of the network to interconnect its groups. In contrast, the AWGRs at the higher level of HiveNet can provide a faster mechanism for inter-pods communication, significantly relieving the bottleneck.
The performance of BCube in a larger-scale network has decreased significantly, as shown in Fig. 4(c and f). network delay reaches saturation at about 0.09% of the injected load. For Fat-Tree, HiveNet, H-LION, and Lotus, the saturation points of the injected load are 8.4%, 7.1 %, 6.8%, and 5.4%, respectively.
[See PDF for image]
Fig. 6
The Impact of switch and link failure on path failures in the networks of class B
Table 2. A comprehensive comparison of the HiveNet architecture with other network architectures
HiveNet | Fat-tree | H-LION | Leaf-Spine | BCube | Lotus | |
|---|---|---|---|---|---|---|
Scalability | High | Medium | high | Medium | high | high |
Incremental scalability | Quite high | Quite high | Low | Quite high | Not necessary | Low |
Server Number | ||||||
Electrical switch: | 5N/n | - | 5N/2m | |||
AWGRs: | m | - | - | - | ||
Links Number | ||||||
Diameter | 8 | 4 | 9 | |||
Bisection Width | N/2 | N/2 | N/2 |
Fault-tolerance
Given that the failure of any switch or link can result in significant degradation of network performance, the high resilience of the network architecture to various types of failures can be regarded as a distinguishing characteristic. This characteristic enables the identification of the most robust architectures, which can serve as the underlying infrastructure for highly trustworthy applications. To ensure resilience against various types of failures within the network, the necessity for needless physical interconnections and resources in DCNs becomes essential. This subsection examines the impact of switch and link failures on the reliability of network paths.
Two sets of simulations were conducted to assess the behavior of the HiveNet network under switch and link failures, comparing it with the Fat-Tree, H-LION, BCube, and Lotus networks. The first set evaluates networks with approximately 150 servers (class A), while the second set evaluates networks with approximately 600 servers (class B). It’s important to note that the networks being compared accommodate roughly the same number of servers.
As shown in the Fig. 5(a), the HiveNet network in class A demonstrates superior robustness against link failures when compared to other networks. Specifically, HiveNet exhibits improved robustness by 58.4%, 13.8%, 74.5%, and 79.2% over Fat-Tree, H-LION, BCube, and Lotus, respectively. On the other hand, Fig. 6(a) shows that the HiveNet network in class B exhibits the highest robustness compared to other architectures. Specifically, HiveNet achieves a resilience improvement of 26.6%, 13.3%, 52.4%, and 66.7% over Fat-Tree, H-LION, BCube, and Lotus, respectively, in terms of link failure resilience.
The curves are shown in both Figs. 5(b) and 6(b) highlight the remarkable superiority of the HiveNet network in comparison to the other networks. In Fig. 5(b), the HiveNet network in class A shows the highest robustness against switch failures compared to different architectures. More specifically, HiveNet achieves a robustness improvement of 29.1%, 12.9%, 61.2%, and 79.1% over Fat-Tree, H-LION, BCube, and Lotus, respectively, in terms of switch failure resilience. Furthermore, as shown in the Fig. 6(b), the HiveNet network outperforms the Fat-Tree, H-LION, BCube, and Lotus networks in group B by margins of 24.3%, 14.6%, 52.4%, and 69.5%, respectively.
Overall, the HiveNet network demonstrates superior resilience to both switch and link failures in comparison to other architectures. This superior performance is attributed to the variety of paths and dual-centric interconnections inherent in HiveNet, which provide exceptional fault tolerance in the event of a switch or link failure. Additionally, HiveNet leverages dual ports in the nodes to mitigate faults at the root switch, a critical vulnerability in switch-centric networks, which can lead to the collapse of the entire network.
Construction evaluations
This section compares the topological characteristics, diameter, bisection width, scalability, construction costs, and power consumption of HiveNet with other widely used DCN architectures, including Fat-Tree, H-LION, Leaf-Spine, BCube, and Lotus. In order to ensure a fair comparison, all architectures are configured with an identical number of servers and switches, each having the same number of ports.
The characteristics of their topological structures significantly impact the effect of DCNs. This study includes a thorough analysis of the topological features of HiveNet, Fat-Tree, H-LION, Leaf-Spine, BCube, and Lotus. A summarized overview of these characteristics is provided in Table 2, which includes details on the network diameter, bisection bandwidth, and the number of switches, servers, and links for each DCN architecture
Diameter
Table 2 reveals that Leaf-Spine architecture exhibits the smallest diameter. However, its network size is constrained by the number of switch ports, n, in the top layer. Once the switch ports in the top layer are fully utilized, the network size of the Leaf-Spine architecture becomes considerably smaller than that of other architectures. Similarly, the size of the Fat-Tree network is also constrained by the number of switch ports in the top layer. It can be represented as , which is twice its height. In both H-LION and BCube networks, the diameter grows exponentially as k, representing the network layer, increases. This exponential increase in diameter makes these architectures less suitable for DCNs. In contrast, HiveNet and Louts exhibit a linear increase in diameter, which results in improved performance, particularly in terms of transmission delay. As outlined in Section 3, the HiveNet diameter can be limited, with intra-cluster, intra-pod, and inter-pod diameters of 2, 6, and 8, respectively, independent of the DCN size or network level. This ensures that HiveNet can effectively support real-time applications.
Bisection width
The bisection widths of Leaf-Spine, BCube, and Fat-Tree structures are all equal to N/2, which represents half of the total number of servers in their respective networks. The bisection widths for both H-LION and Lotus remain unknown, but the lower bounds are expressed as . The bisection width of HiveNet, expressed as , increases as the number of servers grows. This suggests that HiveNet becomes more efficient in large-scale DCNs. Thus, HiveNet offers a wide range of potential paths among servers, enhancing its inherent fault tolerance. Furthermore, HiveNet can achieve a full bisection width of N/2, accomplishing this at a significantly lower cost than the Fat-Tree, Leaf-Spine, and BCube architectures. The cost-efficiency of this advantage will be explored in more detail in the following subsection.
Scalability
Scalability is a key factor in the construction of DCNs, as cloud computing services rely on the ability to scale efficiently. The DCNs must support the interconnection of a vast number of servers while maintaining performance. Furthermore, the construction must facilitate the gradual expansion of network capacity, guaranteeing that upgrades to DCN components do not require substantial reconfiguration of the network topology.
Table 2 illustrates that the scalability of BCube is constrained by the number of server ports. While this architecture can support additional servers, doing so requires the addition of more NIC ports. In Fat-Tree, H-LION, Leaf-Spine, and Lotus, scalability is limited by the number of switch ports. This means that once all switch ports are fully occupied, no additional servers can be integrated into the architecture. To address this challenge, the solution lies in using switches with a significantly higher number of ports for building the DCNs. The scalability of HiveNet is constrained by just the number of ports in the top-layer switches. As the network level in HiveNet expands to allow for the inclusion of more servers, only the ports on the top-layer switches need to be extended. To scale up the HiveNet structure, pods can be added with a flexible number of servers, allowing for continuous and efficient expansion. This approach facilitates seamless expansion and significantly improves the network’s scalability. To explain in more detail, the expansion from to involves the addition of m new clusters, each capable of hosting servers. As a result, with each new level added to the HiveNet network, the capacity grows by servers. Based on the analysis above, it is clear that a flexible number of servers can be added without compromising the overall topological characteristics of the network.
Cost and energy consumption
This section investigates the costs and energy consumption assessment based on switches (optical and electrical), NICs, and the number of links for different network sizes. In addition, it evaluates the prices of tunable wavelength converters that are needed to be used in conjunction with AWGR, where the laser tuning speed can be within the nanosecond level (Yin et al. 2012). Then, we highlight a comparison between the construction cost and energy consumption of the optical (H-LION Proietti et al. 2015), the hybrid (Lotus), and electrical (Fat-Tree, BCube, and Leaf-Spine Al-Fares et al. 2008; Guo et al. 2009; Alizadeh and Edsall 2013) structures with HiveNet. The number of switches and links in each architecture is determined based on the formulas outlined in Table 2. Fat-Tree and Leaf-Spine are made of three layers and two layers of electronic switches, respectively, while BCube contains multiple layers of electronic switches. H-LION is built of passive AWGRs and fast-tunable lasers. Lotus is a hybrid topology integrating electrical switches and AWGRs. Since the nodes have an equal contribution to all architectures, only the network’s cost and energy consumption are estimated. Moreover, for achieving a fair comparison, there is no oversubscription at TORs for all architectures. The number of switches, links, and NICs for various network sizes can be obtained based on network configuration.
Table 3. Prices and power consumption of the interconnection networks’ components
Switches & TXs | Ports | Cost($) | Power(W) |
|---|---|---|---|
Electrical | P 128 | 20 per port | 2 per port |
Switch | P = 256 | 10922 | 622 |
P = 512 | 65532 | 2490 | |
P = 1024 | 131064 | 5050 | |
AWGR | P = 8 | 300 | – |
P = 16 | 1200 | – | |
P = 24 | 2700 | – | |
P = 32 | 4800 | – | |
NIC | 1 | 5 | 5 |
2 | 10 | 7.5 | |
4 | 20 | 10 | |
Tunable | 75 | 0.2 |
Table 4. Parameters for constructing several scales of interconnection architectures
1000 nodes | HiveNet | H-LION | Fat-tree | Leaf-Spine | BCube | Lotus |
Electrical switch: | 216 (7) L1 | 128 (16) L1 | 25 (80) L1 | 256 (16) | 136 (10) L1 | |
amount (radix) | 36 (7) L2 | 128 (16) L2 | 8 (128) L2 | 136 (16) L2 | ||
64 (16) L3 | ||||||
AWGR | 6 (6) | 448 (4) | 17(8) | |||
Links | 2,088 | 2,432 | 3,072 | 2,048 | 3,072 | 2,618 |
Tunable | 36 | 1,024 | 136 | |||
NICs (radix) | 1,296 (2) | 1,024 (1) | 1,024 (1) | 1,024 (1) | 1,024 (2) | 1,088 (1) |
10,000 nodes | HiveNet | H-LION | Fat-tree | Leaf-Spine | BCube | Lotus |
Electrical switch: | 1000 (11) L1 | 626 (32) L1 | 156 (128) L1 | 1452 (22) | 876 (18) L1 | |
amount (radix) | 100 (11) L2 | 626 (32) L2 | 19 (512) L2 | 876 (22) L2 | ||
313 (32) L3 | ||||||
AWGR | 10 (10) | 2,736 (8) | 73 (12) | |||
Links | 15,600 | 26,496 | 30,048 | 19,968 | 31,944 | 24,528 |
Tunable | 100 | 10,368 | 876 | |||
NICs (radix) | 10,000 (2) | 10,368 (1) | 10,016 (1) | 9,984 (1) | 10,648 (3) | 10,512 (1) |
100,000 nodes | HiveNet | H-LION | Fat-tree | Leaf-Spine | BCube | Lotus |
Electrical switch: | 5,832 (19) L1 | 2,780 (72) L1 | 1,560 (128) L1 | 23,328 (18) | 4862 (32) L1 | |
amount (radix) | 324 (19) L2 | 2,780 (72) L2 | 98 (1,024) L2 | 4862 (44) L2 | ||
1,390 (72) L3 | ||||||
AWGR | 18 (18) | 10,545 (32) | 221 (22) | |||
Links | 160,704 | 229,824 | 300,240 | 199,680 | 419,904 | 243,100 |
Tunable | 324 | 103,968 | 4,862 | |||
NICs (radix) | 104,976 (2) | 103,968 (1) | 100,080 (1) | 99,840 (1) | 104,976 (4) | 106,946 (1) |
[See PDF for image]
Fig. 7
The normalized number of links per node for different architectures
The electrical and optical structures component costs and energy consumption are shown in Table 3 (Elpeus Technology 2025; Industrial Networking Solutions 2025; Yan et al. 2015; Arista Networks 2025; Yu et al. 2013). The components’ prices and power consumption are acquired from the laboratory prototype and the present commercial components, which may vary based on adopting new technologies. These should be considered as mere approximations of actual values.
Assuming that the nodes connected to the network are N in total, and the b Gb/s is the data rate of each node’s operations. Therefore, TOR’s total aggregated bandwidth will be Gb/s. To better understand the energy consumption and cost of the various structures using several components, we investigate the condition of expanding the network with a fixed data rate of 10 Gb/s (b = 10) and various numbers of nodes from 1,000 to 100,000. Table 4 presents the whole elements of the network ( the quantity and ports numbers of the optical and electronic switches as well as the number of NICs, links, and tunable) needed for interconnecting about 1000, 10,000, and 100,000 nodes via the optical and electrical structures.
Figure 7 shows the total number of links normalized via the number of interconnected nodes for the various architectures and network sizes. Based on the above analysis of the HiveNet configuration, the number of ports in the electrical switch at the lower tier remains constant during the scaling of the network. Consequently, the proposed architecture uses fewer links than other architectures, regardless of network size. More precisely, HiveNet only needs 53.5% and 38.3% of the links used in Fat-Tree and BCube with respect to the network’s large-scale. Furthermore, HiveNet uses lower links by 30.1%, 19.5%, and 19.5% 33.8 compared to H-LION, Leaf-Spine, and Lotus for the large network scale.
[See PDF for image]
Fig. 8
The normalized cost per node for different architectures
Figure 8 demonstrates the overall costs of the various architectures and network sizes normalized by the number of interconnected nodes. Compared to the other architectures, the H-LION is more expensive because of the high cost of the tunable devices. HiveNet is the most cost-effective architecture for all the sizes of networks (1000, 10,000, and 100,000). In exact terms, for the network’s small scale, the construction cost in HiveNet is about 49.6% and 35.7% that of Fat-Tree and H-LION. Compared with Leaf-Spine, BCube, and Lotus, the HiveNet architecture saves costs by 23.1%, 72.8%, and 53.7% respectively, in terms of the small-scale network. Moreover, with the expansion of the network, the electrical architectures are more costly than the HiveNet architecture, mainly because of the excessive use of electrical switches. For the large-scale network, HiveNet building cost is about 49.3%, 26.4%, 32.7%, 54.1%, and 59.3% that of Fat-Tree, H-LION, Leaf-Spine, BCube, and Lotus, respectively.
[See PDF for image]
Fig. 9
The normalized power consumption per node for different architectures
Figure 9 illustrates the aggregate power consumption of the five architectures normalized by the number of interconnected nodes. The normalized energy consumption of electric architectures (Fat-Tree, BCube, and Leaf-Spine) is the highest among the other structures due to the employment of electric switches in their networks. H-LION has the lowest power consumption because of the adoption of passive AWGR. HiveNet architecture has lower power consumption than electrical structures because it takes advantage of the adoption of passive AWGR and limits the use of large amounts of electrical switches. In other words, for the large-scale network, compared to Fat-Tree, BCube, Leaf-Spine, and Lotus, HiveNet achieves significant savings of 34.8%, 48.2%, 29.8%, and 23.1% regarding power consumption.
Design trade-offs and complexity
While AWGRs offer low-latency, high-throughput communication, their lack of optical buffering can introduce latency under bursty traffic conditions. In contrast, electrical switches in the lower layers offer power efficiency and electronic buffering, which reduces latency for bursty traffic but increases energy consumption. HiveNet’s hybrid approach strikes a balance by combining low-latency optical switching with energy-efficient buffering, ensuring optimal performance while maintaining scalability. Furthermore, a comparative complexity analysis of HiveNet against other hybrid architectures, focusing on hardware complexity and scalability, is provided in the above section. This analysis demonstrates how HiveNet’s dual-centric design facilitates incremental scalability and cost-effective deployment, positioning it as a viable solution for large-scale DCNs and HPC systems.
Recognizing that no singular DCN architecture can universally excel across all performance metrics without compromises is imperative. Each network design prioritizes certain characteristics over others based on the specific demands it seeks to meet. In the case of HiveNet, the architecture has been meticulously designed to strike a balanced compromise, emphasizing scalability, cost-efficiency, and power consumption, which are critical for the sustainable expansion of modern DCs. Indeed, as noted, the Fat-tree architecture demonstrates a marginally superior latency performance, exceeding HiveNet by approximately 9%. This is attributable to the Fat-tree’s extensive use of connections and switches, which, while beneficial for reducing latency, also result in increased complexity, higher costs, and elevated power consumption. Conversely, HiveNet’s design philosophy embraces a more holistic approach, aiming to deliver a scalable and economically viable solution without disproportionately escalating operational expenses or energy usage. As detailed in this Section, HiveNet distinguishes itself by offering remarkable scalability and cost-efficiency alongside significant power savings, all while maintaining a bisection width comparable to that of the Fat-tree. These attributes are particularly advantageous for DC operators seeking to navigate the challenges of rapid growth and environmental sustainability. By judiciously balancing these considerations, HiveNet represents a forward-thinking solution that addresses the multifaceted demands of contemporary DC environments.
Conclusion
This paper presents a novel hybrid interconnect architecture based on AWGRs and dual-port nodes, HiveNet. It thoroughly examines its features. The architecture effectively utilizes low-radix electric switches to reduce cable complexity and construction costs, maximizing the network’s potential. HiveNet comprises several pods, allowing for the incremental addition of nodes while maintaining its topological properties. By leveraging the dual-port functionality inherent in the nodes, HiveNet enables various communication types, including node-to-switch and node-to-node connections. This enhances the network’s fault tolerance and increases its bisection width. A specially designed algorithm is proposed for this network to enhance routing efficiency. The system performance is numerically assessed through sets of simulations under three different traffic patterns. The simulation results show that the system exhibits a controlled delay and strong aggregate throughput, making it an excellent candidate for designing DCNs or HPC systems. Additionally, when accommodating 104,976 nodes operating at 10 Gb/s each, the establishment cost of the HiveNet architecture is only about 49.3%, 26.4%, 32.7%, 54.1%, and 59.3% of the costs associated with Fat-Tree, H-LION, Leaf-Spine, BCube, and Lotus, respectively. Furthermore, when compared to Fat-Tree, BCube, Leaf-Spine, and Lotus, HiveNet demonstrates a reduction in power consumption of 34.8%, 48.2%, 29.8%, and 23.1%, respectively.
HiveNet strategically localizes the optical-electrical interface to a single high-speed port on each intermediary switch, ensuring that inter-pod traffic transitions efficiently into the optical domain. This design minimizes conversion overhead and maintains high throughput, even as traffic scales. By combining selective optical integration with electrical switching at lower layers, HiveNet achieves a balanced trade-off between performance, scalability, and cost, making it robust for both current workloads and future network growth. However, some challenges need to be acknowledged. Specifically, the absence of optical buffering remains a fundamental limitation, particularly in handling burst traffic patterns, as AWGRs do not provide the necessary buffering capabilities for packet-level granularity. Additionally, the scalability of the architecture is heavily reliant on the scalability of AWGRs. To address these limitations, future work will focus on incorporating AI-driven routing techniques and advanced traffic management strategies to enhance HiveNet’s adaptability, improve traffic flow control, and optimize performance.
Data Availability
The data are available from the corresponding author on reasonable request.
Declarations
Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
Al-Fares, M; Loukissas, A; Vahdat, A. A scalable, commodity data center network architecture. ACM SIGCOMM Comp Commun Rev; 2008; 38,
Alizadeh M, Edsall T (2013) On the data path performance of leaf-spine datacenter fabrics. In: 2013 IEEE 21st annual symposium on high-performance interconnects. IEEE, pp 71–74
Al-Makhlafi, M; Gu, H; Almuaalemi, A et al. Ribsnet: a scalable, high-performance, and cost-effective two-layer-based cloud data center network architecture. IEEE Trans Netw Serv Manage; 2022; 20,
Al-Makhlafi M, Gu H, Yu X et al (2020) P-cube: a new two-layer topology for data center networks exploiting dual-port servers. IEICE Trans Commun, p 2019EBP3219
Arista Networks (2025) 7500 series universal spine and cloud networks. https://www.arista.com/en/products/7500-series
Ballani H, Costa P, Behrendt R et al (2020) Sirius: a flat datacenter network with nanosecond optical switching. In: Proceedings of the annual conference of the ACM special interest group on data communication on the applications, technologies, architectures, and protocols for computer communication, pp 782–797. https://doi.org/10.1145/3387514.3405887
Bari, MF; Boutaba, R; Esteves, R et al. Data center network virtualization: a survey. IEEE Commun Surv Tutor; 2012; 15,
Chen K, Wen X, Ma X et al (2015) Wavecube: a scalable, fault-tolerant, high-performance optical data center architecture. In: 2015 IEEE Conference on Computer Communications (INFOCOM). IEEE, pp 1903–1911
Christodoulopoulos, K; Lugones, D; Katrinis, K et al. Performance evaluation of a hybrid optical/electrical interconnect. J Opt Commun Netw; 2015; 7,
Cisco Systems, Inc. (2024) Cisco catalyst 9400 series switch data sheet. https://www.cisco.com/c/en/us/products/collateral/switches/catalyst-9400-series-switches/nb-06-cat9400-ser-data-sheet-cte-en.html. Accessed 24 July 2025
Elpeus Technology (2025) Product categories. http://www.elpeus.com/categories
Farrington, N; Forencich, A; Porter, G et al. A multiport microsecond optical circuit switch for data center networking. IEEE Photonics Technol Lett; 2013; 25,
Fiorani, M; Aleksic, S; Casoni, M et al. Energy-efficient elastic optical interconnect architecture for data centers. IEEE Commun Lett; 2014; 18,
Ghiasi, A. Large data centers interconnect bottlenecks. Opt Express; 2015; 23,
Grani P, Proietti R, Akella V et al (2017) Design and evaluation of awgr-based photonic noc architectures for 2.5 d integrated high performance computing systems. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, pp 289–300
Guo C, Lu G, Li D et al (2009) Bcube: a high performance, server-centric network architecture for modular data centers. In: Proceedings of the ACM SIGCOMM 2009 conference on Data communication, pp 63–74
Guo C, Wu H, Tan K et al (2008) Dcell: a scalable and fault-tolerant network structure for data centers. In: Proceedings of the ACM SIGCOMM 2008 conference on data communication, pp 75–86
Hintemann R, Hinterholzer S (2019) Energy consumption of data centers worldwide. Business, Computer Science (ICT4S)
Hou, S; Hu, Y; Tian, L et al. Poly-hqos: a polymorphic packet scheduler for traffic isolation in multi-tenant cloud environment. J King Saud Univ - Comp Inf Sci; 2025; 37,
Industrial Networking Solutions (2025) Product categories. http://www.industrialnetworking.com/Category
Kamei S, Ishii M, Itoh M et al (2003) 6464-channel uniform-loss and cyclic-frequency arrayed-waveguide grating router module. Electron Lett 39(1):83–84. https://doi.org/10.1049/el:20030054, https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/el:20030054
Kitayama, KI; Huang, YC; Yoshida, Y et al. Torus-topology data center network based on optical packet/agile circuit switching with intelligent flow management. J Lightwave Technol; 2015; 33,
Kumar, V; Grama, A; Anshul, G et al. Introduction to parallel computing: design and analysis of algorithms. Benjamin/Cummings Publishing Company; 1994; 18, pp. 82-109.
Kuschnerov, M; Mangan, BJ; Gong, K et al. Transmission of commercial low latency interfaces over hollow-core fiber. J Lightwave Technol; 2015; 34,
Li, D; Wu, J; Liu, Z et al. Towards the tradeoffs in designing data center network architectures. IEEE Trans Parallel Distrib Syst; 2016; 28,
Liao, XK; Pang, ZB; Wang, KF et al. High performance interconnect network for tianhe system. J Comput Sci Technol; 2015; 30,
Li D, Guo C, Wu H et al (2009) Ficonn: using backup port for server interconnection in data centers. In: IEEE INFOCOM 2009. IEEE, pp 2276–2285
Liu Y, Muppala JK, Veeraraghavan M et al (2013) Data center networks: topologies, architectures and fault-tolerance characteristics. Springer Briefs in Computer Science. https://doi.org/10.1007/978-3-319-01949-9
Moeen AM, Gu H, Almuaalemi A et al (2024) Alveolinet: an incrementally scalable, cost-effective, and high-performance two-layer based architecture for data centers. IEEE Trans Netw Sci Eng
NVIDIA Corporation (2025) NVIDIA DGX-1 with Tesla V100 system architecture. https://images.nvidia.com/content/pdf/dgx1-v100-system-architecture-whitepaper.pdf
OPNET Modeler (2025) Opnet network simulator. https://opnetprojects.com/opnet-network-simulator/. Accessed 2025
Peng S, Simeonidou D, Zervas G et al (2014) A novel sdn enabled hybrid optical packet/circuit switched data centre network: the lightness approach. In: 2014 European Conference on Networks and Communications (EuCNC). IEEE, pp 1–5
Porter, G; Strong, R; Farrington, N et al. Integrating microsecond circuit switching into the data center. ACM SIGCOMM Comp Commun Rev; 2013; 43,
Poutievski L, Mashayekhi O, Ong J et al (2022) Jupiter evolving: transforming google’s datacenter network via optical circuit switches and software-defined networking. In: Proceedings of the ACM SIGCOMM 2022 conference, pp 66–85
Proietti, R; Yin, Y; Yu, R et al. Scalable optical interconnect architecture using awgr-based tonak lion switch with limited number of wavelengths. J Lightwave Technol; 2013; 31,
Proietti, R; Cao, Z; Nitta, CJ et al. A scalable, low-latency, high-throughput, optical interconnect architecture based on arrayed waveguide grating routers. J Lightwave Technol; 2015; 33,
Ullah, Y; Roslee, M; Mitani, SM et al. A survey on ai-enabled mobility and handover management in future wireless networks: key technologies, use cases, and challenges. J King Saud Univ - Comput Inf Sci; 2025; 37,
Xia Y, Sun XS, Dzinamarira S et al (2017) A tale of two topologies: exploring convertible data center network architectures with flat-tree. In: Proceedings of the conference of the ACM special interest group on data communication, pp 295–308
Yan, F; Xue, X; Calabretta, N. Hifost: a scalable and low-latency hybrid data center network architecture based on flow-controlled fast optical switches. J Opt Commun Netw; 2018; 10,
Yan F, Miao W, Dorren H et al (2015) On the cost, latency, and bandwidth of lightness data center network architecture. In: 2015 International Conference on Photonics in Switching (PS). IEEE, pp 130–132
Yin, Y; Proietti, R; Ye, X et al. Lions: an awgr-based low-latency optical switch for high-performance computing and data centers. IEEE J Sel Top Quantum Electron; 2012; 19,
Yu, R; Cheung, S; Li, Y et al. A scalable silicon photonic chip-scale optical switch for high performance computing systems. Opt Express; 2013; 21,
Zheng, G; Gao, H; He, N et al. Sdass: secure data auditing for sharing matrix based secret share cloud storage supporting data dynamics. J King Saud Univ - Comput Inf Sci; 2025; 37,
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.