Content area
Data Centers (DCs) are critical infrastructures that support the digital world, requiring fast and reliable information transmission for sustainability. Ensuring their reliability and efficiency is essential for minimizing risks and maintaining operations. This study presents a novel availability-driven approach to optimizing maintenance costs in DC Uninterruptible Power Supply (UPS) systems configured in a parallel k-out-of-n arrangement. The model integrates reliability and availability metrics into a dynamic optimization framework, determining the optimal number of components needed to achieve the desired availability while minimizing maintenance costs. Through simulations and a case study by utilizing variable failure rates and monthly maintenance costs, the model achieves a combined system availability of 99.991%, which exceeds the Tier 1 DC requirement of 99.671%. A sensitivity analysis, incorporating ±10% variations in Mean Time Between Failures (MTBF), Mean Time To Repair (MTTR), and maintenance costs, was conducted to demonstrate the model’s robustness and adaptability across diverse operational conditions. The analysis also evaluates how different k-out-of-n UPS system configurations influence overall availability and maintenance costs. Additionally, feasible k-out-of-n configurations that achieve the required system availability while balancing operational costs were examined. Furthermore, the optimal number of UPS components and their associated minimum costs were compared across different DC tiers, highlighting the impact of varying availability requirements on maintenance strategies. These results showcase the model’s effectiveness in supporting critical maintenance planning, providing DC managers with a robust tool for balancing operational expenses and uptime.
Details
Maintenance management;
Optimization;
Failure rates;
Availability;
Operations management;
Balancing;
Operating costs;
Fault tolerance;
Energy consumption;
MTBF;
Ventilation;
Adaptability;
Configurations;
Efficiency;
Data centers;
Computer centers;
Reliability engineering;
Maintenance costs;
Infrastructure;
Artificial intelligence;
Air conditioning;
Reliability;
Decision making;
Cost analysis;
Information processing;
Cost control;
Cloud computing;
Risk reduction;
Uninterruptible power supplies;
Critical infrastructure
