Content area

Abstract

This paper analyzes a finite-capacity GI/M/2/N queue with two heterogeneous servers operating under a multiple working-vacation policy, Bernoulli feedback, and customer impatience. Using the supplementary-variable technique in tandem with a tailored recursive scheme, we derive the stationary distributions of the system size as observed at pre-arrival instants and at arbitrary epochs. From these, we obtain explicit expressions for key performance metrics, including blocking probability, average reneging rate, mean queue length, mean sojourn time, throughput, and server utilizations. We then embed these metrics in an economic cost function and determine service-rate settings that minimize the total expected cost via the Bat Algorithm. Numerical experiments implemented in R validate the analysis and quantify the managerial impact of the vacation, feedback, and impatience parameters through sensitivity studies. The framework accommodates general renewal arrivals (GI), thereby extending classical (M/M/2/N) results to more realistic input processes while preserving computational tractability. Beyond methodological interest, the results yield actionable design guidance: (i) they separate Palm and time-stationary viewpoints cleanly under non-Poisson input, (ii) they retain heterogeneity throughout all formulas, and (iii) they provide a cost–optimization pipeline that can be deployed with routine numerical effort. Methodologically, we (i) characterize the generator of the augmented piecewise–deterministic Markov process and prove the existence/uniqueness of the stationary law on the finite state space, (ii) derive an explicit Palm–time conversion formula valid for non-Poisson input, (iii) show that the boundary-value recursion for the Laplace–Stieltjes transforms runs in linear time O(N) and is numerically stable, and (iv) provide influence-function (IPA) sensitivities of performance metrics with respect to (μ1,μ2,ν,α,ϕ,β).

Full text

Turn on search term navigation

1. Introduction

Modern service systems routinely face finite buffers, rework loops, energy-aware operation, and behavioral customer responses. Treating these features jointly and without restricting arrivals to be Poisson is essential for credible design. Queuing theory (In 1909, Agner Krarup Erlang, a Danish engineer employed at the Copenhagen Telephone Exchange, published the pioneering paper that laid the foundation for what is now known as queuing theory [1]. Erlang modeled the arrival of telephone calls at an exchange as a Poisson process and subsequently provided solutions to the M/D/1 queue in 1917 and the M/D/k queuing model in 1920 [2])—aptly characterized as the mathematics of waiting lines—furnishes a rigorous probabilistic language for analyzing congestion, delay, and capacity allocation in service systems driven by stochastic demand. Its core analytic tools include Markovian and renewal models, embedded chains, transform and generating-function techniques, matrix-analytic methods for quasi-birth–death (QBD) structures, and diffusion/heavy-traffic limits; together, these enable the derivation of stability conditions, stationary distributions, and performance descriptors such as waiting times, loss probabilities, utilizations, and throughput (see the classics [3,4,5,6,7,8,9,10] and Little’s law [11]). For recent references, see [12,13,14,15,16,17,18,19,20,21]. This apparatus is not merely descriptive: it supports normative decisions in mission-critical environments—including emergency response, communication networks, cloud-computing platforms, healthcare delivery, transportation, and logistics—where service quality and economic efficiency must be optimized jointly under uncertainty (cf. [22,23,24]). By articulating principled trade-offs between limited resources and stochastic variability, queuing theory has become a cornerstone for designing resilient, cost-effective service systems.

A salient evolution in the literature is the explicit modeling of customer impatience, which manifests primarily as (i) balking, whereby an arrival observing unfavorable conditions (e.g., long visible queues or high predicted delay) elects not to join, and (ii) reneging, whereby a customer who has joined subsequently abandons after an excessive waiting time. From an operational perspective, impatience fundamentally reshapes effective load, stability, and welfare: it removes demand endogenously, introduces state- and history-dependent losses, and alters equilibrium congestion levels in ways that are highly sensitive to information structure and behavioral assumptions. Classical and modern studies have established its impact across telecommunications, web services, healthcare, and manufacturing [25,26,27,28,29]. From an analytical standpoint, impatience complicates standard embeddings (e.g., arrival-epoch Markov chains) and undermines the convenience of PASTA outside the Poisson regime; even under Poisson input, the state dependence of abandonment and balking may require careful Palm-calculus distinctions between time-average and arrival-average measures [4].

A complementary stream of work studies working-vacation (WV) policies, under which servers reduce—rather than suspend—their service rate during vacation periods. The M/M/1/WV model introduced by [30] demonstrated that partial service during vacations can yield favorable sojourn-time and queue-length distributions relative to hard shutdowns, by tempering transient congestion while preserving energy or labor savings. Subsequent generalizations have spanned bulk arrivals [31,32], Erlang and general service distributions [33,34,35], multi-server configurations M/M/c [36,37,38,39], and discrete-time analogues [40,41]. Methodologically, WV policies induce a modulated service mechanism that couples vacation phases with the service process; tractability often follows from phase-type representations or supplementary variables that capture the WV phase jointly with the queue state.

Real traffic is frequently over-dispersed, bursty, or correlated, rendering Poisson arrivals an inadequate approximation. Renewal-input models GI/·/· capture such features parsimoniously. For GI/M/1 and finite-capacity variants with WV, notable contributions include explicit sojourn-time characterizations, interruption mechanisms, and phase-type vacation structures [31,42,43,44,45,46]. The passage from Poisson to general renewal input removes PASTA and necessitates a careful separation of time-stationary quantities from arrival-stationary (Palm) quantities. In particular, performance evaluation at pre-arrival epochs typically requires either (i) Markovianization by tracking residual inter-arrival time as a supplementary variable, or (ii) matrix-analytic constructions on an augmented state space; passage between Palm and time averages demands size-biasing and residual-life arguments.

Combining impatience with WV dynamics raises subtle questions about (i) state-dependent stability and ergodicity, (ii) the structural properties of stationary laws, and (iii) optimal control under explicit economic objectives. For Markovian baselines, explicit probability-generating functions across vacation and busy periods have been obtained in [47], while finite buffers, policy comparisons, and cost optimization under impatience/WV interactions appear in [48,49,50,51,52,53,54,55]. A related and practically pervasive mechanism is feedback: after service completion, a customer returns to the queue with fixed probability, modeling rework loops or packet retransmissions. Originating with Takács [56], feedback has been analyzed in Markovian and renewal settings [57,58,59,60,61,62,63,64]. Analytically, feedback alters the effective arrival rate in a state-dependent manner and couples departure streams to subsequent congestion, generating nontrivial interactions with vacation phases and impatience.

Most multi-server studies posit homogeneous servers, an assumption natural for automated systems but frequently violated in human-service or mixed-technology environments. Heterogeneity (μ1μ2) destroys exchangeability, induces state-dependent departure rates tied to server occupancy configurations, and generally breaks product-form heuristics. Early work on heterogeneity includes [65,66,67,68,69], with focused developments for M/M/2 (including WV) and transient analyses under impatience [70,71,72]. In contrast, renewal-input GI/M/2/N queues with heterogeneous servers remain comparatively underexplored; to our knowledge, [73] addresses performance metrics but leaves open a broader structural and economic treatment that accommodates impatience, feedback, and WV control simultaneously.

This paper: model, analytic approach, and scope. We consider a finite-buffer GI/M/2/N system with two heterogeneous exponential servers, a working-vacation policy, customer impatience in the form of balking and reneging, and Bernoulli feedback. The system captures operational features that co-occur in practice: (i) non-Poisson variability in demand (renewal inter-arrivals), (ii) endogenous demand attrition (balking, reneging), (iii) flexible, partially productive capacity modulation (working vacations), (iv) rework/retransmission loops (feedback), and (v) finite-capacity constraints (blocking). Formally, let {Ak}k1 be i.i.d. inter-arrival times with cdf A(u), density a(u), and mean 1/λ(0,); service times are exponential with heterogeneous rates μ1,μ2>0. A WV policy modulates the active server’s rate according to a finite-state vacation phase (standard variants include single and multiple vacation schemes). Balking is captured by a state-dependent acceptance probability θ¯(i)[0,1] when i customers are present upon arrival (with θ¯(N)=1 under finite capacity), while reneging is modeled via exponential patience with hazard α(i) that may depend on the queue length. Feedback is Bernoulli with probability β¯[0,1), applied to each service completion. Customers are served under FCFS; ties in server selection are resolved by a fixed priority or randomized rule. The system state records the queue length (including any in service), server-occupancy configuration (which server(s) are busy), WV phase, and remaining inter-arrival time. Because N<, the augmented process is irreducible and aperiodic on a finite state space, hence admitting a unique stationary distribution for any fixed parameter vector; all performance measures below are computed in this stationary regime. We emphasize that finite capacity and exponential impatience together preclude null recurrence even under heavy traffic; positive recurrence is automatic in our augmented finite-state description. Please see Table 1 for details.

Contributions and organization. Our analysis proceeds via the supplementary-variable technique [77], which Markovianizes the renewal input by tracking the remaining inter-arrival time and couples it with the WV phase and occupancy configuration. The resulting piecewise-deterministic Markov process admits a system of linear balance equations whose structure allows a recursive solution for (i) pre-arrival (Palm) stationary probabilities and (ii) time-stationary probabilities at arbitrary epochs, together with explicit Palm–time conversions to reconcile performance metrics that require one or the other viewpoint. This framework facilitates the computation of loss probability, mean queue length, waiting- and sojourn-time descriptors, and server utilizations, while preserving the heterogeneity of μ1 and μ2 and the modulation induced by WV phases. We then embed the stationary descriptors into an economic objective that trades off (a) capacity costs (including heterogeneous service rates and WV control parameters), (b) congestion costs (delay and queue-length penalties), (c) abandonment costs (reneging-induced losses), and (d) rework costs arising from feedback. The resulting design problem is nonconvex due to the nonlinear dependence of performance measures on (μ1,μ2), WV parameters, and impatience/feedback primitives; we therefore adopt the Bat Algorithm [74], a metaheuristic with robust exploration–exploitation behavior and favorable convergence in applied queuing design [75,76]. To our knowledge, this is the first study to combine renewal arrivals, heterogeneity, working vacations, impatience (balking and reneging), and Bernoulli feedback in the same finite-capacity two-server model with a full Palm–time reconciliation.

Unified state augmentation under renewal input. By including the remaining inter-arrival time in the state, we retain exactness for general GI input without resorting to Poissonization or diffusion scaling. This preserves compatibility with balking and reneging rules that depend on instantaneous congestion and the WV phase, while enabling Palm-compatible pre-arrival descriptors. Heterogeneity-aware service dynamics. We treat μ1μ2 explicitly and distinguish occupancy configurations—(0), (1 on server 1), (1 on server 2), or (2 in service)—which generate distinct departure hazards and interact with WV states; this non-exchangeability is retained in all performance expressions. Impatience and feedback coupling. Balking modifies the effective arrival stream in a state-dependent way at pre-arrival epochs; reneging introduces additional departures from the queue that depend on both the waiting population and the WV phase; and feedback couples completions to subsequent arrivals. Our formulation keeps these mechanisms explicit and compatible, rather than approximating by constant effective rates. Palm–time reconciliation. Many design metrics (e.g., loss on arrival, experienced waiting time) are naturally Palm quantities, whereas costs tied to time-average congestion require time-stationary laws. We make these distinctions explicit and provide the necessary conversions, avoiding implicit PASTA assumptions that fail under GI input. Economic design under WV control. We formalize a cost functional that transparently prices (i) heterogeneity in service effort, (ii) responsiveness under WV modulation, and (iii) behavioral losses due to impatience; the resulting optimization illustrates how calibrated heterogeneity and WV tuning jointly improve responsiveness and cost.

Let π(A) denote the pre-arrival stationary law on the augmented state space and π the time-stationary law. We first establish that, for any fixed parameter vector with finite buffer N<, the augmented process forms an irreducible and aperiodic Markov chain with a unique stationary distribution; hence, both π(A) and π exist and are unique. For completeness, we further delineate Foster–Lyapunov conditions under which these conclusions extend to N=, while our numerical investigations focus on the finite-buffer setting. Next, we obtain closed recursive characterizations of π(A) and π by balancing over renewal cycles and WV phases, and we develop Laplace–Stieltjes transform representations that render computations tractable for finite N. Building on these representations, we derive explicit performance formulas: (i) the loss probability arising from balking or blocking, (ii) the mean queue length together with the utilizations of heterogeneous servers, (iii) the waiting-time and sojourn-time distributions from both Palm and time perspectives, and (iv) sensitivities with respect to μ1,μ2, the WV parameters, and the impatience/feedback primitives. Finally, within a convex–nonconvex composite cost framework, we demonstrate that a Bat Algorithm search identifies operating points that outperform homogeneous-server baselines and hard-vacation controls across broad regions of the parameter space [74,75,76].

Structure of the paper. Section 2 formalizes the model state space, WV mechanism, balking/reneging rules, feedback, and supplementary-variable augmentation for renewal input, with a precise definition of pre-arrival (Palm) versus time-average observables. Section 3 develops the stationary analysis: recursive balance equations, Laplace–Stieltjes transform solutions, and Palm–time conversions. Section 4 reports performance characteristics and sensitivity analyses, including heterogeneity-induced asymmetries in utilization and their interaction with WV phases. Section 5 formulates the economic objective and presents the optimization methodology. Section 6 provides numerical experiments and design maps that expose managerial trade-offs and robustness. Section 7 illustrates the implications on a flexible production facility benchmark. Section 8 concludes with research directions, including diffusion approximations for large N, control under partial state information, and learning-based calibration of impatience and feedback.

2. Model Mathematical Description

This study investigates a single-class GI/M/2/N queue with two heterogeneous servers, a common FIFO queue, state-dependent balking, exponential reneging, Bernoulli feedback, and multiple working vacations taken by the slower server. A schematic of the system appears in Figure 1. The mathematical description relies on the following notation and assumptions.

Inter-arrival times. Inter-arrival times of successive arrivals form an i.i.d. sequence with cumulative distribution function A(u) (density a(u) for u0), Laplace–Stieltjes transform

A*(s)=[0,)esudA(u),s0,

and mean inter-arrival time

E[U]=ddsA*(s)|s=0=1λ(0,).

Finite capacity. The total system capacity is NN, counting customers both in service and waiting. Arrivals that would exceed capacity are blocked. Finite N ensures positive recurrence of the augmented Markov chain.

State-dependent balking. Upon arrival to a system containing i{0,1,,N} customers, a customer is admitted with probability θi and balks with probability θ¯i:=1θi. We assume

θ0=θ1=1,0θi+1θi1(2iN1),θN=0,

so no one joins a full system and admission propensities are nonincreasing in congestion. (Feedback customers are subject to the same rule; see below) Alternative information structures (delayed or noisy observations) can be incorporated by redefining (θi); our analysis is unchanged.

Service structure and discipline. There is a single common FIFO queue feeding two heterogeneous servers. Server 1 and Server 2 provide exponential service with rates μ1 and μ2, respectively, with μ2μ1. Server 1 is always available (no vacations).

Working vacations of Server 2. If Server 2 becomes idle when the queue is empty, it immediately enters a working vacation, whose duration is exponential with rate ϕ. During a working vacation, Server 2 remains active but at a reduced exponential service rate νμ2. At the vacation’s end, Server 2 switches to regular service if at least one customer is waiting; otherwise, it initiates a new working vacation. This induces “multiple” working vacations separated by possible service epochs.

Reneging (impatience). Whenever both servers are busy and the system contains i2 customers, the queue length equals i2. Each waiting customer runs an independent exponential impatience timer TExp(α). A customer whose service has not started before their timer expires abandons the system permanently. Impatience clocks are independent of the queue length process and of all other primitives.

Bernoulli feedback. Upon service completion (either during regular operation or while Server 2 is on a working vacation), a customer departs the system with probability β and, with complementary probability β¯:=1β, instantaneously feeds back to the input stream and behaves as a fresh arrival (i.e., is subjected to balking and capacity constraints in the current state). This mechanism models rework/retx loops; its coupling with balking is crucial in finite capacity.

All primitive sources of randomness—inter-arrival times, regular and working vacation service times, vacation durations, impatience timers, and feedback decisions—are mutually independent.

3. Steady-State Solution

This section analyzes the queuing system in steady-state using the supplementary variable technique to derive the system’s long-term behavior. The state of the system at time t is modeled as a continuous-time Markov process {(L(t),J(t),W(t)),t0}, with the state space defined as

Ω={(0,0,u)(1,0,u)(i,1,u)(i,2,u):1iN,u0},

where the system variables are

L(t): The number of customers in the system at time t, including those in service.

J(t): The state of the server at time t, defined as

J(t)=0,ifServer2isidleduringtheworkingvacationperiod,1,ifServer2isbusyduringtheworkingvacationperiod,2,ifServer2isbusyduringtheregularserviceperiod.

W(t): The remaining inter-arrival time for the next customer arrival at time t.

We define the joint probabilities for the system states as follows:

Pi,0(u,t)du=P(L(t)=i,uW(t)<u+du,J(t)=0),u0,i=0,1,Pi,j(u,t)du=P(L(t)=i,uW(t)<u+du,J(t)=j),u0,j=1,2,1iN.

In steady-state, as t, we have

Pi,0(u)=limtPi,0(u,t),i=0,1,Pi,j(u)=limtPi,j(u,t),j=1,2,1iN.

By employing the probabilistic arguments and using the remaining inter-arrival time as the supplementary variable, we observe the state of the system at two consecutive time epochs, t and t+dt. Taking limt and after some simplification, we have the following set of differential difference equations.

(1)P0,0(1)(u)=β[μ1P1,0(u)+νP1,1(u)+μ2P1,2(u)],

(2)P1,0(1)(u)=βμ1P1,0(u)+β[νP2,1(u)+μ2P2,2(u)]+a(u)P0,0(0),

(3)P1,1(1)(u)=(ϕ+βν)P1,1(u)+βμ1P2,1(u),

(4)P2,1(1)(u)=[β(μ1+ν)+ϕ]P2,1(u)+[β(μ1+ν)+α]P3,1(u)+a(u)[P1,0(0)+P1,1(0),

(5)Pi,1(1)(u)=[β(μ1+ν)+ϕ+(i2)α]Pi,1(u)+[β(μ1+ν)+(i1)α]Pi+1,1(u)+a(u)[θi1Pi1,1(0)+θ¯iPi,1(0)],3iN1,

(6)PN,1(1)(u)=[β(μ1+ν)+ϕ+(N2)α]PN,1(u)+a(u)[θN1PN1,1(0)+PN,1(0)],

(7)P1,2(1)(u)=βμ2P1,2(u)+ϕP1,1(u)+βμ1P2,2(u),

(8)Pi,2(1)(u)=[β(μ1+μ2)+(i2)α]Pi,2(u)+[β(μ1+μ2)+(i1)α]Pi+1,2(u)+ϕPi,1(u)+a(u)[θi1Pi1,2(0)+θ¯iPi,2(0)],2iN1,

(9)PN,2(1)(u)=[β(μ1+μ2)+(N2)α)PN,2(u)+ϕPN,1(u)+a(u)[θN1PN1,2(0)+PN,2(0)],

where Pi,0(0),i=0,1 and Pi,j(0),j=1,2,1iN are the respective rate probabilities with the remaining inter-arrival time equal to zero denoting that an arrival is about to occur. Then, we introduce the following Laplace–Stieltjes transforms of the steady-state probabilities as follows:

Pi,0*(s)=0esuPi,0(u)du,i=0,1,Pi,j*(s)=0esuPi,j(u)du,j=1,2,1iN.

Let Pi,0=Pi,0*(0) for i{0,1} and Pi,j=Pi,j*(0) for j{1,2} and 1iN denote the steady–state probabilities of having i customers in the system when the server is in state j{0,1,2}, observed at an arbitrary epoch. For 2iN, define

δi:=β(μ1+ν)+ϕ+(i2)α,ϑi:=β(μ1+μ2)+(i2)α.

Multiplying Equations (1)–(9) by esu and integrating with respect to u over [0,), we obtain the transformed relations

(10)sP0,0*(s)=P0,0(0)+βμ1P1,0*(s)+νP1,1*(s)+μ2P1,2*(s).

(11)(βμ1s)P1,0*(s)=P1,0(0)+βνP2,1*(s)+μ2P2,2*(s)+A*(s)P0,0(0).

(12)(ϕ+βνs)P1,1*(s)=P1,1(0)+βμ1P2,1*(s).

(13)(δ2s)P2,1*(s)=P2,1(0)+(δ3ϕ)P3,1*(s)+A*(s)P1,0(0)+P1,1(0)+θ¯2P2,1(0).

(14)(δis)Pi,1*(s)=Pi,1(0)+(δi+1ϕ)Pi+1,1*(s)+A*(s)θi1Pi1,1(0)+θ¯iPi,1(0),3iN1.

(15)(δNs)PN,1*(s)=PN,1(0)+A*(s)θN1PN1,1(0)+PN,1(0).

(16)(βμ2s)P1,2*(s)=P1,2(0)+ϕP1,1*(s)+βμ1P2,2*(s).

(17)(ϑis)Pi,2*(s)=Pi,2(0)+ϕPi,1*(s)+ϑi+1Pi+1,2*(s)+A*(s)θi1Pi1,2(0)+θ¯iPi,2(0),2iN1.

(18)(ϑNs)PN,2*(s)=PN,2(0)+ϕPN,1*(s)+A*(s)θN1PN1,2(0)+PN,2(0).

Summing (10)–(18) and simplifying yields

(19)A*(s)i=01Pi,0(0)+i=1NPi,1(0)+Pi,2(0)=si=01Pi,0*(s)+i=1NPi,1*(s)+Pi,2*(s).

Differentiating (19) with respect to s, letting s0, and using the normalization condition

i=01Pi,0+i=1NPi,1+Pi,2=1,

we obtain

(20)i=01Pi,0(0)+i=1NPi,1(0)+Pi,2(0)=λ.

Equation (20) expresses conservation of flow: the long–run average inflow to the system per unit time equals the mean arrival rate λ. For i{0,1}, define Pi,0 as the steady–state probability (at a pre–arrival epoch) that an arriving customer sees i customers in the system and the server in state j=0. For j{1,2} and 1iN, define Pi,j analogously as the steady–state probability that an arrival sees i customers and server state j. Our first goal is to connect these pre-arrival probabilities to the stationary descriptors at an arbitrary time.

Let L(t) denote the number of customers in the system at time t, let S(t){0,1,2} denote the server state, and let U(t) denote the remaining inter-arrival time at t. By Bayes’ formula for conditional probabilities, for each admissible (i,j), we have

Pi,j=limtPL(t)=i,S(t)=j|U(t)=0=limtPL(t)=i,S(t)=j,U(t)=0PU(t)=0.

Consequently,

(21)Pi,j=1λPi,j(0),j=0,i{0,1},j{1,2},1iN,

where λ is the arrival rate and Pi,j(0) is the corresponding boundary value at s=0 of the transform (defined below). To determine the pre-arrival probabilities, it suffices to compute

Pi,0(0)(i=0,1),Pi,j(0)(j=1,2,1iN),Pi,0*(s)(i=0,1),Pi,j*(s)(j=1,2,1iN),

recursively in terms of PN,0(0) (details below). Throughout, A*(·) denotes the relevant transform and A*(k) its kth derivative.

Step 1: Eliminating PN1,1(0).

Setting s=δN in (15) yields PN1,1(0) in terms of PN,1(0):

(22)PN1,1(0)=ψN1PN,1(0),ψN1:=1A*(δN)A*(δN)θN1.

Step 2: An expression for PN,1*(s).

Substituting (22) back into (15) gives

(23)(δNs)PN,1*(s)=A*(s)θN1ψN1+ψNψNPN,1(0),

where we set ψN1. Hence, for sδN,

(24)PN,1*(s)=A*(s)θN1ψN1+ψNψNδNsPN,1(0).

Step 3: The value at s=δN and higher derivatives.

Differentiating (23) with respect to s and then setting s=δN yields

(25)PN,1*(δN)=A*(1)(δN)θN1ψN1+ψNPN,1(0).

Differentiating (23) a total of l times with respect to s gives, for l1,

(26)(δNs)PN,1*(l)(s)lPN,1*(l1)(s)=A*(l)(s)θN1ψN1+ψNPN,1(0).

Step 4: Compact representation.

Combining (24)–(26), we obtain

(27)PN,1*(s)=ζN,sPN,1(0),

where

ζN,s=A*(s)θN1ψN1+ψNψNδNs,sδN,A*(1)(s)θN1ψN1+ψN,s=δN,

and

ζN,s(l)=A*(l)(s)θN1ψN1+ψN+lζN,s(l1)δNs,sδN,A*(l+1)(s)θN1ψN1+ψNl+1,s=δN,

with ζN,s(l) denoting the lth derivative of ζN,s with respect to s. Equations (21)–(27) provide the desired bridge from arbitrary-epoch quantities to pre-arrival probabilities and furnish the recursive ingredients needed to express all required terms in PN,0(0). For i=N1, taking s=δN1 in Equation (14) and using Equation (22), we obtain PN2,1(0) in terms of PN,1(0) as follows:

(28)PN2,1(0)=ψN2PN,1(0),

with

ψN2=ψN1(δNϕ)ζN,δN1A*(δN1)θ¯N1ψN1A*(δN1)θN2.

By applying Equation (28) to Equation (14) for i=N1, we can express PN1,1*(s) in terms of PN,1(0) as follows:

(29)PN1,1*(s)=ζN1,sPN,1(0),

where

ζN1,s=A*(s)(θN2ψN2+θ¯N1ψN1)+(δNϕ)ζN,sψN1(δN1s),ifsδN1,(A*(1)(s)(θN2ψN2+θ¯N1ψN1)+(δNϕ)ζN,s(1)),ifs=δN1,

where

ζN1,s(l)=A*(l)(s)(θN2ψN2+θ¯N1ψN1)+(δNϕ)ζN,s(l)+lζN1,s(l1)(δN1s),ifsδN1,A*(l+1)(s)(θN2ψN2+θ¯N1ψN1)+(δNϕ)ζN,s(l+1)l+1,ifs=δN1.

For i=N2,N3,,3 in Equation (14), Pi,1(0), for 2iN3, and Pi,1*(s), for 3iN2, are expressed in terms of PN,1(0) as follows:

(30)Pi1,1(0)=ψi1PN,1(0),i=N2,N3,,3,

where

ψi1=ψi(δi+1ϕ)ζi+1,δiA*(δi)θ¯iψiA*(δi)θi1i=N2,N3,,3,

and

(31)Pi,1*(s)=ζi,sPN,1(0),i=N2,N3,,3,

where

ζi,s=A*(s)(θi1ψi1+θ¯iψi)+(δi+1ϕ)ζi+1,sψi(δis),ifsδi,(A*(1)(s)(θi1ψi1+θ¯iψi)+(δi+1ϕ)ζi+1,s(1)),ifs=δi,

where

ζi,s(l)=A*(l)(s)(θi1ψi1+θ¯iψi)+(δi+1ϕ)ζi+1,s(l)lζi,s(l1)(δis),ifsδi,(δi+1ϕ)ζi+1,δi(l+1)+A*(l+1)(s)(θi1ψi1+θ¯iψi)l+1,ifs=δi.

Substituting s=δ2 into Equation (13) yields P1,1(0) in terms of PN,1(0) and P1,0(0) as follows:

(32)P1,1(0)=ψ1PN,1(0)+ωP1,0(0),

where

ψ1=ψ2(δ3ϕ)ζ3,δ2A*(δ2)θ¯2ψ2A*(δ2),andω=A*(δ2)A*(δ2)=1.

Using Equation (32) in Equation (13), we obtain P2,1*(s) in terms of PN,1(0) as follows:

(33)P2,1*(s)=ζ2,sPN,1(0),

where

ζ2,s=ψ2+(δ3ϕ)ζ3,s+A*(s)(ψ1+θ¯2ψ2)(δ2s),ifsδ2,((δ3ϕ)ζ3,s(1)+A*(1)(s)(ψ1+b¯2ψ2)),ifs=δ2,

and

ζ2,s(l)=(δ3ϕ)ζ3,s(l)+A*(l)(s)(ψ1+θ¯2ψ2)lζ2,s(l1)(δ2s),ifsδ2,(δ3ϕ)ζ3,s(l+1)+A*(l+1)(s)(ψ1+b¯2ψ2)l+1,ifs=δ2.

From Equation (12), we have

(34)P1,1*(s)=ζ1,sPN,1(0)+τ1,sP1,0(0),

where

ζ1,s=βμ1ζ2,sψ1(ϕ+βνs),ifsϕ+βν,βμ1ζ2,s(1),ifs=ϕ+βν.τ1,s=ω(ϕ+βνs),ifsϕ+βν,0,ifs=ϕ+βν,

where

ζ1,s(l)=βμ1ζ2,s(l)lζ1,s(l1)(ϕ+βνs),ifsϕ+βν,βμ1ζ2,s(l+1)l+1,ifs=ϕ+βν.τ1,s(l)=lτ1,s(l1)(ϕ+βνs),ifsϕ+βν,0,ifs=ϕ+βν.

Putting s=ϑN in Equation (18) and using the relation of PN,1*(s), we obtain PN1,2(0) in terms of PN,1(0) and PN,2(0) as follows:

(35)PN1,2(0)=ηN1PN,2(0)+γN1PN,1(0),

where

ηN1=1A*(ϑN)A*(ϑN)θN1andγN1=ϕζN,ϑNA*(ϑN)θN1.

Using Equation (35) in Equation (18), we get

(36)PN,2*(s)=ρN,sPN,2(0)+χN,sPN,1(0),

where

ρN,s=A*(s)(θN1ηN1+ηN)ηNϑNs,ifϑNs,A*(1)(s)(θN1ηN1+ηN),ifϑN=s.

χN,s=ϕζN,s+A*(s)θN1γN1ϑNs,ifϑNs,(ϕζN,s(1)+A*(1)(s)θN1γN1),ifϑN=s,

where

ρN,s(l)=A*(l)(s)(θN1ηN1+ηN)+lρN,s(l1)(ϑNs),ifϑNs,A*(l+1)(ϑN)(θN1ηN1+ηN)l+1,ifϑN=s.

χN,s(l)=ϕζN,s(l)+A*(l)(s)θN1γN1+lχN,s(l1)ϑNs,ifϑNs,ϕζN,ϑN(l+1)+A*(l+1)(ϑN)θN1γN1l+1,ifϑN=s,

where ηN=1 and γN=0. Following the procedure carried out for obtaining Pi,1(0) and Pi,1*(s), the probabilities Pi,2(0),1iN2, and Pi,2*(s),2iN1, can be obtained from Equation (17) in terms of PN,1(0) and PN,2(0) as

(37)Pi1,2(0)=ηi1PN,2(0)+γi1PN,1(0),2iN1,

where

ηi1=ηiϑi+1ρi+1,ϑiA*(ϑi)θ¯iηiA*(ϑi)θi1,γi1=γiϑi+1χi+1,ϑiA*(ϑi)θ¯iγiϕζi,ϑiA*(ϑi)θi1.

By using Equation (37) in Equation (17), we get

(38)Pi,2*(s)=ρi,sPN,2(0)+χi,sPN,1(0),

where

ρi,s=ηi+ϑi+1ρi+1,s+A*(s)(θi1ηi1+θ¯iηi)(ϑis),ifϑis,(ϑi+1ρi+1,s(1)+A*(1)(s)(θi1ηi1+θ¯iηi)),ifϑi=s.χi,s=γi+ϕζi,s+ϑi+1χi+1,s+A*(s)((θi1γi1+θ¯iγi))ϑis,ifϑis,(ϕζi,s(1)+ϑi+1χi+1,s(1)+A*(1)(s)((θi1γi1+θ¯iγi))),ifϑi=s,

where

ρi,s(l)=ϑi+1ρi+1,s(l)+A*(l)(s)(θi1ηi1+θ¯iηi)+lρi,s(l1)(ϑis),ifϑis,ϑi+1ρi+1,s(l+1)+A*(l+1)(s)(θi1ηi1+θ¯iηi)l+1,ifϑi=s.

χi,s(l)=ϕζi,s(l)+ϑi+1χi+1,s(l)+A*(l)(s)(θi1γi1+θ¯iγi)+lχi,s(l1)ϑis,ifϑis,ϕζi,s(l+1)+ϑi+1χi+1,s(l+1)+A*(l+1)(s)(θi1γi1+θ¯iγi)l+1,ifϑi=s.

Taking s=βμ1 in Equation (11), we get P0,0(0) in terms of P1,0(0), PN,1(0), and PN,2(0) as

(39)P0,0(0)=0P1,0(0)+1PN,1(0)+2PN,2(0),

where

0=1A*(βμ1),1=βνζ2,βμ1βμ2χ2,βμ1A*(βμ1),2=βμ2ρ2,βμ1A*(βμ1).

Taking s=ϕ+βν in Equation (12), using the expressions of P1,1(0) and P2,1*(s), we get P1,0(0) in terms of PN,1(0) as follows:

(40)P1,0(0)=κ1PN,1(0),

where

κ1=ψ1βμ1ζ2,ϕ+βν.

Taking βμ2=s into Equation (16) and using the expressions for P1,2(0),P1,1*(s) and P2,2*(s), along with Equation (40), we obtain PN,2(0) in terms of PN,1(0) as follows:

(41)PN,2(0)=κ2PN,1(0),

where

κ2=ϕζ1,βμ2+βμ1χ2,βμ2+ϕκ1τ1,βμ2γ1η1βμ1ρ2,βμ2.

From Equations (22), (28) and (30), and by applying Equation (40) into Equation (32), Equation (41) into Equations (35) and (37), and Equations (40) and (41) into Equation (38), we can express all Pi,j(0) in terms of PN,1(0). The only unknown PN,1(0) can be determined from Equation (21).

PN,1(0)=λκ10+1+κ22+ψ1+i=2Nψi+i=2N(γi+κ2ηi)1.

Now, the steady-state probabilities at a pre-arrival epoch Pi,j can be simply obtained from the rate probabilities Pi,j(0) through Equation (21). Finally, we formulate the steady-state probabilities at an arbitrary epoch in terms of the corresponding probabilities at a pre-arrival epoch. In particular, setting s=0 in Equations (10)–(18), and subsequently invoking relation (21) together with the requisite algebraic reductions, yields

PN,1=λδNθN1PN1,1,Pi,1=δi+1ϕδiPi+1,1+λδiθi1Pi1,1θiPi,1,i=N1,,3,P2,1=δ3ϕδ2P3,1+λδ2P1,0+P1,1θ2P2,1,

P1,1=βμ1ϕ+βνP2,1λϕ+βνP1,1,PN,2=ϕϑNPN,1+λϑNθN1PN1,2,Pi,2=ϑi+1ϑiPi+1,2+ϕϑiPi,1+λϑiθi1Pi1,2θiPi,2,i=N1,,2,P1,2=μ1μ2P2,2+ϕβμ2P1,1λβμ2P1,2,P1,0=νμ1P2,1+μ2μ1P2,2+λβμ1P0,0P1,0,

where P0,0 can be computed by using the normalization condition, that is,

P0,0=1P1,0i=1N(Pi,1+Pi,2).

4. Performance Measures

To guide the design and optimization of the queuing system, we evaluate standard performance indices computed from the steady-state probabilities at arbitrary epochs obtained in the previous section. Throughout, let λ denote the external arrival rate, N the system capacity, α the per-customer reneging rate when applicable, and θi[0,1] the probability that an arriving customer decides to join upon seeing i customers in the system (so that θ¯i:=1θi is the balking probability). We write Pi,0 for the probability of being in a state with i customers while the server is idle on a working vacation, Pi,1 for a state with i customers and the server in a (reduced-rate) working–vacation service mode, and Pi,2 for a state with i customers and the server in the regular busy mode. The loss probability upon arrival is denoted by

ΠL:=PN,1+PN,2,

i.e., the probability that an arriving customer finds the system full (pre-arrival probabilities).

(i). Mean system size and mean sojourn time.

By definition and Little’s law applied to the effective throughput λ(1ΠL), we have

(42)Ls=P1,0+i=1NiPi,1+Pi,2,

(43)Ws=Lsλ(1ΠL).

(ii). Server-state occupancy probabilities.

The long-run fractions of time that the server is (a) idle during a working vacation, (b) busy during a working vacation, and (c) busy during a normal busy period are

(44)ΠI=i=01Pi,0,ΠW=i=1NPi,1,ΠB=i=1NPi,2.

(iii). Flow rates of joining, balking, and reneging.

Let Js denote the average rate of customers that actually enter service (i.e., join the system), Br the average balking rate, and Rr the average reneging rate. Then,

(45)Js=λP0,0+P1,0+P1,1+P1,2+i=2NλθiPi,1+Pi,2,

(46)Br=i=2Nλθ¯iPi,1+Pi,2,

(47)Rr=i=2N(i2)αPi,1+Pi,2.

The indices (42)–(43) together with (ΠI, ΠW, ΠB) and (Js, Br, Rr) provide a coherent basis for performance evaluation and, in particular, for selecting service-rate parameters that balance congestion, abandonment, and utilization under capacity constraints.

Remark 1.

Section 4 reports deterministic evaluations of the stationary distribution and associated indices via the exact recursions and transform-based formulae developed earlier; hence, no stochastic simulation or hardware-dependent averaging was involved. For completeness, we will specify the computational environment (standard workstation (O(N)) complexity per evaluation) and provide a clear rationale for input choices: service and vacation rates were selected to satisfy the structural ordering (ν<μ2<μ1); balking and reneging rates were set within ranges consistent with classical impatience models; and buffer sizes were varied to capture light, moderate, and heavy traffic regimes while ensuring stability.

5. Cost Model and Optimization Study

Building on the performance indices derived in the previous section, we now formulate the steady-state expected total cost per unit time for the proposed queuing system. The decision vector comprises three continuous, nonnegative service/vacation rates,

(μ1,μ2,ν)R+3,

where μ1 and μ2 are the normal service rates of the two servers (with Server 1 faster than Server 2), and ν is the working-vacation service rate of Server 2. In practice, queuing managers aim to reduce operating costs subject to stability and operational constraints. Our objective is to determine optimal values

(μ1*,μ2*,ν*)argmin(μ1,μ2,ν)F(μ1,μ2,ν),

where F denotes the steady-state cost function defined below.

Unit cost elements.

Let C1,,C90 denote the following unit costs:C1:

Cost per unit time when Server 2 is idle during a working-vacation period;

C2:

Cost per unit time when Server 2 is busy during a working-vacation period;

C3:

Cost per unit time when Server 2 is busy during a normal (non-vacation) busy period;

C4:

Holding cost per unit time per customer present in the system;

C5:

Penalty cost per unit time due to balking or reneging;

C6:

Penalty cost per unit time per lost customer when the system is blocked;

C7:

Cost per unit of normal service effort;

C8:

Cost per unit of feedback service effort;

C9:

Fixed purchase (or capacity) cost per server unit.

Performance indices (given).

Let ΠI,ΠW,ΠB,ΠL[0,1] denote, respectively, the steady-state probabilities that Server 2 is idle on working vacation, and busy on working vacation, busy in normal operation, and the system is blocked; let Ls0 be the steady-state expected system size; let Br and Rr be the steady-state rates of balking and reneging, respectively; let λ>0 be the external arrival rate; and let β¯[0,1] denote the mean feedback probability. These indices are determined by the stochastic model specified in the previous section.

Total expected cost.

The steady-state expected total cost per unit time is

(48)F(μ1,μ2,ν)=C1ΠI+C2ΠW+C3ΠB+C4Ls+C5(Br+Rr)+C6λΠL+(μ1+μ2+ν)C7+β¯C8+2C9.

Optimization problem.

We seek the cost-minimizing rates under the natural ordering and feasibility (stability) constraints:

(49)(μ1*,μ2*,ν*)argmin(μ1,μ2,ν)F(μ1,μ2,ν)subjectto0<ν<μ2<μ1,andthesteady-state/stabilityconditionsofthemodel.

Solution approach.

Because F in (48) is a highly nonlinear function of (μ1,μ2,ν) through the embedded performance indices, closed-form optimality conditions are generally intractable. We therefore adopt a numerical approach based on the Bat algorithm, a metaheuristic inspired by echolocation, to efficiently explore the feasible region in (49) and obtain high-quality approximations to (μ1*,μ2*,ν*).

Comments

The selection of the decision variables (μ1,μ2,ν) and the cost components (C1,,C9) is motivated by both modeling fidelity and practical relevance. We briefly elaborate on their roles.

Service and vacation rates.

The triplet (μ1,μ2,ν) forms the natural decision vector because service rates constitute the most direct levers for system performance. In particular,

μ1 and μ2 represent the normal operating capacities of two heterogeneous servers. Empirically, such asymmetry is common in practice, where one server is technologically superior or operated by a more skilled resource. Distinguishing them allows us to capture realistic differences in throughput and workload distribution.

ν models the effective rate during a working-vacation period, a concept increasingly relevant in systems where partial service continues during maintenance, energy-saving modes, or off-peak operation. Including ν in the optimization enables one to quantify the trade-off between reduced productivity during vacations and the associated cost savings.

Cost components.

The decomposition of the total cost into nine elements (C1,,C9) ensures that all critical operational aspects are explicitly represented:

C1C3 differentiate between idle and busy states of Server 2 across normal and vacation regimes, thereby capturing the utilization-dependent expenditure structure.

C4C6 encode congestion and dissatisfaction costs: holding cost for customers in queue, penalties for impatience (balking/reneging), and loss penalties under blocking. This reflects the service quality dimension and its direct impact on customer retention.

C7C8 measure the marginal effort of normal versus feedback service, thereby penalizing overuse of resources and accounting for the additional burden of reprocessing tasks.

C9 accounts for fixed capacity investments, ensuring that expansion or contraction of server capability is consistently weighed against operational benefits.

Modeling rationale.

This parameterization balances analytical tractability with operational richness. By separating performance-determining rates from cost-incurring states, the model provides a transparent mapping from managerial decisions to economic outcomes. The chosen structure also aligns with the literature on cost optimization in heterogeneous and vacation-based queuing systems, ensuring comparability and theoretical robustness.

Remark 2.

The proposed framework can, in principle, be extended to accommodate multi-class customer populations or priority-based service policies. In the multi-class setting, the state descriptor must be augmented to reflect both the number and the class of customers in service or waiting, leading to vector-valued probability generating functions and Laplace–Stieltjes transforms, together with class-dependent recursive relations. Similarly, under priority-based disciplines (preemptive or non-preemptive), the balance equations and boundary conditions must be reformulated to incorporate class-specific service rules, which naturally give rise to block-structured transition dynamics. Analytically, such extensions often require the use of matrix-analytic or quasi-birth–death techniques to preserve tractability. While feasible, these generalizations substantially increase the model’s dimensional and computational complexity, and their rigorous treatment lies beyond the scope of the present work.

6. Numerical Results

In this section, we provide a detailed numerical investigation of the multiple working vacation GI/M/2/N queuing model. The inter-arrival process is examined under a range of canonical distributions, namely deterministic, exponential, and Erlang-2, so as to capture a spectrum of arrival-time variability. The numerical experiments, presented through carefully constructed graphs and tables, are intended to elucidate the influence of key queuing parameters on both the principal performance measures and the associated cost-optimization criteria. To this end, we have developed an R program, (R version 4.5.2), implemented by the authors, which computationally validates and illustrates the analytical formulas derived in the preceding sections. For the entire numerical analysis, representative values of the system parameters and cost coefficients are chosen in an illustrative yet systematic manner, ensuring that the essential qualitative behaviors of the model are faithfully exhibited.

In Table 2, we report the steady-state joint distributions observed at both pre-arrival and arbitrary epochs for three representative inter-arrival time laws—deterministic, exponential, and Erlang–2—all calibrated with rate parameter λ=0.9. The remaining system parameters are specified as μ1=2.7, μ2=2.3, ν=1.9, β¯=0.2, α=0.6, and ϕ=1.0. Customer balking is modeled through the state-dependent function θ¯i=i/N for 2iN1, where the total system capacity is fixed at N=9. A noteworthy feature emerges in the exponential case: because of the memoryless property of the exponential distribution, the pre-arrival probabilities Pi,j coincide exactly with the arbitrary-epoch probabilities Pi,j, reflecting the fundamental Markovian nature of this setting.

6.1. Numerical Results of System Performance Measures

Table 3 and Figure 2, Figure 3 and Figure 4 report a sensitivity analysis of the principal performance measures with respect to the service rates during normal and working-vacation regimes, namely μ1, μ2, and ν. Throughout, inter-arrival times are taken to be deterministic with mean 1/λ. Unless otherwise stated, the baseline parameter vector is

(λ,μ1,μ2,ν,β¯,α,ϕ)=(1.2,3.3,2.5,1.6,0.2,1.1,5.0),

and the balking function is

θ¯i=11i+1,2iN1,N=7.

Main findings. The numerical results show that increasing any of the service rates μ1, μ2, or ν leads to a monotone decrease in the congestion and impatience indicators:

Ls,Ws,Br,Rr,

where Ls is the mean system size, Ws the mean sojourn time, Br the average balking rate, and Rr the average reneging rate. In parallel, the effective joining rate Js increases with faster service, reflecting improved responsiveness and reduced anticipated delay.

Regarding server-state utilization, the probability that Server 2 is busy during normal (non-vacation) periods, denoted ΠB, decreases as each of μ1, μ2, and ν increases. During working vacations, the busy probability of Server 2, denoted ΠW, decreases with μ1 and with ν, but increases with μ2. This asymmetric response is consistent with the heterogeneity of the two servers and the division of work between normal and vacation regimes.

Effect of balking. Comparisons between systems with and without balking reveal that

Ls,Ws,Js,Rr,ΠBarealllargerintheno-balkingsystem,

while

Br(trivially)andΠWarelargerwhenbalkingispresent.

These patterns are qualitatively in line with queuing intuition: allowing balking filters out customers facing long anticipated waits, thereby lowering congestion measures and shifting occupancy from normal periods to working vacations.

Secondly, in Table 4 and Figure 5, Figure 6 and Figure 7, we analyze the impact of feedback probability β¯, arrival rate λ, working vacation rate ϕ, and impatience rate α on various performance measures. Here, we assume that the inter-arrival times follow an Erlang-2 distribution. The balking function is taken as θ¯i=i/N2, for 2iN1, where N=7. The parameters are taken as λ=1.0, μ1=2.8, μ2=2.4, ν=1.8, α=0.9, and ϕ=2.0. From Table 4 and Figure 5, Figure 6 and Figure 7, we observe the following:

With fixed values of β¯, ϕ, and α, an increase in λ leads to increases in Ls and Ws, as expected. This in turn increases Js, Br, and Rr. Consequently, the probability of customer loss due to system size limitation ΠL significantly rises. Notably, the probability that Server 2 is idle during the working vacation period ΠI decreases with λ.

With fixed values of β¯, λ, and α, when the working vacation rate ϕ increases, Server 2 rapidly switches to the normal busy period at which the customers are served at a higher rate. Therefore, the characteristics Ls, Rr, Br, Ws, and ΠL decrease. This implies an increase in Js, and in ΠI because the mean working vacation time 1/ϕ decreases. This trend matches absolutely with the realistic situation.

With fixed values of β¯, λ, and ϕ, when the impatience rate α increases, the system characteristics Js, Rr, and ΠI increase, while ΠL, Br, Ls, and Ws decrease. This relationship highlights that higher impatience rates lead to smaller system sizes on average and higher average reneging rates.

For fixed values of α, ϕ, and λ, an increase in the feedback probability β¯ leads to higher values of Ls and Ws, as intuitively expected. Consequently, this increase results in elevated values of Br, Rr, and the probability of customer loss due to system size limitation ΠL. Conversely, Js and the probability that Server 2 is idle during the working vacation period ΠI monotonically decrease.

Table 4

Effects of λ, ϕ, α, and β¯ on system characteristics.

β ¯ = 0.3 β ¯ = 0.5 β ¯ = 0.7
λ = 0 . 7 λ = 1 . 0 λ = 0 . 7 λ = 1 . 0 λ = 0 . 7 λ = 1 . 0
L s 0.3649478 0.5285393 0.5156073 0.7485630 0.8706453 1.2585121
W s 0.5213540 0.5285393 0.7365818 0.7485630 1.2437789 1.2585122
J s 0.6995574 0.9984455 0.6989834 0.9965871 0.6967290 0.9899939
B r 0.0004426 0.0015545 0.0010166 0.0034129 0.0032710 0.0100061
R r 0.0003104 0.0013975 0.0011433 0.0047919 0.0068955 0.0251020
β ¯ = 0 . 3 β ¯ = 0 . 5 β ¯ = 0 . 7
ϕ = 2 . 0 ϕ = 4 . 0 ϕ = 2 . 0 ϕ = 4 . 0 ϕ = 2 . 0 ϕ = 4 . 0
L s 0.5285393 0.5253591 0.7485630 0.7430044 1.2585121 1.2486714
W s 0.5285393 0.5253591 0.7485630 0.7430044 1.2585122 1.2486714
J s 0.9984455 0.9984870 0.9965871 0.9966743 0.9899939 0.9901889
B r 0.0015545 0.0015130 0.0034129 0.0033257 0.0100061 0.0098111
R r 0.0013975 0.0013191 0.0047919 0.0045670 0.0251020 0.0243441
β ¯ = 0 . 3 β ¯ = 0 . 5 β ¯ = 0 . 7
α = 0 . 6 α = 0 . 9 α = 0 . 6 α = 0 . 9 α = 0 . 6 α = 0 . 9
L s 0.5288857 0.5285393 0.7501344 0.7485630 1.2703945 1.2585121
W s 0.5288857 0.5285393 0.7501344 0.7485630 1.2703948 1.2585122
J s 0.9984379 0.9984455 0.9965519 0.9965871 0.9897164 0.9899939
B r 0.0015621 0.0015545 0.0034481 0.0034129 0.0102836 0.0100061
R r 0.0010108 0.0013975 0.0035746 0.0047919 0.0199789 0.0251020
Figure 5

Effects of λ,β¯, and ϕ on ΠL and ΠI.

[Figure omitted. See PDF]

Figure 6

Effects of ϕ,β¯, and α on ΠL and ΠI.

[Figure omitted. See PDF]

Figure 7

Effects of α,β¯, and λ on ΠL and ΠI.

[Figure omitted. See PDF]

6.2. Numerical Results of Cost Optimization

Our aim in this subsection is to determine the optimal service rates μ1, μ2, and ν that minimize the total expected cost function F(μ1,μ2,ν) under various operational scenarios. The unit cost coefficients are prescribed as

C1=30,C2=20,C3=10,C4=15,C5=12,C6=10,C7=10,C8=4,C9=5.

Customer balking is modeled by the quadratic function

θ¯i=i2N2,2iN1,

with the system capacity fixed at N=5. Unless otherwise specified, the parameters

β¯=0.3,ϕ=0.2,λ=1.8,μ2=0.9,ν=0.6

remain constant throughout the analysis. For the numerical optimization, we implement the Bat Algorithm, a population-based metaheuristic, with a population size of P=20 and a maximum number of iterations Tmax=30.

In Figure 8 and Figure 9, the convexity and optimality of the total expected cost function F(μ1,μ2,ν) are clearly observed. These figures illustrate that there exist specific values of the service rates μ1,μ2, and ν that minimize the total expected cost function for the selected model parameters.

According to Table 5, as the arrival rate λ increases, we observe that the optimal service rates at the primary and secondary stations, μ1* and μ2*, increase, whereas the auxiliary-rate ν* decreases. Consequently, the optimal expected cost

Fμ1*,μ2*,ν*

is strictly increasing in λ. This behavior is natural: higher inflow intensifies congestion, enlarging the expected system size and—holding the optimality conditions fixed—raising the total expected cost. By contrast, when the impatience (reneging) rate α increases, all three optimal rates μ1*, μ2*, and ν* rise, and so does the objective Fμ1*,μ2*,ν*. This accords with intuition: greater impatience penalizes delay more heavily, incentivizing faster service to mitigate reneging, but the higher service intensities themselves entail larger operating costs, thereby elevating the overall optimum. Moreover, the presence of balking amplifies these effects. In models with balking, the optimal rates μ1*, μ2*, and ν*, together with the resulting optimal cost Fμ1*,μ2*,ν*, exceed their counterparts in otherwise identical systems without balking. Thus, classical impatience behaviors—balking and reneging—unambiguously worsen total expected cost by compelling more aggressive (and costlier) service policies. The comparative statics reported in Table 5 are consistent with operational practice and provide actionable guidance for real-time implementations: appropriate tuning of the service rates μ1, μ2, and ν can hedge against higher arrival intensities and customer impatience, while explicitly accounting for balking to avoid systematic under-provision of capacity.

6.3. Discussions

Remark 3.

On the compact feasible set {νminνμ2μ1μ1,max}, the mapping (μ1,μ2,ν)F is continuous (indeed locally Lipschitz) since it is a smooth composition of linear recursions in the stationary probabilities. Hence, a global minimizer exists. The terms C1ΠI+C2ΠW+C3ΠB price server phases (idle/WV/regular), C4Ls penalizes congestion, C5(Br+Rr) penalizes behaviorally lost work, C6λΠL accounts for blocked demand, and (C7+β¯C8)(μ1+μ2+ν) proxies energy/labor with rework overhead.

We initialize bat uniformly over the feasible cone νμ2μ1, project after each move to enforce ordering, and use logarithmic step sizes for rates to improve exploration across scales. A small amount of Gaussian jitter on the best incumbent accelerates escape from flat basins. We report the best feasible incumbent after T iterations and confirm robustness through five independent runs.

Remark 4.

We adapted the Bat algorithm in three essential ways to reflect the mathematical structure of our queuing system. First, the decision variables are constrained by the intrinsic service-rate ordering

0<ν<μ2<μ1,

and rather than introducing penalty terms, every candidate solution is projected directly onto this feasible cone (after proposing moves in log-rate space to preserve positivity and scale-invariance). Second, fitness evaluation is based not on simulation but on the exact recursion formulas derived from the Laplace–Stieltjes transform of the system, with special analytic limits at the singular points δi,ϑi to maintain numerical stability and continuity. Third, because the resulting objective is deterministic, piecewise–smooth, and nonconvex, we modified the Bat dynamics to add controlled jitter around the incumbent best solution in order to escape plateaus, while keeping acceptance and loudness schedules less aggressive than in the stochastic case. These adaptations ensure that the search respects the system’s structural constraints, that each evaluation is mathematically exact and stable, and that the algorithm remains effective on the nonconvex landscape induced by the working-vacation queue with balking, reneging, and feedback.

Remark 5.

We would like to stress that the numerical results in Section 6 are based on exact recursive formulations of the stationary distribution and therefore are not subject to sampling variability as in simulation-based studies. Consequently, conventional statistical tests are not directly applicable. To ensure robustness, however, we performed several independent optimization runs with different initializations and obtained highly consistent solutions, indicating that the reported outcomes are stable with respect to the metaheuristic search. The performance measures themselves are deterministic functionals of the stationary probabilities, so their accuracy hinges on numerical stability rather than statistical fluctuations, which has been carefully verified in the implementation.

Remark 6.

The convergence of the proposed optimization procedure must be understood in a numerical rather than purely analytical sense. Because the underlying cost functional is evaluated exactly (through the backward recursions and analytic limits at singular points), the sequence of best solutions generated by the Bat algorithm is deterministic and exhibits monotone improvement, eventually stabilizing within a narrow tolerance. However, as is well known for population-based metaheuristics, global optimality cannot be guaranteed theoretically; what can be claimed is convergence to a numerically stable and high-quality local optimum, with robustness reinforced by projection into the ordered feasible cone and by independent replications of the search. From a computational perspective, the recursion structure makes each fitness evaluation linear in the buffer size, so the optimization remains tractable: in practice, for realistic system dimensions, the entire optimization completes within seconds to minutes on standard hardware. This ensures that the solution time is fully acceptable for decision-support and design purposes, even though a theoretical guarantee of global optimality is absent.

Remark 7.

In practical environments such as call centers, healthcare units, or flexible manufacturing, the model would be implemented by first calibrating its structural parameters—arrival intensity, patience (reneging) distributions, balking probabilities, service-time laws under regular versus working-vacation modes, and feedback frequencies—directly from operational data. Concretely, this requires access to high-resolution transactional logs (arrival timestamps, abandonment times, service completions, switch-overs to reduced-capacity states, and post-service return behavior). Standard statistical procedures (maximum likelihood, nonparametric hazard estimation, survival analysis) would then be used to fit the inter-arrival and service-time distributions as well as the patience and balking functions. With these empirically derived inputs, the queuing model can be parameterized and the optimization framework applied to support staffing, scheduling of reduced-capacity periods, and service-level decisions tailored to the specific application domain.

Remark 8.

The adoption of the Bat Algorithm was motivated by the structure of our optimization problem, which involves a deterministic yet piecewise-smooth objective derived from LST-based recursions with removable singularities. Such a landscape is nonconvex and may exhibit flat regions, making derivative-free methods particularly suitable. The Bat Algorithm, with its built-in balance of global exploration and adaptive local exploitation, proves efficient when coupled with our log-space parameterization and projection scheme enforcing (ν<μ2<μ1). Although alternative metaheuristics (e.g., GA, PSO, DE) could also be applied and are expected to produce comparable optima, the difference would primarily concern convergence speed and robustness. Our choice therefore reflects adequacy and computational efficiency rather than exclusivity, and the methodology remains transferable to other metaheuristic frameworks.

7. Model Application in Practice

. We consider a flexible manufacturing facility that must concurrently support make-to-order (MTO) jobs—customer-specific production released upon demand—and opportunistic make-to-stock (MTS) runs that replenish standard inventory. The shop comprises two heterogeneous machines (servers) operating under a first-in, first-out (FIFO) discipline. Both machines primarily process MTO work. Incoming MTO orders that find both machines busy enter a finite buffer of size N; an arrival that encounters a full buffer is blocked and lost; otherwise, it joins the queue.

. To sustain high utilization during light-load periods while preserving responsiveness to MTO demand, one machine (Server 2) may enter a working vacation (WV) whenever it becomes idle. During a WV, Server 2 continues to process jobs at a deliberately reduced service rate ν (for example, under a low-power or maintenance mode). The other machine (Server 1) remains fully available for MTO work at rate μ1. This mechanism reflects the widely used modeling idea that a server need not switch completely off; instead, it can operate at a lower productive rate rather than halt altogether [30,78,79]. At the completion of a WV, Server 2 adapts to the system state: if the system is empty, it immediately initiates a new WV, capturing extended low-load spells; otherwise, it returns to normal operation and serves at its regular MTO rate μ2>ν. This interruption/continuation logic, standard in the WV literature, captures practical rules that preserve responsiveness when backlog exists [78,79].

. Customer impatience is represented through two complementary behaviors. Balking allows a newly arriving order to decline entry when congestion is visible; in a finite-capacity setting, this is modeled by a state-dependent joining probability θi that depends on the observed system size i, generalizing the classical economics-of-balking perspective [80]. Reneging captures abandonment by customers who have joined but lose patience while waiting; we posit an exponential patience (hazard) rate α while in the queue, following standard models of queues with abandonment [81,82,83]. Finally, we incorporate Bernoulli feedback at service completion: with probability 1β, a job is routed back to the queue for rework (e.g., remedial processing or quality corrections), whereas with probability β, it departs permanently. This abstraction is classical for rework loops and repeated service attempts [84,85].

. In aggregate, the model integrates four operational realities—finite capacity, working vacations with interruption, impatience (balking and reneging), and rework via Bernoulli feedback—within a unified and tractable framework. The purpose is to support design decisions such as sizing the buffer N, tuning the aggressiveness of WV operation (choice of ν and the vacation policy), and quantifying the trade-offs among loss, abandonment, throughput, and cycle time in a two-server, heterogeneous MTO–MTS environment [86].

Remark 9.

While the present work is developed in a purely theoretical framework, the incorporation of empirical operational logs would offer a natural and valuable extension. Such data could serve to calibrate the impatience and feedback parameters—typically idealized in analytical models—by confronting them with observed inter-arrival, patience, and retry patterns. This would not only validate the adequacy of the exponential-type assumptions but also reinforce the practical relevance of the theoretical findings without altering their mathematical consistency.

Remark 10.

In large-scale systems, the cumulative cost of repeated fitness evaluations may pose challenges; however, this can be overcome through closed-form recursions, analytic treatment of singularities, and projection-based feasibility enforcement, which together ensure numerical stability and preserve scalability without simulation overhead.

8. Conclusions and Perspectives

We developed a steady-state analysis for a finite-buffer GI/M/2/N queue with two heterogeneous servers, impatience in the form of balking and reneging, Bernoulli feedback, and multiple working vacations. Leveraging the supplementary-variable technique and a tailored recursion, we derived balance equations and system-size distributions at both pre-arrival and arbitrary epochs. From these, we obtained standard performance measures, including blocking and reneging probabilities, mean queue length, mean sojourn time, throughput, and server utilizations. On the managerial side, we formulated an economic objective that trades off congestion, abandonment, and capacity costs, and we optimized the decision variables using the Bat Algorithm. A numerical study implemented in R demonstrated how service heterogeneity, vacation parameters, buffer size, feedback probability, and impatience intensities jointly shape delay, loss, and total cost, thereby providing actionable guidance for the design of call centers, flexible manufacturing facilities, and telecommunication access networks.

Our analysis assumes exponential service times and a renewal input process. While these choices encompass a broad range of applications, richer service-time models and strongly correlated arrivals would require additional structure such as phase-type or Markov-modulated input. A further limitation is the assumption of FCFS with no routing asymmetry; incorporating priority or skill-based dispatching under heterogeneity would improve fidelity in many service contexts.

Perspectives. Several extensions are both impactful and technically tractable. One avenue replaces exponential service with phase-type distributions, yielding a GI/PH/2/N counterpart amenable to matrix-analytic and matrix-geometric methods while retaining the WV/feedback/impatience structure [66,87]. Another introduces nonstationarity and burstiness through Markov-modulated or batch Markovian arrivals, or via time-varying staffing, following classical MAP/MMPP/BMAP frameworks and time-dependent staffing models [88,89,90,91]. Reliability considerations can be incorporated by adding breakdown and repair states coupled with WV control and impatience/feedback, connecting to network performance bounds and product-form insights [92,93]. Endogenizing dispatching and staffing with heterogeneous speeds—for example, via skill-based or priority routing—opens the door to the joint design of routing and WV control, building on established service-system modeling [94,95]. Multi-class demand with class-dependent balking, reneging, and feedback would enable analysis of fairness and service-level guarantees using scheduling and prioritization tools [95]. On the data side, renewal laws and impatience hazards can be estimated from operational logs, and feedback/WV parameters can be calibrated by likelihood or Bayesian approaches that align with the supplementary-variable framework [96,97]. Learning-augmented control—combining analytical models with online policy learning to adapt service rates, admission, or vacation durations under stability constraints—is a promising direction, with index policies and reinforcement learning furnishing principled mechanisms [98,99]. Risk-sensitive and robust design objectives, such as CVaR or chance constraints, would address tail performance and model uncertainty [100,101], while rare-event techniques and large deviations can quantify extreme blocking and delay beyond average-case metrics [102,103,104]. Finally, computational acceleration exploiting sparsity and Toeplitz structure in the balance equations—together with state-of-the-art matrix-analytic libraries and fast matrix-exponential routines—would scale the numerics to large buffers and richer phase structures [87,105].

Palm–time viewpoint. Across these extensions, maintaining an explicit Palm–time conversion under non-Poisson input is essential, since performance targets often mix arrival-based and time-based metrics. This separation prevents inadvertent PASTA assumptions and leads to correct comparisons of customer-experienced versus system-averaged quantities.

On the case N=. When the buffer is unbounded, positive recurrence is no longer guaranteed. A Foster–Lyapunov function V(i,j)=i yields

LV(i,j)λθiμ1+μ21{j=2}+ν1{j=1}+(i2)+α,

so stability obtains whether the effective service capacity—accounting for the WV mix and reneging—dominates the effective admission rate. Our focus on finite N ensures positive recurrence without additional assumptions.

Closing remark. By unifying heterogeneity, impatience, feedback, and working vacations within a renewal-input framework, this study provides analytical foundations and deployable levers for modern service operations. The outlined directions connect the model to established methodological pillars and point toward robust, data-informed, and learning-enabled designs.

Author Contributions

Conceptualization, A.G.; methodology, A.G.; validation, A.G. and S.B.; formal analysis, A.G.; investigation, A.G.; writing—original draft preparation, A.G. and S.B.; writing—review and editing, A.G. and S.B. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors gratefully acknowledge the Editor-in-Chief, the Associate Editor, and the three anonymous referees for their insightful comments and valuable suggestions, which greatly enhanced both the clarity and the overall quality of this work.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Notations

Notation for the queuing model with working vacations, balking, and reneging.

Primitives and Parameters
λExternal arrival rate; mean inter-arrival time 1/λ.
A(u),a(u),A*(s)Inter-arrival cdf, density, and Laplace–Stieltjes transform.
NTotal capacity (in service + waiting).
θi,θ¯i=1θiJoin/balking probability on seeing i customers.
αExponential impatience (reneging) rate per waiting customer.
β,β¯=1βDeparture vs. feedback probability at service completion.
μ1,μ2Regular service rates of Server 1 and Server 2 (μ2μ1).
νServer 2 service rate during a working vacation.
ϕWorking-vacation termination rate.
State processes and stationary objects
L(t)Total customers in system at time t (queue + in service).
J(t){0,1,2}Server 2 state: 0 = idle on WV; 1 = busy on WV; 2 = busy (regular).
W(t)Remaining inter-arrival time at t.
Pi,j(u,t)P(L(t)=i,J(t)=j,uW(t)<u+du) (density in u).
Pi,j(u)Stationary version of Pi,j(u,t) as t.
Pi,j*(s)LST of Pi,j(u) in u.
Pi,j(0)Boundary (rate) probability at W=0 (arrival instant).
Pi,jArbitrary-epoch stationary probability of (L,J)=(i,j).
Pi,jPre-arrival (Palm) probability; Pi,j=λ1Pi,j(0).
δiβ(μ1+ν)+ϕ+(i2)α (WV balance parameter).
ϑiβ(μ1+μ2)+(i2)α (regular balance parameter).
Performance measures
ΠI,ΠW,ΠBFractions of time: Server 2 idle on WV/busy on WV/busy regular.
ΠLLoss probability on arrival: PN,1+PN,2.
LsMean system size: P1,0+i=1Ni(Pi,1+Pi,2).
WsMean sojourn time: Ws=Lsλ(1ΠL).
JsEffective joining rate.
Br,RrAverage rates of balking and reneging.
Cost model
C1,,C9Unit cost coefficients used in the economic objective.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

Figure 1 Schematic representation of the GI/M/2/N queuing model.

View Image -

Figure 2 Effects of μ1,μ2, and ν on ΠB and ΠW.

View Image -

Figure 3 Effects of μ2,μ1, and ν on ΠB and ΠW.

View Image -

Figure 4 Effects of ν,μ2, and μ1 on ΠB and ΠW.

View Image -

Figure 8 Effects of μ1 and μ2 on the total expected cost function.

View Image -

Figure 9 Effects of μ1 and ν on the total expected cost function.

View Image -

State-of-the-art summary.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.