Improved hybrid estimation of distribution

Full text

Turn on search term navigation

INTRODUCTION

The deepening of manufacturing globalization has led to an increasing demand for distributed production processes by enterprises. The large-scale discrete manufacturing system based on multi-variety and small-batch production mode promotes the emergence of distributed manufacturing [1]. Therefore, distributed manufacturing holds important research value, and distributed shop scheduling has been widely considered by experts and scholars as an important part of distributed manufacturing [2-8].

The Distributed Assembly Permutation Flow Shop Scheduling Problem (DAPFSP) is a type of distributed scheduling problem proposed by Sara et al. [9]. This problem widely exists in the manufacturing process of enterprises. It has high research value and is a hot spot of distributed pipeline scheduling research Paz and Jose [10]. Li et al. [11] proposed an improved cuckoo algorithm in solving the hybrid flow shop scheduling problem. Li et al. [12], Huang et al. [13], Ochi and Driss [14], Pan et al. [15], Zhang et al. [16], Huang et al. [17], Sang et al. [18] and other scholars have made specific achievements in the research of the DAPFSP. Pan et al. [19] propose a knowledge-based two-population optimisation algorithm to solve the distributed energy-efficient parallel machines scheduling problem, minimising total energy consumption and tardiness through extensive simulation comparisons demonstrating its effectiveness. Zhao et al. [20] introduced a hyperheuristic with Q-learning to address the energy-efficient distributed blocking flow shop scheduling problem, showing significant improvements in efficiency and effectiveness through extensive benchmark testing. Zhao et al. [21] applied an improved iterative greedy algorithm with a Q-learning mechanism to the distributed no-idle permutation flowshop scheduling problem, achieving satisfactory results in optimising makespan and total tardiness through comprehensive benchmarking.

Based on DAPFSP, this paper further considers the case of multiple parallel assemblers completing product assembly tasks. We propose a distributed parallel assembly permutation flow shop scheduling problem (DPAPFSP) and establish a mathematical model. According to an extensive literature review and actual industrial surveys, it is found that parallel assembly machines are commonly present in various manufacturing factories, highlighting the practical relevance of DPAPFSP. As an extension of the DAPFSP, the DPAPFSP has a high potential for application in the field of assembly manufacturing. Meanwhile, compared with the DAPFSP, the DPAPFSP is more complex and rarely reported, so it has a high research value.

DAPFSP is one of the integrated flexible job shop scheduling problems Li et al. [22]. Intelligent optimisation algorithms are widely used to study this kind of problems, which include many kinds of intelligent optimisation algorithms, such as the genetic algorithm (GA), simulated annealing algorithm (SA), estimation of distribution algorithm (EDA) and so on. EDA is an intelligent optimisation algorithm based on statistics. EDA adopts a macro-level evolution method based on search space Wang et al. [23], evaluates global information and historical data by using a probability matrix, and achieves population update through iterative sampling, showcasing strong global search capabilities and high search speed. Therefore, EDA finds wide-ranging applications Ahmed et al. [24], Jiang et al. [25], Ricardo and Arturo [26], Pratap Chandran et al. [27], Du et al. [28]. In the field of production scheduling, Sun et al. [29] proposed a hybrid EDA for robot assembly line balancing; Wang et al. [30] applied EDA to flexible manufacturing system scheduling based on Petri nets; Faraji Amiri and Behnamian [31] used the distribution estimation algorithm to solve the multi-objective green flow shop scheduling problem under uncertain conditions; Tang et al. [32] developed an HDPSO-SA algorithm to solve a multi-objective flexible job-shop scheduling model. Qin et al. [33] proposed a multi-target casting production scheduling model and designed a hybrid discrete multi-objective grey wolf optimiser. Ouyang and Utamima [34] applied the mixed distribution estimation algorithm to solve the single-row facility layout problem. Huang et al. [35] developed a two-phase evolutionary algorithm for multi-objective distributed DAPFSP. Feng et al. [36] proposed an improved distribution estimation algorithm for the flow-shop scheduling problem with sequence-dependent setup times.

However, EDA still exhibits defects such as weak local search ability and lack of population diversity in late iteration. Therefore, this paper proposes an improved hybrid distribution estimation algorithm (IHEDA) to solve DPAPFSP. The algorithm has the following improvement measures: firstly, it adopts a decoding strategy based on the earliest finish time (EFT) rules to enhance the uniform distribution of the workpiece in the factory; Secondly, five kinds of neighborhood structures are designed to improve the local search capability of the algorithm by using local neighbourhood search based on critical path; Thirdly, the double sampling strategy based on repetition rate is adopted to improve the diversity of the late population and avoid premature convergence; Fourthly, simulated annealing search is used to enhance the search for the optimal solution of iterative stagnation to improve the fine search ability of the algorithm. Experimental results show that the algorithm is effective.

THE DESCRIPTION OF DPAPFSP

Symbol definition

The mathematical symbols and definitions involved in this paper's model construction are shown in Table 1.

TABLE 1 The symbol table.

Symbol	Introduction
J	A set of n workpieces. J = {J₁, J₂, …, J_n}
F	The number of factories in the workpiece processing stage
M	The number of machines in each factory during the workpiece processing stage
r	The number of assembly machines in the product assembly stage
P	The number of products
θ^f	The workpiece processing sequence of the f factory in the workpiece processing stage. θ = [θ(1), θ(2), …, θ(n_f)]
θ^f(i)	The workpiece number of the i-th workpiece processed in the f factory
${C}_{{\theta }^{f}(i),m}$	The processing completion time of workpiece θ_f(i) at machine m
${t}_{{\theta }^{f}(i),m}$	The processing time of workpiece θ^f(i) at machine m
n_f	The number of workpiece to be processed in factory f
θ^p	θ^p = [θ, θ, …, θ]
θ^p(k)	The product number of the k − th product to be assembled in product assembly stage
${C}_{{\theta }^{p}(k)}^{S}$	The latest machining completion time of the workpiece to which θ^p(k) belongs
${Q}_{{\theta }^{f}(i),{\theta }^{p}(k)}$	Boolean variable. It is 1 if the workpiece θ^f(i) belongs to the product θ^p(k), otherwise it is 0
${C}_{{\theta }^{p}(k)}$	The assembly completion time of product θ^p(k) in product assembly stage
${t}_{{\theta }^{p}(k)}$	The assembly time of product θ^p(k) in product assembly stage
${C}_{j,k}^{M}$	The earliest idle time of assembly machine during the selection of θ^p(k) in product assembly stage
V_j,k	Boolean variable. It is 1 if θ^p(k) is assembled in assembly machine j. Otherwise it is 0

Problem description

DPAPFSP consists of workpiece processing stage (Stage 1) and product assembly stage (Stage 2) as shown in Figure 1.

[IMAGE OMITTED. SEE PDF]

In Stage 1, n workpieces {J₁, J₂, …, J_n} are first assigned to factories, and the corresponding factories are selected from F isomorphic factories for workpiece processing; Secondly, the processing sequence of the workpiece in the same factory is sorted, so that the workpiece J_i assigned to the factory is processed by M machines $\left\{Mec{h}_{{f}_{1}},Mec{h}_{{f}_{2}},\text{\ldots },Mec{h}_{{f}_{M}}\right\}$ in accordance with the processing sequence {O₁, O₂, …, O_M}.

In Stage 2, assembly of P products $\left\{{P}_{{d}_{1}},{P}_{{d}_{2}},\text{\ldots },{P}_{{d}_{P}}\right\}$ are completed on any of r parallel assembly machines $\left\{Mec{h}_{{A}_{1}},Mec{h}_{{A}_{2}},\text{\ldots },Mec{h}_{{A}_{r}}\right\}$ according to the workpiece-product assembly relationship for all finished workpiece. The assumptions are as follows:

(1)
Workpieces are not allowed to change factories after completion of factory distribution;
(2)
Stage 1 and Stage 2 only consider the processing time of the workpiece in the machine and the assembly time of the product in the machine, other time is ignored;
(3)
One machine processes only one workpiece or one product and one workpiece or one product is processed only on one machine at a moment;
(4)
The process of each workpiece is continuous and can not be interrupted;
(5)
Each workpiece has M + 1 operation, and the first M operations can be completed by M machines in any factory of Stage 1, and the M + 1th operation is carried out in the assembly shop.
(6)
A workpiece belongs to only one product;
(7)
The assembly operation of the product can be carried out after all the workpiece belonging to the product has been processed.

Model building

For DPAPFSP, optimisation is conducted with the goal of minimising the makespan (C_max). The expression is shown in Equation (1).

1 $min{C}_{\max }$ The mathematical model of Stage 1 is as follows: 2 ${C}_{{\theta }^{f}(1),1}={t}_{{\theta }^{f}(1),1},\quad \forall f=1,2,\text{\ldots },F$ 3 ${C}_{{\theta }^{f}(i),1}={C}_{{\theta }^{f}(i-1),1}+{t}_{{\theta }^{f}(i),1},\quad f=1,2,\text{\ldots },F;i=1,2,\text{\ldots },{n}_{f}$ 4 ${C}_{{\theta }^{f}(1),m}={C}_{{\theta }^{f}(1),m-1}+{t}_{{\theta }^{f}(1),m},\quad f=1,2,\text{\ldots },F;m=1,2,\text{\ldots },M$ 5 $\begin{align*}\hfill & {C}_{{\theta }^{f}(i),m}=\max \left\{{C}_{{\theta }^{f}(i-1),m},{C}_{{\theta }^{f}(i),m-1}\right\}+{t}_{{\theta }^{f}(i),m},\hfill \\ \hfill & f=1,2,\text{\ldots },f;i=1,2,\text{\ldots },{n}_{f};m=2,\text{\ldots },M\hfill \end{align*}$

Equation (2) represents the processing completion time of the first workpiece processed on the first machine in the factory f; Equation (3) represents the processing completion time of the i-th workpiece processed on the first machine in the factory f; Equation (4) represents the processing completion time of the i-th workpiece processed on the first machine in the factory f; Equation (5) represents the processing completion time of the i-th workpiece processed on the m-th machine in the factory f.

After finishing the processing phase, the workpiece goes to the assembly shop for product assembly. The mathematical model of Stage 2 is as follows: 6 $\begin{align*}\hfill & {C}_{{\theta }^{p}(k)}^{S}=\max \left\{{Q}_{{\theta }^{f}(i),{\theta }^{p}(k)}\cdot {C}_{{\theta }^{f}(i),M}\right\},\hfill \\ \hfill & \forall k=1,2,\text{\ldots },P;\forall f=1,2,\text{\ldots },F;\forall i=1,2,\text{\ldots },{n}_{f}\hfill \end{align*}$ 7 $\sum\limits _{k=1}^{P}{Q}_{{\theta }^{f}(i),{\theta }^{p}(k)}=1,\forall k=1,2,\text{\ldots },P;\forall f=1,2,\text{\ldots },F;\forall i=1,2,\text{\ldots },{n}_{f}$ 8 ${C}_{{\theta }^{p}(1)}={C}_{{\theta }^{p}(1)}^{S}+{t}_{{\theta }^{p}(1)}$ 9 ${C}_{{\theta }^{p}(k)}=\max \left\{\min \left\{{C}_{j,k}^{M}\right\},{C}_{{\theta }^{p}(k)}^{s}\right\}+{t}_{{\theta }^{p}(k)},\forall k=1,2,\text{\ldots },P;\forall j=1,2,\text{\ldots },r$ 10 ${C}_{j,1}^{M}=0,\forall j=1,2,\text{\ldots },r$ 11 ${C}_{j,k}^{M}=\max \left\{{C}_{j,k-1}^{M},{V}_{j,k-1}\cdot {C}_{{\theta }^{p}(k-1)}\right\},\forall k=2,3,\text{\ldots },P;\forall j=1,2,\text{\ldots },r$ 12 ${C}_{\max }=\max \left\{{C}_{{\theta }^{p}(k)}\right\},\forall k=2,3,\text{\ldots },P$

Equation (6) represents the latest processing completion time of the workpiece to which the k-th assembly product belongs in the product assembly sequence θ^p. Equation (7) represents the assembly relation between the product and the workpiece. If the workpiece θ^f(i) belongs to the product θ^p(k), it is 1; otherwise, it is 0; Equation (8) represents the processing completion time of the first product in the product sequence; Equations (9)–(11) represent the processing completion time of the k-th product and the earliest idle time of the current parallel assembly machine in the product sequence; Equation (12) represents the makespan of the scheduling scheme. DPAPFSP is a NP-hard problem, and the optimisation objective should be transformed into an approximate optimal solution within a limited time, so as to minimise the makespan.

IHEDA

The introduction of EDA

The estimation of distribution algorithm, as an intelligent optimization algorithm, is based on the principle of statistics and implements population updating by constructing probability statistical model and sampling. EDA has strong global search ability compared to other intelligent optimization algorithms. It uses a probability model to analyze the global information of the solution space's information and the search process's historical information. It takes this as the probability index of the future population sampling.

The key steps of EDA are probability model construction and population sampling. The probability model is built based on the current search probability and statistics of the historical information of the learning process to establish the probability model for describing the solution space. Population sampling is a random method, such as the Monte Carlo method, to generate new populations. Therefore, the probability model is the core of EDA Wang et al. [23].

Encoding and decoding

In the DPAPFSP solution, IHEDA adopts single-layer displacement coding, which involves sorting the numbers of the processed workpieces. The coding length is the total number of workpieces to be processed. Figure 2 shows a string of an 8-bit single-layer permutation codes. For the above coding sequencing, this algorithm uses the EFT decoding strategy Iwata et al. [37] to determine the factory allocation and processing sequencing of the workpieces to be processed. According to EFT rules, each workpiece to be processed is placed in each factory, and the current processing completion time of the workpiece in each factory is calculated. The factory with the shortest processing completion time is selected as the processing factory of the workpiece. If the alternative factories have the same processing time, one of them will be selected for allocation. If the alternative factories have the same processing time, one of them will be selected for allocation. This method improves the uniformity of the workpiece selection in the factory and avoids the phenomenon of a completely idle factory. Initialization uses random generation to generate initial solutions.

[IMAGE OMITTED. SEE PDF]

Local neighbourhood search based on critical path

As an effective operation method, local neighborhood search can refine the algorithm's search scope and improve the algorithm's local optimization ability. For a scheduling scheme aiming at C_max, the critical path is the decisive factor affecting the scheduling goal. The critical path refers to the sequential series of workpieces within a project that possesses the maximum processing completion time, ultimately dictating the minimum time required for the project's completion. The critical path comprises two distinct components: the production and assembly phase. Key workpieces are defined as the processing workpieces on the critical path. The goal of the scheduling scheme can be reduced only if key workpieces are adjusted. For a critical path, the key workpieces all exist in a factory. IHEDA designed five local neighborhood search operations combined with the critical path to enhance the fine search ability of the algorithm and realize the purpose of effective optimization.

The five kinds of local neighborhood search include three types of intra-factory search and two types of factory-to-factory search. The specific operations are as follows:

The three intra-factory search operations include swap, insert, and flip operations.

The swap operation involves randomly selecting one piece from the key workpiece and then choosing a non-repeating workpiece from the same factory as the selected workpiece. These two workpieces are exchanged to create a new solution.

The insert operation is to select a random workpiece from the key workpiece set and then choosing a random non-repeating workpiece from the factory where the workpiece is located. The key factory workpiece is extracted sequentially from the sequence, and the position of the factory in the sequence is reserved. For the extracted sequence, insert the selected key workpiece behind another workpiece. Then, the extraction sequence that completes the insertion operation is placed in the workpiece sequence to form a new solution.

The flip operation randomly selects one workpiece from the key workpiece set and then randomly selects a non-repeating workpiece from the factory where the workpiece is located. The critical factory workpiece is extracted sequentially from the sequence, and the position of the factory in the sequence is reserved. For the extracted sequence, the sequence between the selected workpieces is flipped. Then, the extraction sequence that completes the insertion operation is placed in the workpiece sequence to form a new solution.

Here is an example with 8 workpieces and 2 factories, where factory 1 = 1, 3, 4, 6 and factory 2 = 2, 5, 7, 8 are known by EFT rule assignment. Suppose the key artefact set is 1, 4, 6. Inserting key workpiece 4 into workpiece 3 is shown in Figure 3a. Flip key workpiece 1 with workpiece 3 as shown in Figure 3b.

[IMAGE OMITTED. SEE PDF]

The two types of factory-factory operation include swap and insert operation.

The swap operation is to randomly select one workpiece from the key workpiece set, and then select a non-repeating workpiece from other factories, and exchange the two workpieces to form a new solution.

The insert operation is to randomly select one workpiece from the key workpiece set, and then randomly select a non-repeating workpiece from other factories, and insert the selected key workpiece behind another workpiece to form a new solution.

In this context, each of the above operations are performed once recorded as a local search. The new solution obtained in each local search is compared with the original solution. If the evaluation value (C_max) of the new solution is superior to the old one, the new solution is used to replace the old one, otherwise the old one is retained. The above procedure will be applied to the current optimal individual, and the number of executions is denoted as C = γn, where γ represents the local search depth.

Probability model update

The probability matrix model for EDA is usually updated using an elite solution. In other words, some individuals with the best evaluation value in the population are taken as elite solutions and the probability matrix is updated by means of incremental learning. The “IHEDA” uses an n*n 2-dimensional matrix ρ(gen) to represent the gen-th generation probability model, and the initialisation is set to ρ(1) = 1/n to ensure that the artefacts are equally likely to appear at each position in the sequence in the initial state. The update formula is Equation (13): 13 ${\rho }_{i,j}(gen+1)=(1-\alpha ){\rho }_{i,j}(gen)+\frac{\alpha }{{N}_{JY}}\sum\limits _{k=1}^{{N}_{JY}}{I}_{i,j}^{k}$

In the formula, ρ_i,j(gen) is the probability of gen-th generation workpiece appearing at or before sequence position i; α is the learning factor, and its value range is [0,1]. N_JY is the number of elite individuals. Here, set η = N_JY/GT to represent the proportion of the elite solution in the population. GT is the number of individuals in the population.

The expression for the range of values of ${I}_{i,j}^{k}$ is Equation (14) 14 ${I}_{i,j}^{k}\in \left\{0,1\right\}$

If the position number where workpiece j is located is less than or equal to position i, then ${I}_{i,j}^{k}$ takes the value of 1, otherwise it takes the value of 0.

Double sampling strategy based on repetition rate

Considering that EDA has the defect of decreasing population diversity and falling into the local optimal solution in the late iteration, the sampling quantity of the new population is divided by calculating the repetition rate of the probability matrix. The sampling quantity of the probability model matrix with a high repetition rate is reduced in the late iteration. After updating the probability matrix, we introduce a dual-sampling strategy. To ensure the diversity of the algorithm in the later iteration, new population individuals are randomly generated to supplement the algorithm.

The repetition rate describes the repetition degree of the new solution obtained by sampling the probability matrix, and its calculation formula is Equation (15): 15 $R=\frac{{\sum }_{j=1}^{n}{Z}_{j}}{n}\ast 100\%$

In the formula, Z_j is a Boolean variable, indicating whether the distribution of the probability model matrix in the j row meets the “repetition condition”. If the condition is satisfied, Z_j takes 1, otherwise it takes 0.

According to the sampling rule of probability model matrix, sampling generally starts from the first sequence bit. When the sequence bit of a workpiece is determined, the probability of the column in which the workpiece is located is set to zero.

Suppose the number of non-zero or non-minimal probability workpiece corresponding to each order bit of the probability matrix equals the number of the order bits. In that case, the number of redeemable workpiece corresponding to this order bit is only 1, which satisfies the “repetition condition”. The repetition rate R is used for the proportion of bits in the statistical probability matrix that meet the “repetition condition”. The content of Pseudo-code 1 is repetition rate calculation and repetition condition judgement Algorithm 1.

Algorithm

Pseudo-code 1: Repeat Rate

Repeat = 0 for do i 1: n for do j 1: n MinNum = 0 % Initialising Minimal Value Records if then ρ(i, j) ≤ 1/(n*θ) MinNum = MinNum + 1 end if end for if then MinNum == n − i % Determine if duplicate conditions are satisfied Repeat = Repeat + 1 end if end for R = Repeat/n*100%

The relation between the repetition rate R and the quantity of the probability model matrix solution type N is shown in Equation (16): 16 $N\le \frac{n!}{(R\ast N)!}$

As can be seen from Equation (16), for a probability matrix, the higher the repetition rate, the higher the repetition degree of obtaining new solutions, the lower the diversity of new solutions, and the higher the possibility of falling into local optimal solutions. Therefore, when the repetition rate of the probability matrix is high, the number of individuals to obtain new solutions by sampling the probability matrix should be reduced to decrease the likelihood of falling into a locally optimal solution in later stages. Random generation is adopted to supplement the empty sampling individuals, ensuring algorithm diversity in subsequent iterations. The quantitative updating formula of random sampling is shown in Equation (17). When the repetition rate is lower than 0.5, the diversity of new solutions obtained through probability matrix sampling is high, so only the probability matrix sampling is performed. 17 ${n}_{\mathit{Rand}}=\left\{\begin{array}{@{}ll@{}}\lfloor {(GT-1)}^{2R-1}+0.5\rfloor ,\quad \hfill & R\ge 50\%\hfill \\ 0,\quad \hfill & R< 50\%\hfill \end{array}\right.$

Simulated annealing searching

Li et al. [38] proposed a hybrid chemical reaction optimization algorithm, which was combined with the simulated annealing (SA) algorithm's new solution acceptance method. This inspired the introduction of simulated annealing search mechanism to enhance the searching ability of IHEDA and improve the quality of the solution. In developing a hybrid discrete multi-objective imperial competition algorithm, Peng et al. [39] also incorporated a simulated annealing algorithm mechanism in the second step of improving the search mechanism. Here, set search frequency σ and the total number of iterations Max_iters. If the optimal value remains unchanged for Max_iters/σ consecutive times during the algorithm iteration, the simulated annealing algorithm is used to conduct an enhanced search for the current optimal solution.

The simulated annealing search process in IHEDA is outlined as follows:

$1$

Step

Determine whether the conditions for simulated annealing search are met, if yes, proceed to step 2; Otherwise go to step 7.

$2$

Step

Set relevant parameters and select the current optimal solution as the initial solution for the current iteration.

$3$

Step

Performs a swap operation on the current solution to generate a new one.

$4$

Step

Calculate the increment ΔE of the evaluation function for the new solution and apply the Metropolis criterion is used to determine whether the new solution is acceptable as a replacement for the current solution.

$5$

Step

Check if the specified number of searches at the current temperature have been completed. If yes, proceed to Step 6; Otherwise, return to Step 3.

$6$

Step

Update the current temperature and compare it with the final temperature. If the current temperature is lower than the final temperature, proceed to Step 7; otherwise, return to Step 3.

$7$

Step

Update the current optimal solution and exit simulated annealing search.

The flow chart of IHEDA to solute DPAPFSP is shown in Figure 4.

[IMAGE OMITTED. SEE PDF]

SIMULATION EXPERIMENT

Since DPAPFSP lacks a standard test example and shares structural similarities with the DAPFSP model, DPAPFSP is simulated using DAPFSP data (). The algorithm and test in this paper are implemented using Matlab2016a. Assuming that the parallel assembly machine is isomorphic, each product can be processed on any assembly machine with a constant processing time, and the number of assembly machines is denoted as r = 2. The dataset for this example contains four variables: N, M, F and P. These variables take values from the sets n = {8, 12, 16, 20, 24}, M = {2, 3, 4, 5}, F = {2, 3, 4}, P = {2, 3, 4}. These variables together result in a total of 180 combinations. Each combination has five different machining time options for workpieces, and one case is considered for each time option, that is, 5 cases for each combination. Consequently, the total number of experimental cases is 900.

In this paper, average relative percentage deviation (ARPD) is used as the performance index, and the calculation formula is Equation (18): 18 $ARPD=\frac{1}{U}\sum\limits _{i=1}^{U}\left(\frac{{C}_{i}-{C}_{\mathit{best}}}{{C}_{\mathit{best}}}\ast 100\%\right)$

In the formula, C_best is the optimal solution of the currently known best scheduling solution; C_i is the optimal value of the algorithm at the i-th run, and U is the number of runs.

Parameter settings

Main parameters of IHEDA are shown in Table 2. An experimental design method is adopted to analyze parameter performance. Example I_24_4_2_4_2 is selected for simulation calculation. Example I_24_4_2_4_2 has the following information: n = 24, M = 4, F = 2, p = 4, r = 2; the table of the time required for each workpiece in workpiece processing stage is shown in Table 3, the table of the time required for each workpiece in the product assembly stage is shown in Table 4, and the assembly relationship between the workpiece and the product is shown in Table 5.

TABLE 2 IHEDA main parameter table.

Parameter	α	γ	η	σ
Meaning	Learning factor	Depth of local search	Elite solution ratio	SA search frequency

TABLE 3 Time required for each workpiece in workpiece processing stage.

θ^f(i)	O₁	O₂	O₃	O₄	θ^f(i)	O₁	O₂	O₃	O₄
1	93	96	13	58	13	71	75	34	82
2	96	27	52	73	14	18	62	87	61
3	41	40	73	49	15	40	87	6	96
4	95	34	32	62	16	95	32	3	97
5	60	6	45	93	17	9	33	48	63
6	74	55	66	58	18	46	7	28	72
7	22	93	37	75	19	6	97	12	70
8	12	42	4	36	20	82	81	71	48
9	58	63	58	69	21	68	38	42	63
10	41	1	28	65	22	90	60	13	54
11	69	75	99	21	23	39	58	20	97
12	75	28	33	54	24	13	75	47	45

TABLE 4 Time required for each workpiece in the product assembly stage.

θ^p(k)	1	2	3	4
${t}_{{\theta }^{p}(k)}$	194	107	139	388

TABLE 5 Assembly relationship between the workpiece and the product.

θ^p(k)	1	2	3	4
θ^f(i)	1, 3, 13, 21, 23, 24	5, 6, 9, 10, 20	4, 11, 14, 17, 18, 22	2, 7, 8, 12, 15, 16, 19

Four groups of different level values are set for each major parameter as shown in Table 6. At the same time, orthogonal table L₁₆(4⁴) is selected, each combination is run independently with IHEDA for 10 times, Maximum number of iteration Max_Iters = 100, population size GT = 50, and the average percentage deviation ARPD is used as the influence value RV, here the optimal value run in the orthogonal experiment is selected as C_best. The orthogonal table and its results are shown in Table 7.

TABLE 6 Combination table of different parameter values.

Number	α	γ	η	σ
1	0.1	0.25	0.1	2
2	0.2	0.50	0.2	3
3	0.3	0.75	0.3	4
4	0.4	1.00	0.4	5

TABLE 7 Orthogonal table and RV values.

Number	α	γ	η	σ	RV	Number	α	γ	η	σ	RV
1	1	1	1	1	2.801	9	3	1	3	4	1.131
2	1	2	2	2	1.515	10	3	2	4	3	1.950
3	1	3	3	3	1.214	11	3	3	1	2	1.753
4	1	4	4	4	1.151	12	3	4	2	1	1.836
5	2	1	2	3	1.266	13	4	1	4	2	2.147
6	2	2	1	4	1.100	14	4	2	3	1	2.459
7	2	3	4	1	2.085	15	4	3	2	4	1.183
8	2	4	3	2	1.494	16	4	4	1	3	1.504

Group analysis was performed for each parameter in Table 7, and the average RV value was used as the evaluation value to draw the trend chart of each parameter as shown in Figure 5.

[IMAGE OMITTED. SEE PDF]

It is observed that the greater the local search depth γ and search frequency σ, the higher the algorithm performance. Considering the time cost and operation efficiency, and based on the statistical analysis of the above parameters, the algorithm parameters are set as follows: GT = 50, Max_Iters = 100, α = 0.2, γ = 1.00, η = 0.2, σ = 4.

Simulation experiment results

We conducted simulation experiments using the obtained optimal parameters. The simulation process involved IHEDA and four comparative algorithms (GA, EDA, EDA-EFT, and EDA-LS). A total of 900 instances were used for the simulation experiments, and the results obtained from running IHEDA are shown in Table 8. It is important to note that the data in column J corresponds to the values of parameter n. The 5 data (T1,T2,T3,T4, and T5) in each group are the results obtained by running the algorithm 5 times for that group. Refer to Table 1 and Section Four (Simulation Experiment) for a more comprehensive explanation.

TABLE 8 IHEDA algorithm simulation results data (partial).

No.	Instance	J	M	F	P	Replicate	T1	T2	T3	T4	T5
1	I_8_2_2_2_1	8	2	2	2	1	371	371	371	371	371
2	I_8_2_2_2_2	8	2	2	2	2	361	361	361	361	361
3	I_8_2_2_2_3	8	2	2	2	3	454	454	454	454	454
4	I_8_2_2_2_4	8	2	2	2	4	497	497	497	497	497
5	I_8_2_2_2_5	8	2	2	2	5	426	426	426	426	426
6	I_8_2_2_3_1	8	2	2	3	1	472	472	472	472	472
7	I_8_2_2_3_2	8	2	2	3	2	403	403	403	403	403
8	I_8_2_2_3_3	8	2	2	3	3	299	299	299	299	299
9	I_8_2_2_3_4	8	2	2	3	4	456	456	456	456	456
10	I_8_2_2_3_5	8	2	2	3	5	403	403	403	403	403
11	I_8_2_2_4_1	8	2	2	4	1	388	388	388	388	388
12	I_8_2_2_4_2	8	2	2	4	2	220	220	220	220	220
13	I_8_2_2_4_3	8	2	2	4	3	427	427	427	427	427
14	I_8_2_2_4_4	8	2	2	4	4	348	348	348	348	348
15	I_8_2_2_4_5	8	2	2	4	5	334	334	334	334	334
…	……	…	…	…	…	…	…	…	…	…	…
886	I_24_2_2_2_1	24	2	2	2	1	1221	1221	1221	1219	1219
887	I_24_2_2_2_2	24	2	2	2	2	944	944	944	944	944
888	I_24_2_2_2_3	24	2	2	2	3	1846	1843	1838	1838	1838
889	I_24_2_2_2_4	24	2	2	2	4	1296	1296	1296	1296	1296
890	I_24_2_2_2_5	24	2	2	2	5	1260	1257	1256	1256	1256
891	I_24_2_2_3_1	24	2	2	3	1	1114	1114	1114	1114	1114
892	I_24_2_2_3_2	24	2	2	3	2	1090	1090	1090	1073	1072
893	I_24_2_2_3_3	24	2	2	3	3	1130	1088	1088	1088	1088
894	I_24_2_2_3_4	24	2	2	3	4	1033	1029	1029	1029	1028
895	I_24_2_2_3_5	24	2	2	3	5	1316	1314	1314	1314	1314
896	I_24_2_2_4_1	24	2	2	4	1	1004	1004	1004	1004	1004
897	I_24_2_2_4_2	24	2	2	4	2	1242	1206	1202	1202	1202
898	I_24_2_2_4_3	24	2	2	4	3	1245	1245	1245	1245	1245
899	I_24_2_2_4_4	24	2	2	4	4	1016	1016	1016	1016	1016
900	I_24_2_2_4_5	24	2	2	4	5	831	815	815	815	815

The analysis of examples

To verify the performance effect of IHEDA, 900 experimental examples were used for simulation analysis and compared with GA, EDA Wang et al. [40], EDA-EFT and EDA-LS. EDA-EFT is an EDA that uses only EFT rules, and EDA-LS is an EDA that uses local neighborhood search. The average percentage deviation ARPD and the best value achievement rate Best are used as the evaluation performance indicators, and Max_Iters is set to 100. Since no optimal value has been recorded and reported in DPAPFSP, the optimal value of all the methods to be measured is taken as C_best. Each combination was run independently 5 times, and 15 combinations were established based on F and n classification. The experimental results are shown in Table 9, where the optimal results of each group are marked black. As can be seen from Table 9, IHEDA has obvious advantages in ARPD and Best compared with other algorithms, and each combination is not inferior to different algorithms. In addition, probability and statistical analysis of ARPD value of each algorithm in each combination was conducted, as shown in Table 10. The optimal results of each group were marked black. The maximum MAX and standard deviation STD of IHEDA in each combination are lower than those of other algorithms, indicating that IHEDA has high stability and can effectively improve the quality of solutions.

TABLE 9 Algorithm comparison table of ARPD and Best.

F*n	GA	EDA	EDA-EFT	EDA-LS	IHEDA
ARPD	Best	ARPD	Best	ARPD	Best	ARPD	Best	ARPD	Best
2*8	0.965	90.00	0	100	0.013	100	0.010	100	0	100
2*12	2.688	33.33	0.300	71.67	0.931	53.33	0.458	65.00	0.023	100
2*16	3.784	11.67	0.925	45.00	1.687	26.67	0.893	50.00	0.062	98.33
2*20	4.197	3.33	1.729	21.67	3.231	3.33	1.386	33.33	0.052	100
2*24	4.880	6.67	3.977	6.67	6.106	6.67	3.417	8.33	0.190	100
3*8	0.351	98.33	0	100	0.012	100	0.019	100	0	100
3*12	1.335	68.33	0.119	85.00	0.391	80.00	0.202	85.00	0.012	100
3*16	2.442	35.00	0.556	60.00	1.037	45.00	0.565	65.00	0.028	100
3*20	3.372	16.67	1.416	23.33	2.561	13.33	1.668	25.00	0.085	100
3*24	3.249	18.33	2.416	11.67	3.731	8.33	2.109	23.33	0.097	98.33
4*8	0.048	100	0	100	0.008	98.33	0.002	100	0	100
4*12	0.520	88.33	0.132	93.33	0.254	88.33	0.167	90.00	0.001	100
4*16	1.412	58.33	0.428	71.67	0.699	61.67	0.518	70.00	0.018	98.33
4*20	2.017	31.67	0.728	46.67	1.196	35.00	0.887	53.33	0.043	100
4*24	2.221	30.00	1.484	28.33	2.323	16.67	1.267	36.67	0.112	98.33
Average	2.232	46.00	0.947	57.67	1.612	49.11	0.905	60.33	0.048	99.56

TABLE 10 Algorithm comparison table of MAX and STD.

F*n	GA	EDA	EDA-EFT	EDA-LS	IHEDA
MAX	STD	MAX	STD	MAX	STD	MAX	STD	MAX	STD
2*8	3.966	0.995	0.000	0.000	0.504	0.069	0.176	0.034	0.000	0.000
2*12	6.719	1.572	1.970	0.515	5.365	1.220	4.455	0.823	0.554	0.091
2*16	8.909	1.924	6.607	1.260	7.103	1.704	6.877	1.328	0.730	0.146
2*20	10.316	2.155	7.355	1.667	10.917	2.371	4.702	1.524	0.611	0.125
2*24	12.220	2.994	13.652	2.900	15.371	3.821	10.645	2.682	1.425	0.318
3*8	4.602	0.704	0.000	0.000	0.503	0.068	0.708	0.102	0.000	0.000
3*12	5.356	1.356	1.290	0.291	3.226	0.730	1.957	0.490	0.565	0.074
3*16	8.239	1.915	6.906	1.168	6.496	1.415	6.701	1.144	0.452	0.081
3*20	8.678	2.089	6.752	1.390	7.057	1.678	7.317	1.629	1.019	0.193
3*24	9.128	2.282	7.882	2.050	10.475	2.540	8.575	2.291	0.935	0.181
4*8	1.122	0.175	0.000	0.000	0.321	0.046	0.058	0.009	0.000	0.000
4*12	5.753	0.946	3.945	0.544	5.699	0.879	5.753	0.772	0.029	0.004
4*16	4.482	1.405	4.181	0.915	4.181	0.979	4.181	0.984	0.267	0.054
4*20	6.537	1.672	4.375	1.128	4.375	1.157	8.664	1.520	0.523	0.100
4*24	6.285	1.712	6.037	1.438	8.386	1907	6356	1.401	1.389	0.255

A box graph is used to describe the discrete distribution of data. In this case, the box graph is used to conduct statistical collation and comparison of the data of each algorithm when the number of workpiece is n = 24, as shown in Figure 6.

[IMAGE OMITTED. SEE PDF]

It can be seen that compared with other algorithms, IHEDA data has high aggregation and certain advantages in the stability and quality of solutions.

In addition, for example, I_24_4_2_4_2, the five algorithms were tested for 10 times, and their convergence curves were recorded, as shown in Figure 7.

[IMAGE OMITTED. SEE PDF]

It can be seen that the convergence effect of IHEDA is significantly better than other algorithms, which confirms the effectiveness of IHEDA in DPAPFSP.

Based on the comprehensive analysis conducted above, it can be inferred that IHEDA exhibits superior stability and faster convergence speed when compared to other algorithms. Consequently, IHEDA can be considered the optimal algorithm for addressing the DPAPFSP. To illustrate, we employ IHEDA to solve the DPAPFSP case instance I_24_4_2_4_2 and present the resulting Gantt chart (Figure 8) through visualization.

[IMAGE OMITTED. SEE PDF]

Below are non-parametric tests performed on the comparative algorithm above, including the Wilcoxon two-factor rank-sum test and the Friedman multiple-factor rank-sum test. These tests serve to validate the solution's effectiveness; however, their distinction lies in the various perspectives from which they assess the solution.

(a)
Wilcoxon Two-Factor Rank Sum Test

The Wilcoxon test is based on the traditional sign test and includes difference analysis, which has advantages over convertional positive-negative comparisons. IHEDA is used as the main testing algorithm and is compared with other algorithms using the Wilcoxon test. The example analyzes the five algorithms under the condition of F = 2, F = 3, and F = 4, and the data is divided into five cases according to the number of workpieces n ∈ {8, 12, 16, 20, 24}.

For each case, the algorithm's ARPD value is taken as the comparative data for testing. R+ represents the sum of positive ranks, indicating the total sum of ranks where IHEDA outperforms the compared algorithms. R- represents the sum of negative ranks, indicating the total sum of ranks where IHEDA is inferior to the compared algorithms. The p suggests the difference between IHEDA and other compared algorithms, usually compared with the confidence parameter α′. When α′ is set to 0.05, if the p is less than α′, it indicates that within the 95% confidence interval, IHEDA has better solution quality than other compared algorithms. Similarly, when α′ is set to 0.1, if the p is less than α′, it indicates that within the 90% confidence interval, IHEDA has better solution quality than other compared algorithms.

From Table 11, it can be seen that IHEDA is significantly better than the other algorithms in the case of F = 2, except for when n = 8, where IHEDA is within the 95% confidence interval of the different comparison algorithms. However, in the case of n = 8, IHEDA is within the 95% confidence interval and is better than GA and EDA-LS. In comparison with EDA and EDA-EFT, IHEDA has a negative rank sum of 0, indicating that IHEDA is not inferior to the other comparison algorithms in the case of n = 8 and further confirming the effectiveness of IHEDA's solution for F = 2.

TABLE 11 The Wilcoxon test table for the comparison of IHEDA (F = 2).

n	vs.	R+	R−	z	p	α′ = 0.05	α′ = 0.1
8	GA	1275	0	−6.149 E+00	8.882 E−16	Y	Y
EDA	0	0	–	5.000 E−01	N	N
EDA-EFT	6	0	−1.336 E+00	1.250 E−01	N	N
EDA-LS	15	0	−1.888 E+00	3.125 E−02	Y	Y
12	GA	1770	0	−6.676 E+00	1.735 E−18	Y	Y
EDA	300	0	−4.271 E+00	5.960 E−08	Y	Y
EDA-EFT	780	0	−5.435 E+00	1.819 E−12	Y	Y
EDA-LS	351	0	−4.445 E+00	1.490 E−08	Y	Y
16	GA	1830	0	−6.732 E+00	8.674 E−19	Y	Y
EDA	1102	26	−5.688 E+00	7.596 E−12	Y	Y
EDA-EFT	1485	0	−6.389 E+00	5.551 E−17	Y	Y
EDA-LS	984	6	−5.701 E+00	7.958 E−13	Y	Y
20	GA	1830	0	−6.732 E+00	8.674 E−19	Y	Y
EDA	1540	0	−6.447 E+00	2.776 E−17	Y	Y
EDA-EFT	1711	0	−6.612 E+00	3.469 E−18	Y	Y
EDA-LS	1274	1	−6.139 E+00	1.776 E−15	Y	Y
24	GA	1711	0	−6.620 E+00	3.469 E−18	Y	Y
EDA	1596	0	−6.505 E+00	1.388 E−17	Y	Y
EDA-EFT	1653	0	−6.563 E+00	6.939 E−18	Y	Y
EDA-LS	1653	0	−6.563 E+00	6.939 E−18	Y	Y

We performed the same Wilcoxon rank-sum test for both F = 3 and F = 4 cases, yielding results consistent with those obtained for the F = 2 scenario. The Wilcoxon rank-sum test confirms the effectiveness of IHEDA in solving the DPAPFSP.

(b)
Friedman's Multivariate Rank Sum Test

The Friedman rank sum test, a non-parametric test for determining significant differences among multiple samples of interest, is performed below for the five comparison algorithms. The Friedman test sorts various samples and their data sets and calculates the mean rank. Here, the Friedman rank sum test is set up for three scenarios of plant number F = 2, F = 3, and F = 4. Each case is then divided into five groups of samples with the number of workpieces n ∈ {8, 12, 16, 20, 24}. The average percent deviation ARPD mean value of each group of samples is selected as the evaluation index. It sorted the order of the evaluation index from best to worst and attached the ordinal rank. Finally, the average of the ordinal rank of each algorithm corresponding to five groups of samples is calculated, which is the average rank of the algorithm. The lower the average rank of the algorithm, the higher the solution quality of the algorithm. Meanwhile, the Nemenyi follow-up test is introduced on this basis, which determines whether there is a significant difference between the compared algorithms in this confidence interval by calculating the corresponding confidence interval corresponding to the CD value and the average rank of the compared algorithms.

The Friedman test tables for the comparison algorithms in the cases of plant number F = 2, F = 3, and F = 4 are shown in Table 12, and the Friedman test plots are shown in Figure 9.

TABLE 12 Friedman test table.

Algorithm	Mean rank
F = 2	F = 3	F = 4
GA	4.800	4.800	4.800
EDA	2.500	2.100	2.100
EDA-EFT	4.200	4.000	4.000
EDA-LS	2.400	3.000	3.000
IHEDA	1.100	1.100	1.100
CD − α = 5%(IHEDA)	3.828	3.828	3.828
CD − α = 10%(IHEDA)	3.559	3.559	3.559

[IMAGE OMITTED. SEE PDF]

The related graph show that the average rank of IHEDA is significantly lower than other algorithms, and it is considerably better than GA and EDA-EFT in the 95% confidence interval for all three cases with different plant numbers. The experimental results illustrate the effectiveness of IHEDA in solving DPAPFSP.

CONCLUSION

In this paper, the Distributed parallel assembly permutation flow shop scheduling problem is studied, and an improved hybrid estimation of distribution algorithm is proposed according to the characteristics of the problem and the defects of EDA. The algorithm adopts a variety of improvement strategies. Firstly, the single-layer permutation encoding and decoding strategy based on EFT rules are adopted in the coding rules. Secondly, the local neighborhood search based on the critical path is carried out for the optimal solution. Then the new population sampling is done by the double sampling strategy based on the repetition rate. Finally, the simulated annealing algorithm is introduced to enhance the search. Simulation experiment and algorithm comparison verified the effectiveness of IHEDA in solving DPAPFSP.

Furthermore, the main focus of this study is on the two-stage distributed assembly flowshop scheduling problem. However, most complex assembly systems in pratical scenarios involve scheduling models with three or more stages. In the future, addressing the multi-stage distributed assembly flowshop scheduling problem with three or more stages will be a crucial direction for research in this field. The primary objective of this paper is to minimize the maximum completion time, which falls under the category of single-objective scheduling problems. In actual production settings, various conditions such as delivery deadlines, costs, and energy consumption must be considered simultaneously. Therefore, the introduction of multi-objective scheduling methods becomes necessary. Additionally, future research can explore the integration of advanced soft computing strategies, such as knowledge-based optimization algorithms and Q-learning mechanisms. It can further enhance the effectiveness and efficiency of solving complex scheduling problems.

AUTHOR CONTRIBUTIONS

Lizhen Du: Conceptualization; Funding acquisition; Project administration; Resources; Supervision; Writing – review & editing. Xintao Wang: Data curation; Formal analysis; Software; Visualization; Writing – original draft; Writing – review & editing. Jiaqi Tang: Investigation; Methodology; Validation. Chuqiao Xu: Project administration; Resources; Supervision. Guanxing Qin: Software; Visualization.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT

Author elects to not share data.

References

Lei, D., Li, J.: Distributed energy‐efficient assembly scheduling problem with transportation capacity. Symmetry 14(11), [eLocator: 2225] (2022). [DOI: https://dx.doi.org/10.3390/sym14112225]

Word count: 7274

Show less

© 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Distributed assembly permutation flow shop scheduling problem is the hot spot of distributed pipeline scheduling research; however, parallel assembly machines are often in the assembly stage. Therefore, we propose and study distributed parallel assembly permutation flow shop scheduling problem (DPAPFSP). This aims to enhance the efficiency of multi‐factory collaborative production in a networked environment. Initially, a corresponding mathematical model was established. Then, an improved hybrid distribution estimation algorithm was proposed to minimize the makespan. The algorithm adopts a single‐layer permutation encoding and decoding strategy based on the rule of the Earliest Finished Time. A local neighbourhood search based on critical paths is performed for the optimal solution using five types of neighborhood design. A dual sampling strategy based on repetition rates was introduced to ensure the diversity of the population in the later periods of iteration. Simulated annealing searching was applied to accelerate the decline of optimal value. Finally, we conduct simulation experiments using 900 arithmetic cases and compare the simulation experimental data of this algorithm with the other four existing algorithms. The analysis results demonstrate this improved algorithm is very effective and competitive in solving the considered DPAPFSP.

Details

Title

Improved hybrid estimation of distribution algorithm for distributed parallel assembly permutation flow shop scheduling problem

Author

Du, Lizhen¹

; Wang, Xintao¹; Tang, Jiaqi¹; Xu, Chuqiao²; Qin, Guanxing¹

¹ Hubei Key Laboratory of Digital Textile Equipment, Wuhan Textile University, Wuhan, Hubei, China, School of Mechanical Engineering & Automation, Wuhan Textile University, Wuhan, Hubei, China
² School of Mechanical Engineering & Automation, Wuhan Textile University, Wuhan, Hubei, China

Section

ORIGINAL RESEARCH

Publication year

2024

Publication date

Sep 1, 2024

Publisher

John Wiley & Sons, Inc.

e-ISSN

25168398

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1049/cim2.12116

ProQuest document ID

3192492300

Improved hybrid estimation of distribution algorithm for distributed parallel assembly permutation flow shop scheduling problem

Jump to:

Full text

Abstract

Details

Suggested sources