Full Text

Turn on search term navigation

1. Introduction

The scheduling of job shops plays a pivotal role in the realm of manufacturing, encompassing not only the execution and coordination of production plans but also exerting direct influence on production efficiency and product quality. The job-shop scheduling problem (JSP), being a representative problem in the field of shop floor scheduling, exhibits a level of complexity that positions it as a prominent research topic within the domains of control science and manufacturing. The JSP exhibits the NP-Hard property [1], indicating that obtaining the optimal solution through an exhaustive search of all feasible solutions is challenging. Although traditional optimization methods, including mathematical programming, branch-and-bound, and enumeration techniques, have achieved certain results in solving job-shop scheduling problems, they are limited in their applicability to small-scale cases. As the problem size increases, the feasible solution space expands dramatically, resulting in a significant increase in computation time and prolonged solution time. Consequently, directly applying optimization methods becomes challenging for large-scale JSP problems. Approximation methods, such as genetic algorithms [2], particle swarm algorithms [3], ant colony algorithms [4], and simulated annealing algorithms [5], are capable of identifying approximate optimal solutions for large-scale job-shop scheduling problems. These techniques explore the solution space by simulating natural processes or biological evolutionary mechanisms, thereby mitigating some of the limitations associated with optimization methods when dealing with large-scale problems. However, approximate optimization methods also possess certain limitations. For instance, they often entail prolonged computational time, particularly when addressing intricate problems. Furthermore, these algorithms typically necessitate iterative parameter adjustments to enhance solution quality, thereby augmenting the complexity and uncertainty of the algorithms. Evolutionary mechanisms can be employed as a means to partially overcome the aforementioned constraints associated with optimization methods in handling large-scale problems.

The rapid advancement of information technology has led to the accumulation of vast amounts of data in workshop production, which presents a novel solution for JSP. These data not only record various aspects of the production process but also contain valuable knowledge that can inform scheduling decisions. Therefore, effectively utilizing this data and extracting useful dispatching knowledge is crucial for enhancing the optimization ability of scheduling algorithms and improving job-shop processing efficiency. Specifically, through data mining technology [6], it is possible to obtain the effective dispatching knowledge in the machining process from the offline data, such as the machining sequence of the workpiece, the load of the machine, the production bottleneck, and so on [7]. Then, the dispatching knowledge is applied to the scheduling algorithm to improve its solving quality and efficiency by optimizing the algorithm’s policies and parameters. In this way, we can generate an optimal scheduling policy that is closer to the optimal solution, so as to guide the actual scheduling process of the job shop.

Dispatching rules (DRs) [8] occupy an important position in JSP, and they provide effective approximate optimization methods for dealing with the problem of job conflicts at the same moment and on the same machine. These rules are usually based on the long-term experience and knowledge accumulation of scheduling experts, but in practice, traditional DRs often perform unsatisfactorily, and no one rule can show superiority in all scenarios and metrics [9]. To address this issue, scheduling methods based on data mining algorithms have emerged, leveraging the offline production data accumulated by the job shop itself to extract effective decision rules (DRs). Compared with traditional expert knowledge-based DRs, those extracted through data mining algorithms better reflect the actual production situation and possess higher relevance and practicality. By analyzing job sequence, processing time, machine load, and other information in the production data, the data mining algorithm extracts implicit scheduling knowledge and generates corresponding DRs that consider factors such as job priority and machine availability while incorporating dynamic changes and uncertainties in actual production. These DRs can effectively guide JSP solving. To further enhance solution efficiency and quality, embedding the DRs extracted from the data mining algorithm into a heuristic algorithm fully utilizes their guiding role, accelerating convergence speed of the algorithm for real-time optimization.

Li et al. [10] first described an innovative approach to generate DRs using data mining algorithms and described in detail how learning algorithms can be applied directly to production data and how data mining techniques can be utilized to explore previously unknown DRs. Shahzad and Mebarki [11], in order to overcome the challenge of excessive real-time computation faced by meta-heuristic algorithms in solving shop-floor scheduling problems, proposed a method for extracting DRs of shop floor If-Then offline using decision trees, and successfully applied this method to a real-world problem of online shop-floor scheduling. Wang et al. [12] proposed a method to mine optimal DRs through decision trees and use neural networks to accurately predict the performance of dispatching rules. Jun and Lee [13] proposed a new method to deal with the dynamic single-machine scheduling problem by transforming the scheduling process into training data including underlying scheduling decisions, and then constructing dispatching rules based on decision trees. Experimental results show that this method is superior to other dispatching rules under the performance index of minimizing the average total weighted delay. Zahmani et al. [14] proposed an innovative method that integrates DRs, genetic algorithms, data mining algorithms, and simulation techniques, which is able to assign different DRs to machines in real time, and can effectively solve the local optimal solution in scheduling problems and the irrationality or inability of scheduling solutions arising from inconsistency in the optimization objective function and constraints, and improve the system’s efficiency and real-time performance. Baykasoglu et al. [15] proposed a data mining algorithm for selecting DRs based on genetic programming principles, which has the core objective of deciding the most appropriate scheduling plan based on the specific parameters of the current production shop; Balasundaram et al. [16] used the decision tree technique in a data mining algorithm to identify special DRs from the scheduling data that are easy to understand. These rules were used to deal with the scheduling challenges of a job shop equipped with two machining machines. Wang et al. [17] proposed a decision tree DR mining strategy for JSPs incorporating the branch delimitation method and the decision tree in data mining algorithms, which was able to extract the scheduling knowledge implicit in the optimal scheduling strategy and the method demonstrated shorter completion times than other methods in real cases. Long et al. [18] adopted priority DRs to optimize and solve the simulation scheduling cases of the job shop, and obtained the optimal DRs that met a single scheduling objective. Then, an immune scheduling algorithm was used to establish a control model for the selected optimal dispatching rules, and dynamic DRs selection was realized to meet multiple scheduling objectives. Huang et al. [19] proposed a PDR generation method based on graph neural networks (GNN) and reinforcement learning (RL), which is capable of self-learning and self-evolution through interaction with the scheduling environment. In order to combine DJSP and GNN closely, a solution representation method based on a disjunctive graph was designed. DJSP was expressed as a Markov decision process, and GNN was used to fully explore the problem features of disjunctive graphs and the internal relations between vertices. The Actor–Critic RL method was used to automatically train network parameters to optimize policies in order to schedule the best operations at each step.

According to the existing academic research literature, there is a limited amount of discussion and research on data mining algorithms for extracting Dominant Rules (DRs) in the field of job-shop scheduling problems (JSP). Therefore, this paper focuses on JSP as the subject of investigation and proposes a framework for rule-based heuristic dispatching method with “minimizing the maximum completion time” as the optimization performance index. The method effectively extracts dispatching knowledge from historical data and transforms it into easily interpretable If-Then rules. Considering that the beluga whale optimization algorithm (BWO) [20] is prone to local optimality and slow convergence when solving JSP, an improved beluga whale optimization algorithm (IBWO) is designed, and a domain search strategy based on greedy thought is proposed. The crossover operator and mutation operator in the genetic algorithm are introduced to enhance scheduling efficiency, solution quality, and local search capability. This expansion of the algorithm’s search capability aims to find a superior scheduling scheme and prevent it from getting trapped in local optima. By combining data mining algorithms with DRs, a JSP scheduling optimization algorithm for IBWO is developed. The mined DRs can be further utilized to refine the scheduling algorithm and improve its optimization performance. Simulation experiments are conducted to validate the feasibility and effectiveness of the proposed scheduling algorithm.

2. Rule-Based Heuristic Scheduling Framework and IBWO-DDR-AdaBoost Optimization Algorithm

2.1. JSP Problem Description

The JSP problem [21] can be described as follows: there are n different workpieces to be processed, and each workpiece must be processed on m different machines in accordance with the specified processing sequence; after a workpiece is processed by a machine, a process is completed, which is recorded as $O_{ij}$ ; each machine has and can only process one workpiece at the same time, and each workpiece completes all the processing processes, that is, the processing is completed. The task goal of JSP is to clearly select the appropriate processing machine for each process of the workpiece and determine the time to start processing and the processing sequence of the workpiece on each machine, so that the scheduling results meet the expected performance indicators.

JSP problems need to satisfy the following constraints and assumptions:

All machines can be used at zero time, that is, all workpieces can start processing immediately after arriving at the machine;
All workpieces have no priority and can be processed at zero time;
Each process is processed on the designated machine, and the processing can only begin after the processing of the previous process is completed;
Each machine can only process one work piece at the same time;
Each workpiece can only be processed once on one machine;
The processing sequence and processing time of each workpiece are known, and do not change with the change in the processing sequence;
Once any process begins processing, it is not allowed to be interrupted until the end of the process.

The preparation time and completion time of each workpiece are included in the processing time, without considering problems such as machine failure.

2.2. Rule-Based Heuristic Scheduling Framework

The present paper proposes a rule-based heuristic scheduling framework for addressing job-shop problems (JSP). This framework comprises three main components: data preprocessing, AdaBoost-CART integrated learning algorithm for mining scheduling rules, and embedding the mined rules into an enhanced beluga optimization algorithm to achieve job-shop scheduling optimization. The primary objectives of the first two components are to employ data mining algorithms in order to extract scheduling rules, while the final component aims at utilizing these rules to update and optimize solutions for scheduling problems. The input of this framework consists of a scheduling dataset composed of classical examples or workshop data, with the output being an optimal schedule. Figure 1 illustrates the structure of our rule-based heuristic approach for scheduling.

2.3. Dispatching Knowledge Mining Based on AdaBoost-CART Integrated Learning Mining Algorithm

This paper proposes the Adaptive Boosting-Classification and Regression Trees (AdaBoost-CART) algorithm, which integrates the two classification algorithm models of CART [22] and AdaBoost [23], i.e., the weak classifiers in the integrated learning model are trained using the CART-based classification algorithm. The weak classifiers in the model are trained using the CART-based classification algorithm, and then the weak classifiers are iteratively optimized using the principles of the AdaBoost integrated learning algorithm, i.e., the weak classifiers trained by the CART algorithm are used as the weak classifiers type of AdaBoost, and a series of weak classifiers are trained iteratively and the samples are weighted according to the accuracy of the classifiers, and these weak classifiers are finally combined to form the strong classifiers. Its model structure is shown in Figure 2.

The core model of the AdaBoost-CART scheduling knowledge mining algorithm for extracting dispatching rules is as follows: first, train the CART classification tree model on the sample set, find the optimal model using the cross-validation method, use the trained CART classification tree model as the weak classifier of AdaBoost integrated classifier, and use the principle of AdaBoost integrated learning algorithm to perform the iterative optimization to extract the scheduling rule dendrogram. The steps of the AdaBoost-CART mining dispatching rules are described as follows, and its specific algorithm flowchart is shown in Figure 3.

Perform data preprocessing on the scheduling dataset to determine the attributes, attribute values, and category values and divide the training set as well as the test set;
Train the CART classification tree model on the sample set. The attribute with the smallest Gini index is taken as the root node of the branch;
On top of the root node, continue branching according to the principle of Gini index minimization, recursively build the tree, and end the tree splitting when the training set contains only one category;
Apply the obtained tree rules to the test set, and observe whether there is a deviation in the scheduling results; if there is overfitting needed to prune the CART tree, use cross-validation to find the optimal model, determine and save the model parameters, and then repeat c until the scheduling output of each test subset is basically the same as the output of the test set; proceed to step e;
Iteratively optimize the CART classification tree model using the AdaBoost algorithm, i.e., use the CART classification tree model as a weak classifier for the AdaBoost integrated classification model;
Train the weak classifier using the CART algorithm;
Calculate the classification error rate and weighting coefficients of the weak classifier;
Update the weight coefficients;
Update the sample weights;
Determine whether the maximum number of iterations is reached or the minimum error is satisfied, i.e., the classification error rate of the weak classifier is less than the threshold, if yes, then the weak classifier will be combined into the final strong classifier by weighted average and output the AdaBoost-CART tree rule diagram, otherwise, return to step f.

2.4. IBWO Scheduling Optimization Algorithm Based on AdaBoost-CART

2.4.1. Improvement of Standard Beluga Whale Optimization Algorithm

The standard BWO has good optimization ability, but it has problems such as low convergence accuracy, poor adaptive ability, and it is easy to fall into local optimization when solving complex optimization problems such as the NP-hard problem of job-shop scheduling. In this paper, based on the original BWO algorithm, an improved beluga whale optimization algorithm (IBWO) is proposed for solving the job-shop scheduling problem with the minimization of the maximum completion time as the performance index, and the two improvement mechanisms are introduced below:

Introducing the greedy idea of domain search strategy

In the development stage, that is, the local search stage, a neighborhood search strategy based on greedy thought is designed to perform neighborhood search operations on the better Beluga whale individuals, expand the search space survey, and reduce the invalid search, so as to improve the scheduling efficiency and solution quality, and enhance the local search ability.

b.. Crossover and mutation operations of genetic operators

The crossover and mutation operations in the genetic operator are invoked to perturb the individual position vectors of beluga whales, aiming at expanding the searching ability of the algorithm, seeking a better scheduling scheme, and avoiding the algorithm from falling into local optimum.

2.4.2. IBWO-DDR-AdaBoost Optimization Algorithm

In this study, we directly apply the proposed AdaBoost-CART integrated learning algorithm to analyze historical production data and discover new dispatching rules through data mining. Subsequently, the dispatching rules obtained from accumulated shop scheduling data in the production system are combined with the IBWO algorithm to effectively optimize job-shop scheduling by leveraging the power of data. We introduce an IBWO optimization algorithm based on AdaBoost-CART, which belongs to data mining and dispatching rules (IBWO-DDR-AdaBoost) that utilizes DRs obtained through an embedded IBWO algorithm in a data mining approach for the real-time optimization of job-shop scheduling, resulting in improved solving speeds and convergence rates. The process flow based on the IBWO-DDR algorithm is illustrated in Figure 4.

The specific steps of the IBWO-DDR algorithm are as follows:

Set initialization parameters and iteration termination conditions;
Use a hybrid initialization population $W_{1}$ , $W_{2}$ , $W_{3}$ dominated by the rules of the AdaBoost-CART integrated classification model for initialization, calculate individual fitness values of beluga whales, and record the optimal individuals;
Perform a greedy thought-based domain search strategy on the better (optimal and suboptimal) individuals, evaluate the new solution produced by them, and accept the new solution if it is better than the current individual;
Convert the scheduling scheme into a position vector;
Calculate the parameter values of the balance factor $B_{f}$ and the whale fall probability $W_{f}$ in the BWO algorithm, update the location of individual beluga whales at each stage of the algorithm, and generate a vector of random numbers corresponding to the process code.
Convert the position vector into a scheduling scheme, and calculate the individual fitness value of each stage. If it is better than the current individual, update the individual;
Perform the crossover and mutation operations of the genetic operators on the selected individuals and evaluate the new solution generated by them and accept the new solution if it is better than the current individual;
Determine whether the update time exceeds the threshold, if so, call the excavated AdaBoost-CART tree rule, re-select the processing order of the current beluga whale population, and compare the fitness values of the two; if the fitness value is better, update and jump out of the local optimal solution. If the threshold is not exceeded, the next iteration is performed;
Determine whether the iteration termination condition is met. If yes, go to Step j; otherwise, go back to step c;
The algorithm ends and outputs the optimal scheduling result.

(1). Encoding mechanism

In the job-shop scheduling problem, process-based coding is used in this paper due to the simplicity and intuition of the workpiece order-based coding approach. The code is composed of the workpiece number, the total length of the code is the number of processes, and the $j$ occurrence of the workpiece number $i$ represents the $j$ process of the $i$ workpiece, that is, $O_{i j}$ . Assume that the workshop contains three workpieces, each of which contains two processes, and the process-based coding is shown in Figure 5.

(2). Transformation mechanism of position vector

The update of position in the BWO algorithm is a continuous value, while the JSP scheduling solution is a discrete value, and BWO cannot be directly applied to solve the JSP, so it is necessary to establish a conversion mechanism between the algorithm’s solution in the continuous space to the discrete scheduling solution. In this paper, we use the ascending order rule Ranked Order Value (ROV) [24] approach to achieve the mapping of solutions in continuous space to discrete space.

In the conversion from discrete space to discrete space, we take any value from $[- ε, ε]$ , $ε$ being the total number of processing processes, and generate the value corresponding to each process, and then we use the ROV rule to assign the ROV value to each process, i.e., the random numbers are labelled according to the order from the smallest to the largest; for example, the smallest value of the random number in the figure −2.8 corresponds to an ROV value of 1, and the largest value of the random number is 2.4, which corresponds to an ROV value of 6. Next, the ROV values are assigned to the corresponding process codes in the order of process numbers. The random numbers corresponding to the adjusted ROV values are the updated position vectors (i.e., traverse the process numbers from left to right and select the random values corresponding to the positions from the process codes), which achieves the conversion of the scheduling scheme into position vectors, as shown in Figure 6.

Figure 6

Conversion of process codes to position vectors.

[Figure omitted. See PDF]

In the transformation from discrete space to continuous space, firstly, the position index is assigned to the position vector, and then the ROV is sorted, and the process number corresponding to the position index with the same ROV value is selected from left to right; for example, if the ROV value is 6, the process number corresponding to the same position index is 3, and then it is converted into the process code, which achieves the conversion of the position vector into the scheduling scheme. The specific conversion method is shown in Figure 7.

(3). Mixed initial population strategy

The standard BWO uses random initialization of the population; although the random initialization method is simple and easy to use, the quality of its population is low, which leads to slow convergence of the algorithm in the early stage and the search efficiency of the algorithm not being high. Moreover, in the optimization process of the BWO algorithm, it is found that the quality of individual populations has a great influence on the accuracy of the algorithm. When solving medium- and large-scale problems, it is usually necessary to increase the size of the population and the number of iterations, which will increase the computation time of the algorithm, resulting in slower convergence, and the quality of the solution cannot be guaranteed.

The JSP scheduling problem is addressed by integrating the scheduling knowledge of the AdaBoost-CART classification tree model with the application of scheduling knowledge, enabling a rapid and straightforward initialization of process pair ordering at each decision point on the machine. The rules derived from the AdaBoost-CART tree determine attribute values for comparing processes, thereby establishing their sequential execution order and generating an ordered set of process pairs. However, since these extracted rules may not cover all scheduling instances comprehensively, it is possible to generate multiple processes with equal priority simultaneously. In such cases, additional priority dispatching rules like shortest processing time (SPT), longest remaining processing time (LWR), etc., are combined to ascertain the final order within a group of processes. In order to effectively improve the quality of the initial population of beluga whales, this paper designs a hybrid initialization algorithm initial population method, i.e., an initialization method that combines the scheduling knowledge of the population random and data mining algorithms as well as a combination of the priority dispatching rules, where 60% of beluga whales in the population $W_{1}$ are initialized using the AdaBoost-CART tree rule, 20% of the beluga whales in the population $W_{2}$ are initialized using the priority scheduling rule initialization, and 20% of the beluga individuals in population $W_{3}$ are initialized using the random initialization method. Equal probability random initialization can make the initial search points uniformly distributed in the whole solution space to ensure the global search range of the algorithm, and the adopted hybrid initialization strategy can enrich the diversity of the BWO and further improve the search quality and search efficiency of the optimization algorithm.

(4). Fitness function

In the beluga whale optimization algorithm, the fitness function is used to evaluate the fitness of each individual to determine its degree of superiority in the search space. The optimization objective selected in this paper is maximum completion time minimization. The maximum value of the completion time of each process in the workshop is the maximum completion time, and the smaller the maximum completion time is, the more ideal the scheduling structure. Therefore, $C_{\max}$ is used to represent the fitness function, as in Equation (5). The smaller the value of $C_{\max}$ , the better the fitness, and the algorithm takes the individual with the smaller fitness value as the excellent individual.

(1) $Fitness = C_{\max} = \max \{C_{i, j}\}$

(5). Neighborhood search strategy based on greedy thought

BWO approaches the best individual mainly by sharing the individual position information in the local search optimization phase. In order to prevent the best individual from falling into the local optimum when solving the JSP problem and not obtaining a good scheduling solution, this paper designs a neighborhood search strategy based on greedy thinking for the high-quality (optimal and suboptimal) individuals, which is used to find the neighborhood better solution for the best individual in order to expand the survey of the search space. The specific neighborhood search operations are as follows:

Neighborhood operation N1:

Select any two positional elements in the process ordering, the selected positional elements should correspond to the processes of different workpieces, and then perform the exchange operation on the selected positional elements.

b.. Neighborhood operation N2:

Select any two positional elements in the process sorting for exchange and whether they are adjacent to each other; if the first positional element selected is the last element, then re-select the first positional element.

c.. Neighborhood operation N3:

Select any two positional elements in the process and sort and insert the latter element into the position before the former element.

The idea of a greedy algorithm is combined with neighborhood operation. Specifically, in the process of algorithm iteration, $N_{i} (i = 1, 2, 3)$ , neighborhood transformation is carried out on the selected individuals one by one, and the new solutions are evaluated to determine whether they are acceptable, and the accepted new solutions are updated to the population synchronously. In particular, by setting the $r_\max$ parameter, the strategy is repeatedly executed to enhance the local search capability until the iteration termination condition is met. The neighborhood search strategy based on greedy thinking is shown in Figure 8.

(6). Genetic operator operation

In view of the strong global search ability of the genetic algorithm, the crossover and mutation operators of the genetic operator are introduced into the beluga optimization algorithm to perturb the process coding, aiming to extend the search ability of the algorithm, so as to further enhance the computational performance of the algorithm and to seek for a better scheduling scheme. The specific operations are as follows:

Crossover operation

The crossover operation adopts the classical single-point crossover operation [25]. Single-point crossover is a common crossover operation in genetic algorithms, which generates two new offspring individuals by choosing a random position, cutting the chromosomes of two parent individuals at that position and exchanging the cut segments. First, a proportional selection strategy (also known as roulette selection) is employed to calculate the selection probability, select prob, based on fitness values. This strategy ensures that individuals with higher fitness values have a greater likelihood of being chosen. The numpy scientific function np.exp() is utilized to normalize the fitness values by converting them into probabilities. To ensure numerical stability, the fitness values are divided by the maximum absolute fitness value and multiplied by a constant factor of 6, which represents the temperature value in the softmax function and controls the average degree of probability distribution in its output. Finally, dividing this result by their sum yields a probability distribution.

(2) $s e l e c t p r o b = n p . e x p (6 * n p . a r r a y (f i t n e s s e s) / n p . m a x (n p . a b s (f i t n e s s e s)))$

(3) $s e l e c t p r o b = s e l e c t p r o b / n p . s u m (s e l e c t p r o b)$

Subsequently, employing the np.random.choice() function based on selection probability, eight parent individuals are randomly chosen from the population for each iteration. The selected pairs undergo a crossover operation, preceded by a check to ensure sufficient individuals are available for pairing. If a randomly generated number is below the crossover probability (crossprob), the exchange of values occurs at a randomly selected feature location (i.e., crosspoint) between the two individuals, resulting in the creation of a new descendant individual. Should this newly derived individual exhibit superior fitness compared with its predecessor prior to crossover, its fitness value is updated and recorded as part of the novel solution.

b.. Mutation operation

In this paper, we adopt the variation of basic sites, i.e., point variation operator [26], to perform variation operation on the individuals in the current population, which enhances the local search ability of the individuals, and makes the algorithm better able to explore different regions in the solution space. For each individual, the np.random.uniform () function generates a random number in the interval [0, 1] to determine whether variation is required: if the random value is less than the mutation probability, mutprob, the characteristic position, mutpoint (mutation point), of an individual is randomly selected through the np.random.choice () function to mutate and form a new individual.

Finally, the check () function is used to check whether the corresponding workpiece process meets the working procedure conditions. If yes, the fitness value of the new individual obtained after variation is evaluated. If the fitness value of the original individual is better, the fitness value is updated and the new solution is recorded. If it does not match, the random value is generated again and the mutation probability is compared and judged, and the new feature position is adjusted for mutation.

(7). Dispatching rule optimization strategy

The IBWO optimization algorithm incorporates AdaBoost-CART tree rules to achieve optimal scheduling results. Specifically, the current AdaBoost-CART tree rule is reintegrated into the IBWO optimization algorithm for achieving optimal scheduling outcomes. In particular, AdaBoost-CART tree rules are introduced to reselect the processing order of individuals in the beluga population and update their fitness values when a superior solution is found, thereby avoiding local optima. To expedite the convergence of the IBWO-DDR-AdaBoost algorithm, conditions for optimizing dispatching rules are established. At specific update intervals during each iteration, if the elapsed time exceeds a threshold value (a multiple of one quarter of the total iteration cycle), updates are made to mined scheduling rule calls. This approach facilitates improved individual renewal and stepwise optimization of the IBWO population while reducing the total iterations required for obtaining an optimal solution.

3. Results and Discussion

3.1. Simulation Environment Construction

The simulation experiments in this section were carried out on a Windows 11 operating system platform with an Intel i5-10200H processor (Santa Clara, CA, USA) with an acceleration frequency of 2.4 GHz and a running memory of 8 GB, and are written in Python language with a compilation environment of PyCharm and a software version of Python 3.9. In order to optimize the IBWO-DDR-AdaBoost scheduling algorithm in this paper and obtain accurate and reliable comparison results, each algorithm was independently tested 10 times to obtain the optimal value, and calculate the average, Avg, standard deviation, Std, and deviation. After several simulation tests, the experimental parameters were finally determined as follows: all the algorithms are set with the same maximum iteration number $T_{\max}$ = 300, the same population size $p o p$ = 70, the population size of AdaBoost-CART rule accounts for 60% of the total in the mixed-initialized population $W_{1}$ , the population size of combined scheduling rule $W_{2}$ accounts for 20% of the total, the population size of random initialization is 20% of the total $W_{3}$ , and the population size of neighborhood initialization is 20% of the total W3. The maximum number of iterations in the neighborhood search is $r_\max$ = 5, and the crossover operation probability and the mutation operation probability of the process coding in the genetic algorithm are both set to 0.5 in the end.

3.2. Representation of Dispatching Rules

3.2.1. Selection of Scheduling Sample Set

The classical LA01 algorithm was chosen as the scheduling dataset. Table 1 presents the data, process constraints, and other information processed by the LA01 algorithm. Additionally, for production scheduling data mining purposes, it is essential to have prior knowledge of workpiece sorting results, which must be obtained through a successful instance or algorithm since the LA01 algorithm refers to the scheduling Gantt chart. Subsequently, static scheduling datasets are mined and unknown rules are applied to dynamic production scheduling in order to achieve three processes: “scheduling rule mining, production scheduling optimization, and job shop management”. Figure 9 illustrates the Gantt diagram depicting the optimal scheduling for the LA01 instance. The LA01 example includes ten machining workpieces and five machining stages. In each stage, one machining machine is involved. The machining sequence of workpieces is given in the table. For instance, the machining sequence of workpiece 1 is machine 2, machine 1, machine 5, machine 4, and machine 3. The working time of each workpiece on the corresponding machine is also given in the table.

3.2.2. Select Performance Indicators and Data Representation

The selection of scheduling attributes [27] is the basis for data mining algorithms to mine dispatching rules, which are designed to reflect the essential characteristics of job-shop scheduling in historical production scheduling data sets. The essence of job-shop scheduling is to arrange the sequential processing order of the workpieces waiting for the same machine at the same time. The common dispatching rules [28] involved in job-shop scheduling are shown in Table 2.

Based on the processing data, process constraints, and other information in the LA01 example shown in Table 1, the maximum completion time is minimized according to the performance index of scheduling optimization. Therefore, three dispatching rules, SPT, LWR, and LOR in Table 2, are selected to form the historical scheduling rule base of the example, so as to determine attributes, attribute values, and category values [29]. As shown in Table 3.

The specific description of each of the above attribute values is as follows:

PT: the processing time on the machine for a particular process for a particular workpiece;
RPT: the sum of the times of all subsequent unprocessed processes when a particular workpiece is scheduled;
ROPN: the number of subsequent unprocessed processes when a process is scheduled for a particular workpiece.

For one of the multiple processes waiting to be processed on the same machine, their PT, RPT, and ROPN attribute values are compared separately, and after attribute preprocessing, the following new attributes are obtained:

Job_pt_longer (two processes of two tasks on the same machine whose processing time is longer);
Job_rpt_longer (two processes of two tasks on the same machine who have longer remaining processing time);
Job_ropn_more (who has more remaining operations of two tasks on the same machine).

If there are $O_{i j}$ and ${O_{i j}}^{'}$ processes that can be processed at a certain moment, according to the scheduling dataset and the historical scheduling rule base, comparing the attributes of PT, RPT, and ROPN, assuming that the PT value of $O_{i j}$ is greater than the PT value of ${O_{i j}}^{'}$ , then the value of the Job_pt_longer attribute is recorded as 1. If the attribute values are equal, they are uniformly recorded as 0 because the SPT rule is applied, so $O_{i j}$ will be post-processed, that is, the category value is 0, and the other two attribute values are the same. In the sorting result, the process that is processed first is recorded as category yes (abbreviated as 1), and the process that is processed later is recorded as category no (abbreviated as 0). For example, at moment 0 of the M2 machine, the processes waiting to be processed are process 1 ( $O_{11}$ ) of workpiece 1 and process 1 ( $O_{41}$ ) of workpiece 4. The values of processing time PT, the remaining processing time RPT, and the remaining number of processes ROPN of $O_{11}$ and $O_{41}$ are calculated as shown in Table 4, and the values of data obtained after the discrete processing of continuous data are shown in Table 5.

3.2.3. Construct Sample Data Set

Two processes $O_{i j}$ and ${O_{i j}}^{'}$ were randomly selected from the scheduling data of the LA01 example. In view of the situation of how to assign the two processes to the same available equipment who processed them first, the scheduling sample set was preprocessed according to its sorting results and various scheduling process information data, and the corresponding scheduling sample training set was obtained, as shown in Table 6. Based on the sample set, the AdaBoost-CART scheduling mining algorithm was used to construct the tree dispatching rules.

3.2.4. Rule Scheduling AdaBoost-CART Tree Rule

The training set data from Table 6 was imported into the PyCharm compilation environment. Firstly, the sklearn library [30] was used to split the training set and the dataset in a 4:1 ratio (test_size = 0.2) using the train_test_split function. Secondly, the weak classifier CART tree model was trained on the training set, and its tree rules were applied to the test set to evaluate any potential bias in the scheduling results. If deviations occurred, cross-validation was performed by pruning the CART tree model to find an optimal model with determined parameters that were saved until consistent scheduling results were achieved across all test subsets compared with those of the test set. Subsequently, these optimal models and their corresponding pa parameters served as AdaBoost weak classifiers, with their saved parameter values presented in Table 7 (parameters obtained through CART cross-validation).

Next, the grid search method was used to optimize the two parameters of the number of classifiers and the learning rate in the AdaBoost integrated classification model. The optimal parameter combination of the optimized AdaBoost integrated classification model is shown in Table 8.

After determining the parameters of the weak classifiers as well as the parameters of the AdaBoost integrated learning model, the CART weak classifiers were iteratively optimized using the principles of the AdaBoost algorithm, and when the maximum number of iterations and the minimum error of the classifiers were satisfied, the weak classifiers, CART, were weighted and linearly combined to form a strong classifier, and the tree dispatching rules of the AdaBoost-CART classification model were outputted.

In the AdaBoost-CART classification tree rule in Figure 10, the True and False of the true rule are opposite because the weak classifier CART model is a binary classification tree and each attribute value is less than or equal to 0.5. This leads to the “If-Then” combination dispatching rule as shown in Table 9.

Figure 11 shows the optimal scheduling rule graph of the AdaBoost-CART classification tree, and Figure 6 represents the feature importance ranking of the LA01 instance. From Figure 5 and Figure 6, it can be seen that the attribute of the root node in the LA01 example is Job_ropn_more, which indicates that when the two processes of two workpieces compete for the same machine, the attribute of the largest number of remaining processes is the most important when comparing the number of their unprocessed processes, and it is ranked the first in the feature importance ranking graph, and the second leaf node with more important attributes is Job_rpt_longer, and the attribute Job_pt_longer is considered last compared with the other two attributes.

3.3. Case Study

The simulation analysis in this paper is based on a comprehensive evaluation of different algorithms for solving various JSP examples, aiming to facilitate algorithm comparison. This can be further divided into the following three sections:

In the first section of the simulation, to validate the efficiency of the IBWO-DDR-AdaBoost scheduling algorithm, we incorporated the AdaBoost-CART tree scheduling rules discovered in the preceding section into the IBWO algorithm. In addition, we added the decision tree classification model (DT) [31] and the Random Forest (RF) integrated classification model [32] to mine effective scheduling knowledge, that is, to jointly extract tree-like scheduling rules from the sample training set of LA01 examples. Consequently, we obtained embedded rule-based scheduling algorithms such as IBWO-DDR-AdaBoost, IBWO-DDR-DT, and IBWO-DDR-RF. For optimizing maximum completion time in JSP classical scheduling problem’s LA11 calculation example, performance index-related processing attributes were selected. In this section of the simulation results analysis, a comparative convergence curve for each algorithm is presented along with Gantt charts illustrating solved schedules for the LA11 example using each respective algorithm. Finally, a comprehensive analysis is conducted on the convergence and solution results achieved by all the algorithms.

In the simulation conducted in the second section, we selected LA17, LA22, LA26, LA33, and LA36 from a range of medium- to large-scale complex examples with an increasing number of machining workpieces and machines. This selection was made to verify the universality and high efficiency of the IBWO-DDR-AdaBoost rule heuristic algorithm in solving JSP problems. The simulation results provide a convergence curve comparison diagram for these examples, while introducing evaluation indices to analyze the accuracy of each algorithm’s solutions. All the data required for the example are from reference [33].

The third section involves the selection of 40 benchmark test examples of JSP to evaluate the universality and efficiency of the IBWO-DDR-AdaBoost rule heuristic algorithm. Additionally, a deviation measure ( $d_{e v}$ ) is introduced to assess the proximity of each algorithm to the standard optimal solution, to analyze the deviation of the optimal solution, $T_{b e s t}$ , obtained by different algorithms from that obtained by the standard case, $T_{k n o w}$ . The simulation results provide a comparative analysis of the optimal solution, average value, standard deviation, and deviation for IBWO-Adaboost, IBWO-DT, IBWO-RF, IBWO, and BWO algorithms. Finally, a box diagram is employed to illustrate the range of deviations exhibited by different algorithms in solving these 40 examples.

3.3.1. Simulation Analysis of LA11 Typical Example

Table 10 shows the processing data, process constraints, and other relevant information for the LA11 example. The LA11 example includes 20 machining workpieces and five machining stages. In each stage, one machining machine is involved. The machining sequence of workpieces is given in the table. For instance, the machining sequence of workpiece 1 is machine 3, machine 2, machine 1, machine 4, and machine 5. The working time of each workpiece on the corresponding machine is also given in the table. For example, the first row of data indicates that workpiece 1 is first processed in 34 units of time on machine 3, then 21 units of time on machine 2, and so on, and finally 95 units of time on machine 5.

Machine constraint matrix $P_{m}$ and working time matrix $P_{t}$ can be obtained from the processing information of the LA11 example in Table 10, and the transposed matrix $P_{m}^{T}$ of the machine constraint matrix is shown in Equation (6).

(4) $P_{m}^{T} = [\begin{matrix} 3 & 1 & 1 & 3 & 1 & 5 & 1 & 5 & 5 & 1 & 1 & 5 & 2 & 3 & 5 & 2 & 5 & 2 & 5 & 5 \\ 2 & 4 & 2 & 4 & 5 & 3 & 5 & 4 & 2 & 2 & 4 & 3 & 3 & 2 & 1 & 1 & 2 & 1 & 1 & 3 \\ 1 & 2 & 3 & 5 & 4 & 1 & 3 & 2 & 1 & 3 & 2 & 1 & 5 & 5 & 4 & 5 & 4 & 3 & 3 & 2 \\ 4 & 5 & 5 & 1 & 2 & 4 & 2 & 3 & 3 & 5 & 5 & 2 & 1 & 1 & 3 & 4 & 1 & 4 & 2 & 4 \\ 5 & 3 & 4 & 2 & 3 & 2 & 4 & 1 & 4 & 4 & 3 & 4 & 4 & 4 & 2 & 3 & 3 & 5 & 4 & 1 \end{matrix}]$

where the

i

column of the machine constraint matrix

P_{m}^{T}

represents the machining sequence of the workpiece

i

on the machining machine. For example, the third column number “1 2 3 5 4” in the

P_{m}^{T}

matrix means that the first process of the No. 3 workpiece

J_{3}

is first processed on machine 1, followed by the processing sequence of machines 2, 3, and 5, and finally the last processing process is completed on machine 4. Similarly, the working time matrix

P_{t}

is shown in Equation (7).

(5) $P_{t} = [\begin{matrix} 34 & 21 & 12 & 66 & 83 & 79 & 93 & 83 & 25 & 96 & 95 & 10 & 91 & 27 & 9 & 54 & 33 & 89 & 84 & 81 \\ 21 & 52 & 42 & 77 & 37 & 43 & 77 & 24 & 49 & 75 & 76 & 95 & 59 & 52 & 87 & 20 & 28 & 33 & 69 & 45 \\ 53 & 71 & 31 & 79 & 34 & 92 & 87 & 41 & 44 & 43 & 7 & 61 & 59 & 43 & 41 & 43 & 26 & 8 & 94 & 78 \\ 55 & 16 & 98 & 55 & 19 & 62 & 87 & 38 & 98 & 77 & 28 & 9 & 46 & 28 & 39 & 14 & 78 & 66 & 74 & 69 \\ 95 & 26 & 39 & 77 & 64 & 54 & 69 & 60 & 17 & 79 & 35 & 35 & 16 & 50 & 45 & 71 & 37 & 42 & 27 & 96 \end{matrix}]$

The columns of the working time matrix in Equation (7) represent the working time on the machine for each process of the workpiece. For example, the fifth column of the matrix, “83 37 34 19 64”, indicates that the working time of the five processes $O_{5 k}$ ( $k = 1 \sim 5$ ) of the workpiece No. 5 are as follows: $P_{51 k} = 83$ , $P_{52 k} = 37$ , $P_{53 k} = 34$ , $P_{54 k} = 19$ , $P_{55 k} = 64$ (the subscripts k represent the different machines). As a result, the modelling of the example LA11 is started based on $P_{m}^{T}$ and $P_{t}$ . The modelling process is mainly based on the IBWO-DDR-AdaBoost scheduling algorithm process for the encoding of artefacts, transformation mechanism, mixed population initialization, fitness function calculation, neighborhood search, genetic operator cross mutation, and other operations.

The LA11 algorithm employs the IBWO-DDR scheduling algorithm, without embedded rules, and runs the standard BWO algorithm for 10 iterations. The average of these 10 results is taken as the final outcome, which is then compared with the simulation results to demonstrate convergence (Figure 12). Additionally, Figure 13, Figure 14, Figure 15, Figure 16 and Figure 17 present optimal scheduling Gantt charts for the IBWO-DDR-AdaBoost algorithm, IBWO-DDR-DT algorithm, IBWO-DDR-RF algorithm, IBWO algorithm, and standard BWO algorithm, respectively. These figures illustrate completion time on the vertical axis against the number of machines on the horizontal axis.

Based on Figure 12 and in conjunction with Figure 13, Figure 14, Figure 15, Figure 16 and Figure 17, it is evident that the IBWO-DDR algorithm incorporating the AdaBoost-CART rule achieves the optimal solution of 1222 units within 44 generations. The scheduling algorithm utilizing the decision tree, DT, rule attains an optimal solution around 65 generations, while the Random Forest RF rule yields the 110th best result. Although IBWO converges after the 140th generation and obtains an optimal solution of 1273 units, it deviates from the known optimum of the arithmetic example by nearly 50 units. Furthermore, although IBWO reaches convergence at generation 120 with a current optimal solution of 1428 unit time, this is not yet optimal. It can be concluded that BWO augmented with the neighborhood search strategy and genetic operator operation outperforms standard BWO in terms of optimization speed and convergence. Moreover, embedding data-mined combinatorial dispatching rules significantly accelerates optimization speed and convergence for improved BWO algorithm performance and accuracy.

3.3.2. Simulation Analysis of JSP Examples under Different Scales

In order to demonstrate the enhanced applicability of IBWO-DDR-AdaBoost in shop floor scheduling problems across different scales, this section selects LA17 (10 × 10), LA22 (15 × 10), LA26 (20 × 10), LA33 (30 × 10), and LA36 (15 × 15) as representative examples that progressively increase in terms of workpiece and machine numbers. These examples range from medium-sized to large-scale and complex scenarios, aiming to validate the effectiveness and efficiency of the IBWO-DDR-AdaBoost algorithm. The convergence curves of IBWO-DDR, compared with those of IBWO and BWO algorithms, are illustrated in Figure 18, Figure 19, Figure 20, Figure 21 and Figure 22. Additionally, Table 11 presents the optimal solution as well as the mean and standard deviation obtained by each algorithm, serving as indicators for evaluating their solution quality and performance.

According to Figure 18, Figure 19, Figure 20, Figure 21 and Figure 22, it can be observed that the IBWO scheduling algorithm based on data mining and dispatching rules achieves near-optimal solutions in most cases, exhibiting comparable performance to the standard arithmetic case. Although the rule-embedded IBWO algorithm shows a slight improvement in its search capability without embedded rules, its overall optimality search ability and convergence are slightly inferior to those of IBWO-DDR. Moreover, as the size of arithmetic cases increases, the standard BWO optimization algorithm gradually loses convergence and occasionally gets trapped in local optimal solutions, resulting in inadequate global searches by individual BWOs. Consequently, it is evident that the enhanced BWO optimization algorithm with embedded combinatorial dispatching rules demonstrates greater stability and significantly improves solution speed and convergence compared with the other two algorithms.

The analysis of Table 11 reveals that the IBWO scheduling algorithm, integrated with AdaBoost-CART dispatching rules, consistently achieves optimal solutions in all arithmetic cases, as indicated by the bold values. Furthermore, the average values obtained are comparatively smaller than those achieved by the other two IB-WO scheduling algorithms embedded with different dispatching rules. In comparison with the embedded-rule IBWO scheduling algorithms, both the IBWO and standard BWO algorithms exhibit slightly lower solution effectiveness.

3.3.3. Comparative Analysis of Solution Results under 40 Benchmark Test Cases

To further verify the universality and effectiveness of IBWO-DDR-AdaBoost proposed in this paper, 40 Benchmark examples (LA01–LA40) were used to verify the algorithm. In order to reflect the accuracy of the algorithm, the evaluation index of deviation $d_{e v}$ was added. Deviation $d_{e v}$ is the deviation degree between the optimal solution $T_{b e s t}$ searched by the algorithm and the known optimal solution $T_{k n o w}$ . The formula is shown as follows:

(6) $d_{e v} = \frac{(T_{b e s t} - T_{k n o w})}{T_{k n o w}} \times 100 %$

The experimental parameter settings in this section are the same as those in Section 3.1. The solution results and evaluation indicators of the experiment are shown in Table 12, Table 13, Table 14, Table 15 and Table 16. Figure 23 shows the corresponding deviation box diagram of IBWO-DDR, IBWO, and BWO algorithms in a total of 40 examples from LA01 to LA40. In the following table, n is the number of workpieces processed and m is the number of machines.

After conducting multiple instance validations, based on the experimental results presented in Table 12 and Table 13, it is evident that for small- and medium-sized instances (LA01 to LA16), the three scheduling algorithms with embedded rules, namely IBWO-AdaBoost, IBWO-DT, and IBWO-RF, exhibit negligible differences among each other. All of these algorithms are capable of achieving optimal solutions or generating results that closely approximate the optimal solution. Moreover, in certain arithmetic cases, the optimal solution obtained by IBWO surpasses that achieved by the standard BWO algorithm.

However, for large and complex instances, as demonstrated in Table 14, Table 15 and Table 16, the IBWO algorithm with embedded combinatorial dispatching rules exhibits superior performance compared with both the standard BWO algorithm and the original IBWO approach. Furthermore, the optimal solution obtained by the IBWO-AdaBoost algorithm with embedded AdaBoost-CART rules closely approximates that of the standard algorithm. In terms of average solution results (mean avg), standard deviation (std), and bias, the IBWO-AdaBoost algorithm outperforms other model scheduling algorithms incorporating rule embedding such as IBWO-DT and IBWO-RF across all benchmark problems.

Considering the optimal solution of each benchmark problem, the IBWO-AdaBoost algorithm achieves the optimal solution of standard algorithms in 34 out of 40 benchmark cases. Among these instances, IBWO-DT achieved the best value in 14 cases, IBWO-RF achieved it in 19 cases, while IBWO only achieved it in 3 cases. When using mean value as a benchmark, IBWO-AdaBoost outperformed others in 25 instances; however, both IBWO-DT and IBWO-RF had limited instances where they performed better. Furthermore, when considering standard deviation as a benchmark, all three scheduling algorithms (IBWO-AdaBoost, IBWO-DT, and IBWO-RF) achieved superior results in 33 instances. These findings demonstrate that the improved beluga optimization algorithm embedded with mined combinatorial dispatching rules enhances overall search ability by quickly identifying optimal regions within the search space. Additionally, this embedding allows for the automatic regulation of global scope influence on population updates which further improves algorithmic solution capabilities.

Figure 23 describes the deviation of each algorithm for solving examples at different scales.

4. Conclusions

The paper presents a framework for a rule-heuristic scheduling method in order to solve the job-shop problem (JSP), taking into account the discrete, diverse, complex, and hidden nature of production data. This framework aims to mine effective dispatching rules and integrate them into a heuristic algorithm for optimizing solutions.

The second contribution of this study is the proposal of an AdaBoost-CART scheduling knowledge mining algorithm, which aims to extract scheduling knowledge based on the AdaBoost-CART dispatching knowledge mining algorithm. The extracted dispatching knowledge is then transformed into effective decision rules (DRs), resulting in the construction of a DRs library. Multiple classification models are employed to extract various tree-like DRs for workshop instances with different problem sizes. To address the limitations faced by the beluga optimization algorithm in solving job-shop scheduling problems, such as susceptibility to local optima and slow convergence, we propose an Improved Biogeography-Based Optimization (IBWO) algorithm that enhances both global and local search capabilities. Finally, these extracted tree-like rules are effectively fused together. Subsequently, these DRs are applied to solve complex job-shop scheduling problems, and their effectiveness and practicality are validated through simulation experiments.

The subsequent work aims to conduct an in-depth investigation into the challenges associated with random machine failures, job-shop scheduling under multi-objective optimization, and the extraction of flexible shop DRs. Our objective is to enhance the accuracy and convergence performance of the scheduling algorithm solution.

Author Contributions

Methodology, formal analysis, software, writing—original draft preparation, Z.Z.; data curation, validation, supervision, X.J.; methodology, funding acquisition, writing—review and editing, Y.W. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Data are contained within this article.

Conflicts of Interest

The authors declare no conflicts of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

Figure 1. Rule heuristic scheduling framework.

Figure 2. AdaBoost-CART model structure.

Figure 3. Flow chart of AdaBoost-CART dispatching knowledge mining algorithm to extract dispatching rules.

Figure 4. Flow chart of optimization algorithm based on IBWO-DDR-AdaBoost.

Figure 5. Example of coding based on sequence of processes.

Figure 7. Conversion of position vectors to process codes.

Figure 8. Neighborhood operation diagram.

Figure 9. Gantt diagram of LA01 example. Figure 9 shows the optimal Gantt chart solved by the heuristic algorithm.

Figure 10. AdaBoost-CART visualization of tree rule diagrams.

Figure 11. LA01 instance feature attribute importance ranking.

View Image - Figure 12. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA11 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

Figure 12. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA11 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

View Image - Figure 13. Gantt chart of LA11 based on IBWO-AdaBoost scheduling algorithm. The horizontal axis represents the completion time of the IBWO-AdaBoost algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

Figure 13. Gantt chart of LA11 based on IBWO-AdaBoost scheduling algorithm. The horizontal axis represents the completion time of the IBWO-AdaBoost algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

View Image - Figure 14. Gantt chart of LA11 based on IBWO-DT scheduling algorithm. The horizontal axis represents the completion time of the IBWO-DT algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

Figure 14. Gantt chart of LA11 based on IBWO-DT scheduling algorithm. The horizontal axis represents the completion time of the IBWO-DT algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

View Image - Figure 15. Gantt chart of LA11 based on IBWO-RF scheduling algorithm. The horizontal axis represents the completion time of the IBWO-RF algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

Figure 15. Gantt chart of LA11 based on IBWO-RF scheduling algorithm. The horizontal axis represents the completion time of the IBWO-RF algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

View Image - Figure 16. Gantt chart of LA11 based on IBWO scheduling algorithm. The horizontal axis represents the completion time of the IBWO algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

Figure 16. Gantt chart of LA11 based on IBWO scheduling algorithm. The horizontal axis represents the completion time of the IBWO algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

View Image - Figure 17. Gantt chart of LA11 based on BWO scheduling algorithm. The horizontal axis represents the completion time of the BWO algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

Figure 17. Gantt chart of LA11 based on BWO scheduling algorithm. The horizontal axis represents the completion time of the BWO algorithm, and the vertical axis represents the number of machines from machine 1 to machine 5. The Gantt chart as a whole represents a reasonable arrangement of the order of each workpiece on different machines.

View Image - Figure 18. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA17 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

Figure 18. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA17 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

View Image - Figure 19. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA22 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

Figure 19. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA22 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

View Image - Figure 20. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA26 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

Figure 20. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA26 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

View Image - Figure 21. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA33 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

Figure 21. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA33 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

View Image - Figure 22. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA36 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

Figure 22. Comparison of the convergence curves of IBWO-DDR with IBWO and BWO based on LA36 examples. The horizontal axis represents the number of iterations of the algorithm, and the vertical axis represents the maximum completion time of the respective algorithm from start to finish.

View Image - Figure 23. Comparison of deviation box patterns of LA01 to LA40 examples. The horizontal coordinate is the name of each algorithm, and the vertical coordinate is LAA01–LA40 examples of different scales. The IBWO-AdaBoost scheduling algorithm demonstrates a compact box plot, with the majority of data concentrated within a narrow range and a median positioned towards the lower end. This indicates relatively small data values and low deviation, suggesting high precision in the results obtained. The IBWO-RF algorithm closely follows suit in terms of stability, showcasing consistent performance. Conversely, the IBWO-DT scheduling algorithm exhibits outliers outside of the box plot but still within proximity to it. In contrast, both IBWO and BWO algorithms present elongated box plots implying a more discrete distribution of data with larger differences between values and higher ranges of deviation, thus indicating reduced stability.

Figure 23. Comparison of deviation box patterns of LA01 to LA40 examples. The horizontal coordinate is the name of each algorithm, and the vertical coordinate is LAA01–LA40 examples of different scales. The IBWO-AdaBoost scheduling algorithm demonstrates a compact box plot, with the majority of data concentrated within a narrow range and a median positioned towards the lower end. This indicates relatively small data values and low deviation, suggesting high precision in the results obtained. The IBWO-RF algorithm closely follows suit in terms of stability, showcasing consistent performance. Conversely, the IBWO-DT scheduling algorithm exhibits outliers outside of the box plot but still within proximity to it. In contrast, both IBWO and BWO algorithms present elongated box plots implying a more discrete distribution of data with larger differences between values and higher ranges of deviation, thus indicating reduced stability.

Table 1

Instance data of LA01.

LA01 (10 × 5)
Number of Work Pieces	Stage 1		Stage 2		Stage 3		Stage 4		Stage 5
Number of Work Pieces	Machine	Work Time	Machine	Work Time	Machine	Work Time	Machine	Work Time	Machine	Work Time
1	2	21	1	53	5	95	4	55	3	34
2	1	21	4	52	5	16	3	26	2	71
3	4	39	5	98	2	42	3	31	1	12
4	2	77	1	55	5	79	3	66	4	77
5	1	83	4	34	3	64	2	19	5	37
6	3	54	3	43	5	79	1	92	4	62
7	4	69	5	77	2	87	3	87	1	93
8	3	38	1	60	2	41	4	24	5	83
9	4	17	2	49	5	25	1	44	3	98
10	5	77	4	79	3	43	2	75	1	96

Table 2

Common basic dispatching rules.

Rule Name	Rule Description
FIFO (First in First out)	Preferentially process the workpiece that enters the processing equipment first.
LILO (Last in First out)	Give priority to the workpiece entering the processing equipment after processing.
SPT (Shortest Processing Time)	Priority is given to the workpiece with the shortest processing time.
LPT (Longest Processing Time)	Priority is given to the workpiece with the longest processing time.
EDD (Earliest Due Date)	Priority is given to the workpiece with the earliest delivery date.
MST (Minimum Slack Time)	The workpiece with minimum relaxation time is processed preferentially.
LWR (Longest Work Remaining)	Priority is given to the workpiece with the longest total remaining processing time.
SWR (Shortest Work Remaining)	Priority is given to the workpiece with the shortest total remaining processing time.
MOR (Most Operations Remaining)	The workpiece with the most total remaining operands is processed first.
LOR (Least Operations Remaining)	Preferentially process the workpiece with the fewest total remaining operands.

Table 3

Attributes, attribute values, and category values.

Attributes	Attribute Values	Category Values
PT	L (Low value)	0 (Priority execution)
RPT	H (High value)	1 (Priority execution)
ROPN	L (Low value)	0 (Priority execution)

Table 4

Examples of processing information for O₁₁ and O₄₁.

	PT	RPT	ROPN	Sorting Result
O ₁₁	21	237	4	1
O ₄₁	77	277	4	0

Table 5

Discrete processing of O₁₁ and O₄₁ processing information.

$O_{i j}$	${O'}_{i j}$	Job_Pt_Longer	Job_Rpt_Longer	Job_Ropn_More	Sorting Result
O ₁₁	O ₄₁	0	0	0	1

Table 6

Sample training data set.

$O_{i j}$	$O'_{i j}$	Job_Pt_Longer	Job_Rpt_Longer	Job_Ropn_More	Sorting Result
O ₂₁	O ₅₁	0	1	0	1
O ₂₁	O ₄₁	0	0	0	1
O ₆₁	O ₄₁	0	0	0	0
O ₃₁	O ₇₁	0	0	0	1
O ₇₁	O ₉₁	1	1	0	0
O ₇₂	O ₉₃	1	1	1	1
O ₁₄	O ₅₂	1	0	0	1
O ₁₀₃	O ₂₄	0	1	1	1
O ₁₁	O ₄₁	0	0	0	1
O ₄₃	O ₂₃	1	1	0	0
O ₃₂	O ₆₃	1	1	1	0
O ₅₄	O ₈₃	0	0	0	1
O ₃₂	O ₅₅	1	1	1	1
O ₇₁	O ₁₀₂	0	0	1	0
O ₆₃	O ₇₂	1	0	0	0
O ₆₄	O ₉₄	1	1	0	1
O ₅₁	O ₈₂	1	1	1	1
O ₅₁	O ₄₁	0	1	0	0
$⋮$	$⋮$	$⋮$	$⋮$	$⋮$	$⋮$
O ₃₁	O ₉₁	0	0	0	1
O ₂₁	O ₆₁	1	0	0	1
O ₇₂	O ₃₂	1	1	0	1

Table 7

Optimized parameters of CART decision tree model.

Parameters	Meaning	Parameter Values
max_depth	The maximum depth of the decision tree	3
min_samples_split	Minimum number of samples required for internal node repartition	2
min_samples_leaf	Minimum sample number of leaf nodes	2
max_leaf_nodes	Maximum number of leaf nodes	4

Table 8

Parameters of the optimized AdaBoost ensemble learning model.

Parameters	Meaning	Parameter Values
n_estimator	Number of weak classifiers	100
learning_rate	Learning rate	1.0

Table 9

AdaBoost-CART tree scheduling rule.

Tree Rules		Meaning Description
Rule 1	If Job_ropn_more = TrueThen Job_ $O_{i 1}$ = yes	If the number of remaining processes of the first process $O_{i j}$ is greater than that of the second process ${O_{i j}}^{'}$ , then the first process is processed first;
Rule 2	If Job_ropn_more = Falseand Job_rpt_longer = FalseThen Job_ $O_{i 1}$ = no	If the number of remaining processes of the first process $O_{i j}$ is less than that of the second process ${O_{i j}}^{'}$ and its remaining processing time is also less than ${O_{i j}}^{'}$ , then the second process is processed preferentially;
Rule 3	If Job_ropn_more = Falseand Job_rpt_longer = Trueand Job_pt_longer = TrueThen Job_ $O_{i 1}$ = no	If the number of remaining processes of the first process $O_{i j}$ is less than that of the second process ${O_{i j}}^{'}$ and its remaining processing time is more, and the processing time of the first process $O_{i j}$ is more than ${O_{i j}}^{'}$ , then the second process is processed preferentially;
Rule 4	If Job_ropn_more = Falseand Job_rpt_longer = Trueand Job_pt_longer = FalseThen Job_ $O_{i 1}$ = yes	If the number of remaining processes of the first process $O_{i j}$ is less thanthat of the second process ${O_{i j}}^{'}$ and its remaining processing time is more than that of the second process ${O_{i j}}^{'}$ , and the processing time of the first process $O_{i j}$ is less than that of the first process ${O_{i j}}^{'}$ , then the first process is processed first.

Table 10

Typical example of LA11.

LA11 (20 × 5)
Number of WorkPieces	Stage 1		Stage 2		Stage 3		Stage 4		Stage 5
Number of WorkPieces	Machine	WorkTime	Machine	WorkTime	Machine	WorkTime	Machine	WorkTime	Machine	WorkTime
1	3	34	2	21	1	53	4	55	5	95
2	1	21	4	52	2	71	5	16	3	26
3	1	12	2	42	3	31	5	98	4	39
4	3	66	4	77	5	79	1	55	2	77
5	1	83	5	37	4	34	2	19	3	64
6	5	79	3	43	1	92	4	62	2	54
7	1	93	5	77	3	87	2	87	4	69
8	5	83	4	24	2	41	3	38	1	60
9	5	25	2	49	1	44	3	98	4	17
10	1	96	2	75	3	43	5	77	4	79
11	1	95	4	76	2	7	5	28	3	35
12	5	10	3	95	1	61	2	9	4	35
13	2	91	3	59	5	59	1	46	4	16
14	3	27	2	52	5	43	1	28	4	50
15	5	9	1	87	4	41	3	39	2	45
16	2	54	1	20	5	43	4	14	3	71
17	5	33	2	28	4	26	1	78	3	37
18	2	89	1	33	3	8	4	66	5	42
19	5	84	1	69	3	94	2	74	4	27
20	5	81	3	45	2	78	4	69	1	96

Table 11

Experimental results of IBWO-DDR and IBWO and BWO under different scale examples. In Table 11, n is the number of workpieces processed and m is the number of machines. In the table, the numbers are bolded as the optimal value obtained by each algorithm in the evaluation index.

Instance		LA17	LA22	LA26	LA33	LA36
n × m		10 × 10	15 × 10	20 × 10	30 × 10	15 × 15
IBWO-AdaBoost	Avg	791.7	936.3	1266.8	1755.1	1321.1
	Best	784.0	929.0	1239.0	1733.0	1296.0
	Std	9.2	8.3	37.0	20.1	19.3
IBWO-DT	Avg	813.8	982.8	1309.6	1787.8	1369.7
	Best	810.0	944.0	1289.0	1755.0	1357.0
	Std	4.7	35.3	28.1	31.7	14.7
IBWO-RF	Avg	809.0	990.9	1285.4	1788.0	1343.7
	Best	793.0	969.0	1268.0	1779.0	1321.0
	Std	16.1	20.5	21.3	11.8	27.4
IBWO	Avg	885.1	1180.2	1379.3	1944.1	1471.2
	Best	854.0	1168.0	1364.0	1938.0	1456.0
	Std	22.4	18.0	14.4	8.0	16.9
BWO	Avg	1004.3	1327.0	1703.2	2141.2	1678.4
	Best	983.0	1320.0	1693.0	2120.0	1649.0
	Std	22.7	6.2	12.7	26.7	28.4

Table 12

Experimental results of example LA01 to LA08. In the table, the numbers are bolded as the optimal value obtained by each algorithm in the evaluation index.

Instance		LA01	LA02	LA03	LA04	LA05	LA06	LA07	LA08
n × m		10 × 5	10 × 5	10 × 5	10 × 5	10 × 5	15 × 5	15 × 5	15 × 5
IBWO-AdaBoost	Avg	667.1	666.2	608.7	597.9	593.0	926.0	905.9	867.2
	Best	666.0	655.0	597.0	591.0	593.0	926.0	890.0	863.0
	Std	2.2	11.7	14.0	6.8	0.0	0.0	14.3	5.4
	dev	0.0	0.0	0.0	0.2	0.0	0.0	0.0	0.0
IBWO-DT	Avg	672.0	681.2	620.1	607.6	593.0	926.0	909.6	871.4
	Best	666.0	655.0	599.0	593.0	593.0	926.0	893.0	863.0
	Std	6.0	13.6	13.6	11.9	0.0	0.0	12.4	5.5
	dev	0.0	0.0	0.3	0.5	0.0	0.0	0.3	0.0
IBWO-RF	Avg	673.4	685.9	617.4	604.6	593.0	926.0	912.5	874.8
	Best	666.0	655.0	598.0	593.0	593.0	926.0	890.0	863.0
	Std	8.9	26.5	14.6	10.8	0.0	0.0	12.9	13.4
	dev	0.0	0.0	0.2	0.5	0.0	0.0	0.0	0.0
IBWO	Avg	730.0	718.6	638.5	649.7	593.0	926.0	948.6	932.7
	Best	730.0	678.0	626.0	612.0	593.0	926.0	920.0	925.0
	Std	0.0	26.5	9.0	22.7	0.0	0.0	15.8	7.3
	dev	9.6	3.5	4.9	3.7	0.0	0.0	3.4	7.2
BWO	Avg	766.0	806.4	733.3	727.9	681.0	1118.0	1111.1	1054.4
	Best	766.0	748.0	702.0	729.0	681.0	1118.0	1058.0	1017.0
	Std	0.0	35.4	22.7	35.0	0.0	0.0	24.6	30.2
	dev	15.0	14.2	17.6	23.6	14.8	20.7	18.9	17.8

Table 13

Experimental results of example LA09 to LA16. In the table, the numbers are bolded as the optimal value obtained by each algorithm in the evaluation index.

Instance		LA09	LA10	LA11	LA12	LA13	LA14	LA15	LA16
n × m		15 × 5	15 × 5	20 × 5	20 × 5	20 × 5	20 × 5	20 × 5	10 × 10
IBWO-AdaBoost	Avg	951.0	958.0	1227.3	1041.2	1157.1	1292.0	1222.8	959.3
	Best	951.0	958.0	1222.0	1039.0	1150.0	1292.0	1212.0	945.0
	Std	0.0	0.0	9.9	5.6	11.2	0.0	9.2	16.1
	dev	0.0	0.0	0.0	0.0	0.0	0.0	0.2	0.0
IBWO-DT	Avg	951.0	958.0	1225.8	1049.4	1156.3	1292.0	1234.6	972.8
	Best	951.0	958.0	1222.0	1039.0	1150.0	1292.0	1216.0	954.0
	Std	0.0	0.0	5.8	17.5	10.7	0.0	13.0	20.1
	dev	0.0	0.0	0.0	0.0	0.0	0.0	0.8	1.0
IBWO-RF	Avg	951.0	958.0	1226.4	1044.1	1169.1	1292.0	1228.4	959.7
	Best	951.0	958.0	1222.0	1039.0	1150.0	1292.0	1212.0	947.0
	Std	0.0	0.0	8.9	10.1	19.0	0.0	10.1	11.2
	dev	0.0	0.0	0.0	0.0	0.0	0.0	0.4	0.2
IBWO	Avg	961.0	958.0	1273.0	1061.0	1259.0	1316.0	1295.2	1016.3
	Best	961.0	958.0	1273.0	1061.0	1259.0	1316.0	1281.0	994.0
	Std	0.0	0.0	0.0	0.0	0.0	0.0	9.9	19.5
	dev	1.1	0.0	4.2	2.1	10.4	1.9	6.1	5.2
BWO	Avg	1228.0	1031.0	1428.0	1228.0	1386.0	1432.0	1485.2	1216.9
	Best	1228.0	1031.0	1428.0	1228.0	1386.0	1432.0	1432.0	1183.0
	Std	0.0	0.0	0.0	0.0	0.0	0.0	25.6	26.5
	dev	29.1	7.6	16.9	18.2	20.5	10.8	18.6	25.2

Table 14

Experimental results of example LA17 to LA24. In the table, the numbers are bolded as the optimal value obtained by each algorithm in the evaluation index.

Instance		LA17	LA18	LA19	LA20	LA21	LA22	LA23	LA24
n × m		10 × 10	10 × 10	10 × 10	10 × 10	15 × 10	15 × 10	15 × 10	15 × 10
IBWO-AdaBoost	Avg	791.7	873.6	857.6	928.8	1095.1	936.3	1071.3	978.6
	Best	784.0	852.0	843.0	907.0	1056.0	929.0	1038.0	940.0
	Std	9.2	15.3	12.9	23.5	31.7	8.3	17.3	22.7
	dev	0.0	0.5	0.1	0.6	1.0	0.2	0.6	0.5
IBWO-DT	Avg	813.8	870.9	853.8	928.8	1102.8	982.8	1077.6	968.4
	Best	810.0	852.0	843.0	910.0	1082.0	944.0	1061.0	938.0
	Std	4.7	13.2	10.8	13.7	24.1	15.3	13.5	15.2
	dev	3.3	0.5	0.1	0.9	3.4	1.8	2.8	0.3
IBWO-RF	Avg	809.0	881.3	857.1	937.0	1093.3	990.9	1055.6	986.2
	Best	793.0	855.0	844.0	905.0	1063.0	969.0	1034.0	949.0
	Std	16.1	15.4	11.1	27.7	33.0	19.5	15.4	20.9
	dev	1.2	0.8	0.2	0.3	1.6	4.5	0.2	1.5
IBWO	Avg	885.1	949.7	978.3	1135.8	1219.1	1180.2	1191.5	1126.0
	Best	854.0	932.0	927.0	1100.0	1196.0	1168.0	1155.0	1094.0
	Std	22.4	15.9	36.9	27.6	12.7	18.0	14.7	22.7
	dev	8.9	9.9	10.1	22.0	14.3	26.0	11.9	17.0
BWO	Avg	1004.3	1136.0	1021.2	1295.1	1369.9	1327.0	1353.1	1339.0
	Best	983.0	1114.0	1115.0	1263.0	1346.0	1320.0	1324.0	1318.0
	Std	22.7	15.0	34.3	20.0	10.6	6.2	14.5	16.8
	dev	25.4	31.4	32.4	40.0	28.7	42.4	28.3	41.0

Table 15

Experimental results of example LA25 to LA32.

Instance		LA25	LA26	LA27	LA28	LA29	LA30	LA31	LA32
n × m		15 × 10	20 × 10	20 × 10	20 × 10	20 × 10	20 × 10	30 × 10	30 × 10
IBWO-AdaBoost	Avg	998.8	1266.8	1271.6	1258.8	1198.0	1386.9	1848.0	1919.6
	Best	985.0	1239.0	1242.0	1235.0	1163.0	1373.0	1813.0	1883.0
	Std	13.5	37.0	29.8	17.0	30.9	12.0	25.9	35.4
	dev	0.8	1.7	0.6	1.6	1.0	1.3	1.6	1.8
IBWO-DT	Avg	1024.3	1309.6	1288.4	1299.8	1219.18	1411.4	1878.2	1940.3
	Best	993.0	1289.0	1258.0	1261.0	1169.0	1388.0	1846.0	1887.0
	Std	26.9	28.1	19.8	40.8	47.4	25.9	32.7	30.9
	dev	1.6	5.8	1.9	3.7	1.5	2.4	3.5	2.0
IBWO-RF	Avg	1016.4	1285.4	1278.5	1268.9	1213.64	1406.8	1865.1	1912.6
	Best	989.0	1268.0	1241.0	1240.0	1163.0	1381.0	1830.0	1870.0
	Std	27.8	21.3	34.2	27.3	44.8	30.9	31.0	41.3
	dev	1.2	4.1	0.5	2.0	1.0	1.9	2.6	1.1
IBWO	Avg	1092.0	1379.3	1446.7	1342.4	1325.45	1471.4	1899.6	2133.0
	Best	1025.0	1364.0	1428.0	1320.0	1284.0	1449.0	1878.0	2082.0
	Std	44.0	14.4	19.0	18.9	24.8	13.2	22.5	26.6
	dev	4.9	12.0	15.6	8.6	5.6	6.9	5.3	12.5
BWO	Avg	1309.3	1703.2	1603.8	1590.9	1442.73	1662.7	2176.7	2318.3
	Best	1297.0	1693.0	1593.0	1549.0	1380.0	1629.0	2140.0	2290.0
	Std	10.7	12.7	16.3	23.6	34.6	22.6	32.4	24.8
	dev	32.8	39.0	29.0	27.4	13.5	20.2	20.0	23.8

Table 16

Experimental results of example LA33 to LA40. In the table, the numbers are bolded as the optimal value obtained by each algorithm in the evaluation index.

Instance		LA33	LA34	LA35	LA36	LA37	LA38	LA39	LA40
n × m		30 × 10	30 × 10	30 × 10	15 × 15	15 × 15	15 × 15	15 × 15	15 × 15
IBWO-AdaBoost	Avg	1755.1	1778.5	1952.9	1321.1	1440.2	1252.9	1267.5	1251.6
	Best	1733.0	1751.0	1908.0	1296.0	1417.0	1223.0	1247.0	1238.0
	Std	20.1	20.9	31.8	19.3	18.0	19.2	18.0	15.0
	dev	0.8	1.7	1.1	2.2	1.4	2.3	1.1	1.3
IBWO-DT	Avg	1787.8	1805.2	1996.5	1369.7	1482.7	1277.4	1269.8	1284.1
	Best	1755.0	1781.0	1926.0	1357.0	1455.0	1257.0	1267.0	1255.0
	Std	31.7	23.4	49.3	14.7	17.1	15.8	27.5	19.5
	dev	2.09	3.49	2.01	7.02	4.15	5.10	2.76	2.70
IBWO-RF	Avg	1788.0	1792.5	1951.2	1343.7	1465.0	1274.9	1273.1	1275.1
	Best	1779.0	1751.0	1904.0	1321.0	1433.0	1245.0	1258.0	1248.0
	Std	11.8	25.6	29.7	27.4	14.1	23.0	13.3	28.1
	dev	3.5	1.7	0.9	4.2	2.6	4.1	2.0	2.1
IBWO	Avg	1944.1	1965.7	2056.7	1471.2	1541.6	1476.9	1431.2	1390.0
	Best	1938.0	1934.0	2028.0	1456.0	1520.0	1440.0	1380.0	1362.0
	Std	8.0	21.3	35.1	16.9	19.1	26.6	32.1	24.6
	dev	12.7	12.4	7.4	14.8	8.8	20.4	11.9	11.5
BWO	Avg	2141.2	2154.8	2295.3	1678.4	1808.7	1657.0	1664.4	1597.1
	Best	2120.0	2100.0	2252.0	1649.0	1786.0	1641.0	1626.0	1556.0
	Std	26.7	30.7	31.9	28.4	17.0	13.0	26.4	32.7
	dev	23.3	22.0	19.3	30.1	27.9	37.2	31.9	27.3

References

1. Applegate, D.; Cook, W. A computational study of the job-shop scheduling problem. ORSA J. Comput.; 1991; 3, pp. 149-156. [DOI: https://dx.doi.org/10.1287/ijoc.3.2.149]

2. Liu, T.K.; Tsai, J.T.; Chou, J.H. Improved genetic algorithm for the job-shop scheduling problem. Int. J. Adv. Manuf. Technol.; 2006; 27, pp. 1021-1029. [DOI: https://dx.doi.org/10.1007/s00170-004-2283-4]

3. Kui, C.; Li, B. Research on FJSP of improved particle swarm optimization algorithm considering transportation time. J. Syst. Simul.; 2021; 33, pp. 845-853.

4. Zhang, S.; Li, X.; Zhang, B.; Wang, S. Multi-objective optimization in flexible assembly job shop scheduling using a distributed ant colony system. Eur. J. Oper. Res.; 2020; 283, pp. 441-460. [DOI: https://dx.doi.org/10.1016/j.ejor.2019.11.016]

5. Fontes DB, M.M.; Homayouni, S.M.; Gonçalves, J.F. A hybrid particle swarm optimization and simulated annealing algorithm for the job shop scheduling problem with transport resources. Eur. J. Oper. Res.; 2023; 306, pp. 1140-1157. [DOI: https://dx.doi.org/10.1016/j.ejor.2022.09.006]

6. Han, J.; Pei, J.; Tong, H. Data Mining: Concepts and Techniques; Morgan Kaufmann: Burlington, MA, USA, 2022.

7. Zhang, Y.; Liu, Y. Analysis and Application of Data Mining Algorithm Based on Decision Tree. J. Liaoning Petrochem. Univ.; 2007; 27, pp. 78-80.

8. Salama, S.; Kaihara, T.; Fujii, N.; Kokuryo, D. Dispatching rules selection mechanism using support vector machine for genetic programming in job shop scheduling. IFAC-PapersOnLine; 2023; 56, pp. 7814-7819. [DOI: https://dx.doi.org/10.1016/j.ifacol.2023.10.1149]

9. Blackstone, J.H.; Phillips, D.T.; Hogg, G.L. A state-of-the-art survey of dispatching rules for manufacturing job shop operations. Int. J. Prod. Res.; 1982; 20, pp. 27-45. [DOI: https://dx.doi.org/10.1080/00207548208947745]

10. Li, X.N.; Sigurdur, O. Discovering dispatching rules using data mining. J. Sched.; 2005; 8, pp. 515-527. [DOI: https://dx.doi.org/10.1007/s10951-005-4781-0]

11. Shahzad, A.; Mebarki, N. Learning dispatching rules for scheduling: A synergistic view comprising decision trees, tabu search and simulation. Computers; 2016; 5, 3. [DOI: https://dx.doi.org/10.3390/computers5010003]

12. Wang, K.J.; Chen, J.C.; Lin, Y.S. A hybrid knowledge discovery model using decision tree and neural network for selecting dispatching rules of a semiconductor final testing factory. Prod. Plan. Control; 2005; 16, pp. 665-680. [DOI: https://dx.doi.org/10.1080/09537280500213757]

13. Jun, S.; Lee, S. Learning dispatching rules for single machine scheduling with dynamic arrivals based on decision trees and feature construction. Int. J. Prod. Res.; 2021; 59, pp. 2838-2856. [DOI: https://dx.doi.org/10.1080/00207543.2020.1741716]

14. Habib Zahmani, M.; Atmani, B. Multiple dispatching rules allocation in real time using data mining, genetic algorithms, and simulation. J. Sched.; 2021; 24, pp. 175-196. [DOI: https://dx.doi.org/10.1007/s10951-020-00664-5]

15. Baykasoğlu, A.; Göçken, M.; Özbakir, L. Genetic programming based data mining approach to dispatching rule selection in a simulated job shop. Simulation; 2010; 86, pp. 715-728. [DOI: https://dx.doi.org/10.1177/0037549709346561]

16. Balasundaram, R.; Baskar, N.; Sankar, R.S. A new approach to generate dispatching rules for two machine flow shop scheduling using data mining. Procedia Eng.; 2012; 38, pp. 238-245. [DOI: https://dx.doi.org/10.1016/j.proeng.2012.06.031]

17. Wang, C.L.; Li, C.; Feng, Y.P.; Rong, G. Dispatching rule extraction method for job shop scheduling problem. J. Zhejiang Univ. Eng. Sci.; 2015; 49, pp. 421-429.

18. Tian, L.; Junjia, W. Multi-target job-shop scheduling based on dispatching rules and immune algorithm. Inf. Control; 2016; 45, pp. 278-286.

19. Huang, J.P.; Gao, L.; Li, X.Y.; Zhang, C.J. A novel priority dispatch rule generation method based on graph neural network and reinforcement learning for distributed job-shop scheduling. J. Manuf. Syst.; 2023; 69, pp. 119-134. [DOI: https://dx.doi.org/10.1016/j.jmsy.2023.06.007]

20. Zhong, C.; Li, G.; Meng, Z. Beluga whale optimization: A novel nature-inspired metaheuristic algorithm. Knowl.-Based Syst.; 2022; 251, 109215. [DOI: https://dx.doi.org/10.1016/j.knosys.2022.109215]

21. Xiong, H.; Shi, S.; Ren, D.; Hu, J. A survey of job shop scheduling problem: The types and models. Comput. Oper. Res.; 2022; 142, 105731. [DOI: https://dx.doi.org/10.1016/j.cor.2022.105731]

22. Loh, W.Y. Classification and regression trees. Wiley Interdiscip. Rev. Data Min. Knowl. Discov.; 2011; 1, pp. 14-23. [DOI: https://dx.doi.org/10.1002/widm.8]

23. Ying, C.; Qi-Guang, M.; Jia-Chen, L.; Lin, G. Advance and prospects of AdaBoost algorithm. Acta Autom. Sin.; 2013; 39, pp. 745-758.

24. Yuan, Y.; Xu, H.; Yang, J. A hybrid harmony search algorithm for the flexible job shop scheduling problem. Appl. Soft Comput.; 2013; 13, pp. 3259-3272. [DOI: https://dx.doi.org/10.1016/j.asoc.2013.02.013]

25. Poli, R. Exact schema theory for genetic programming and variable-length genetic algorithms with one-point crossover. Genet. Program. Evolvable Mach.; 2001; 2, pp. 123-163. [DOI: https://dx.doi.org/10.1023/A:1011552313821]

26. Thede, S.M. An introduction to genetic algorithms. J. Comput. Sci. Coll.; 2004; 20, pp. 115-123.

27. Tang, J.; Alelyani, S.; Liu, H. Feature selection for classification: A review. Data Classif. Algorithms Appl.; 2014; 37, pp. 37-64.

28. Xiong, H.; Fan, H.; Jiang, G.; Li, G. A simulation-based study of dispatching rules in a dynamic job shop scheduling problem with batch release and extended technical precedence constraints. Eur. J. Oper. Res.; 2017; 257, pp. 13-24. [DOI: https://dx.doi.org/10.1016/j.ejor.2016.07.030]

29. Wang, J.; Chen, Y. Data-driven Job Shop production scheduling knowledge mining and optimization. Comput. Eng. Appl.; 2018; 54, pp. 264-270.

30. Bergstra, J.; Komer, B.; Eliasmith, C.; Yamins, D.; Cox, D.D. Hyperopt: A python library for model selection and hyperparameter optimization. Comput. Sci. Discov.; 2015; 8, 014008. [DOI: https://dx.doi.org/10.1088/1749-4699/8/1/014008]

31. De Ville, B. Decision trees. Wiley Interdiscip. Rev. Comput. Stat.; 2013; 5, pp. 448-455. [DOI: https://dx.doi.org/10.1002/wics.1278]

32. Biau, G. Analysis of a random forests model. J. Mach. Learn. Res.; 2012; 13, pp. 1063-1095.

33. Wang, L. Job-Shop Scheduling and Its Genetic Algorithm; Tsinghua University Press: Beijing, China, 2003.

Word count: 12689

Show less

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

In this paper, an improved Beluga Whale Optimization algorithm based on data mining and scheduling rules with AdaBoost(IBWO-DDR-AdaBoost) rule heuristic scheduling method for solving job-shop scheduling problems (JSP) is proposed, in which data mining-extracted dispatching rules are incorporated into the heuristic algorithm to guide the optimization process. Firstly, an AdaBoost-CART integrated learning algorithm is introduced to evolve dispatching knowledge from historical data and convert it into effective dispatching rules. Secondly, in order to address the issues of local optimality and slow convergence speed faced by the beluga whale optimization algorithm (BWO) when solving JSP, this study presents an improved beluga whale optimization algorithm (IBWO) that incorporates two enhancement mechanisms: a neighborhood search strategy based on greedy thinking and genetic operators. These enhancements aim to improve both the efficiency and quality of reconciliation in scheduling, ultimately leading to better scheduling schemes. Furthermore, the extracted scheduling rules obtained through the AdaBoost-CART integrated learning algorithm are embedded into the improved beluga optimization algorithm, enabling real-time solution updates for optimized schedules. Finally, extensive simulation tests are conducted on JSP benchmark examples of varying scales with minimizing maximum completion time as the objective function for schedule optimization. The simulation results demonstrate the significant advantages of our proposed IBWO-DDR-AdaBoost rule heuristic scheduling method in terms of accuracy, performance optimization, and convergence speed.

Details

Title

Evolving Dispatching Rules in Improved BWO Heuristic Algorithm for Job-Shop Scheduling

Author

Zhang, Zhen; Jin, Xin

; Wang, Yue

First page

2635

Publication year

2024

Publication date

2024

Publisher

MDPI AG

e-ISSN

20799292

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/electronics13132635

ProQuest document ID

3079024767

Evolving Dispatching Rules in Improved BWO Heuristic Algorithm for Job-Shop Scheduling

Jump to:

Full Text

Abstract

Details

Suggested sources