Int J Adv Manuf Technol (2010) 50:729–747 DOI 10.1007/s00170-010-2518-5
ORIGINAL ARTICLE
Evolving scheduling rules with gene expression programming for dynamic single-machine scheduling problems
Li Nie & Xinyu Shao & Liang Gao & Weidong Li
Received: 9 July 2009 / Accepted: 4 January 2010 / Published online: 4 February 2010 # Springer-Verlag London Limited 2010
Abstract The paper considers the problems of scheduling n jobs that are released over time on a machine in order to optimize one or more objectives. The problems are dynamic single-machine scheduling problems (DSMSPs) with job release dates and needed to be solved urgently because they exist widely in practical production environment. Gene expression programming-based scheduling rules construc- tor (GEPSRC) was proposed to construct effective sched- uling rules (SRs) for DSMSPs with job release dates automatically. In GEPSRC, Gene Expression Programming (GEP) worked as a heuristic search to search the space of SRs. Many experiments were conducted, and comparisons were made between GEPSRC and some previous methods. The results showed that GEPSRC achieved significant improvement.
Keywords Singlemachinescheduling.Dynamic scheduling . Release dates . Scheduling rules . Gene expression programming
1 Introduction
Scheduling plays an important role in a shop floor control system, which has a significant impact on the performance
L. Nie:X. Shao:L. Gao (*)
The State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People’s Republic of China e-mail: gaoliang@mail.hust.edu.cn
W. Li
Faculty of Engineering and Computing, Coventry University, Coventry, UK
of the shop floor. Scheduling is to allocate scarce resources (usually are machines) to activities (usually are jobs) with the objective of optimizing one or more performance criteria (for instance, minimizing makespan, flow time, lateness, or tardiness) [1]. In recent years, more and more effective scheduling methods for shop floor control have emerged with the developments in scheduling methodolo- gies (in research and in practice) as well as technological advances in computing.
Scheduling problems investigated by researchers for several decades may be categorized roughly into two types, static scheduling problems and dynamic scheduling prob- lems. In static scheduling problems, it is usually assumed that the attributes of all jobs to be scheduled are available simultaneously at the start of the planning horizon and unchanged during the planning horizon. The assumption is made mainly for the sake of convenience to model the system considered and solve the scheduling problems that exist. However, the assumption does not always accord with the practical production environment, since there are always all kinds of random and unpredictable events that occur. For example, jobs arrive continuously over time, machines break down and are repaired, and the due dates of jobs are changed during processing. It is rarely possible that the attributes of all jobs to be scheduled are available at the start of planning horizon and unchanged during the horizon. Most scheduling problems that exist in such environment are called dynamic scheduling problems [2]. Dynamic scheduling problems have attracted more and more atten- tion. For example, Kianfar et al. [3] studied a flexible flow shop system considering dynamic arrival of jobs; Wang et al. [4] considered the single-machine scheduling problem with a deteriorating function, which means that the actual job processing time is a function of jobs already processed; Ham and Fowler [5] considered the scheduling of batching
730
Int J Adv Manuf Technol (2010) 50:729–747
operations with job release dates in wafer fabrication facilities. Although some of static scheduling problems are often solvable exactly in polynomial time, most of them are NP-hard. Dynamic scheduling problems are usually more difficult to solve than static ones.
The paper considers the problems of scheduling n jobs that are released over time on a machine in order to optimize one or more objectives, which are dynamic single- machine scheduling problems (DSMSPs) with job release dates. The problems are considered for the following reasons: First, dynamic scheduling problems exist widely in practice and need to be solved urgently, although they pose bigger challenges than static scheduling problems. Secondly, single-machine scheduling problems often form components of solutions for more complex scheduling environments. For example, a job shop scheduling problem may be decomposed into single-machine sub-problems [6].
Static scheduling problems have been studied for almost half of a century, and many effective methods have been proposed. At the beginning, many enumerative-based techniques have been developed. Although enumerative methods such as branch and bound usually provide optimal solutions, the cost of computation time is huge even for a moderate size problem [7]. In the last decades, therefore, many heuristic methods, including dispatching rules [8] and search-based methods, such as simulated annealing [9], tabu search [10], and genetic algorithms (GAs) [11] have been developed to solve larger problems in a reasonable time. Search-based methods usually offer high-quality solutions. However, neither enumerative-based techniques nor search-based methods are appropriate in dynamic conditions, because once the schedule is prepared, the processing sequence of all jobs is determined, and it is inevitable to modify the schedule frequently to respond to the change of the system.
Over the last two decades, much effort has been made to propose new strategies or approaches to solve dynamic scheduling problems. Aytug et al. [12] categorized roughly existing strategies into three classes: completely reactive approaches, robust scheduling approaches, and predictive– reactive approaches. Completely reactive approaches have been widely employed in a large number of scheduling systems and formed the backbone of much industrial scheduling practice. The approaches are characterized by least commitment strategies such as real-time dispatching that create partial schedules according to the current state of the shop floor and the production objectives. Many heuristics, also called dispatching rules, are frequently used to examine the jobs waiting in processing at the given machine or those that arrive in the immediate future, at each time t when the machine is idle and to compute a priority value for each job. The next job to be processed is selected from them by sorting and filtering them according to the
priority values assigned to them and selecting the job at the head of the resulting list. The priority function which is encapsulated in the heuristic and assigns values to jobs is usually called with the term scheduling rules (SRs) [1].
Several important achievements on DSMSPs with job release dates are reviewed below. An online algorithm to minimize makespan problem, now commonly called list scheduling, was firstly investigated by Graham [13]. It is a simple greedy algorithm and does not use the information about processing times of jobs. Similar to the work of Graham, other researchers made other research on the online heuristics and achieved many results. As for the total completion time problem, if all jobs are released at the same time, Smith showed that the problem can be solved optimally by the well-known shortest processing time (SPT) rule [14]. For the preemptive version, Baker’s work showed that it is easy to construct an optimal schedule online by always running the job with shortest remaining processing time (SRPT) [15]. In the case of single-machine non-preemptive scheduling for minimizing the total com- pletion time, Hoogeveen and Vestjens [16] gave online 2- approximation algorithms, called delayed SPT rule, and proved that the lower bound on online scheduling is 2. Phillips et al. [17] gave a different 2-competitive algorithm called PSW algorithm, which converts preemptive sched- ules to non-preemptive schedules while only increasing the total completion time by a small constant factor. It is noticeable that it was not the average flow time of a set of jobs that was studied in the literature. Although average flow time is equivalent to average completion time at optimality, Kellerer et al. [18] have shown that the approximability of these two criteria can be quite different. Guo et al. [19] modified the PSW algorithm to solve minimizing total flow time on a single machine with job release dates and proved that this new algorithm yields good solutions for the problem on average. Other objective functions were rarely considered under the model of the dynamic single-machine scheduling with job release dates. For a review on online scheduling results, the comprehen- sive reviews of [20, 21] are referred. Apart from these simple online heuristics, other classical scheduling rules were also reported in literatures, which are the results of decades of research [22].
The general conclusion on scheduling rules is that no rule performs consistently better than all other rules under a variety of shop configurations, operating conditions, and performance objectives, because the rules have all been developed to address a specific class of system config- urations relative to a particular set of performance criterion and generally do not perform well in another environment or for other criteria. Therefore, many researchers made effort to exploit several methods based on artificial intelligence to learn to select rules dynamically according
Int J Adv Manuf Technol (2010) 50:729–747
731
to the change of the system’s state from a number of candidates. For example, Jeong and Kim [23] and Yin and Rau [24] used simulation approach; Chen and Yih [25] and El-Bouri and Shah [26] used neural network; Aytug et al. [27] used genetic learning approach; Trappey et al. [28] used expert systems; Singh et al. [29] used the approach of identifying the worst performing criterion; and Yang and Yan [30] used adaptive approach. These methods are mainly based on learning to select a given rule from among a number of candidates rather than identifying new and potentially more effective rules. However, significant breakthrough beyond current applications of artificial intelligence to production scheduling have been made by other researchers who made it possible to automatically construct effective rules for a given scheduling environ- ments. One of the typical works was the learning system SCRUPLES proposed by Geiger et al. [31]. The system combined an evolutionary learning mechanism, i.e., Genet- ic Programming (GP) [32], with a simulation model of the industrial facility under study, which automates the tedious process of developing new scheduling rules for a given environment which involves implementing different rules in a simulation model of the facility under study and evaluates the rules through extensive simulation experi- ments. Other existing similar researches include: Dimopoulos and Zalzala [33] evolved rules with GP for single-machine tardiness problem; Yin et al. [34] evolved rules with GP for single-machine scheduling subject to breakdowns; Jakobovic and Budin [1] evolved rules with GP for dynamic single machine and job shop problem; Atlan et al.[35] and Miyashita [36] applied GP mainly to classic job shop tardiness scheduling; and Tay and Ho [37-39] focused on evolving rules with GP for flexible job shop problem. The characteristic shared by these works is that it is the space of algorithms but not that of potential scheduling solutions is searched with an evolutionary learning mech- anism. The point in the space of potential scheduling solutions presents only a solution to the specific scheduling instance, which means that a new solution must be found for different initial conditions. While the point in the space of algorithms represents a solution for all of the problems, instances in a scheduling environment with an algorithm can be used to generate a schedule [1]. However, these GP- based approaches mentioned above have a huge cost on computation time, and the constructed rules are usually formulized complexly.
In this research, gene expression programming-based SR constructor (GEPSRC) was proposed to automatically discover effective SRs for DSMSPs with job release dates. Gene Expression Programming (GEP), one of the evolu- tionary algorithms, worked as a heuristic search to search the space of algorithms but not that of potential scheduling solutions. The proposed approach was evaluated in a
variety of single-machine environment where the jobs arrive over time. GEP was usually possible to discover rules that are competitive with those evolved by GP and the classical heuristics selected from literature. In addition, the computation requirement for training GEP to discover high performing rules is much less than that of GP.
The remainder of the paper is organized as follows. Section 2 gives the statement of the DSMSPs with job release dates. Section 3 describes the heuristic for the scheduling problems. Section 4 proposes the framework and mechanism of the GEPSRC and describes the applica- tion of GEPSRC on the scheduling problems. An extensive computational study is conducted within the single-machine environment to assess the efficiency and robustness of the autonomous SRs constructing approach. The experiments and results are provided in Section 5. Section 6 is the conclusion and future work.
2 Statement of dynamic single-machine scheduling problems
The DSMSPs with job release dates is described as follows. The shop floor consists of one machine and n jobs, which are released over time and are processed once on the machine without preemption. Each job can be identified with several attributes, such as processing time pi, release date ri, due date di, and weight wi, which denotes the relative importance of job i, i = 1,…, n. The attributes of a job are unknown in advance unless the job is currently available at the machine or arrive in the immediate future. It is also assumed that the machine cannot process more than one job simultaneously. The scheduling objective is to determine a sequence of jobs on the machine in order to minimize one or more optimization criteria, in our case, makespan, total flow time, maximum lateness, and total tardiness, respectively. For the sake of convenience, we
assume all jobs relatively equal, i.e., wj = 1. Then, performance criteria considered are defined below.
Cmax 1⁄4maxðci;i1⁄41;:::;nÞ
Xn
F1⁄4 ðciriÞ
i1⁄41
Lmax 1⁄4maxðci di;i1⁄41;:::;nÞ
the four
ð1Þ
ð2Þ
ð3Þ
ð4Þ
T 1⁄4
Xn i1⁄41
maxðci di; 0Þ
732
Int J Adv Manuf Technol (2010) 50:729–747
where ci denotes the finishing time of job i. Cmax, F, Lmax, and T denote makespan, total flow time, maximum lateness, and total tardiness of a problem instance, respectively.
Since GEP is used to search the space of algorithms but not that of potential scheduling solutions in the paper; the scheduling algorithms are evaluated on a large number of training sets or test sets of problem instances, which represent different operating conditions relative to different performance criteria, respectively. In order for all the training sets or test sets to have a similar influence to the overall performance estimation of an algorithm, average criterion value over the training set or test set of problem instances are defined as below.
Xt Xt
jCmaxj 1⁄4 1 Cmax;j 1⁄4 1 max cij;i 1⁄4 1;:::;nj ð5Þ
t j1⁄41 njpj t j1⁄41 njpj
Pnj 1Xt F 1Xt cijrij
jFj1⁄4j1⁄4i1⁄41 ð6Þ t j1⁄41 njpj t j1⁄41 njpj
ð7Þ
ð8Þ
where Cmax,j, Fj, Lmax,j, and Tj denote the makespan, total flow time, maximum lateness, and total tardiness of problem instance j, respectively; nj denotes the number of job in problem instance j; pj denotes the mean processing time of all jobs in problem instance j; cij, rij, and dij denote completion time, release date, and due date of job i in problem instance j, respectively; t denotes the number of instances in a training set or test set; and |Cmax|, |F|, |Lmax|, and |T| represent the average value of makespan, flow time, maximum lateness, and tardiness over the training set or test set of problem instances. It is obvious that algorithms with less objective values of |Cmax|, |F|, |Lmax|, and |T| are better.
3 Heuristic for DSMSPs with job release dates
In static circumstance, since the attributes of all jobs to be scheduled are available at the beginning of planning horizon (referred to be time 0) and unchanged during the planning horizon, the whole schedules usually can be made at the beginning. However, it is not convenient in dynamic conditions where jobs arrive over time and the release dates cannot be known in advance. At any time, some jobs have arrived and others may arrive in some future moment. In this section, we describe a heuristic for the scheduling problems on a single machine with job release dates, and the release dates are unknown in advance unless the jobs will arrive in the immediate future. This heuristic was proposed firstly by Jakobovic and Budin [1].
jLmaxj1⁄41
Lmax;j 1⁄41
t j1⁄41
njpj
Pnj max cij dij;0
njpj
Pt
t j1⁄41 njpj
Pt max c d ;i1⁄41;:::;n ðij ij jÞ
1 Xt t j1⁄41 njpj t j1⁄41
1 Xt
jTj1⁄4 j1⁄4 i1⁄41
T
Heuristic for DSMSPs with job release dates: Initialize t = 0, where the machine is idle at time t; While there are unscheduled jobs do
JSs(t) = {all jobs satisfied with wtj < Pmin(t) };
Calculate priority values for all the jobs in JSs(t) according to a SR;
Schedule the job with the best priority on the machine, and denote the job with J*; Update the machine’s idle time, i.e. t = the completion time of J*;
End while.
Where JSs(t) represents the set of the jobs to be taken into consideration for scheduling at time t; wtj denotes the waiting time for the arrival of the job j, i.e., wtj = max{rj − t, 0}; Pmin(t)denotes the shortest processing time of the jobs that already arrived but are unscheduled at time t.
It is obvious that the JSs(t) consists of two types of jobs: the jobs that already arrived but are unscheduled at time t
(denoted with AType) and those that are expected to arrived soon and satisfy wtj