A reinforcement learning based RMOEA/D for bi-objective fuzzy flexible job shop scheduling

doi:10.1016/j.eswa.2022.117380

Expert Systems with Applications

Volume 203, 1 October 2022, 117380

https://doi.org/10.1016/j.eswa.2022.117380 Get rights and content

Highlights

•
The bi-objective FFJSP with two objectives is considered.
•
An adaptive MOEA/D with VNS is proposed.
•
The results indicate the superior performance of our approach.

Abstract

The flexible job shop scheduling problem (FJSP) is significant for realistic manufacturing. However, the job processing time usually is uncertain and changeable during manufacturing. This paper presents a multi-objective FJSP with fuzzy processing time (MOFFJSP) for optimizing the makespan and total machine workload as objectives. To solve the MOFFJSP, a MOEA/D based on reinforcement learning named RMOEA/D is proposed. RMOEA/D can be featured as: (i) an initial strategy with three rules is used to get a high-quality initial population; (ii) a parameter adaption strategy based on Q-learning is proposed to guide the population choose the best parameter to increase diversity; (iii) a variable neighborhood search based on reinforcement learning is designed to lead the solution to choose the right local search method; and (iv) an elite archive is used to improve the usage rate of the abandoned historical solution. RMOEA/D is compared with five well-known realted methods, i.e., MOEA/D, NSGA-II, MOEA/D-M2M, NSGA-III and IAIS on three benchmark suites. The results show that RMOEA/D outperforms these five state-of-art algorithms.

Introduction

With the development of economic globalization, traditional flexible manufacturing met a quite huge challenge. And it is difficult to satisfy the requirement of the market (Lang et al., 2021, Rifai et al., 2021);. Flexible job shop scheduling problem (FJSP) is a classical scheduling problem has been intensively studied over the past decades. Many heuristic algorithms have been proposed for FJSP such as genetic algorithm (GA) (Yuan et al., 2020), artificial bee colony algorithm (ABC) (Li, Huang, et al., 2020), two-phase meta-heuristic (Lei et al., 2019), Jaya (Caldeira & Gnanavelbabu, 2021), and teaching–learning-based optimization (TLBO) (Lei et al., 2018). However, fixed processing time is too idealized to simulate the practical manufacturing. In practical flexible manufacturing, the processing time is uncontrollable and it floats between an interval (Pan et al., 2021, Zhu and Zhou, 2021). So it is necessary to fuzzy the processing time for FJSP. FJSP with fuzzy processing time (FFJSP) is an extension of FJSP. FJSP has been proved as an NP-hard problem and FFJSP is also an NP-hard problem (Pavlov et al., 2019). Moreover, it is worth for studying how to efficiently solve FFJSP.

Furthermore, with the development of intelligent manufacturing and industrial 4.0, many industrial start to consider the low energy consumption manufacturing (Lu et al., 2021). Many energy-aware scheduling models have been proposed including distributed hybrid flow shop (Shao et al., 2021b),distributed permutation flow shop (Lu et al., 2022), distributed flow shop with heterogeneous factory (Lu et al., 2021), distributed reentrant permutation flow shop scheduling (Rifai et al., 2021), distributed heterogeneous welding flow shop (Wang et al., 2021). However the energy aware FFJSP is seldom considered. The energy consumption is related to machine workload, idle time, and turn on/off times (Meng et al., 2020, Wang and Wang, 2021). And the main effect part of energy consumption is machine workload. Thus, multi-objective FFJSP (MOFFJSP) with minimizing makespan and total workload is significantly considered.

Inspired by scalar objective optimization problem, Zhang and Li (2007) proposed multi-objective evolutionary algorithm based on decomposition (MOEA/D) for multi-objective optimization problem (MoP). Based on reference vectors and the Tchebicheff function, MOEA/D can get good convergence and perform well diversity simultaneously. Recently, reinforcement learning (RL) is truing into a hot topic due to its strong ability of decision and optimization. Cooperation between the RL and evolutionary algorithms is a promising direction to solve complex optimization problems (Gong et al., 2021, Shiue et al., 2018, Zhao et al., 2019). Facing the complex optimization problems, it is expected to achieve superior performance via the synergy between RL and evolutionary computation as well as the problem-specific operators within the MOEA/D framework.

The MOFFJSP has been extensively studied for several years. However, in the existing literature, most methods execute local search strategies by poll pattern, which is inefficient and blind. Moreover, the performance of the algorithm is limited to parameter selection. A lot of black-box testing is performed to find the best parameter. Parameter selection problem lacks prior knowledge and it is time-consuming. To optimize makespan and total machine workload simultaneously, it is challenging and significant to develop effective algorithms for the MOFFJSP. Motivated by the above problems, this paper proposed an RL-based MOEA/D (RMOEA/D), which contains the following innovations: First in order to guide each solution adaptively to select the best local search strategy, a variable neighborhood search based on RL (RVNS) is proposed. Second, to make MOEA/D automatically adjust the parameter $T$ , a parameter selection strategy based on Q-learning (Q-PAS) is designed. Next, an initial method integrating a variety of initial strategies is designed to get a high convergence and diverse population. Then, a discrete crossover and mutation method is used to obtain a large searching step. Moreover, an elite archive is applied to improve the utilization rate of abandoned solutions. Finally, to verify the performance of RMOEA/D, extensive numerical tests are carried out and the comparative results demonstrate the effectiveness of the above designs and the superiority of the proposed algorithm in solving the MOFFJSP.

The main contributions of this paper go in five directions.

(1)
An initial strategy which combines the advantage of the three strategies is designed to provide an initial population with high convergence and diversity.
(2)
A parameter selection strategy based on Q-learning is proposed to let MOEA/D automatically select the best $T$ to improve the diversity of PF.
(3)
A variable neighborhood search method based on RL is proposed to efficiently execute several local search strategies.
(4)
An elite archive is designed to collect the historical elite solution to increase the utilization rate of abandoned solutions.
(5)
The performance of RMOEA/D is executed on 23 FFJSP instances with different features. Experimental results show that RMOEA/D is superior to state-of-art algorithms under the condition of faster convergence.

The rest of this paper is organized as follows. Section 2 illustrates the literature review and some basic concept of triangular fuzzy processing time. In Section 3, the problem statement and modeling are introduced. In Section 4, the details of our approach RMOEA/D are reported. Numerical test experiments on RMOEA/D and discussion are shown in Section 5 and conclusions are summarized in Section 6.

Section snippets

Related work and background knowledge

In this section the literature of FFJSP, MOEA/D and RL applied to scheduling are briefly reviewed.

Problem statement

A flexible job shop scheduling problem with fuzzy processing time from a real-world manufacturing process can be described as follow. $J = {J_{1}, J_{2}, \dots, J_{i}$ $, \dots, J_{n}}$ is the job set and $M = {M_{1}, M_{2}, \dots, M_{k}, \dots, M_{m}}$ is the machine set. Each job $J_{i}$ has a set of $Θ_{i}$ operations, $O_{i} = {O_{i, 1}, O_{i, 2}, \dots, O_{i, j}, \dots$ $, O_{i, Θ_{i}}}$ , $O_{i, j} \in O_{i}$ . Each operation can be processed on part of machines or all machines. And the processing time is a TFN ${\tilde{P}}_{i, j, k} = (p_{1}, p_{2}, p_{3})$ . FFJSP includes two subproblems, machine assignment and operation sequencing. The

Framework of RMOEA/D

In this section, framework of the proposed algorithm MOEA/D based on RL is introduced, which is stated as following: first, the parameters and population are initialized. Then, perform variable neighborhood search for each solution to improve they exploitation. Next, apply Q-learning to select a parameter $T$ for MOEA/D. Moreover, execute MOEA/D to generate new solution by using $T$ . Finally, according to the change of PF’s convergence and diversity, Q-table will be updated. And the elite archive

Experimental results

In Section 4, the RMOEA/D algorithm has been described in details. In this section, we design detailed experiments to evaluate RMOEA/D’s performance. RMOEA/D and the comparison algorithms are coded in MATLAB on an Intel Core $i 7$ 6700 CPU @ 3.4 GHz with 8G RAM. For fairness, all algorithm runs 30 independent times on each instance. Noting that, to verify the convergence and diversity of the proposed algorithm, after 30 independent runs, the average results are collected for performance comparison.

Conclusion

This paper proposed a RMOEA/D that combined two RL techniques to solve multi-objective fuzzy flexible job shop scheduling problems. The objective is to minimize the fuzzy makespan and total workload. As a classical adaption technique, RL can guide the algorithm to automatically select the best parameter or local search strategy. A novel Q-learning parameter adaption method including state definition, action definition, and reward definition is designed to help multi-objective optimization

CRediT authorship contribution statement

Rui Li: Resources, Project administration, Software, Data curation, Writing – original draft, Writing – review & editing. Wenyin Gong: Funding acquisition, Supervision, Conceptualization, Methodology, Writing – review & editing. Chao Lu: Methodology, Writing – review & editing.

Acknowledgments

This work was partly supported by the National Natural Science Fund of China under Grant Nos. 62076225 and 62073300, and the Natural Science Fund for Distinguished Young Scholars of Hubei, China under Grant No. 2019CFA081. All authors approved the version of the manuscript to be published.

References (60)

AhmadiE. et al.
A hybrid method of 2-TSP and novel learning-based GA for job sequencing and tool switching problem
Applied Soft Computing
(2018)
CaldeiraR.H. et al.
A Pareto based discrete jaya algorithm for multi-objective flexible job shop scheduling problem
Expert Systems with Applications
(2021)
DorfeshanY. et al.
A new weighted distance-based approximation methodology for flow shop scheduling group decisions under the interval-valued fuzzy processing time
Applied Soft Computing
(2020)
DuY. et al.
MOEA based memetic algorithms for multi-objective satellite range scheduling problem
Swarm and Evolutionary Computation
(2019)
GaoK.Z. et al.
A two-stage artificial bee colony algorithm scheduling flexible job-shop scheduling problem with new job insertion
Expert Systems with Applications
(2015)
GaoK.Z. et al.
An improved artificial bee colony algorithm for flexible job-shop scheduling problem with fuzzy processing time
Expert Systems with Applications
(2016)
LangS. et al.
NeuroEvolution of augmenting topologies for solving a two-stage hybrid flow shop scheduling problem: A comparison of different solution strategies
Expert Systems with Applications
(2021)
LeiD.
Co-evolutionary genetic algorithm for fuzzy flexible job shop scheduling
Applied Soft Computing
(2012)
LiY. et al.
An improved artificial bee colony algorithm for solving multi-objective low-carbon flexible job shop scheduling problem
Applied Soft Computing
(2020)
LiJ. et al.
Improved artificial immune system algorithm for type-2 fuzzy flexible job shop scheduling problem
IEEE Transactions on Fuzzy Systems
(2020)

LinJ.

A hybrid biogeography-based optimization for the fuzzy flexible job-shop scheduling problem

Knowledge-Based Systems

(2015)

LinJ.

Backtracking search based hyper-heuristic for the flexible job-shop scheduling problem with fuzzy processing time

Engineering Applications of Artificial Intelligence

(2019)

LuC. et al.

A Pareto-based collaborative multi-objective optimization algorithm for energy-efficient scheduling of distributed permutation flow-shop with limited buffers

Robotics and Computer-Integrated Manufacturing

(2022)

LuoS.

Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning

Applied Soft Computing

(2020)

PalombariniJ.A. et al.

Closed-loop rescheduling using deep reinforcement learning

IFAC-PapersOnLine

(2019)

RifaiA.P. et al.

Multi-objective distributed reentrant permutation flow shop scheduling with sequence-dependent setup time

Expert Systems with Applications

(2021)

SakawaM. et al.

Fuzzy programming for multiobjective job shop scheduling with fuzzy processing time and fuzzy duedate through genetic algorithms

European Journal of Operational Research

(2000)

ShahrabiJ. et al.

A reinforcement learning approach to parameter estimation in dynamic job shop scheduling

Computers & Industrial Engineering

(2017)

ShaoW. et al.

Multi-objective evolutionary algorithm based on multiple neighborhoods local search for multi-objective distributed hybrid flow shop scheduling problem

Expert Systems with Applications

(2021)

ShiueY.-R. et al.

Real-time scheduling for a smart factory using a reinforcement learning approach

Computers & Industrial Engineering

(2018)

WangG. et al.

Energy-efficient distributed heterogeneous welding flow shop scheduling problem using a modified MOEA/D

Swarm and Evolutionary Computation

(2021)

WangJ.-j. et al.

A cooperative memetic algorithm with learning-based agent for energy-aware distributed hybrid flow-shop scheduling

IEEE Transactions on Evolutionary Computation

(2021)

XuY. et al.

An effective teaching-learning-based optimization algorithm for the flexible job-shop scheduling problem with fuzzy processing time

Neurocomputing

(2015)

YuanS. et al.

A co-evolutionary genetic algorithm for the two-machine flow shop group scheduling problem with job-related blocking and transportation times

Expert Systems with Applications

(2020)

ZhangJ. et al.

MOEA/D with many-stage dynamical resource allocation strategy to solution of many-objective OPF problems

International Journal of Electrical Power & Energy Systems

(2020)

ZhuZ. et al.

A multi-objective multi-micro-swarm leadership hierarchy-based optimizer for uncertain flexible job shop scheduling problem with job precedence constraints

Expert Systems with Applications

(2021)

BrandimarteP.

Routing and scheduling in a flexible job shop by tabu search

Annals of Operations Research

(1993)

DebK. et al.

An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, Part I: Solving problems with box constraints

IEEE Transactions on Evolutionary Computation

(2014)

DebK. et al.

A fast and elitist multiobjective genetic algorithm: NSGA-II

IEEE Transactions on Evolutionary Computation

(2002)

GaoK.Z. et al.

An effective discrete harmony search algorithm for flexible job shop scheduling problem with fuzzy processing time

International Journal of Productions Research

(2015)

Cited by (66)

Effective metaheuristic and rescheduling strategies for the multi-AGV scheduling problem with sudden failure
2024, Expert Systems with Applications
This paper investigates a novel scheduling problem, the multi-AGV scheduling problem with sudden failure (MASP-SF), which is of great importance in modern intelligent manufacturing systems. The problem considers how to assign tasks to multiple AGVs with the objective of minimizing the total cost under two events of normal production and sudden failure in the workshop. In the event of normal production, a population-based variable neighborhood search (PVNS) algorithm with effective strategies such as rule-based stochastic method in the initialization stage, pancake flipping strategy in the shaking stage, improved neighborhood structures in the variable neighborhood descent stage is presented to solve the multi-AGV scheduling problem. In the event of sudden failure, a mixed-integer linear programming model and two rescheduling strategies are proposed to solve MASP-SF. The two rescheduling strategies are the rapid repair rescheduling strategy and a rescheduling strategy based on an improved NNH. To validate the above ideas, we conduct comprehensive and in-depth experiments on a practical plant and statistical analysis of the resulting data. The results show that the scheduling solutions generated by the rescheduling strategies are feasible, and the proposed PVNS algorithm has a superior performance compared to existing algorithms in multi-AGV scheduling problem.
Evolutionary algorithm incorporating reinforcement learning for energy-conscious flexible job-shop scheduling problem with transportation and setup times
2024, Engineering Applications of Artificial Intelligence
Flexible job-shop scheduling is considerably important in the modern intelligent manufacturing factory. In a real job shop, transportation and setup times account for a large percentage of the total processing flow, and with today's companies demanding higher delivery times, the feasibility and punctuality of scheduling will be considerably reduced if these time constraints are ignored. Recently, several companies have become green in their manufacturing processes. However, transportation, setup, and delivery times have rarely been combined with energy efficiency. To solve this problem, we employed an integer programming approach to develop a complete mathematical model of the problem and simultaneously optimized four objectives: maximum completion time, total energy consumption, workload of critical machines, and penalties for earliness/tardiness. Subsequently, an evolutionary algorithm incorporating reinforcement learning was proposed to solve the model. The algorithm had the following features: (1) four initialization strategies were designed to obtain high-quality populations; (2) a reinforcement learning-based parameter-adaptive strategy was proposed to guide the population to select the best parameters; (3) a critical path-based neighborhood structure with transportation and setup times was designed, and according to the objectives of this study, four additional neighborhood structures were designed; (4) a reference point-based non-dominated sorting selection was presented to guide the solution toward the Pareto-optimal front; and (5) an external archive was proposed to enhance the utilization of abandoned historical solutions. Finally, the effectiveness of this algorithm was demonstrated using 33 benchmark instances of variants and comparison experiments.
A novel collaborative agent reinforcement learning framework based on an attention mechanism and disjunctive graph embedding for flexible job shop scheduling problem
2024, Journal of Manufacturing Systems
The Flexible Job Shop Scheduling Problem (FJSP), a classic NP-hard optimization challenge, has a direct impact on manufacturing system efficiency. Considering that the FJSP is more complex than the Job Shop Scheduling Problem (JSSP) due to its involvement of both job and machine selection, we have introduced a collaborative agent reinforcement learning (CARL) architecture to tackle this challenge for the first time. To enhance Co-Markov decision process, we introduced disjunctive graphs for the representation of state features. However, the representation of states and actions often leads to suboptimal solutions due to intricate variability. To achieve superior outcomes, we refined our approach to representing states and actions. During the solving process, we employed Graph Attention Network (GAT) to extract global state information from the disjunctive graph and used a Transformer Encoder to quantitatively capture the competitive relationships among machines. We configured two independent encoder–decoder components for job and machine agents, enabling the generation of two distinct action strategies. Finally, we employed the Soft Actor–Critic (SAC) algorithm and an integrated Deep Q Network (DQN) known as D5QN to train the decision network parameters of job and machine agents. Our experiments revealed that after just one training session, collaborative agents acquired exceptional scheduling strategies. These strategies excel not only in solution quality compared to traditional Priority Dispatching Rules (PDR) but also outperform results achieved by some metaheuristic and reinforcement learning algorithms. Additionally, they exhibit greater speed than OR-Tools. Moreover, the empirical findings on both randomized and benchmark instances underscore the remarkable robustness of our acquired policies in practical, large-scale scenarios. Notably, when confronted with the DPpaulli dataset, characterized by a considerable imbalance between the number of operations and machines, our approach achieved optimality in 11 out of 18 FJSP instances.
Multi-policy deep reinforcement learning for multi-objective multiplicity flexible job shop scheduling
2024, Swarm and Evolutionary Computation
This study considers the simultaneous minimization of makespan and total tardiness for the multi-objective multiplicity flexible job shop scheduling problem (MOMFJSP). A deep reinforcement learning framework employing a multi-policy proximal policy optimization algorithm (MPPPO) is developed to solve MOMFJSP. The MOMFJSP is treated as a Markov decision process, allowing an intelligent agent to make sequential decisions based on the current production status. This framework involves multiple policy networks with different objective weight vectors. Using MPPPO, these networks are optimized simultaneously to obtain a set of high-quality Pareto-optimal policies. Moreover, a fluid model is introduced to extract state features and devise composite dispatching rules as discrete actions. A multi-policy co-evolution mechanism (MPCEM) is proposed to facilitate collaborative evolution among policy networks, supported by a reward mechanism that considers the objective weights. A training algorithm based on MPPPO is designed for learning across multiple policy networks. The effectiveness and superiority of the proposed method are confirmed through comparisons with composite dispatching rules and other scheduling methods.
A DQN-based memetic algorithm for energy-efficient job shop scheduling problem with integrated limited AGVs
2024, Swarm and Evolutionary Computation
AGVs have gained significant popularity in various industries. However, the existing literature rarely considers the integrated scheduling of production and logistics on the workshop due to the NP-hard property of both machine scheduling and AGV scheduling. The energy-efficient job shop scheduling problem with limited AGVs is investigated in this paper. A multi-objective memetic algorithm with deep Q-network (DQNMMA) is proposed to minimize the makespan and total energy consumption. In DQNMMA, ten features are selected to describe the current state of the population. This enables the DQN to dynamically adjust the crossover probability according to the population evolution. Formulas for calculating the head length and tail length of each node in the disjunctive graph model are presented for the first time to enable fast and accurate access to the critical paths. Building upon the analysis of critical paths, four problem properties are developed as the foundation for designing six neighborhood operators. Then, a property-based variable neighborhood search strategy is proposed to enhance the exploration capability of the algorithm. Numerous experimental results demonstrate that the proposed approaches can effectively enhance the performance of the algorithm, especially in solving large-scale problems. The comparative analysis with three other state-of-the-art multi-objective algorithms confirms the superiority and effectiveness of the proposed DQNMMA.
Collaborative scheduling of energy-saving spare parts manufacturing and equipment operation strategy using a self-adaptive two-stage memetic algorithm
2024, Robotics and Computer-Integrated Manufacturing
Previous studies on production scheduling predominantly focus on developing optimal production plans in the context of deterministic customer requirements, achieving certain optimization goals related to the manufacturer. However, little research has been conducted on scheduling customized spare parts manufacturing on the supply side and equipment operation strategy on the demand side simultaneously. In real-world engineering projects, an efficient collaborative scheduling solution not only helps the manufacturer develop optimal production plans for critical spare parts, but also guides equipment users to achieve maximum output. Therefore, this paper studies a collaborative scheduling problem that both considers energy-saving spare parts manufacturing in a flexible job shop and equipment operation strategy of distributed users. To solve this problem effectively, a self-adaptive two-stage memetic algorithm (STMA) is proposed to minizine total energy consumption of the manufacturer and maximize total operation utility of equipment simultaneously. In detail, four heuristic rules are designed to generate a high-quality initial population. In the first stage, the crossover and mutation are utilized for global exploration, and a self-adaptive local search operator is presented to enhance the local development ability of the STMA. In the second stage, several energy-saving and utility-improving strategies are proposed to further optimize the Pareto front solutions generated in the previous period. Extensive experiments are conducted, and the numerical results indicate that STMA is the most promising compared to other well-known algorithms.

View all citing articles on Scopus

View full text

A reinforcement learning based RMOEA/D for bi-objective fuzzy flexible job shop scheduling

Highlights

Abstract

Introduction

Section snippets

Related work and background knowledge

Problem statement

Framework of RMOEA/D

Experimental results

Conclusion

CRediT authorship contribution statement

Acknowledgments

Applied Soft Computing

Expert Systems with Applications

Applied Soft Computing

Swarm and Evolutionary Computation

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Applied Soft Computing

Applied Soft Computing

IEEE Transactions on Fuzzy Systems

Knowledge-Based Systems

Engineering Applications of Artificial Intelligence

Robotics and Computer-Integrated Manufacturing

Applied Soft Computing

IFAC-PapersOnLine

Expert Systems with Applications

European Journal of Operational Research

Computers & Industrial Engineering

Expert Systems with Applications

Computers & Industrial Engineering

Swarm and Evolutionary Computation

IEEE Transactions on Evolutionary Computation

Neurocomputing

Expert Systems with Applications

International Journal of Electrical Power & Energy Systems

Expert Systems with Applications

Routing and scheduling in a flexible job shop by tabu search

Annals of Operations Research

An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, Part I: Solving problems with box constraints

IEEE Transactions on Evolutionary Computation

A fast and elitist multiobjective genetic algorithm: NSGA-II

IEEE Transactions on Evolutionary Computation

An effective discrete harmony search algorithm for flexible job shop scheduling problem with fuzzy processing time

International Journal of Productions Research