Joint decision-making of parallel machine scheduling restricted in job-machine release time and preventive maintenance with remaining useful life constraints

https://doi.org/10.1016/j.ress.2022.108429Get rights and content

Highlights

  • A MILP model for parallel machine scheduling and maintenance is proposed.

  • Job-machine release and maintenance duration/interval flexibility are considered.

  • A RUL prediction method is proposed to get online machine reliability.

  • Online machine reliability is used to guide all maintenance activities.

Abstract

The machine remaining useful life (RUL), the job-machine release time and the correlation between the maintenance duration and the machine enlistment age are, in this paper, collectively emphasized at the parallel machine scheduling problem. Based on this, a corresponding mixed integer programming model is constructed to minimize the makespan and the processing loss beyond the machine RUL threshold, where a discrete teaching and learning based optimization algorithm is applied to solve this NP-hard problem, and a fault mode-assisted gated recurrent unit (FGRU) life prediction method is used to guide the predictive maintenance initiation time of all machines. In addition, this paper demonstrates that the FGRU method is more accurate than three common methods (Encoder-Decoder Recurrent Neural Network, Bidirectional Long Short-Term Memory and GRU) through two actual bearing degradation cases, and shows through three benchmark cases that the joint decision-making can effectively reduce the time cost of manufacturing enterprises.

Introduction

In the increasingly competed market, the job-shop scheduling problem (JSP) is derived from the pursuit of continuous profit in manufacturing enterprises. The constraints in the JSP generally include the job arrival time, the job delivery time, the job deterioration effect, the machine processing cost, the machine functional attributes, shop geographical distribution, etc. The scheduling objectives of JSP roughly include the makespan, the total delay time, the total production cost, etc. And its frameworks have been roughly divided into single machine, parallel machine, flow shop, hybrid flow shop and other targeted modes. Traditional parallel machine scheduling can be described as: m constant-speed machines with similar functional properties need to process n different jobs with only one process. As an extension of single machine scheduling and a very important research field in scheduling family, the parallel machine scheduling is extensively used in the actual semiconductor manufacturing, wafer fabrication and call-in service center, and has attracted extensive attention of scholars since McNaughton [1] made the first study on it.

With the hysterical pursuit of better processing continuity, major industrial entities and research institutions have begun to implement attempts on various production models and scheduling theories. Mönch and Shen [2] studied a parallel machine scheduling problem with weighted delivery time as the objective function in distributed manufacturing environment. Basiri et al [3]. constructed a flexible parallel machine scheduling model for the reality of fuzzy processing time, sequence-dependent setup time and reentrant jobflow, where a Pareto-based multi-objective meta-heuristics was proposed. Li et al [4]. investigated a parallel machine scheduling problem with position-dependent deteriorating jobs and DeJong's learning effects in an uncertain system, which is fairly consistent with the intermediate products in chemical industry and the employees in human-computer interaction processing, respectively. Xu et al [5]. studied a parallel machine scheduling rule for the outside processing and the due window, motivated by the flexible delivery time and the sharing of trial-manufacture jobs among groups. Xiao et al [6]. analyzed the parallel machine scheduling in green manufacturing from the aspects of time cost and carbon emissions, and that conformed to the national policy of carbon neutralization. Beside the parallel machine framework, Zhang et al [7]. presented a two-stage hybrid flow shop scheduling problem (HFSP) with reentrant and limited waiting time constraints based on block painting operations. Gao et al [8]. addressed a flexible job shop scheduling problem (FJSP) with the constraints of fuzzy processing time and new job insertion based on artificial bee colony. Abdelmaguid et al [9]. studied the proportionate multiprocessor open shop scheduling problem (PMOSP) in which the processing time on a given center is not job-dependent. To sum up, they all have established the mixed integer programming models suitable for their concerned attributes, which has always been the materialization direction of traditional parallel machine problem.

The said literature have made a basic assumption that all jobs to be processed are released at zero time, which has undeniable application prospects in deterministic orders, such as schoolroom administration and port unloading. However, in many cases, e.g. call center supply, health code verification and wine-making, the job arrival time uncertainty has brought many troubles to the subsequent resource allocation and personnel arrangement, easily leading to the congestion of transportation channels and the idle of equipment for a long time. That greatly limits the applicability of the flawed hypothesis. Especially in the handling of intermediates (MI) with deterioration effect, the said assumption may bring the direct risk of MI waste and the secondary risk of equipment scrapping. The job arrive time has become a recognized major technical issue. To protect direct beneficiaries against such liability, a handful of scholars have made finite restrictions on the job arrival time. Long et al [10]. investigated the steelmaking-continuous casting production scheduling problem, and solved the dilemma of job release time fluctuation to some extent. Cevikcan and Durmusoglu [11] added a workload regulation module in the parallel machine system to determine when and where the jobs are released to the workshop under what conditions. Zhou et al [12]. addressed an energy-efficient scheduling problem with unequal job arrival times including productivity and energy cost measures, and provided decision makers with a good approximation of Pareto solutions. To summarize, manufacturing service systems with indefinite orders rely on the ability to deal with the random job arrival times.

In the same way, the importance of machine release time in scheduling underlying settings is also noted well. For instance, in component highly correlated and personnel intensive systems, e.g. engine assembly, machines may be frequently and unexpectedly occupied when new orders arrive. Nevertheless, in the literature published in the past few decades, the machine release time consideration is mainly limited to single machine scheduling [13], and they even crudely presume that all machines are available at time zero in other scheduling frameworks [14]. Instead, to provide a feasible machine allocation table for each job within the maximum allowable waiting time, the backstage supporters not only need to provide a limited buffer for deterministic orders, but also make the ultimate solution comply with local regulations and other cost constraints, which makes the scheduling model more complex and difficult to solve. The readable mathematical model and optimization algorithm with lower time complexity are worth discussing. Mario[13] and Defersha[15] are two pioneers in the machine release time filed, those interested can refer to them more.

The current machine availability is reflected not only in whether it is occupied by other jobs, but also in the machine performance. Today, defect and fatigue evolution will occur at any time as machines become more sophisticated, which causes emergent incidents or slow-changing breakdowns, and then makes it unavailable during shutdown maintenance. Adiri et al. [16]. first noticed the unavailability reality caused by machine failure and stated a single machine flow-time scheduling problem with a single breakdown. Wang et al [17]. addressed a proactive scheduling problem with stochastic machine breakdown arising from steel production. Abbas et al [18]. thought the emergent failures would disrupt pre-established planning. In total, they have made a forward-looking theoretical discussion on the predictive maintenance for different types of machine failures. During practical production, for the continuous benefits and low restart costs, e.g. die change and relocation [19], the mainstream maintenance has undergone the fundamental change from periodical or corrective maintenance to well-timed predictive maintenance around machine status, which not only can avoid redundant interference-induced unnecessary stoppages, but also prevent overdue renovation-induced cascading failures. Be careful here, although some literature have considered the machine unavailability constraint, they tend to focus on periodic or single maintenance and evade the changing reliability, like the Refs. [16], [17], [18]. Such an unrealistic intention is not valid in many real-world applications. Therefore, at present, how to balance production scheduling and preventive maintenance, how to make the machine continuously have good operation state and high reliability, and then how to realize the harmonious unity of scheduling, maintenance and reliability still are key issues to be solved in the realization of intelligent manufacturing.

As a foundation for studying complicated systems, the joint decision-making in single machine has been widely deliberated. Kacem [20] proposed a single machine scheduling case with a fixed maintenance interval, where all machines had only one definite unavailability period. Yu et al [21]. believed multi-cycle maintenance interval in single machine system with non-preemptive jobs were rigid and known in advance. Zou and Yuan [22] attached the constraints, a maximum allowable maintenance interval and a fixed maintenance duration, to the single machine scheduling with job rejection. For parallel machine systems with predictive maintenance, Hsieh and You [23] thought that current maintenance cycle was flexible, and it only depended on the job number in a continuous processing batch since the last maintenance. Marsili et al [24]. assumed the maintenance based on short and frequent interruptions had less impact on the system operability compared to those based on longer and sporadic interventions. Pang et al [25]. said that the flexible maintenance cycle was applicable to a part of machines, and it only related to the machine service age. In conclusion, the current quantitative criteria of unavailability constraints of machines basically do not match the current reliability indicators of machines. There is still no unified standard for the communication bridge between maintenance trigger and machine reliability. Fortunately, over the past a few decades, failure distribution model of manufacturing systems, a criterion for quantifying the machine ability to perform specific functions, has been intensively investigated. By considering the deterioration effect of jobs and machine, Cui et al [26]. studied the Weibull function for single machine system, whose strategy is to perform predetermined maintenance to recover the system whenever the machine performance from Weibull was lower than the reliability threshold. It is also called threshold based maintenance strategy. Tao et al [27]. introduced recursive decline factor and failure rate rise factor into Weibull distribution in incomplete maintenance scenario. Sousa et al [28]. used a radial basis function to reduce unnecessary maintenance insertion and resource delivery for corroded pipelines. Inspired by the fault perception classification, Ke et al [29]. evaluated each machine reliability based on the memoryless property of adjacent maintenances. It is worth noting that, for simplicity, they followed the condition based maintenance strategy, “maintenance at reliability threshold”. In fact, on the one hand, early maintenance can enhance product yield rate, reduce the overhaul cost and ensure the consistency of job physical properties, on the other hand, when machine reliability exceeds the given threshold, partial maintenance resources, e.g. spare parts inventory and outsourcing supply, may not meet the necessary conditions for immediate maintenance, which means that reliability threshold should be regarded as the latest maintenance cordon rather than the maintenance trigger. Furthermore, from their point of view, the maintenance duration associated with each breakdown was treated as expected or fixed values. On the contrary, in many engineering practices, the maintenance durations of machines in different degradation stages are actually discrepant. In view of the gaps among current maintenance strategy, duration and realistic maintenance concept, this paper sets a linear relationship between maintenance duration and machine reliability, where the near-optimal maintenance initiation time wound be determined by the optimization algorithm described later within the reliability threshold, and no longer entirely controlled by the time when the reliability threshold is exceeded for the first time.

Subsequently, most literature including [21], [22], [23], [24], [25], [26], [27], [28], [29] utilize a ready-made statistical model for off-line monitoring of machines as the maintenance guidance, without online considering dynamic service environment-induced stochastic fault modes, e.g. unlubrication friction-created headstock overheating and transilient load-associated tool resonance. By the same token, although Failure Mode/Effect and Criticality Analysis (FMEA/FMECA) can predetermine the relative severity and probability of failures, what it is essentially dealing with is the failure possibility of a machine before the service or in a static service environment, not the failure probability of the monitored system on the go or in the dynamic service environment, nor can it give the approximate duration range of the fragile parts from early failure to complete failure. Rather than acquiring the time-variant failure probability of monitored components in the dynamic environment, we prefer to predict the degradation duration of monitored components to total failure, which can help business owners plan some time-bound preparations in advance.

In the interests of monitoring life-circle machine reliability on line, inspired by the ideas of literature [30,31], our motivation is to develop a data-driven remaining useful life (RUL) prediction model considering the influence of stochastic fault modes on RUL prediction accuracy throughout the lead time. To the best of our knowledge, this is an early attempt to combine RUL prediction with the joint decision-making of maintenance and scheduling. Nowadays, the intelligent transformation is making the monitoring data gradually change from the production by-product to the production key [32]. Through the analysis and mining of massive degradation data, the experience- or model-driven maintenance management of manufacturing unit is ushering in a significant revolution [33,34]. And indeed, that is what generalized predictive maintenance did, which integrates condition monitoring, fault diagnosis, RUL prediction, maintenance decision-making and maintenance activities into the dispatching automation system. Nonetheless, scholars in the RUL field prefer to pursue the improvement of RUL prediction accuracy, rather than inquire into the specific maintenance logic around resource supply [35,36], which is the stand aloof from intricate industrial society to a certain extent. In its way, our modest contribution is kind of the first application exploration of RUL prediction in parallel machine system.

Eventually, by pulling together three latent terms in the parallel machine system: job-machine release times, predicted RUL (online reliability) and maintenance duration associated with enlistment age, in this article, a joint decision-making for minimizing makespan and machining penalty beyond RUL threshold is formulated with the aim of obtaining the near-optimal job production/sequence sequences, job-machine combination and maintenance initiation time. Nevertheless, evaluating the MRP-II coherence under the uncertain maintenance duration and reliability is not straightforward and facing three challenges as following: (1) due to the uncertainty of job-machine release time, the idle time of waiting for jobs or machines is a stochastic quantity, which is challenging to propagate the uncertainty to the makespan; (2) under the stochastic maintenance duration and initiation time, the existing evaluation methods for maintenance insertion are no longer effective; (3) resolving the parallel machine scheduling and preventive maintenance model under the RUL guidance is computationally tedious, and it involves more constraints. Consequently, it necessitates the development of a short-cut discrete optimization algorithm to reduce the calculation burden.

In the combinatorial optimization, especially for the parallel machine system, the branch delimitation algorithm that can uniquely find the optimal solution has heavy computational burden, so here it is not considered. The heuristic or meta-heuristic algorithms are more effective because they do not require the objective function to satisfy specific conditions or have favorable mathematical properties [37]. These algorithms (particle swarm optimization (PSO) [38], whale optimization algorithm (WOA) [39], fruit fly optimization algorithm (FOA) [40], etc.) have attracted more and more attention and are widely used in various engineering problems. As one of the most representative heuristic algorithms, teaching and learning based optimization (TLBO) algorithm has been developed by Rao et al. [41] and does not require any specific parameters. Its entire optimization process includes the teaching stage and the learning stage. In the teaching stage, each learner learns from the teacher (the best learner). In the learning stage, each learner learns from the other learner in a random manner. Due to the relatively competitive global search and convergence performance, TLBO has long been regarded as a bright new star among many heuristic algorithms [42]. Despite all this universalism, all original meta heuristic algorithms are based on the resolution of continuous functions. In order to realize the successful execution of the optimizer on the discrete combinatorial optimization, this paper needs to encode the discrete variables that job-machine combination and job processing/release sequence into the continuous computable variables, and then establish a discrete TLBO algorithm.

To recognize the unique novelties of this work, comparing with the existing joint scheduling planning and predictive maintenance models, the main contributions of our work can be summarized as the following three aspects: (1) A new joint decision-making model for scheduling planning and predictive maintenance is put forth for the parallel machine system, where the proposed discrete teaching-learning based optimization algorithm is treated as the solver; (2) The job-machine release time, the job release sequence and the maintenance duration flexibility are considered into influence factors interactively, then a novel data-driven remaining useful life prediction method reconciled by stochastic fault modes is formulated to guide all maintenance activities. And it is the first time that data-driven remaining useful life prediction is considered in the dynamic preventive maintenance policy of parallel machines; (3) We made comparative experiments using two degeneration datasets and three job-machine benchmark data. And the results have proved the efficiency and superiority of the proposed RUL prediction method and joint decision-making model of parallel machines.

The remainder of this paper is organized as follows: Section 2 descripts the basis scheduling assumptions and RUL problem definition Section 3. formulates the new joint decision-making of parallel machine scheduling and predictive maintenance, and explains the discretization process of TLBO algorithm Section 4. introduces the structures of fault mode-assisted RUL prediction process in detail Section 5. starts the comparative test and the experimental analysis, and Section 6 concludes this study and discusses some future works.

Section snippets

Preliminaries

In this section, to clearly state the joint decision-making (SPM) and RUL prediction problems, we roughly standardize the framework of parallel machine system, the premise assumptions of joint decision-making and the RUL conception.

SPM formulation

In this study, there are m machines and n jobs in total, j is seen as the machine index, lj jobs will be assigned to machine j, that is, lj is the number of machining positions where k is the machining position index. i and i are called the job number, and the orderly release position of each job is recorded as g or g. The release preparation duration and the release time of job i are denoted as qi and Qi, respectively, the processing duration of job i is recorded as pi. The processing

Fault mode assited RUL prediction

In this study, the proposed RUL prediction method is divided into four parts: (1) the fault mode diagnosis, (2) the determination of the first prediction time (FPT), (3) the training of multiple candidate RUL prediction models based on degeneration data with multiple different fault modes and (4) predicting the RUL of monitored component after the FPT by using the RUL prediction model corresponding to the diagnosed fault mode.

Case study

To verity the practicability and effectiveness of the proposed approach, we carry out two degeneration cases for RUL prediction and three simulation data for the joint decision-making. All of models and algorithms were coded in Matlab 2018 and ran on a personal computer with a 2.10 GHzn2 CPU and 4.0 GB RAM. In the first degeneration case, an accelerated degradation test platform for roller bearings regarded as a vulnerable part of machine was built, and the vibration signals were collected

Conclusions and future works

In this paper, by taking account of (1) the online uncertainty of machine reliability caused by the dynamic service environment, (2) the random release times of jobs and machines, and (3) the relationship between the flexible maintenance duration and the enlistment age of machines, a joint decision-making MIP model of parallel machine scheduling and predictive maintenance is established and evaluated with the aim of minimizing the total makespan, where the processing penalty beyond RUL

CRediT authorship contribution statement

Xinxin He: Methodology, Writing – original draft. Zhijian Wang: Conceptualization, Funding acquisition. Yanfeng Li: Writing – review & editing. Svetlana Khazhina: Methodology, Writing – review & editing. Wenhua Du: Project administration, Writing – review & editing. Junyuan Wang: Visualization, Data curation. Wenzhao Wang: Validation, Supervision, Writing – original draft.

Declaration of Competing Interest

There is no conflict of interest between the authors and all the authors mutually agree to submit the manuscript in the Journal.

Acknowledgements

The authors would like to acknowledge the Shanxi Provincial Natural Science Foundations of China (Nos. 201801D221339, 201801D221237, 201901D111239), Openresearch fund for Key Discipline Laboratory of Super-Pressure Hurting Technology (No. DXMBJJ2019-01) and National Natural Science Foundation of China (No. 51905496) for their support.

References (50)

  • X. Li et al.

    Attention-based deep survival model for time series data

    Reliab Eng Syst Saf

    (2022)
  • T. Li et al.

    Hierarchical attention graph convolutional network to fuse multi-sensor signals for remaining useful life prediction

    Reliab Eng Syst Saf

    (2021)
  • Q. Hu et al.

    Robust recurrent neural network modeling for software fault detection and correction prediction

    Reliab Eng Syst Saf

    (2007)
  • G. Liao et al.

    Remaining useful life prediction for multi-phase deteriorating process based on Wiener process

    Reliab Eng Syst Saf

    (2021)
  • X. Xu et al.

    Remaining useful life prediction of lithium-ion batteries based on Wiener process under time-varying temperature condition

    Reliab Eng Syst Saf

    (2021)
  • H. Ouyang et al.

    Teaching-learning based optimization with global crossover for global optimization problems

    Appl Math Comput

    (2015)
  • E. Zhang et al.

    A practical approach for solving multi-objective reliability redundancy allocation problems using extended bare-bones particle swarm optimization

    Reliab Eng Syst Saf

    (2014)
  • X. Zhang et al.

    Bearing fault diagnosis using a whale optimization algorithm-optimized orthogonal matching pursuit with a combined time-frequency atom dictionary

    Mech Syst Signal Process

    (2018)
  • Y. Zhang et al.

    A novel multi-scale cooperative mutation fruit fly optimization algorithm

    Knowl Based Syst

    (2016)
  • R.V. Rao et al.

    Teaching-learning-based optimization: an optimization method for continuous non-linear large scale problems

    Inform Sci

    (2012)
  • M. Črepinšek et al.

    A note on teaching-learning-based optimization algorithm

    Inform Sci

    (2012)
  • H. Wang et al.

    Remaining useful life prediction and optimal maintenance time determination for a single unit using isotonic regression and gamma process model

    Reliab Eng Syst Saf

    (2021)
  • Y. Yu et al.

    Averaged Bi-LSTM networks for RUL prognostics with non-life-cycle labeled dataset

    Neurocomputing

    (2020)
  • S. Gao et al.

    Operational reliability evaluation and prediction of rolling bearing based on isometric mapping and NoCuSa-LSSVM

    Reliab Eng Syst Saf

    (2020)
  • R. McNaughton

    Scheduling with deadlines and loss functions

    Manag Sci

    (1959)
  • Cited by (42)

    • Adaptive staged RUL prediction of rolling bearing

      2023, Measurement: Journal of the International Measurement Confederation
    • A novel rolling bearing fault diagnosis method based on continuous hierarchical fractional range entropy

      2023, Measurement: Journal of the International Measurement Confederation
    View all citing articles on Scopus
    View full text