Abstract
Given the dynamic and uncertain production environment of job shops, a scheduling strategy with adaptive features must be developed to fit variational production factors. Therefore, a dynamic scheduling system model based on multi-agent technology, including machine, buffer, state, and job agents, was built. A weighted Q-learning algorithm based on clustering and dynamic search was used to determine the most suitable operation and to optimize production. To address the large state space problem caused by changes in the system state, four state features were extracted. The dimension of the system state was decreased through the clustering method. To reduce the error between the actual system states and clustering ones, the state difference degree was defined and integrated with the iteration formula of the Q function. To select the optimal state-action pair, improved search and iteration update strategies were proposed. Convergence analysis of the proposed algorithm and simulation experiments indicated that the proposed adaptive strategy is well adaptable and effective in different scheduling environments, and shows better performance in complex environments. The two contributions of this research are as follows: (1) a dynamic greedy search strategy was developed to avoid blind searching in traditional strategy. (2) Weighted iteration update of the Q function, including the weighted mean of the maximum fuzzy earning, was designed to improve the speed and accuracy of the improved learning algorithm.







Similar content being viewed by others
References
Baker, K. R. (1984). Sequencing rules and due-date assignments in a job shop. Management Science,30(9), 1093–1104.
Delgoshaei, A., Ali, A., Ariffin, M. K. A., & Gomes, C. (2016). A multi-period scheduling of dynamic cellular manufacturing systems in the presence of cost uncertainty. Computers & Industrial Engineering,100, 110–132.
Frobenius, G. (1912). Uber Matrizen aus nicht negativen Elementen.
Gao, L., & Pan, Q. K. (2016). A shuffled multi-swarm micro-migrating birds optimizer for a multi-resource-constrained flexible job shop scheduling problem. Information Sciences,372, 655–676.
Huang, G. Q., Zhang, Y. F., Chen, X., & Newman, S. T. (2008). RFID-enabled real-time wireless manufacturing for adaptive assembly planning and control. Journal of Intelligent Manufacturing,19(6), 701–713.
Karimi-Nasab, M., Modarres, M., & Seyedhoseini, S. M. (2015). A self-adaptive PSO for joint lot sizing and job shop scheduling with compressible process times. Applied Soft Computing,27, 137–147.
Ken, Y., Myungryun, Y., & Takanori, Y. (2013). A proposal of real-time scheduling algorithm based on RMZL and schedulability analysis. Procedia Computer Science,24, 9–14.
Kenneth, R. B. (2014). Minimizing earliness and tardiness costs in stochastic scheduling. European Journal of Operational Research,236(2), 445–452.
Kundakcı, N., & Kulak, O. (2016). Hybrid genetic algorithms for minimizing makespan in dynamic job shop scheduling problem. Computers & Industrial Engineering,96, 31–51.
Kusiak, A., & Li, W. (2010). Short-term prediction of wind power with a clustering approach. Renewable Energy,35(10), 2362–2369.
Kusiak, A., Tang, F., & Xu, G. (2011). Multi-objective optimization of hvac system with an evolutionary computation algorithm. Energy,36(5), 2440–2449.
Lee, K. K. (2008). Fuzzy rule generation for adaptive scheduling in a dynamic manufacturing environment. Applied Soft Computing,28(4), 1295–1304.
Li, X., Peng, Z., Du, B., Guo, J., Xu, W., & Zhuang, K. (2017). Hybrid artificial bee colony algorithm with a rescheduling strategy for solving flexible job shop scheduling problems. Computers & Industrial Engineering,113, 10–26.
Liu, B. J., Fan, Y. S., & Liu, Y. (2015). A fast estimation of distribution algorithm for dynamic fuzzy flexible job-shop scheduling problem. Computers & Industrial Engineering,87, 193–201.
Liu, Z., Lu, L., & Qi, X. (2018). Cost allocation in rescheduling with machine unavailable period. European Journal of Operational Research,266(1), 16–28.
Luo, H., Du, B., Huang, G. Q., Chen, H., & Li, X. (2013). Hybrid flow shop scheduling considering machine electricity consumption cost. International Journal of Production Economics,146(2), 423–439.
Minsky, M. L. (1954). Theory of neural-analog reinforcement systems and its application to the brain model problem. Princeton: Princeton University.
Mokhtari, H., & Hasani, A. (2017). An energy-efficient multi-objective optimization for flexible job-shop scheduling problem. Computers & Chemical Engineering,104, 339–352.
Park, J., Mei, Y., Su, N., Chen, G., & Zhang, M. (2018). An investigation of ensemble combination schemes for genetic programming based hyper-heuristic approaches to dynamic job shop scheduling. Applied Soft Computing,63, 72–86.
Park, S. C., Raman, N., & Shaw, M. J. (1997). Adaptive scheduling in dynamic flexible manufacturing systems, a dynamic rule selection approach. IEEE Transactions on Robotics and Automation,13(4), 486–502.
Perron, O. (1907). Zur theorie der matrices. Mathematische Annalen,64(2), 248–263.
Piroozfard, H., Wong, K. Y., & Wong, W. P. (2018). Minimizing total carbon footprint and total late work criterion in flexible job shop scheduling by using an improved multi-objective genetic algorithm. Resources, Conservation and Recycling,128, 267–283.
Polydoros, A. S., & Nalpantidis, L. (2017). Survey of model-based reinforcement learning, applications on robotics. Journal of Intelligent and Robotic Systems,86(2), 153–173.
Sadeghzadeh, M., Calvert, D., & Abdullah, H. A. (2015). Self-learning visual servoing of robot manipulator using explanation-based fuzzy neural networks and Q-learning. Journal of Intelligent and Robotic Systems,78(1), 83–104.
Shahrabi, J., Adibi, M. A., & Mahootchi, M. (2017). A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Computers & Industrial Engineering,110, 75–82.
Shen, X. N., & Yao, X. (2015). Mathematical modeling and multi-objective evolutionary algorithms applied to dynamic flexible job shop scheduling problems. Information Sciences,298, 198–224.
Singh, S., Jaakkola, T., Littman, M. L., & Szepesvari, C. (2000). Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learning,38(3), 287–308.
Tang, Z., Jiang, L. G., Zhou, J. Q., Li, K., & Li, K. (2015). A self-adaptive scheduling algorithm for reduce start time. Future Generation Computer Systems,43–44, 51–60.
Theodoridis, S., & Koutroumbas, K. (2003). Pattern recognition (2nd ed.). San Diego: Academic Press.
Tian, J., Wang, Q., Fu, R., & Yuan, J. (2016). Online scheduling on the unbounded drop-line batch machines to minimize the maximum delivery completion time. Theoretical Computer Science,617, 65–68.
Wang, C., & Jiang, P. (2016). Manifold learning based rescheduling decision mechanism for recessive disturbances in RFID-driven job shops. Journal of Intelligent Manufacturing,29, 1–16.
Wang, G. L., Lin, L., & Zhong, S. S. (2009). Clustering state membership-based Q-learning for dynamic scheduling. High Technology Letters,19(4), 428–433. (in Chinese).
Wang, S. J., Sun, S., Zhou, B., & Xi, L. F. (2007). Q-Learning based dynamic single machine scheduling. Journal of Shanghai Jiaotong University,41(8), 1227–1232. (in Chinese).
Wang, Y. C., & Usher, J. M. (2004). Learning policies for single machine job dispatching. Robotics and Computer-Integrated Manufacturing,20(6), 553–562.
Wang, Y. C., & Usher, J. M. (2005). Application of reinforcement learning for agent-based production scheduling. Engineering Applications of Artificial Intelligence,18(1), 73–82.
Wang, H. X., & Yan, H. S. (2016). An interoperable adaptive scheduling strategy for knowledgeable manufacturing based on SMGWQ-learning. Journal of Intelligent Manufacturing,27, 1085–1095.
Wen, Z., & Sun, H. K. (2017). MATLAB intelligent algorithm. Beijing: Tsinghua University Press. (in Chinese).
Wu, X., & Sun, Y. (2018). A green scheduling algorithm for flexible job shop with energy-saving measures. Journal of Cleaner Production,172, 3249–3264.
Xiong, H., Fan, H., Jiang, G., & Li, G. (2017). A simulation-based study of dispatching rules in a dynamic job shop scheduling problem with batch release and extended technical precedence constraints. European Journal of Operational Research,257(1), 13–24.
Yang, H. B., & Yan, H. S. (2009). An adaptive approach to dynamic scheduling in knowledgeable manufacturing cell. The International Journal of Advanced Manufacturing Technology,42, 312–320.
Yao, S., Jiang, Z., Li, N., Zhang, H., & Geng, N. (2011). A multi-objective dynamic scheduling approach using multiple attribute decision making in semiconductor manufacturing. International Journal of Production Economics,130(1), 125–133.
Zhang, Z. (2011). Proficient in MATLAB R2011a. Beijing: Beihang University Press. (in Chinese).
Zhang, Z., Zheng, L., Li, N., Wang, W., Zhong, S., & Hu, K. (2012). Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning. Computers & Operations Research,39(7), 1315–1324.
Acknowledgements
This research is supported by the National Natural Science Foundation of China under Grant No. 51705260, and by Natural Science Foundation of the Higher Education Institution of Jiangsu Province under Grant No. 16KJD460005. I thank the editor-in-chief and the anonymous reviewers for their valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, YF. Adaptive job shop scheduling strategy based on weighted Q-learning algorithm. J Intell Manuf 31, 417–432 (2020). https://doi.org/10.1007/s10845-018-1454-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-018-1454-3