Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems

Hong, Joonki; Prabhu, Vittaldas V.

doi:10.1023/B:APIN.0000011143.95085.74

Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems

Published: January 2004

Volume 20, pages 71–87, (2004)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Joonki Hong¹ &
Vittaldas V. Prabhu¹

587 Accesses
30 Citations
Explore all metrics

Abstract

This paper presents an approach that is suitable for Just-In-Time (JIT) production for multi-objective scheduling problem in dynamically changing shop floor environment. The proposed distributed learning and control (DLC) approach integrates part-driven distributed arrival time control (DATC) and machine-driven distributed reinforcement learning based control. With DATC, part controllers adjust their associated parts' arrival time to minimize due-date deviation. Within the restricted pattern of arrivals, machine controllers are concurrently searching for optimal dispatching policies. The machine control problem is modeled as Semi Markov Decision Process (SMDP) and solved using Q-learning. The DLC algorithms are evaluated using simulation for two types of manufacturing systems: family scheduling and dynamic batch sizing. Results show that DLC algorithms achieve significant performance improvement over usual dispatching rules in complex real-time shop floor control problems for JIT production.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review of machine learning for the optimization of production processes

Article 20 June 2019

AI in operations management: applications, challenges and opportunities

Article 21 February 2020

An energy-efficient unrelated parallel machine scheduling problem with learning effect of operators and deterioration of jobs

Article 15 April 2024

References

U. Bagchi, R. Sullivan, and Y. Chang, “Minimizing mean squared due-date deviation of completion times about a common due date,” Management Science, vol. 33, pp. 894–906, 1987.
Google Scholar
K.R. Baker and G.D. Scudder, “Sequencing with earliness and tardiness penalties: A review,” Operations Research, vol. 38. no. 1, pp. 22–36, 1990.
Google Scholar
M. Azizoglu and S. Webster, “Scheduling job families about an unrestricted common due date on a single machine,” International Journal of Production Research, vol. 35, no. 5, pp. 132–1330, 1997.
Google Scholar
Y. Yih and A. Thesen, “Semi-Markov decision models for realtime scheduling,” International Journal of Production Research, vol.29, no. 11, pp. 2331–2346, 1991.
Google Scholar
L. Tang, Y. Yih, and C. Liu, “A study on decision rules of a scheduling model in FMS,” Computers in Industry, vol. 22. pp. 1–13, 1993.
Google Scholar
L.C. Rabelo, A. Jones, and Y. Yih, “Development of a realtime learning scheduler using reinforcement learning concept,” in IEEE International Symposium on Intelligent Control, Columbus, Ohio, 1994, pp. 291-296.
Google Scholar
G.H. Kim and C.S.G. Lee, “Genetic reinforcement learning for scheduling heterogeneous machines,” in Proceedings of the 1996 IEEE International Conference on Robotics and Automation, 1996.
S.C. Park, N. Raman, and M.J. Shaw, “Adaptive scheduling in dynamic flexible manufacturing systems: A dynamic rule selection approach,” IEEE Transactions on Robotics and Automation, vol. 13, no. 4, pp. 486–502, 1997.
Google Scholar
W. Brauer and G. Weiss, “Multi-machine scheduling-A multi-agent learning approach,” in Proceedings International Conference on Multi Agent Systems, pp. 42–48, 1998.
R.S. Sutton and A.G. Barto, Reinforcement Learning: An Introduction. MIT Press: Cambridge, MA, 1998.
Google Scholar
G. Tesauro, “Practical issues in temporal difference learning,” Machine Learning, vol. 8, pp. 257–277, 1992.
Google Scholar
L. Lin, “Self-improving reactive agents based on reinforcement learning, planning and teaching,” Machine Learning, vol. 8, pp. 293–321, 1992.
Google Scholar
W. Zhang and T.G. Dietterich, “High performance job-shop scheduling with a time-delay TD(?) network,” in Advances in Neural Information Processing Systems, edited by D.S. Thouretzky, M.C. Mozer, and M.E. Hasselmo, pp. 1024–1030 MIT Press: Cambridge MA, 1996.
Google Scholar
S. Mahadevan, “Average reward reinforcement learning: Foundations, algorithms, and empirical results,” Machine Learning, vol. 22, pp. 159–195, 1996.
Google Scholar
S. Mahadevan, N. Marchalleck, T.K. Das, and A. Gosavi, “Selfimproving factory simulation using continuous-time average reward reinforcement learning,” in Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, July, 1997, pp. 202–210.
G. Wang and S. Mahadevan, “A greedy divide-and-conquer approach to optimizing large manufacturing systems using reinforcement learning,” in NIPS '98 Workshop on Abstraction and Hierarchy in Reinforcement Learning, Dec., 1998.
P. McDonnell, Resource Reconfiguration Decisions for Distributed Manufacturing Systems: A Game Theoretic Approach, Ph.D. Thesis, Industrial Engineering, Pennsylvania State University, 1998.
J. Hong and V.V. Prabhu, “Distributed learning and control for manufacturing systems scheduling,” in Fourteenth International Conference on Industrial and Engineering Applications of Artifi-cial Intelligence and Expert Systems (IEA/AIE- 2001 ), Budapest, Hungary, 4-7, June.
V.V. Prabhu and N.A. Duffie, “Distributed simulation approach for enabling cooperation between entities in heterarchical manufacturing systems, modeling, simulation, and control technologies for manufacturing,” in SIPE Proceedings, vol. 2596, pp. 234–242, 1995.
Google Scholar
J. Hong and V.V. Prabhu, “Modeling and performance of distributed algorithm for scheduling dissimilar machines with setup,” to appear in International Journal of Production Research, 2003.
C.L. Monma and C. Potts, “On the complexity of scheduling with batch setup times,” Operations Research, vol. 37, pp. 798–804, 1989.
Google Scholar
S. Webster and K.R. Baker, “Scheduling groups of jobs on a single machine,” Operations Research, vol. 43, no. 4, pp. 692–703, 1995.
Google Scholar
Z. Chen, “Scheduling with batch setup times and earliness tardiness penalties,” European Journal of Operations Research vol. 96, pp. 518–537, 1997.
Google Scholar
N. Balakrishnan, J.J. Kanet, and S.V. Sridharan, “Early/tardy scheduling with sequence dependent setups on uniform parallel machines,” Computers and Operations Research, vol. 26, pp. 127–141, 1999.
Google Scholar
J.H. Wang, P.B. Luh, J.L. Wang, and R.N. Thomas, “Near optimal scheduling of manufacturing systems with presence of batch machines and setup requirements,” CIRP Annals, vol. 46, no. 1, pp. 397–402, 1997.
Google Scholar
S.A. Banawan and J. Zahorjan, “Load sharing in heterogeneous queuing systems,” in Proceedings of the Eighth Annual Joint Conference of the IEEE Computer and Communication Societies, vol. 2, 1989, pp. 731–739.
Google Scholar
C. Berenguer, C. Chu, and A. Grall, “Inspection and maintenance planning: An application of semi-Markov decision processes,” Journal of Intelligent Manufacturing, vol. 8, pp. 467–476, 1997.
Google Scholar
G. Wang and S. Mahadevan, “Hierarchical optimization of policy-coupled semi-Markov decision processes,” in International Conference on Machine Learning, 1999.
C.J.C.H. Waktins, Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, Cambridge, England, 1989.
Google Scholar
C.J.C.H.Waktins and P. Dayan, “Q-Learning,” Machine Learning, vol. 8, pp. 279–292, 1992.
Google Scholar
J.N. Tsitsiklis, “Asynchronous stochastic approximation and Q-learning,” Machine Learning, vol. 16, pp. 185–202, 1994.
Google Scholar
A.G. Barto, S.J. Bradtke, and S.P. Singh, “Learning to act using Real-time dynamic programming,” Artificial Intelligence, vol. 72, pp. 81–138, 1995.
Google Scholar
V.V. Prabhu, Real-Time Distributed Arrival Time Control of Heterarchical Manufacturing Systems, Ph.D. Thesis, Mechanical Engineering, University of Wisconsin-Madison, 1995.
G. Kudva, A. Elkamel, J.E. Penky, and G.V. Reklaitis, “Heuristic algorithm for scheduling batch and semi-continuous plants with production deadlines, intermediate storage limitations and equipment changeover costs,” Computers in Chemical Engineering, vol. 18, no. 9, pp. 859–875, 1994.
Google Scholar
P. Stone and M. Veloso, “Multiagent systems: A survey from a machine learning perspective,” IEEE Transactions on Knowledge and Data Engineering, June 1996.
D.P. Bertsekas and J.N. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific: Belmont, MA, 1996.
Google Scholar
V.V. Prabhu, “Lyapunov stability of distributed control in multiple machine heterarchical manufacturing cells,” in Proceedings of XXVI North American Manufacturing Research Conference, Atlanta, GA, May 1998, pp. 311–316.
V.V. Prabhu, “Stability and fault adaptation in distributed control of heterarchical manufacturing job shops,” IEEE Transactions on Robotics and Automation, 2002, vol. 19, no. 1, pp. 142–147, 2003.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
Joonki Hong & Vittaldas V. Prabhu

Authors

Joonki Hong
View author publications
You can also search for this author in PubMed Google Scholar
Vittaldas V. Prabhu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hong, J., Prabhu, V.V. Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems. Applied Intelligence 20, 71–87 (2004). https://doi.org/10.1023/B:APIN.0000011143.95085.74

Download citation

Issue Date: January 2004
DOI: https://doi.org/10.1023/B:APIN.0000011143.95085.74

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems

Abstract

Access this article

Similar content being viewed by others

A review of machine learning for the optimization of production processes

AI in operations management: applications, challenges and opportunities

An energy-efficient unrelated parallel machine scheduling problem with learning effect of operators and deterioration of jobs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems

Abstract

Access this article

Similar content being viewed by others

A review of machine learning for the optimization of production processes

AI in operations management: applications, challenges and opportunities

An energy-efficient unrelated parallel machine scheduling problem with learning effect of operators and deterioration of jobs

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation