A Multi-agent Deep Reinforcement Learning-Based Collaborative Willingness Network for Automobile Maintenance Service

Hao, Shengang; Zheng, Jun; Yang, Jie; Ni, Ziwei; Zhang, Quanxin; Zhang, Li

doi:10.1007/978-3-031-16815-4_6

Shengang Hao²⁴,
Jun Zheng²⁴,
Jie Yang²⁴,
Ziwei Ni²⁴,
Quanxin Zhang²⁴ &
…
Li Zhang²⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13285))

Included in the following conference series:

International Conference on Applied Cryptography and Network Security

1394 Accesses

Abstract

With the growth of maintenance market scale of automobile manufacturing enterprises, simple information technology is not enough to solve the problem of uneven resource allocation and low customer satisfaction in maintenance chain services. To solve this problem, this paper abstracts the automotive maintenance collaborative service into a multi-agent collaborative model based on the decentralized partially observable Markov decision progress (Dec-POMDP). Based on this model, a multi-agent deep reinforcement learning algorithm based on collaborative willingness network (CWN-MADRL) is presented. The algorithm uses a value decomposition based MADRL framework, adds a collaborative willingness network based on the original action value network of the agent, and uses the attention mechanism to improve the impact of the collaboration between agents on the action decision-making, while saving computing resources. The evaluation results show that, our CWN-MADRL algorithm can converge quickly, learn effective task recommendation strategies, and achieve better system performance compared with other benchmark algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Two-stage graph attention networks and Q-learning based maintenance tasks scheduling

Article 16 January 2025

Deep multi-objective reinforcement learning for utility-based infrastructural maintenance optimization

Article Open access 10 January 2025

Multi-agent Deep Q-Learning for Maintenance Scheduling of Engineering System with Large-Scale State Space

References

Jiang, P., Ding, K., Leng, J., et al.: Research on service driven community manufacturing model. Comput. Integr. Manuf. Syst. 21(6), 1637–1649 (2015)
Google Scholar
Kang, K., Tan, B.Q., Zhong, R.Y.: Multi-attribute negotiation mechanism for manufacturing service allocation in smart manufacturing. Adv. Eng. Inform. 51, 132–145 (2022)
Article Google Scholar
Cheng, X., Zhang, C., Qian, Y., Aloqaily, M., Xiao, Y.: Deep learning for 5G IoT systems. Int. J. Mach. Learn. Cybern. 12(11), 3049–3051 (2021). https://doi.org/10.1007/s13042-021-01382-w
Article Google Scholar
Zhou, J., Li, P.G., Zhou, Y.H., et al.: Toward new-generation intelligent manufacturing. Engineering 4(1), 11–20 (2018)
Article Google Scholar
Qing, L., Pan, L.: Application of workflow in after-sales service system. Comput. Knowl. Technol. 07(4), 717–718 (2011)
Google Scholar
Feng, T.: Research on the application of workflow technology in after-sales service system of ASP platform of automobile industry chain. Southwest Jiaotong University, Chengdu (2011)
Google Scholar
Li, X.: Application research of automotive after sales service system based on business process choreography and SOA. Southwest Jiaotong University, Chengdu (2009)
Google Scholar
Fu, Q.: Research and implementation of automotive after sales service system based on SOA. Southwest Jiaotong University, Chengdu (2008).
Google Scholar
Wang, S.: Collaborative commerce platform integration framework for manufacturing industry chain. J. Southwest Jiaotong Univ. 43(5), 643–647 (2008)
Google Scholar
Chen, J., Wang, S., Sun, L.: Model and architecture of multi industry chain collaborative public service platform for flexible business association. Comput. Integr. Manuf. Syst. 17(1), 177–185 (2011)
Google Scholar
Yang, X.Y., Fang, Z.G., Li, X.C., et al.: Similarity-based in- formation fusion grey model for remaining useful life prediction of aircraft engines. Grey Syst.: Theory Appl. 11(3), 463–483 (2021)
Google Scholar
Yang, Y.F., Feng, J.: Fault pattern recognition and state prediction research of ship power equipment based on HMM-SVR. Ship Eng. 40(3), 68–72 (2018)
Google Scholar
Zhou, Z.L.: Research on fault prediction of switch machine based on hidden Markov model. Southwest Jiaotong University, Chengdu (2020)
Google Scholar
Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res. 21(178), 1–51 (2020)
MathSciNet MATH Google Scholar
Vinyals, O., Babuschkin, I., Czarnecki, W.M., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)
Article Google Scholar
Tang, Z., Yu, C., Chen, B., et al.: Discovering diverse multi-agent strategic behavior via reward randomization. arXiv preprint arXiv:2103.04564 (2021)
Son, K., Kim, D., Kang, W.J., et al.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning, PMLR, pp. 5887–5896 (2019)
Google Scholar
Sunehag, P., Lever, G., Gruslys, A., et al.: Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems, IFAAMAS, Richland, USA, pp. 2085–2087 (2018)
Google Scholar
Rashid, T., Samvelyan, M., Witt, C.S., et al.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, New York, USA, pp. 4292–4301. ACM (2018)
Google Scholar
Yang, Y., Hao, J., Liao, B., et al.: Qatten: a general framework for cooperative multi-agent reinforcement learning. arXiv:2002.03939 (2020)
Son, K., Kim, D., Kang, W.J., et al.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning, New York, USA, pp. 5887–5896. ACM (2019)
Google Scholar
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Heess, N., Silver, D., The, Y.W.: Actor-critic reinforcement learning with energy-based policies. In: European Workshop on Reinforcement Learning, PMLR, pp. 45–58 (2013)
Google Scholar
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6382–6393 (2017)
Google Scholar
Foerster, J., Farquhar, G., Afouras, T., et al.: Counterfactual multi-agent policy gradients. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI, Palo Alto, USA, pp. 2974–2982 (2018)
Google Scholar
Ha, D., Dai, A.M., Le, Q.V.: HyperNetworks. In: Proceedings of the 5th International Conference on Learning Representations. OpenReview.net, Amherst (2017)
Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. King’s College, University of Cambridge (1989)
Google Scholar
Rummery, G.A., Niranjan, M.: On-line q-learning using connectionist systems. University of Cambridge, Department of Engineering Cambridge, UK (1994)
Google Scholar
Van Seijen, H., Van Hasselt, H., Whiteson, S., et al.: A theoretical and empirical analysis of expected sarsa. In: 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp. 177–184. IEEE, Piscataway (2009)
Google Scholar
Kearns, M.J., Singh, S.P.: Bias-variance error bounds for temporal difference updates. In: Proceedings of the 13th Annual Conference on Computational Learning Theory, pp. 142–147. Morgan Kaufmann, San Francisco (2000)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Key Research and Development Program of China under Grant 2018YFB1701402, National Natural Science Foundation of China (no. U1936218 and 62072037).

Author information

Authors and Affiliations

School of Computer Science, Beijing Institute of Technology, Beijing, 100081, China
Shengang Hao, Jun Zheng, Jie Yang, Ziwei Ni & Quanxin Zhang
Department of Media Engineering, Communication University of Zhejiang, Hangzhou, 310018, China
Li Zhang

Authors

Shengang Hao
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ziwei Ni
View author publications
You can also search for this author in PubMed Google Scholar
Quanxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Zhang .

Editor information

Editors and Affiliations

Singapore University of Technology and Design, Singapore, Singapore
Jianying Zhou
University of Bristol, Bristol, UK
Sridhar Adepu
University of Malaga, Malaga, Spain
Cristina Alcaraz
Radboud University Nijmegen, Málaga, The Netherlands
Lejla Batina
Sapienza University of Rome, Rome, Roma, Italy
Emiliano Casalicchio
Singapore University of Technology and Design, Singapore, Singapore
Sudipta Chattopadhyay
Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
Chenglu Jin
University of Science and Technology of China, Hefei, China
Jingqiang Lin
University of Padua, Padua, Italy
Eleonora Losiouk
Concordia University, Montreal, QC, Canada
Suryadipta Majumdar
Technical University Denmark, Kongens Lyngby, Denmark
Weizhi Meng
Delft University of Technology, Delft, The Netherlands
Stjepan Picek
Zhejiang Gongshang University, Hangzhou, China
Jun Shao
University of Aizu, Aizu-Wakamatsu, Japan
Chunhua Su
City University of Hong Kong, Hong Kong, Hong Kong
Cong Wang
Delft University of Technology, Delft, The Netherlands
Yury Zhauniarovich
Rutgers University, Piscataway, NJ, USA
Saman Zonouz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hao, S., Zheng, J., Yang, J., Ni, Z., Zhang, Q., Zhang, L. (2022). A Multi-agent Deep Reinforcement Learning-Based Collaborative Willingness Network for Automobile Maintenance Service. In: Zhou, J., et al. Applied Cryptography and Network Security Workshops. ACNS 2022. Lecture Notes in Computer Science, vol 13285. Springer, Cham. https://doi.org/10.1007/978-3-031-16815-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-16815-4_6
Published: 24 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16814-7
Online ISBN: 978-3-031-16815-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Multi-agent Deep Reinforcement Learning-Based Collaborative Willingness Network for Automobile Maintenance Service

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Two-stage graph attention networks and Q-learning based maintenance tasks scheduling

Deep multi-objective reinforcement learning for utility-based infrastructural maintenance optimization

Multi-agent Deep Q-Learning for Maintenance Scheduling of Engineering System with Large-Scale State Space

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Multi-agent Deep Reinforcement Learning-Based Collaborative Willingness Network for Automobile Maintenance Service

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Two-stage graph attention networks and Q-learning based maintenance tasks scheduling

Deep multi-objective reinforcement learning for utility-based infrastructural maintenance optimization

Multi-agent Deep Q-Learning for Maintenance Scheduling of Engineering System with Large-Scale State Space

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation