Abstract
With the growth of maintenance market scale of automobile manufacturing enterprises, simple information technology is not enough to solve the problem of uneven resource allocation and low customer satisfaction in maintenance chain services. To solve this problem, this paper abstracts the automotive maintenance collaborative service into a multi-agent collaborative model based on the decentralized partially observable Markov decision progress (Dec-POMDP). Based on this model, a multi-agent deep reinforcement learning algorithm based on collaborative willingness network (CWN-MADRL) is presented. The algorithm uses a value decomposition based MADRL framework, adds a collaborative willingness network based on the original action value network of the agent, and uses the attention mechanism to improve the impact of the collaboration between agents on the action decision-making, while saving computing resources. The evaluation results show that, our CWN-MADRL algorithm can converge quickly, learn effective task recommendation strategies, and achieve better system performance compared with other benchmark algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Jiang, P., Ding, K., Leng, J., et al.: Research on service driven community manufacturing model. Comput. Integr. Manuf. Syst. 21(6), 1637–1649 (2015)
Kang, K., Tan, B.Q., Zhong, R.Y.: Multi-attribute negotiation mechanism for manufacturing service allocation in smart manufacturing. Adv. Eng. Inform. 51, 132–145 (2022)
Cheng, X., Zhang, C., Qian, Y., Aloqaily, M., Xiao, Y.: Deep learning for 5G IoT systems. Int. J. Mach. Learn. Cybern. 12(11), 3049–3051 (2021). https://doi.org/10.1007/s13042-021-01382-w
Zhou, J., Li, P.G., Zhou, Y.H., et al.: Toward new-generation intelligent manufacturing. Engineering 4(1), 11–20 (2018)
Qing, L., Pan, L.: Application of workflow in after-sales service system. Comput. Knowl. Technol. 07(4), 717–718 (2011)
Feng, T.: Research on the application of workflow technology in after-sales service system of ASP platform of automobile industry chain. Southwest Jiaotong University, Chengdu (2011)
Li, X.: Application research of automotive after sales service system based on business process choreography and SOA. Southwest Jiaotong University, Chengdu (2009)
Fu, Q.: Research and implementation of automotive after sales service system based on SOA. Southwest Jiaotong University, Chengdu (2008).
Wang, S.: Collaborative commerce platform integration framework for manufacturing industry chain. J. Southwest Jiaotong Univ. 43(5), 643–647 (2008)
Chen, J., Wang, S., Sun, L.: Model and architecture of multi industry chain collaborative public service platform for flexible business association. Comput. Integr. Manuf. Syst. 17(1), 177–185 (2011)
Yang, X.Y., Fang, Z.G., Li, X.C., et al.: Similarity-based in- formation fusion grey model for remaining useful life prediction of aircraft engines. Grey Syst.: Theory Appl. 11(3), 463–483 (2021)
Yang, Y.F., Feng, J.: Fault pattern recognition and state prediction research of ship power equipment based on HMM-SVR. Ship Eng. 40(3), 68–72 (2018)
Zhou, Z.L.: Research on fault prediction of switch machine based on hidden Markov model. Southwest Jiaotong University, Chengdu (2020)
Rashid, T., Samvelyan, M., De Witt, C.S., Farquhar, G., Foerster, J., Whiteson, S.: Monotonic value function factorisation for deep multi-agent reinforcement learning. J. Mach. Learn. Res. 21(178), 1–51 (2020)
Vinyals, O., Babuschkin, I., Czarnecki, W.M., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)
Tang, Z., Yu, C., Chen, B., et al.: Discovering diverse multi-agent strategic behavior via reward randomization. arXiv preprint arXiv:2103.04564 (2021)
Son, K., Kim, D., Kang, W.J., et al.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: International Conference on Machine Learning, PMLR, pp. 5887–5896 (2019)
Sunehag, P., Lever, G., Gruslys, A., et al.: Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems, IFAAMAS, Richland, USA, pp. 2085–2087 (2018)
Rashid, T., Samvelyan, M., Witt, C.S., et al.: QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning, New York, USA, pp. 4292–4301. ACM (2018)
Yang, Y., Hao, J., Liao, B., et al.: Qatten: a general framework for cooperative multi-agent reinforcement learning. arXiv:2002.03939 (2020)
Son, K., Kim, D., Kang, W.J., et al.: QTRAN: learning to factorize with transformation for cooperative multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Machine Learning, New York, USA, pp. 5887–5896. ACM (2019)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
Heess, N., Silver, D., The, Y.W.: Actor-critic reinforcement learning with energy-based policies. In: European Workshop on Reinforcement Learning, PMLR, pp. 45–58 (2013)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6382–6393 (2017)
Foerster, J., Farquhar, G., Afouras, T., et al.: Counterfactual multi-agent policy gradients. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, AAAI, Palo Alto, USA, pp. 2974–2982 (2018)
Ha, D., Dai, A.M., Le, Q.V.: HyperNetworks. In: Proceedings of the 5th International Conference on Learning Representations. OpenReview.net, Amherst (2017)
Watkins, C.J.C.H.: Learning from delayed rewards. King’s College, University of Cambridge (1989)
Rummery, G.A., Niranjan, M.: On-line q-learning using connectionist systems. University of Cambridge, Department of Engineering Cambridge, UK (1994)
Van Seijen, H., Van Hasselt, H., Whiteson, S., et al.: A theoretical and empirical analysis of expected sarsa. In: 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp. 177–184. IEEE, Piscataway (2009)
Kearns, M.J., Singh, S.P.: Bias-variance error bounds for temporal difference updates. In: Proceedings of the 13th Annual Conference on Computational Learning Theory, pp. 142–147. Morgan Kaufmann, San Francisco (2000)
Acknowledgments
This work was supported by the National Key Research and Development Program of China under Grant 2018YFB1701402, National Natural Science Foundation of China (no. U1936218 and 62072037).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Hao, S., Zheng, J., Yang, J., Ni, Z., Zhang, Q., Zhang, L. (2022). A Multi-agent Deep Reinforcement Learning-Based Collaborative Willingness Network for Automobile Maintenance Service. In: Zhou, J., et al. Applied Cryptography and Network Security Workshops. ACNS 2022. Lecture Notes in Computer Science, vol 13285. Springer, Cham. https://doi.org/10.1007/978-3-031-16815-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-16815-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16814-7
Online ISBN: 978-3-031-16815-4
eBook Packages: Computer ScienceComputer Science (R0)