skip to main content
research-article

Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition

Published: 25 May 2017 Publication History

Abstract

Service-oriented architecture is a widely used software engineering paradigm to cope with complexity and dynamics in enterprise applications. Service composition, which provides a cost-effective way to implement software systems, has attracted significant attention from both industry and research communities. As online services may keep evolving over time and thus lead to a highly dynamic environment, service composition must be self-adaptive to tackle uninformed behavior during the evolution of services. In addition, service composition should also maintain high efficiency for large-scale services, which are common for enterprise applications. This article presents a new model for large-scale adaptive service composition based on multi-agent reinforcement learning. The model integrates reinforcement learning and game theory, where the former is to achieve adaptation in a highly dynamic environment and the latter is to enable agents to work for a common task (i.e., composition). In particular, we propose a multi-agent Q-learning algorithm for service composition, which is expected to achieve better performance when compared with the single-agent Q-learning method and multi-agent SARSA (State-Action-Reward-State-Action) method. Our experimental results demonstrate the effectiveness and efficiency of our approach.

References

[1]
Eyhab Al-Masri and Qusay H. Mahmoud. 2007. Discovering the best web service. In Proceedings of the 16th International Conference on World Wide Web. ACM, 1257--1258.
[2]
Mohammad Alrifai, Dimitrios Skoutas, and Thomas Risse. 2010. Selecting skyline services for QoS-based web service composition. In Proceedings of the 19th International Conference on World Wide Web. ACM, 11--20.
[3]
D. Ardagna and B. Pernici. 2007. Adaptive service composition in flexible processes. IEEE Trans. Softw. Eng. 33, 6 (2007), 369--384.
[4]
Luciano Baresi and Sam Guinea. 2011. Self-supervising bpel processes. IEEE Transactions on Software Engineering 37, 2 (2011), 247--263.
[5]
Sandrine Beauche and Pascal Poizat. 2008. Automated service composition with adaptive planning. In Proceedings of Service-Oriented Computing (ICSOC’08). Springer, 530--537.
[6]
Israel Ben-Shaul, Ophir Holder, and Boris Lavva. 2001. Dynamic adaptation and deployment of distributed components in hadas. IEEE Trans. Softw. Eng. 27, 9 (2001), 769--787.
[7]
Craig Boutilier. 1996. Planning, learning and coordination in multiagent decision processes. In Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK’96). Morgan Kaufmann, San Francisco, CA, 195--210.
[8]
Zaki Brahmi. 2013. QoS-aware automatic web service composition based on cooperative agents. In Proceedings of the 2013 IEEE 22nd International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE’13). IEEE, 27--32.
[9]
George W. Brown. 1951. Iterative solution of games by fictitious play. Activity Anal. Prod. Allocat. 13, 1 (1951), 374--376.
[10]
Bernd Bruegge and Allen H. Dutoit. 2004. Object-Oriented Software Engineering Using UML, Patterns and Java-(Required). Prentice Hall.
[11]
L. Busoniu, R. Babuska, and B. De Schutter. 2008. A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man. Cybernet. C: Appl. Rev. 38, 2 (2008), 156--172.
[12]
Valeria Cardellini, Emiliano Casalicchio, Vincenzo Grassi, Stefano Iannucci, Francesco Lo Presti, and Raffaela Mirandola. 2012. Moses: A framework for qos driven runtime adaptation of service-oriented systems. IEEE Trans. Softw. Eng. 38, 5 (2012), 1138--1159.
[13]
Kun Chen, Jiuyun Xu, and Stephan Reiff-Marganiec. 2009. Markov-htn planning approach to enhance flexibility of automatic web service composition. In Proceedings of the IEEE International Conference on Web Services, 2009 (ICWS’09). IEEE, 9--16.
[14]
Ying Chen, Jiwei Huang, and Chuang Lin. 2014. Partial selection: An efficient approach for QoS-aware web service composition. In Proceedings of the 2014 IEEE International Conference on Web Services (ICWS’14). IEEE, 1--8.
[15]
Caroline Claus and Craig Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence (AAAI’98/IAAI’98). American Association for Artificial Intelligence, Menlo Park, CA, 746--752.
[16]
P. Doshi, R. Goodwin, R. Akkiraju, and K. Verma. 2004. Dynamic workflow composition using Markov decision processes. In Proceedings of the IEEE International Conference on Web Services, 2004. 576--582.
[17]
Vadim Ermolayev, Natalya Keberle, Sergey Plaksin, Oleksandr Kononenko, and Vagan Terziyan. 2004. Towards a framework for agent-enabled semantic web service composition. Int. J. Web Serv. Res. 1, 3 (2004), 63--87.
[18]
Aiqiang Gao, Dongqing Yang, Shiwei Tang, and Ming Zhang. 2005. Web service composition using markov decision processes. In Advances in Web-age Information Management. Springer, 308--319.
[19]
J. Octavio Gutierrez-Garcia and Kwang-Mong Sim. 2010. Agent-based service composition in cloud computing. In Grid and Distributed Computing, Control and Automation. Springer, 1--10.
[20]
Rachid Hamadi and Boualem Benatallah. 2003. A petri net-based model for web service composition. In Proceedings of the 14th Australasian Database Conference,Vol. 17. Australian Computer Society, Inc., 191--200.
[21]
Pieter Jan’t Hoen, Karl Tuyls, Liviu Panait, Sean Luke, and Johannes A. La Poutre. 2005. An overview of cooperative and competitive multiagent learning. In Proceedings of the First International Conference on Learning and Adaption in Multi-Agent Systems. Springer-Verlag, 1--46.
[22]
Jrg Hoffmann, Ingo Weber, and Frank Michael Kraft. 2010. SAP speaks PDDL. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI 2010). 1096--1101.
[23]
Junling Hu and Michael P. Wellman. 1998. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the 15th International Conference on Machine Learning (ICML’98). Morgan Kaufmann, San Francisco, CA, 242--250.
[24]
Ivan J. Jureta, Stéphane Faulkner, Youssef Achbany, and Marco Saerens. 2007. Dynamic web service composition within a service-oriented architecture. In Proceedings of the IEEE International Conference on Web Services, 2007 (ICWS’07). IEEE, 304--311.
[25]
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 1 (1996), 237--285.
[26]
Eirini Kaldeli, Alexander Lazovik, and Marco Aiello. 2011. Continual planning with sensing for web service composition. In AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, Usa, August. 1198--1203.
[27]
Andreas Klein, Fuyuki Ishikawa, and Shinichi Honiden. 2014. SanGA: A self-adaptive network-aware approach to service composition. IEEE Trans. Serv. Comput. 7, 3 (2014), 452--464.
[28]
Michael L. Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In ICML, Vol. 94. 157--163.
[29]
Michael L. Littman. 2001. Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2, 1 (2001), 55--66.
[30]
Zakaria Maamar, Soraya Kouadri Mostefaoui, and Hamdi Yahyaoui. 2005. Toward an agent-based and context-oriented approach for Web services composition. IEEE Trans. Knowl. Data Eng. 17, 5 (2005), 686--697.
[31]
Radu Mateescu, Pascal Poizat, and Gwen Salaun. 2012. Adaptation of service protocols using process algebra and on-the-fly reduction techniques. IEEE Trans. Softw. Eng. 38, 4 (2012), 755--777.
[32]
Dov Monderer and Lloyd S. Shapley. 1996. Fictitious play property for games with identical interests. J. Econ. Theor. 68, 1 (1996), 258--265.
[33]
Oliver Moser, Florian Rosenberg, and Schahram Dustdar. 2012. Domain-specific service selection for composite services. IEEE Trans. Softw. Eng. 38, 4 (2012), 828--843.
[34]
Ahmed Moustafa and Minjie Zhang. 2013. Multi-objective service composition using reinforcement learning. In Service-Oriented Computing. Springer, 298--312.
[35]
John Nash. 1951. Non-cooperative games. Ann. Math. 54, 2 (1951), pp. 286--295.
[36]
Seog-Chan Oh, Dongwon Lee, and Soundar R. T. Kumara. 2008a. Effective web service composition in diverse and large-scale service networks. IEEE Trans. Serv. Comput. 1, 1 (2008), 15--32.
[37]
Seog-Chan Oh, D. Lee, and S. R. T. Kumara. 2008b. Effective web service composition in diverse and large-scale service networks. IEEE Trans. Serv. Comput. 1, 1 (2008), 15--32.
[38]
Liviu Panait and Sean Luke. 2005. Cooperative multi-agent learning: The state of the art. In Proceedings of 2005 Autonomous Agents and Multi-Agent Systems (AAMAS’05). 387--434.
[39]
Petros Papadopoulos, Huaglory Tianfield, David Moffat, and Peter Barrie. 2013. Decentralized multi-agent service composition. Multiagent Grid Syst. 9, 1 (2013), 45--100.
[40]
Marco Pistore, Annapaola Marconi, Piergiorgio Bertoli, and Paolo Traverso. 2005a. Automated composition of web services by planning at the knowledge level. In IJCAI, Vol. 19. 1252--1259.
[41]
Marco Pistore, Paolo Traverso, and Piergiorgio Bertoli. 2005b. Automated composition of web services by planning in asynchronous domains. In ICAPS, Vol. 5. 2--11.
[42]
Martin L. Puterman. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley 8 Sons, New York, NY.
[43]
Mazeiar Salehie and Ladan Tahvildari. 2009. Self-adaptive software: Landscape and research challenges. ACM Trans. Auton. Adapt. Syst. 4, 2 (2009), 14.
[44]
Dong-Hoon Shin, Kyong-Ho Lee, and Tatsuya Suda. 2009. Automated generation of composite web services based on functional semantics. Web Semant.: Sci. Serv. Agents World Wide Web 7, 4 (2009), 332--343.
[45]
Satinder Singh, Tommi Jaakkola, Michael L. Littman, and Csaba Szepesvári. 2000. Convergence results for single-step on-policy reinforcement-learning algorithms. Mach. Learn. 38, 3 (2000), 287--308.
[46]
Peter Stone and Manuela Veloso. 2000. Multiagent systems: A survey from a machine learning perspective. Auton. Robots 8, 3 (2000), 345--383.
[47]
Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. Cambridge University Press, Cambridge.
[48]
Hongxia Tong, Jian Cao, Shensheng Zhang, and Minglu Li. 2011. A distributed algorithm for web service composition based on service agent model. IEEE Trans. Parallel Distrib. Syst. 22, 12 (2011), 2008--2021.
[49]
Thomas Vogel and Holger Giese. 2014. Model-driven engineering of self-adaptive software with EUREMA. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 8, 4 (2014), 18.
[50]
Chang-ying Wang, Wen-wei Chen, and Li Yao. 2004. A multi-agent cooperative reinforcement learning algorithm based on team markov game. J. Fudan Univ. 5 (2004), 041.
[51]
Hongbing Wang, Xin Chen, Qin Wu, Qi Yu, Zibin Zheng, and Athman Bouguettaya. 2014. Integrating on-policy reinforcement learning with multi-agent techniques for adaptive service composition. In Service-Oriented Computing. Springer, 154--168.
[52]
Hongbing Wang, Xiaojun Wang, Xingzhi Zhang, Qi Yu, and Xingguo Hu. 2016. Effective service composition using multi-agent reinforcement learning. Knowl.-Based Syst. 92 (2016), 151--168.
[53]
Hongbing Wang, Xiaojun Wang, and Xuan Zhou. 2012. A multi-agent reinforcement learning model for service composition. In Proceedings of the 2012 IEEE Ninth International Conference on Services Computing (SCC’12). IEEE, 681--682.
[54]
Hongbing Wang, Qin Wu, Xin Chen, and Qi Yu. 2015. Integrating gaussian process with reinforcement learning for adaptive service composition. In Service-Oriented Computing. Springer, 203--217.
[55]
Hongbing Wang, Qin Wu, Xin Chen, Qi Yu, Zibin Zheng, and Athman Bouguettaya. 2014. Adaptive and dynamic service composition via multi-agent reinforcement learning. In Proceedings of the 2014 IEEE International Conference on Web Services (ICWS’14). IEEE, 447--454.
[56]
Hongbing Wang, Xuan Zhou, Xiang Zhou, Weihong Liu, Wenya Li, and Athman Bouguettaya. 2010. Adaptive service composition based on reinforcement learning. In Service-Oriented Computing. Springer, 92--107.
[57]
Lijuan Wang, Jun Shen, and Junzhou Luo. 2015. Facilitating an ant colony algorithm for multi-objective data-intensive service provision. J. Comput. System Sci. 81, 4 (2015), 734--746.
[58]
P. Wang, Z. Ding, C. Jiang, M. Zhou, and Y. Zheng. 2016. Automatic web service composition based on uncertainty execution effects. IEEE Trans. Serv. Comput. 9, 4 (July 2016), 551--565.
[59]
Xiaofeng Wang and Tuomas Sandholm. 2002. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems. MIT Press, 1571--1578.
[60]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Mach. Learn. 8, 3-4 (1992), 279--292.
[61]
Y. Wu, C. Yan, Z. Ding, G. Liu, P. Wang, C. Jiang, and M. Zhou. 2016. A multilevel index model to expedite web service discovery and composition in large-scale service repositories. IEEE Trans. Serv. Comput. 9, 3 (May 2016), 330--342.
[62]
Wenbo Xu, Jian Cao, Haiyan Zhao, and Lei Wang. 2012. A multi-agent learning model for service composition. In Proceedings of the 2012 IEEE Asia-Pacific Services Computing Conference (APSCC’12). IEEE, 70--75.
[63]
Yuhong Yan, P. Poizat, and Ludeng Zhao. 2010. Self-adaptive service composition through graphplan repair. In Proceedings of the 2010 IEEE International Conference on Web Services (ICWS’10). 624--627.
[64]
Chunyang Ye and H.-A. Jacobsen. 2013. Whitening soa testing via event exposure. IEEE Trans. Softw. Eng. 39, 10 (2013), 1444--1465.
[65]
H. Peyton Young. 1993. The evolution of conventions. Econometrica 61, 1 (1993), 57--84.
[66]
Zibin Zheng, Yilei Zhang, and Michael R. Lyu. 2014. Investigating QoS of real-world web services. IEEE Trans. Serv. Comput. 7, 1 (2014), 32--39.

Cited By

View all
  • (2025)Agentes de software basados en técnicas de aprendizaje automático. Perspectivas desde 2010 hasta 2023REVISTA COLOMBIANA DE TECNOLOGIAS DE AVANZADA (RCTA)10.24054/rcta.v1i45.31311:45(39-56)Online publication date: 1-Jan-2025
  • (2025)Satellite Edge Computing for Mobile Multimedia Communications: A Multi-agent Federated Reinforcement Learning ApproachACM Transactions on Autonomous and Adaptive Systems10.1145/3715146Online publication date: 3-Feb-2025
  • (2024)The State of the Art of Emergent Software SystemsIEEE Access10.1109/ACCESS.2024.336990312(31808-31823)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Autonomous and Adaptive Systems
      ACM Transactions on Autonomous and Adaptive Systems  Volume 12, Issue 2
      June 2017
      162 pages
      ISSN:1556-4665
      EISSN:1556-4703
      DOI:10.1145/3099619
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 May 2017
      Accepted: 01 February 2017
      Revised: 01 February 2017
      Received: 01 January 2015
      Published in TAAS Volume 12, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Service composition
      2. game theory
      3. multi-agent system
      4. reinforcement learning

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • Collaborative Innovation Centers of Novel Software Technology and Industrialization and Wireless Communications Technology
      • NSFC Projects
      • Australian Research Council's Discovery Project
      • Australian Research Council's Linkage Projects funding scheme

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)40
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 20 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Agentes de software basados en técnicas de aprendizaje automático. Perspectivas desde 2010 hasta 2023REVISTA COLOMBIANA DE TECNOLOGIAS DE AVANZADA (RCTA)10.24054/rcta.v1i45.31311:45(39-56)Online publication date: 1-Jan-2025
      • (2025)Satellite Edge Computing for Mobile Multimedia Communications: A Multi-agent Federated Reinforcement Learning ApproachACM Transactions on Autonomous and Adaptive Systems10.1145/3715146Online publication date: 3-Feb-2025
      • (2024)The State of the Art of Emergent Software SystemsIEEE Access10.1109/ACCESS.2024.336990312(31808-31823)Online publication date: 2024
      • (2023)Multi-Objective Service Composition Using Enhanced Multi-Objective Differential Evolution AlgorithmComputational Intelligence and Neuroscience10.1155/2023/81843672023Online publication date: 1-Jan-2023
      • (2023)A Solution Space Reduction Approach based on Neural Network and Clustering for Large-scale Service CompositionProceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security10.1145/3661638.3661722(443-447)Online publication date: 22-Dec-2023
      • (2023)A Survey on Collaborative Learning for Intelligent Autonomous SystemsACM Computing Surveys10.1145/362554456:4(1-37)Online publication date: 10-Nov-2023
      • (2023)Learning in Cooperative Multiagent Systems Using Cognitive and Machine ModelsACM Transactions on Autonomous and Adaptive Systems10.1145/361783518:4(1-22)Online publication date: 14-Oct-2023
      • (2023)Service-Based Trajectory Planning in Multi-Drone Skyway Networks2023 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)10.1109/PerComWorkshops56833.2023.10150327(334-336)Online publication date: 13-Mar-2023
      • (2023)Dynamic Service Composition Method Based on Zero-Sum Game Integrated Inverse Reinforcement LearningIEEE Access10.1109/ACCESS.2023.332358411(111897-111908)Online publication date: 2023
      • (2022)GOAL: Supporting General and Dynamic Adaptation in Computing SystemsProceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3563835.3567655(16-32)Online publication date: 29-Nov-2022
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media