research-article

Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition

Authors:

Athman BouguettayaAuthors Info & Claims

ACM Transactions on Autonomous and Adaptive Systems (TAAS), Volume 12, Issue 2

Article No.: 8, Pages 1 - 42

https://doi.org/10.1145/3058592

Published: 25 May 2017 Publication History

Abstract

Service-oriented architecture is a widely used software engineering paradigm to cope with complexity and dynamics in enterprise applications. Service composition, which provides a cost-effective way to implement software systems, has attracted significant attention from both industry and research communities. As online services may keep evolving over time and thus lead to a highly dynamic environment, service composition must be self-adaptive to tackle uninformed behavior during the evolution of services. In addition, service composition should also maintain high efficiency for large-scale services, which are common for enterprise applications. This article presents a new model for large-scale adaptive service composition based on multi-agent reinforcement learning. The model integrates reinforcement learning and game theory, where the former is to achieve adaptation in a highly dynamic environment and the latter is to enable agents to work for a common task (i.e., composition). In particular, we propose a multi-agent Q-learning algorithm for service composition, which is expected to achieve better performance when compared with the single-agent Q-learning method and multi-agent SARSA (State-Action-Reward-State-Action) method. Our experimental results demonstrate the effectiveness and efficiency of our approach.

References

[1]

Eyhab Al-Masri and Qusay H. Mahmoud. 2007. Discovering the best web service. In Proceedings of the 16th International Conference on World Wide Web. ACM, 1257--1258.

Digital Library

[2]

Mohammad Alrifai, Dimitrios Skoutas, and Thomas Risse. 2010. Selecting skyline services for QoS-based web service composition. In Proceedings of the 19th International Conference on World Wide Web. ACM, 11--20.

Digital Library

[3]

D. Ardagna and B. Pernici. 2007. Adaptive service composition in flexible processes. IEEE Trans. Softw. Eng. 33, 6 (2007), 369--384.

Digital Library

[4]

Luciano Baresi and Sam Guinea. 2011. Self-supervising bpel processes. IEEE Transactions on Software Engineering 37, 2 (2011), 247--263.

Digital Library

[5]

Sandrine Beauche and Pascal Poizat. 2008. Automated service composition with adaptive planning. In Proceedings of Service-Oriented Computing (ICSOC’08). Springer, 530--537.

Digital Library

[6]

Israel Ben-Shaul, Ophir Holder, and Boris Lavva. 2001. Dynamic adaptation and deployment of distributed components in hadas. IEEE Trans. Softw. Eng. 27, 9 (2001), 769--787.

Digital Library

[7]

Craig Boutilier. 1996. Planning, learning and coordination in multiagent decision processes. In Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge (TARK’96). Morgan Kaufmann, San Francisco, CA, 195--210.

Digital Library

[8]

Zaki Brahmi. 2013. QoS-aware automatic web service composition based on cooperative agents. In Proceedings of the 2013 IEEE 22nd International Workshop on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE’13). IEEE, 27--32.

Digital Library

[9]

George W. Brown. 1951. Iterative solution of games by fictitious play. Activity Anal. Prod. Allocat. 13, 1 (1951), 374--376.

[10]

Bernd Bruegge and Allen H. Dutoit. 2004. Object-Oriented Software Engineering Using UML, Patterns and Java-(Required). Prentice Hall.

Digital Library

[11]

L. Busoniu, R. Babuska, and B. De Schutter. 2008. A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man. Cybernet. C: Appl. Rev. 38, 2 (2008), 156--172.

Digital Library

[12]

Valeria Cardellini, Emiliano Casalicchio, Vincenzo Grassi, Stefano Iannucci, Francesco Lo Presti, and Raffaela Mirandola. 2012. Moses: A framework for qos driven runtime adaptation of service-oriented systems. IEEE Trans. Softw. Eng. 38, 5 (2012), 1138--1159.

Digital Library

[13]

Kun Chen, Jiuyun Xu, and Stephan Reiff-Marganiec. 2009. Markov-htn planning approach to enhance flexibility of automatic web service composition. In Proceedings of the IEEE International Conference on Web Services, 2009 (ICWS’09). IEEE, 9--16.

Digital Library

[14]

Ying Chen, Jiwei Huang, and Chuang Lin. 2014. Partial selection: An efficient approach for QoS-aware web service composition. In Proceedings of the 2014 IEEE International Conference on Web Services (ICWS’14). IEEE, 1--8.

Digital Library

[15]

Caroline Claus and Craig Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence (AAAI’98/IAAI’98). American Association for Artificial Intelligence, Menlo Park, CA, 746--752.

Digital Library

[16]

P. Doshi, R. Goodwin, R. Akkiraju, and K. Verma. 2004. Dynamic workflow composition using Markov decision processes. In Proceedings of the IEEE International Conference on Web Services, 2004. 576--582.

Digital Library

[17]

Vadim Ermolayev, Natalya Keberle, Sergey Plaksin, Oleksandr Kononenko, and Vagan Terziyan. 2004. Towards a framework for agent-enabled semantic web service composition. Int. J. Web Serv. Res. 1, 3 (2004), 63--87.

[18]

Aiqiang Gao, Dongqing Yang, Shiwei Tang, and Ming Zhang. 2005. Web service composition using markov decision processes. In Advances in Web-age Information Management. Springer, 308--319.

Digital Library

[19]

J. Octavio Gutierrez-Garcia and Kwang-Mong Sim. 2010. Agent-based service composition in cloud computing. In Grid and Distributed Computing, Control and Automation. Springer, 1--10.

[20]

Rachid Hamadi and Boualem Benatallah. 2003. A petri net-based model for web service composition. In Proceedings of the 14th Australasian Database Conference,Vol. 17. Australian Computer Society, Inc., 191--200.

Digital Library

[21]

Pieter Jan’t Hoen, Karl Tuyls, Liviu Panait, Sean Luke, and Johannes A. La Poutre. 2005. An overview of cooperative and competitive multiagent learning. In Proceedings of the First International Conference on Learning and Adaption in Multi-Agent Systems. Springer-Verlag, 1--46.

Digital Library

[22]

Jrg Hoffmann, Ingo Weber, and Frank Michael Kraft. 2010. SAP speaks PDDL. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI 2010). 1096--1101.

Digital Library

[23]

Junling Hu and Michael P. Wellman. 1998. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the 15th International Conference on Machine Learning (ICML’98). Morgan Kaufmann, San Francisco, CA, 242--250.

Digital Library

[24]

Ivan J. Jureta, Stéphane Faulkner, Youssef Achbany, and Marco Saerens. 2007. Dynamic web service composition within a service-oriented architecture. In Proceedings of the IEEE International Conference on Web Services, 2007 (ICWS’07). IEEE, 304--311.

[25]

Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 1 (1996), 237--285.

Digital Library

[26]

Eirini Kaldeli, Alexander Lazovik, and Marco Aiello. 2011. Continual planning with sensing for web service composition. In AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, Usa, August. 1198--1203.

Digital Library

[27]

Andreas Klein, Fuyuki Ishikawa, and Shinichi Honiden. 2014. SanGA: A self-adaptive network-aware approach to service composition. IEEE Trans. Serv. Comput. 7, 3 (2014), 452--464.

[28]

Michael L. Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In ICML, Vol. 94. 157--163.

Digital Library

[29]

Michael L. Littman. 2001. Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2, 1 (2001), 55--66.

Digital Library

[30]

Zakaria Maamar, Soraya Kouadri Mostefaoui, and Hamdi Yahyaoui. 2005. Toward an agent-based and context-oriented approach for Web services composition. IEEE Trans. Knowl. Data Eng. 17, 5 (2005), 686--697.

Digital Library

[31]

Radu Mateescu, Pascal Poizat, and Gwen Salaun. 2012. Adaptation of service protocols using process algebra and on-the-fly reduction techniques. IEEE Trans. Softw. Eng. 38, 4 (2012), 755--777.

Digital Library

[32]

Dov Monderer and Lloyd S. Shapley. 1996. Fictitious play property for games with identical interests. J. Econ. Theor. 68, 1 (1996), 258--265.

[33]

Oliver Moser, Florian Rosenberg, and Schahram Dustdar. 2012. Domain-specific service selection for composite services. IEEE Trans. Softw. Eng. 38, 4 (2012), 828--843.

Digital Library

[34]

Ahmed Moustafa and Minjie Zhang. 2013. Multi-objective service composition using reinforcement learning. In Service-Oriented Computing. Springer, 298--312.

Digital Library

[35]

John Nash. 1951. Non-cooperative games. Ann. Math. 54, 2 (1951), pp. 286--295.

[36]

Seog-Chan Oh, Dongwon Lee, and Soundar R. T. Kumara. 2008a. Effective web service composition in diverse and large-scale service networks. IEEE Trans. Serv. Comput. 1, 1 (2008), 15--32.

Digital Library

[37]

Seog-Chan Oh, D. Lee, and S. R. T. Kumara. 2008b. Effective web service composition in diverse and large-scale service networks. IEEE Trans. Serv. Comput. 1, 1 (2008), 15--32.

Digital Library

[38]

Liviu Panait and Sean Luke. 2005. Cooperative multi-agent learning: The state of the art. In Proceedings of 2005 Autonomous Agents and Multi-Agent Systems (AAMAS’05). 387--434.

Digital Library

[39]

Petros Papadopoulos, Huaglory Tianfield, David Moffat, and Peter Barrie. 2013. Decentralized multi-agent service composition. Multiagent Grid Syst. 9, 1 (2013), 45--100.

Digital Library

[40]

Marco Pistore, Annapaola Marconi, Piergiorgio Bertoli, and Paolo Traverso. 2005a. Automated composition of web services by planning at the knowledge level. In IJCAI, Vol. 19. 1252--1259.

Digital Library

[41]

Marco Pistore, Paolo Traverso, and Piergiorgio Bertoli. 2005b. Automated composition of web services by planning in asynchronous domains. In ICAPS, Vol. 5. 2--11.

Digital Library

[42]

Martin L. Puterman. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley 8 Sons, New York, NY.

[43]

Mazeiar Salehie and Ladan Tahvildari. 2009. Self-adaptive software: Landscape and research challenges. ACM Trans. Auton. Adapt. Syst. 4, 2 (2009), 14.

Digital Library

[44]

Dong-Hoon Shin, Kyong-Ho Lee, and Tatsuya Suda. 2009. Automated generation of composite web services based on functional semantics. Web Semant.: Sci. Serv. Agents World Wide Web 7, 4 (2009), 332--343.

Digital Library

[45]

Satinder Singh, Tommi Jaakkola, Michael L. Littman, and Csaba Szepesvári. 2000. Convergence results for single-step on-policy reinforcement-learning algorithms. Mach. Learn. 38, 3 (2000), 287--308.

Digital Library

[46]

Peter Stone and Manuela Veloso. 2000. Multiagent systems: A survey from a machine learning perspective. Auton. Robots 8, 3 (2000), 345--383.

Digital Library

[47]

Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. Cambridge University Press, Cambridge.

[48]

Hongxia Tong, Jian Cao, Shensheng Zhang, and Minglu Li. 2011. A distributed algorithm for web service composition based on service agent model. IEEE Trans. Parallel Distrib. Syst. 22, 12 (2011), 2008--2021.

Digital Library

[49]

Thomas Vogel and Holger Giese. 2014. Model-driven engineering of self-adaptive software with EUREMA. ACM Transactions on Autonomous and Adaptive Systems (TAAS) 8, 4 (2014), 18.

Digital Library

[50]

Chang-ying Wang, Wen-wei Chen, and Li Yao. 2004. A multi-agent cooperative reinforcement learning algorithm based on team markov game. J. Fudan Univ. 5 (2004), 041.

[51]

Hongbing Wang, Xin Chen, Qin Wu, Qi Yu, Zibin Zheng, and Athman Bouguettaya. 2014. Integrating on-policy reinforcement learning with multi-agent techniques for adaptive service composition. In Service-Oriented Computing. Springer, 154--168.

[52]

Hongbing Wang, Xiaojun Wang, Xingzhi Zhang, Qi Yu, and Xingguo Hu. 2016. Effective service composition using multi-agent reinforcement learning. Knowl.-Based Syst. 92 (2016), 151--168.

Digital Library

[53]

Hongbing Wang, Xiaojun Wang, and Xuan Zhou. 2012. A multi-agent reinforcement learning model for service composition. In Proceedings of the 2012 IEEE Ninth International Conference on Services Computing (SCC’12). IEEE, 681--682.

Digital Library

[54]

Hongbing Wang, Qin Wu, Xin Chen, and Qi Yu. 2015. Integrating gaussian process with reinforcement learning for adaptive service composition. In Service-Oriented Computing. Springer, 203--217.

[55]

Hongbing Wang, Qin Wu, Xin Chen, Qi Yu, Zibin Zheng, and Athman Bouguettaya. 2014. Adaptive and dynamic service composition via multi-agent reinforcement learning. In Proceedings of the 2014 IEEE International Conference on Web Services (ICWS’14). IEEE, 447--454.

Digital Library

[56]

Hongbing Wang, Xuan Zhou, Xiang Zhou, Weihong Liu, Wenya Li, and Athman Bouguettaya. 2010. Adaptive service composition based on reinforcement learning. In Service-Oriented Computing. Springer, 92--107.

[57]

Lijuan Wang, Jun Shen, and Junzhou Luo. 2015. Facilitating an ant colony algorithm for multi-objective data-intensive service provision. J. Comput. System Sci. 81, 4 (2015), 734--746.

[58]

P. Wang, Z. Ding, C. Jiang, M. Zhou, and Y. Zheng. 2016. Automatic web service composition based on uncertainty execution effects. IEEE Trans. Serv. Comput. 9, 4 (July 2016), 551--565.

[59]

Xiaofeng Wang and Tuomas Sandholm. 2002. Reinforcement learning to play an optimal nash equilibrium in team markov games. In Advances in Neural Information Processing Systems. MIT Press, 1571--1578.

Digital Library

[60]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Q-learning. Mach. Learn. 8, 3-4 (1992), 279--292.

Digital Library

[61]

Y. Wu, C. Yan, Z. Ding, G. Liu, P. Wang, C. Jiang, and M. Zhou. 2016. A multilevel index model to expedite web service discovery and composition in large-scale service repositories. IEEE Trans. Serv. Comput. 9, 3 (May 2016), 330--342.

[62]

Wenbo Xu, Jian Cao, Haiyan Zhao, and Lei Wang. 2012. A multi-agent learning model for service composition. In Proceedings of the 2012 IEEE Asia-Pacific Services Computing Conference (APSCC’12). IEEE, 70--75.

Digital Library

[63]

Yuhong Yan, P. Poizat, and Ludeng Zhao. 2010. Self-adaptive service composition through graphplan repair. In Proceedings of the 2010 IEEE International Conference on Web Services (ICWS’10). 624--627.

Digital Library

[64]

Chunyang Ye and H.-A. Jacobsen. 2013. Whitening soa testing via event exposure. IEEE Trans. Softw. Eng. 39, 10 (2013), 1444--1465.

Digital Library

[65]

H. Peyton Young. 1993. The evolution of conventions. Econometrica 61, 1 (1993), 57--84.

[66]

Zibin Zheng, Yilei Zhang, and Michael R. Lyu. 2014. Investigating QoS of real-world web services. IEEE Trans. Serv. Comput. 7, 1 (2014), 32--39.

Digital Library

Cited By

Cazares Alegría HPico Valencia P(2025)Agentes de software basados en técnicas de aprendizaje automático. Perspectivas desde 2010 hasta 2023REVISTA COLOMBIANA DE TECNOLOGIAS DE AVANZADA (RCTA)10.24054/rcta.v1i45.31311:45(39-56)Online publication date: 1-Jan-2025
https://doi.org/10.24054/rcta.v1i45.3131
Jiang WZhan YFang X(2025)Satellite Edge Computing for Mobile Multimedia Communications: A Multi-agent Federated Reinforcement Learning ApproachACM Transactions on Autonomous and Adaptive Systems10.1145/3715146Online publication date: 3-Feb-2025
https://doi.org/10.1145/3715146
Shatnawi AFaye ERima BShara ZSeriai A(2024)The State of the Art of Emergent Software SystemsIEEE Access10.1109/ACCESS.2024.336990312(31808-31823)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3369903
Show More Cited By

Index Terms

Integrating Reinforcement Learning with Multi-Agent Techniques for Adaptive Service Composition
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
        Multi-agent reinforcement learning
2. Information systems
  1. World Wide Web
    1. Web services

Recommendations

Integrating On-policy Reinforcement Learning with Multi-agent Techniques for Adaptive Service Composition
Service-Oriented Computing
Abstract
In service computing, online services and the Internet environment are evolving over time, which poses a challenge to service composition for adaptivity. In addition, high efficiency should be maintained when faced with massive candidate services. ...
A Dynamic and Adaptable Service Composition Architecture in the Cloud Based on a Multi-Agent System

Nowadays, service composition is one of the major problems in the Cloud due to the exceptional growth in the number of services deployed by providers. Recently, atomic services have been found to be unable to deal with all client requirements. ...
Adaptive and Dynamic Service Composition via Multi-agent Reinforcement Learning
ICWS '14: Proceedings of the 2014 IEEE International Conference on Web Services

In the era of big data, data intensive applications have posed new challenges to the filed of service composition, i.e. composition efficiency and scalability. How to compose massive and evolving services in such dynamic scenarios is a vital problem ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Autonomous and Adaptive Systems

ACM Transactions on Autonomous and Adaptive Systems Volume 12, Issue 2

June 2017

162 pages

ISSN:1556-4665

EISSN:1556-4703

DOI:10.1145/3099619

Editors:
Manish Parashar
Rutgers University, USA
,
Franco Zambonelli
University of Modena e Reggio Emilia, Italy

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 May 2017

Accepted: 01 February 2017

Revised: 01 February 2017

Received: 01 January 2015

Published in TAAS Volume 12, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Collaborative Innovation Centers of Novel Software Technology and Industrialization and Wireless Communications Technology
NSFC Projects
Australian Research Council's Discovery Project
Australian Research Council's Linkage Projects funding scheme

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

29
Total Citations
View Citations
945
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)9

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cazares Alegría HPico Valencia P(2025)Agentes de software basados en técnicas de aprendizaje automático. Perspectivas desde 2010 hasta 2023REVISTA COLOMBIANA DE TECNOLOGIAS DE AVANZADA (RCTA)10.24054/rcta.v1i45.31311:45(39-56)Online publication date: 1-Jan-2025
https://doi.org/10.24054/rcta.v1i45.3131
Jiang WZhan YFang X(2025)Satellite Edge Computing for Mobile Multimedia Communications: A Multi-agent Federated Reinforcement Learning ApproachACM Transactions on Autonomous and Adaptive Systems10.1145/3715146Online publication date: 3-Feb-2025
https://doi.org/10.1145/3715146
Shatnawi AFaye ERima BShara ZSeriai A(2024)The State of the Art of Emergent Software SystemsIEEE Access10.1109/ACCESS.2024.336990312(31808-31823)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3369903
Peng SGuo T(2023)Multi-Objective Service Composition Using Enhanced Multi-Objective Differential Evolution AlgorithmComputational Intelligence and Neuroscience10.1155/2023/81843672023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/8184367
Xu FZhang C(2023)A Solution Space Reduction Approach based on Neural Network and Clustering for Large-scale Service CompositionProceedings of the 2023 International Conference on Artificial Intelligence, Systems and Network Security10.1145/3661638.3661722(443-447)Online publication date: 22-Dec-2023
https://dl.acm.org/doi/10.1145/3661638.3661722
Anjos JMatteussi KOrlandi FBarbosa JSilva JBittencourt LGeyer C(2023)A Survey on Collaborative Learning for Intelligent Autonomous SystemsACM Computing Surveys10.1145/362554456:4(1-37)Online publication date: 10-Nov-2023
https://dl.acm.org/doi/10.1145/3625544
Nguyen TPhan DGonzalez C(2023)Learning in Cooperative Multiagent Systems Using Cognitive and Machine ModelsACM Transactions on Autonomous and Adaptive Systems10.1145/361783518:4(1-22)Online publication date: 14-Oct-2023
https://dl.acm.org/doi/10.1145/3617835
Bradley SJanitra AShahzaad BAlkouz BBouguettaya ALakhdari A(2023)Service-Based Trajectory Planning in Multi-Drone Skyway Networks2023 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)10.1109/PerComWorkshops56833.2023.10150327(334-336)Online publication date: 13-Mar-2023
https://doi.org/10.1109/PerComWorkshops56833.2023.10150327
Yuan YGuo YMa W(2023)Dynamic Service Composition Method Based on Zero-Sum Game Integrated Inverse Reinforcement LearningIEEE Access10.1109/ACCESS.2023.332358411(111897-111908)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3323584
Pervaiz AYang YDuracz ABartha FSai RImes CCartwright RPalem KLu SHoffmann HScholliers CSinger J(2022)GOAL: Supporting General and Dynamic Adaptation in Computing SystemsProceedings of the 2022 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3563835.3567655(16-32)Online publication date: 29-Nov-2022
https://dl.acm.org/doi/10.1145/3563835.3567655
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents