Abstract
We introduce a learning controller framework for adaptive control in application service management environments and explore its potential. Run-time metrics are collected by observing the enterprise system during its normal operation and load tests are persisted creating a knowledge base of real system states. Equipped with such knowledge the proposed framework associates system states and high/low service level agreement values with successful/unsuccessful control actions. These associations are used to induce decision rules, which help generating training sets for a neural networks-based control decision module that operates in the application run-time. Control actions are executed in the background of the current system state, which is then again monitored and stored extending the system state repository/knowledge base, and evaluating the correctness of the control actions frequently. This incremental learning leads to evolving controller behavior by taking into account consequences of earlier actions in a particular situation, or other similar situations. Our tests demonstrate that this controller is able to adapt to changing run-time conditions and workloads based on SLA definitions and is able to control the instrumented system under overloading effectively.
Similar content being viewed by others
Notes
The system resources can be modified in order to change important run-time characteristics—this scenario is not discussed in this paper.
The term controller is used as a name of a function of the proposed framework. Due to distributed nature of the framework and the lack of a model of the enterprise system, the function is split into two software components and is deployed in isolation, see Sects. 5.4 and 5.6, where the evaluator, which generates control rules by learning from available data, and the controller API, which coordinates work of actuator agents, are described.
http://httpd.apache.org/docs/trunk/filter.html in Java Servlet specification since version 2.3 (Coward and Yoshida 2003).
Java-ML is a Java Machine Learning library, available of GPL http://java-ml.sourceforge.net/content/feature-selection.
240 system state points in an hour generate 5.7 K in a day, 40 K in a week, 2.1 M in a year.
Merge sets of states of high SLA values and “good” control to promote these control actions.
Merge sets of states of low SLA values and “bad” control to prevent harmful control decisions.
Trained networks are sent to the controller actuators, so they are directly available in the controlled application run- time.
Select m system states S of highest SLAs for “good” control states, where penalties for the control SLA apc (S) were lower than total of SLAs in the state neighborhood.
Select m states of lowest SLAs for “bad” control states, where penalties for the control SLA apc (S) were higher than SLAs (too harmful control)—these rules instruct to avoid applying any control in such states.
A “good” control decision is the one that makes the situation better, so select system states S where total of SLAs is lower for similar state parameters.
A “bad” control decision is the one that makes the situation worse, so select states where SLAs are higher for similar state parameters.
In the simplest form both system states are removed. There have been more variants of resolving conflicts researched as well.
Neuroph 2.4 is an open source Java neural network framework http://neuroph.sourceforge.net/. It contains run-time level reference to another NN library, called Encog http://code.google.com/p/encog-java/. Both of them are published under Apache 2.0 license Apache (2004).
The ddd sample application (web, 2012b) was used as a model enterprise application with a few modifications mainly concerning the load characteristics. It is a sample application which was developed for demonstration of Domain-Driven Design concepts introduced by Evans (2004), and is freely available under the MIT License http://www.opensource.org/licenses/MIT.
Grinder (Aston and Fitzgerald 2005) is a general purpose Java load testing framework specialized in using many load injector machines for flexible load tests (jython scripts) in distributed environments/systems—it is freely available under the BSD license, http://www.opensource.org/licenses/bsd-license.php.
During the simulations the evaluator needed from 20 to 90 s to process the above described algorithm. When the evaluator was running CPU was significantly utilized, which was impacting the application running under load and effectively was changing whole system characteristics. The system had to adapt to such conditions.
References
(2012a) Allmon, a generic system collecting and storing metrics used for performance and availability monitoring. http://code.google.com/p/allmon/
(2012b) DDD sample application, the project is a joint effort by Eric Evans’ company domain language and the Swedish software consulting company Citerus. http://dddsample.sourceforge.net/
Abdelzaher T, Lu C (2000) Modeling and performance control of internet servers. In: Proceedings of the 39th IEEE conference on decision and control, vol 3. IEEE, pp 2234–2239
Abdelzaher T, Shin K, Bhatti N (2002) Performance guarantees for web server end-systems: a control-theoretical approach. IEEE Trans Parallel Distrib Syst 13(1):80–96
Abdelzaher T, Stankovic J, Lu C, Zhang R, Lu Y (2003) Feedback performance control in software services. IEEE Control Syst 23(3):74–90
Abeel T, Van de Peer Y, Saeys Y (2009) Java-ml: a machine learning library. J Mach Learn Res 10:931–934
Abrahao B, Almeida V, Almeida J, Zhang A, Beyer D, Safai F (2006) Self-adaptive sla-driven capacity management for internet services. In: Network operations and management symposium, 2006. NOMS 2006. 10th IEEE/IFIP, IEEE, pp 557–568
Apache (2004) Version 2.0. The Apache Software Foundation: Apache License, Version 20
Aston P, Fitzgerald C (2005) The grinder, a java load testing framework. http://grinder.sourceforge.net/
Åström KJ, Wittenmark B (2008) Adaptive control. Dover Publications
Bertoncini M, Pernici B, Salomie I, Wesner S (2011) Games: green active management of energy in it service centres. Information systems evolution, pp 238–252
Bigus J (1994) Applying neural networks to computer system performance tuning. In: 1994 IEEE international conference on neural networks, 1994. IEEE world congress on computational intelligence, vol 4. IEEE, pp 2442–2447
Bigus J, Hellerstein J, Jayram T, Squillante M (2000) Autotune: a generic agent for automated performance tuning. Practical application of intelligent agents and multi agent technology
Bodık P, Griffith R, Sutton C, Fox A, Jordan M, Patterson D (2009) Statistical machine learning makes automatic control practical for internet datacenters. In: Proceedings of the 2009 conference on hot topics in cloud computing, HotCloud, vol 9
Boniface M, Nasser B, Papay J, Phillips S, Servin A, Yang X, Zlatev Z, Gogouvitis S, Katsaros G, Konstanteli K, et al (2010) Platform-as-a-service architecture for real-time quality of service management in clouds. In: 2010 Fifth international conference on internet and web applications and services (ICIW). IEEE, pp 155–160
Brown B, Chui M, Manyika J (2011) Are you ready for the era of big data? McKinsey Global Institute
Buyya R, Yeo C, Venugopal S (2008) Market-oriented cloud computing: vision, hype, and reality for delivering it services as computing utilities. In: 10th IEEE international conference on high performance computing and communications, 2008. HPCC’08. IEEE, pp 5–13
Buyya R, Yeo C, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gen Comput Syst 25(6):599–616
Chen Y, Gmach D, Hyser C, Wang Z, Bash C, Hoover C, Singhal S (2010) Integrated management of application performance, power and cooling in data centers. In: 2010 IEEE network operations and management symposium (NOMS). IEEE, pp 615–622
Cleveland W, Devlin S (1988) Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc 83(403):596–610
Coward D, Yoshida Y (2003) Java servlet specification version 2.3. Sun Microsystems
Emeakaroha V, Brandic I, Maurer M, Dustdar S (2010) Low level metrics to high level slas-lom2his framework: bridging the gap between monitored metrics and sla parameters in cloud environments. In: 2010 international conference on high performance computing and simulation (HPCS). IEEE, pp 48–54
Evans E (2004) Domain-driven design: tackling complexity in the heart of software. Addison-Wesley Professional
Fodor I (2002) A survey of dimension reduction techniques, vol 9. Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, pp 1–18
Grinshpan L (2012) Solving enterprise applications performance puzzles: queuing models to the rescue. Wiley Online Library
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Haines S (2006) Pro Java EE 5 performance management and optimization. Apress
Hellerstein J (2004) Challenges in control engineering of computing systems. In: American Control Conference, 2004. Proceedings of the 2004, IEEE 3:1970–1979
Hellerstein J, Parekh S, Diao Y, Tilbury D (2004) Feedback control of computing systems. Wiley–IEEE Press
Herrnstein R (1970) On the law of effect. J Exp Anal Behav 13(2):243
Kandasamy N, Abdelwahed S, Hayes J (2004) Self-optimization in computer systems via on-line control: application to power management. In: Proceedings of the international conference on autonomic computing, 2004. IEEE, pp 54–61
Kiczales G, Lamping J, Mendhekar A, Maeda C, Lopes C, Loingtier J, Irwin J (1997) Aspect-oriented programming. ECOOP’97 object-oriented programming, pp 220–242
Kowall J, Cappelli W (2012) Magic quadrant for application performance monitoring. Gartner Research ID:G00232180
Kusic D, Kephart J, Hanson J, Kandasamy N, Jiang G (2009) Power and performance management of virtualized computing environments via lookahead control. Cluster Comput 12(1):1–15
Laddad R (2009) Aspectj in action: enterprise AOP with spring applications. Manning Publications Co.
Lu Y, Abdelzaher T, Lu C, Tao G (2002) An adaptive control framework for qos guarantees and its application to differentiated caching. In: Tenth IEEE international workshop on quality of service, 2002. IEEE, pp 23–32
Lu Y, Abdelzaher T, Lu C, Sha L, Liu X (2003) Feedback control with queueing-theoretic prediction for relative delay guarantees in web servers. In: Proceedings of the 9th IEEE real-time and embedded technology and applications symposium, 2003. IEEE, pp 208–217
Muggleton S (1999) Inductive logic programming: issues, results and the challenge of learning language in logic. Artif Intell 114(1):283–296
Muggleton S, De Raedt L (1994) Inductive logic programming: theory and methods. J Logic Program 19:629–679
Parekh S, Gandhi N, Hellerstein J, Tilbury D, Jayram T, Bigus J (2002) Using control theory to achieve service level objectives in performance management. Real Time Syst 23(1):127–141
Park L, Baek J, Woon-Ki Hong J (2001) Management of service level agreements for multimedia internet service using a utility model. IEEE Commun Mag 39(5):100–106
Patel P, Ranabahu A, Sheth A (2009) Service level agreement in cloud computing. In: Cloud workshops at OOPSLA
Powley W, Martin P, Ogeer N, Tian W (2005) Autonomic buffer pool configuration in postgresql. In: 2005 IEEE international conference on systems, man and cybernetics, vol 1. IEEE, pp 53–58
Ripley BD (2012) Lowees scatter plot smoothing, r statistical data analysis, r documentation. http://stat.ethz.ch/R-manual/R-patched/library/stats/html/lowess.html
Saeys Y, Abeel T, Van de Peer Y (2008) Robust feature selection using ensemble feature selection techniques. Machine learning and knowledge discovery in databases, pp 313–325
Skinner B (1938) The behavior of organisms: an experimental analysis
Skinner B (1963) Operant behavior. Am Psychol 18(8):503
Slotine JJE, Li W et al (1991) Applied nonlinear control, vol 1. Prentice Hall, New Jersey
Stantchev V, Schröpfer C (2009) Negotiating and enforcing qos and slas in grid and cloud computing. Advances in grid and pervasive computing, pp 25–35
Sun D, Chang G, Li F, Wang C, Wang X (2011) Optimizing multi-dimensional qos cloud resource scheduling by immune clonal with preference. Acta Electron Sin 8:018
Sutton R (1984) Temporal credit assignment in reinforcement learning. PhD thesis
Sutton R, Barto A (1998) Reinforcement learning: an introduction, vol 1. Cambridge University Press, Cambridge
Thorndike E, Bruce D (1911) Animal intelligence: experimental studies. Transaction Pub
Vaquero L, Rodero-Merino L, Caceres J, Lindner M (2008) A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput Commun Rev 39(1):50–55
Wang Z, Chen Y, Gmach D, Singhal S, Watson B, Rivera W, Zhu X, Hyser C (2009) Appraise: application-level performance management in virtualized server environments. IEEE Trans Netw Serv Manag 6(4):240–254
Watkins C (1989) Learning from delayed rewards. PhD thesis, King’s College, Cambridge
Welsh M, Culler D (2002) Overload management as a fundamental service design primitive. In: Proceedings of the 10th workshop on ACM SIGOPS European workshop. ACM, pp 63–69
Welsh M, Culler D (2003) Adaptive overload control for busy internet servers. In: Proceedings of the 4th USENIX conference on internet technologies and systems, vol 2
Xiong P, Wang Z, Jung G, Pu C (2010) Study on performance management and application behavior in virtualized environment. In: Network operations and management symposium (NOMS), 2010 IEEE. IEEE, pp 841–844
Zhang A, Santos P, Beyer D, Tang H (2002a) Optimal server resource allocation using an open queueing network model of response time. HP laboratories Technical Report, HPL2002301
Zhang R, Lu C, Abdelzaher T, Stankovic J (2002b) Controlware: a middleware architecture for feedback control of software performance. In: Proceedings of the 22nd international conference on distributed computing systems, 2002. IEEE, pp 301–310
Acknowledgments
This work was partially sponsored by Solid Software Solutions (http://www.solidsoftware.pl/).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sikora, T.D., Magoulas, G.D. Neural Adaptive Control in Application Service Management Environment. Evolving Systems 4, 267–287 (2013). https://doi.org/10.1007/s12530-013-9089-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-013-9089-2