Abstract
Increasingly, storage vendors are finding it difficult to leverage existing white-box and black-box modeling techniques to build robust system models that can predict system behavior in the emerging dynamic and multi-tenant data centers. White-box models are becoming brittle because the model builders are not able to keep up with the innovations in the storage system stack, and black-box models are becoming brittle because it is increasingly difficult to a priori train the model for the dynamic and multi-tenant data center environment. Thus, there is a need for innovation in system model building area.
In this paper we present a machine learning based blackbox modeling algorithm called M-LISP that can predict system behavior in untrained region for these emerging multitenant and dynamic data center environments. We have implemented and analyzed M-LISP in real environments and the initial results look very promising. We also provide a survey of some common machine learning algorithms and how they fare with respect to satisfying the modeling needs of the new data center environments.
- J. Basak. Online adaptive decision trees: Pattern classification and function approximation. Neural Computation, 18:2062--2101, 2006. Google ScholarDigital Library
- C. M. Bishop. Pattern Recognition and Machine Learning. Springer, New York, 2006. Google ScholarDigital Library
- L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Chapman & Hall, New York, 1983.Google Scholar
- A. Gulati, C. Kumar, I. Ahmad, and K. Kumar. Basil: Automated io load balancing across storage devices. In Proc. 8th USENIX Conf File and Storage Technologies (FAST), pages 169--182, 2010. Google ScholarDigital Library
- A. Gulati, G. Shanmuganathan, I. Ahmad, C. A. Waldspurger, and M. Uysal. Pesto: Online storage performance management in virtualized datacenters. In Proc. 2nd ACM Symp. Cloud Computing (SOCC '11). Google ScholarDigital Library
- A. Merchant and P. Yu. Analytic modeling of clustered raid with mapping based on nearly random permutation. IEEE Trans Computing, 45:367--373, 1996. Google ScholarDigital Library
- M. Mesnier, M. Wachs, R. R. Sambasivan, A. X. Zheng, and G. Ganger. Modeling the relative fitness of storage. In Proc. Int Conf Measurements and Modeling of Computer Systems, SIGMETRICS 2004, pages 37--48, 2004. Google ScholarDigital Library
- C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, USA, 2006. Google ScholarDigital Library
- C. Ruemmler and J. Wilkes. An introduction to disk drive modeling. IEEE Computer, 27:17--29, 1994. Google ScholarDigital Library
- J. Sacks, W. J. Welch, T. J. Mitchell, and H. P. Wynn. Design and analysis of computer experiments. Statistical Science, 4:409?U435, 1989.Google ScholarCross Ref
- E. Shriver, A. Merchant, and J. Wilkes. An analytic behavior model for disk drives with readahead caches and request reordering. In Proceedings of Sigmetrics '98, pages 182--191, 1998. Google ScholarDigital Library
- A. J. Smola and B. Schölkopf. A tutorial on support vector regression. Technical Report NC-TR-98-030, NeuroCOLT, Royal Holloway College, University of London, UK, 1998.Google Scholar
- N. A. Syed, H. Liu, and K. K. Sung. Incremental learning with support vector machines. In Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence (IJCAI-99), San Mateo, 1999. Morgan Kauffmann.Google Scholar
- M. Uysal, G. Alvarez, and A. Merchant. A modular, analytical throughput model for modern disk arrays. In Proc. MASCOTS, pages 183--192, 2001. Google ScholarDigital Library
- V. Vapnik, S. Golowich, and A. Smola. Support vector method for function approximation, regression estimation, and signal processing. In M. Mozer, M. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, pages 281--287. MIT Press, Cambridge, MA, USA, 1997.Google Scholar
- M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with cart models. In Proc. Int Conf Measurements and Modeling of Computer Systems, SIGMETRICS 2004, pages 412--413, 2004. Google ScholarDigital Library
- M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with cart models. Technical Report CMU-PDL-04-103, Parallel Data Laboratory, Carnegie Mellon University, Pittsburgh, USA, 2004.Google ScholarDigital Library
- C. K. I. Williams. Prediction with gaussian processes: From linear regression to linear prediction and beyond. In Learning and Inference in Graphical Models, pages 599--621. Kluwer, 1998. Google ScholarDigital Library
Index Terms
- Model building for dynamic multi-tenant provider environments
Recommendations
Towards Dynamic Tenant Management for Microservice based Multi-Tenant SaaS Applications
ISEC '18: Proceedings of the 11th Innovations in Software Engineering ConferenceIn a multi-tenant cloud application, more than one heterogeneous tenants share the single instance of the application. It increases the degree of resource sharing among tenants and brings down the operational cost. In this work, we propose a ...
Controlled Intelligent Agents' Security Model for Multi-Tenant Cloud Computing Infrastructures
Data security in the cloud continues to be a huge concern. The adoption of cloud services continues to increase with more businesses transitioning from on premise technology infrastructures to outsourcing cloud-based infrastructures. As the cloud ...
Supporting Multi-Provider Serverless Computing on the Edge
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel ProcessingServerless computing has recently emerged as a new execution model for cloud computing, in which service providers offer compute runtimes, also known as Function-as-a-Service (FaaS) platforms, allowing users to develop, execute and manage application ...
Comments