skip to main content
research-article

Model building for dynamic multi-tenant provider environments

Authors Info & Claims
Published:18 December 2012Publication History
Skip Abstract Section

Abstract

Increasingly, storage vendors are finding it difficult to leverage existing white-box and black-box modeling techniques to build robust system models that can predict system behavior in the emerging dynamic and multi-tenant data centers. White-box models are becoming brittle because the model builders are not able to keep up with the innovations in the storage system stack, and black-box models are becoming brittle because it is increasingly difficult to a priori train the model for the dynamic and multi-tenant data center environment. Thus, there is a need for innovation in system model building area.

In this paper we present a machine learning based blackbox modeling algorithm called M-LISP that can predict system behavior in untrained region for these emerging multitenant and dynamic data center environments. We have implemented and analyzed M-LISP in real environments and the initial results look very promising. We also provide a survey of some common machine learning algorithms and how they fare with respect to satisfying the modeling needs of the new data center environments.

References

  1. J. Basak. Online adaptive decision trees: Pattern classification and function approximation. Neural Computation, 18:2062--2101, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. C. M. Bishop. Pattern Recognition and Machine Learning. Springer, New York, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Chapman & Hall, New York, 1983.Google ScholarGoogle Scholar
  4. A. Gulati, C. Kumar, I. Ahmad, and K. Kumar. Basil: Automated io load balancing across storage devices. In Proc. 8th USENIX Conf File and Storage Technologies (FAST), pages 169--182, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. A. Gulati, G. Shanmuganathan, I. Ahmad, C. A. Waldspurger, and M. Uysal. Pesto: Online storage performance management in virtualized datacenters. In Proc. 2nd ACM Symp. Cloud Computing (SOCC '11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Merchant and P. Yu. Analytic modeling of clustered raid with mapping based on nearly random permutation. IEEE Trans Computing, 45:367--373, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Mesnier, M. Wachs, R. R. Sambasivan, A. X. Zheng, and G. Ganger. Modeling the relative fitness of storage. In Proc. Int Conf Measurements and Modeling of Computer Systems, SIGMETRICS 2004, pages 37--48, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, USA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Ruemmler and J. Wilkes. An introduction to disk drive modeling. IEEE Computer, 27:17--29, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Sacks, W. J. Welch, T. J. Mitchell, and H. P. Wynn. Design and analysis of computer experiments. Statistical Science, 4:409?U435, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  11. E. Shriver, A. Merchant, and J. Wilkes. An analytic behavior model for disk drives with readahead caches and request reordering. In Proceedings of Sigmetrics '98, pages 182--191, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. J. Smola and B. Schölkopf. A tutorial on support vector regression. Technical Report NC-TR-98-030, NeuroCOLT, Royal Holloway College, University of London, UK, 1998.Google ScholarGoogle Scholar
  13. N. A. Syed, H. Liu, and K. K. Sung. Incremental learning with support vector machines. In Proceedings of the Workshop on Support Vector Machines at the International Joint Conference on Artificial Intelligence (IJCAI-99), San Mateo, 1999. Morgan Kauffmann.Google ScholarGoogle Scholar
  14. M. Uysal, G. Alvarez, and A. Merchant. A modular, analytical throughput model for modern disk arrays. In Proc. MASCOTS, pages 183--192, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Vapnik, S. Golowich, and A. Smola. Support vector method for function approximation, regression estimation, and signal processing. In M. Mozer, M. Jordan, and T. Petsche, editors, Advances in Neural Information Processing Systems 9, pages 281--287. MIT Press, Cambridge, MA, USA, 1997.Google ScholarGoogle Scholar
  16. M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with cart models. In Proc. Int Conf Measurements and Modeling of Computer Systems, SIGMETRICS 2004, pages 412--413, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Wang, K. Au, A. Ailamaki, A. Brockwell, C. Faloutsos, and G. R. Ganger. Storage device performance prediction with cart models. Technical Report CMU-PDL-04-103, Parallel Data Laboratory, Carnegie Mellon University, Pittsburgh, USA, 2004.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C. K. I. Williams. Prediction with gaussian processes: From linear regression to linear prediction and beyond. In Learning and Inference in Graphical Models, pages 599--621. Kluwer, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Model building for dynamic multi-tenant provider environments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader