Skip to main content
Log in

What to expect when you are consolidating: effective prediction models of application performance on multicores

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Consolidation of multiple applications with diverse and changing resource requirements is common in multicore systems as hardware resources are abundant. As opportunities for better system usage become ample, so are opportunities to degrade individual application performances due to unregulated performance interference between applications and system resources. Can we predict a performance region within which application performance is expected to lie under different consolidations? Alternatively, can we maximize resource utilization while maintaining individual application performance targets? In this work we provide a methodology that offers answers to the above difficult questions by constructing a queueing-theory based tool that can be used to accurately predict application scalability on multicores. The tool can also provide the optimal consolidation suggestions to maximize system resource utilization while meeting application performance targets. The proposed methodology is based on asymptotic analysis that can quickly provide a range of performance values that the user should expect under various consolidation scenarios. In addition, when more accurate performance forecasting is needed, the methodology can provide more accurate predictions using approximate mean value analysis. The methodology is light-weight as it relies on capturing application resource demands using standard system monitoring, via non-intrusive low-level measurements.

We evaluate our approach on an IBM Power7 system using the DaCapo and SPECjvm2008 benchmark suites. From 900 different consolidations of application instances, our tool accurately predicts the average iteration time of collocated applications with an average error below 9 per cent. Experimental and analytical results are in excellent agreement, confirming the robustness of the proposed methodology in suggesting the best consolidations that meet given performance objectives of individual applications while maximizing system resource utilization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Because the applications that we use to evaluate the methodology proposed in this paper are composed by a set of iterations, we use the average “iteration time” as a measure of the application end-to-end execution time. Effectively, a scaled iteration time (multiplied by the number of iterations) expresses the application execution time.

  2. Throughout this paper we use the terms “program”, “benchmark” and “application” interchangeably.

  3. Throughout this paper we use the terms “JVM process” and “application instance” interchangeably.

  4. MVA provides the exact solution of product form queueing networks, whose solutions of the steady-state probabilities can be expressed as a product of factors describing the state of each queuing node.

  5. Of course, any alternative definition of a “target” iteration would also work.

  6. See http://www.spec.org/jvm2008.

  7. Of course, this scenario can be changed and the number of primary execution instances can be any integer. We have done experiments that have varied this number from 1 to 10 but are not reported here due to lack of space. The selected number of four primary consolidated applications is representative of all experiments.

References

  1. Ansaloni, D., Chen, L.Y., Smirni, E., Binder, W.: Model-driven consolidation of Java workloads on multicores. In: Proceedings of IEEE/IFIP International Conference on Dependable Systems and Networks (DSN-PDS), pp. 1–12 (2012)

    Chapter  Google Scholar 

  2. Apparao, P., Iyer, R., Zhang, X., Newell, D., Adelmeyer, T.: Characterization & analysis of a server consolidation benchmark. In: Proceedings of VEE, pp. 21–30 (2008)

    Google Scholar 

  3. Balbo, G., Serazzi, G.: Asymptotic analysis of multiclass closed queueing networks: common bottleneck. Perform. Eval. 26(1), 51–72 (1996)

    Article  MATH  Google Scholar 

  4. Blackburn, S., Garner, R., Hoffman, C., Khan, A., McKinley, K., Bentzur, R., Diwan, A., Feinberg, D., Frampton, D., Guyer, S., Hirzel, M., Hosking, A., Jump, M., Lee, H., Moss, J., Phansalkar, A., Stefanović, D., von Dincklage, D., Wiedermann, B.: The DaCapo benchmarks: Java benchmarking development and analysis. In: Proceedings of OOPSLA, pp. 169–190 (2006)

    Google Scholar 

  5. Chen, J., John, L., Kaseridis, D.: Modeling program resource demand using inherent program characteristics. In: Proceedings of SIGMETRICS, pp. 1–12 (2011)

    Google Scholar 

  6. Chen, L.Y., Ansaloni, D., Smirni, E., Yokokawa, A., Binder, W.: Achieving application-centric performance targets via consolidation on multicores: myth or reality? In: Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (HPDC), pp. 37–48 (2012)

    Google Scholar 

  7. Chen, L.Y., Das, A., Qin, W., Sivasubramaniam, A., Wang, Q., Harper, R., Morris, B.: Consolidating clients on back-end servers with co-location and frequency control. ACM SIGMETRICS Perform. Eval. Rev. 34, 383–384 (2006)

    Article  Google Scholar 

  8. Dey, T., Wang, W., Davidson, J., Soffa, M.: Characterizing multi-threaded applications based on shared-resource contention. In: Proceedings of ISPASS, pp. 76–86 (2011)

    Google Scholar 

  9. Govindan, S., Liu, J., Kansal, A., Sivasubramaniam, A.: Cuanta: quantifying effects of shared on-chip resource interference for consolidated virtual machines. In: Proceedings of the ACM Symposium on Cloud Computing (SOCC) (2011)

    Google Scholar 

  10. Hauswirth, M., Sweeney, P., Diwan, A., Hind, M.: Vertical profiling: understanding the behavior of object-oriented applications. In: Proceedings of OOPSLA, pp. 251–269 (2004)

    Google Scholar 

  11. Hines, M.R., Gordon, A., Silva, M., da Silva, D., Ryu, K.D., Ben-Yehuda, M.: Applications know best: performance-driven memory overcommit with ginkgo. Tech. rep., IBM (2011)

  12. Ïpek, E., McKee, S., Caruana, R., de Supinski, B., Schulz, M.: Efficiently exploring architectural design spaces via predictive modeling. In: Proceedings of ASPLOS, pp. 195–206 (2006)

    Google Scholar 

  13. Jerger, N., Vantreaseand, D., Lipast, M.: An evaluation of server consolidation workloads for multi-core designs. In: Proceedings of IISWC, pp. 47–56 (2007)

    Google Scholar 

  14. Knauerhase, R., Brett, P., Hohlt, B., Li, T., Hahn, S.: Using OS observations to improve performance in multicore systems. IEEE MICRO 28, 54–66 (2008)

    Article  Google Scholar 

  15. Koh, Y., Knauerhase, R.C., Brett, P., Bowman, M., Wen, Z., Pu, C.: An analysis of performance interference effects in virtual environments. In: Proceedings of ISPASS, pp. 200–209 (2007)

    Google Scholar 

  16. Lee, B., Collins, J., Wang, H., Brooks, D.: CPR: composable performance regression for scalable multiprocessor models. In: Proceedings of Micro, pp. 270–281. IEEE Computer Society, Washington (2008)

    Google Scholar 

  17. Lipsky, L., Lieu, C., Tehranipour, A., van de Liefvoort, A.: On the asymptotic behavior of time-sharing systems. Commun. ACM 25(10), 707–714 (1982)

    Article  MATH  Google Scholar 

  18. Menascé, D., Almeida, V., Dowdy, L.: Capacity Planning and Performance Modeling: From Mainframes to Client-Server Systems. Prentice Hall, New York (1994)

    Google Scholar 

  19. Meng, X., Isci, C., Kephart, J., Zhang, L., Bouillet, E., Pendarakis, D.: Efficient resource provisioning in compute clouds via VM multiplexing. In: Proceedings of ICAC, pp. 11–20 (2010)

    Google Scholar 

  20. Mi, N., Casale, G., Cherkasova, L., Smirni, E.: Burstiness in multi-tier applications: symptoms, causes, and new models. In: Proceedings of Middleware, pp. 265–286 (2008)

    Google Scholar 

  21. Nathuji, R., Kansal, A., Ghaffarkhah, A.: Q-clouds: managing performance interference effects for QoS-aware clouds. In: Proceedings of EuroSys, pp. 237–250 (2010)

    Google Scholar 

  22. Reiser, M., Lavenberg, S.S.: Mean-value analysis of closed multichain queuing networks. J. ACM 27, 313–322 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  23. Sharifi, A., Srikantaiah, S., Mishra, A., Kandemir, M., Das, C.: METE: meeting end-to-end QoS in multicores through system-wide resource management. In: Proceedings of SIGMETRICS, pp. 13–24 (2011)

    Google Scholar 

  24. Song, X., Chen, H., Chen, R., Wang, Y., Zang, B.: A case for scaling applications to many-core with OS clustering. In: Proceesings of EuroSys, pp. 61–76 (2011)

    Google Scholar 

  25. Tallent, N., Mellor-Crummey, J.: Effective performance measurement and analysis of multithreaded applications. SIGPLAN Not. 44, 229–240 (2009)

    Article  Google Scholar 

  26. Urgaonkar, B., Pacifici, G., Spreitzer, P.S.M., Tantawi, A.: An analytical model for multi-tier Internet services and its applications. In: Proceedings of SIGMETRICS, pp. 291–302 (2005)

    Google Scholar 

  27. Wood, T., Cherkasova, L., Ozonat, K., Shenoy, P.: Profiling and modeling resource usage of virtualized applications. In: Proceedings of Middleware, pp. 366–387 (2008)

    Google Scholar 

  28. Wood, T., Shenoy, P., Venkataramani, A., Yousif, M.: Sandpiper: black-box and gray-box resource management for virtual machines. Comput. Netw. 53, 2923–2938 (2009)

    Article  MATH  Google Scholar 

  29. Zhang, Q., Cherkasova, L., Mi, N., Smirni, E.: A regression-based analytic model for capacity planning of multi-tier applications. Clust. Comput. 11, 197–211 (2008)

    Article  Google Scholar 

  30. Zhuravlev, S., Blagodurov, S., Fedorova, A.: Addressing shared resource contention in multicore processors via scheduling. In: Proceesings of ASPLOS, pp. 129–142 (2010)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by IBM and the Swiss National Science Foundation (project 200021 141002). Part of this work was conducted while Danilo Ansaloni was on an internship and Evgenia Smirni was on sabbatical leave at the IBM Zurich Research Laboratory. Evgenia Smirni is partially supported by NSF grants CCF-0937925 and CCF-1218758. A preliminary version [6] of this paper appeared in the 21st International Symposium on High-Performance Parallel and Distributed Computing, HPDC’12, Delft, Netherlands, June 18–22, 2012.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgenia Smirni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, L.Y., Serazzi, G., Ansaloni, D. et al. What to expect when you are consolidating: effective prediction models of application performance on multicores. Cluster Comput 17, 19–37 (2014). https://doi.org/10.1007/s10586-013-0273-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0273-8

Keywords

Navigation