skip to main content
10.1145/2462326.2462337acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
research-article

Automatic virtual machine clustering based on bhattacharyya distance for multi-cloud systems

Published: 22 April 2013 Publication History

Abstract

Size and complexity of modern data centers pose scalability issues for the resource monitoring system supporting management operations, such as server consolidation. When we pass from cloud to multi-cloud systems, scalability issues are exacerbated by the need to manage geographically distributed data centers and exchange monitored data across them. While existing solutions typically consider every Virtual Machine (VM) as a black box with independent characteristics, we claim that scalability issues in multi-cloud systems could be addressed by clustering together VMs that show similar behaviors in terms of resource usage. In this paper, we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. This innovative methodology exploits the Bhattacharyya distance to measure the similarity of the probability distributions of VM resources usage, and automatically selects the most relevant resources to consider for the clustering process. The methodology is evaluated through a set of experiments with data from a cloud provider. We show that our proposal achieves high and stable performance in terms of automatic VM clustering. Moreover, we estimate the reduction in the amount of data collected to support system management in the considered scenario, thus showing how the proposed methodology may reduce the monitoring requirements in multi-cloud systems.

References

[1]
E. Amigó, J. Gonzalo, J. Artiles, and F. Verdejo. A Comparison of Extrinsic Clustering Evaluation Metrics Based on Formal Constraints. Journal of Information Retrieval, 12(4):461--486, Aug. 2009.
[2]
D. Ardagna, E. di Nitto, P. Mohagheghi, et al. MODAClouds: A model-driven approach for the design and execution of applications on multiple clouds. In Proc. of Workshop on Modeling in Software Engineering (MISE), June 2012.
[3]
D. Ardagna, B. Panicucci, M. Trubian, and L. Zhang. Energy-Aware Autonomic Resource Allocation in Multitier Virtualized Environments. IEEE Trans. on Services Computing, 5(1):2 --19, Jan. 2012.
[4]
Y. Baryshnikov, E. Coffman, G. Pierre, D. Rubenstein, M. Squillante, and T. Yimwadsana. Predictability of Web-Server Traffic Congestion. In Proc. of IEEE Workshop on Web Content Caching and Distribution (WCW), Sophia Antipolis, France, Sept. 2005.
[5]
A. Beloglazov and R. Buyya. Adaptive Threshold-Based Approach for Energy-Efficient Consolidation of Virtual Machines in Cloud Data Centers. In Proc. of MGC Workshop, Bangalore, India, Dec. 2010.
[6]
A. Bhattacharyya. On a measure of divergence between two statistical populations defined by their probability distributions. Bulletin of the Calcutta Mathematical Society, 35:99--109, 1943.
[7]
C. Canali and R. Lancellotti. Automated Clustering of Virtual Machines based on Correlation of Resource Usage. Communications Software and Systems, 8(4), Dec. 2012.
[8]
C. Canali and R. Lancellotti. Automated Clustering of VMs for Scalable Cloud Monitoring and Management. In Proc. of Conference on Software, Telecommunications and Computer Networks (SOFTCOM), Split, Croatia, Sept. 2012.
[9]
S. Casolari, S. Tosi, and F. Lo Presti. An adaptive model for online detection of state changes in Internet-based systems. Performance Evaluation, 69(5):206--226, May 2012.
[10]
M. Castro and B. Liskov. Practical Byzantine Fault Tolerance. In M. I. Seltzer and P. J. Leach, editors, OSDI, pages 173--186. USENIX Association, 1999.
[11]
I. S. Dhillon, Y. Guan, and B. Kulis. Kernel k-means: spectral clustering and normalized cuts. In Proc. of International Conference on Knowledge Discovery and Data Mining, Seattle, USA, Aug. 2004.
[12]
D. Durkee. Why cloud computing will never be free. Queue, 8(4):20:20--20:29, Apr. 2010.
[13]
M. Filippone, F. Camastra, F. Masulli, and S. Rovetta. A survey of kernel and spectral methods for clustering. Pattern Recognition, 41(1):176--190, Jan. 2008.
[14]
D. Freedman and P. Diaconis. On the histogram as a density estimator:L2 theory. Probability Theory and Related Fields, 57(4):453--476, Dec. 1981.
[15]
D. Gmach, J. Rolia, L. Cherkasova, and A. Kemper. Resource pool management: Reactive versus proactive or let's be friends. Computer Networks, 53(17), Dec. 2009.
[16]
Z. Gong and X. Gu. PAC: Pattern-driven Application Consolidation for Efficient Cloud Computing. In Proc. of Symposium on Modeling, Analysis, Simulation of Computer and Telecommunication Systems, Miami Beach, Aug. 2010.
[17]
A. K. Jain. Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8):651--666, 2010.
[18]
A. Karatzoglou, A. Smola, K. Hornik, and A. Zeileis. kernlab - An S4 package for kernel methods in R. Technical Report 9, WU Vienna University of Economics and Business, Aug 2004.
[19]
D. Kusic, J. O. Kephart, J. E. Hanson, N. Kandasamy, and G. Jiang. Power and Performance Management of Virtualized Computing Environment via Lookahead. Cluster Computing, 12(1):1--15, Mar. 2009.
[20]
R. Lancellotti, M. Andreolini, C. Canali, and M. Colajanni. Dynamic Request Management Algorithms for Web-Based Services in Cloud Computing. In Proc. of IEEE Computer Software and Applications Conference (COMPSAC), Munich, Germany, Jul. 2011.
[21]
U. Luxburg. A tutorial on spectral clustering. Statistics and Computing, 17(4):395--416, Dec. 2007.
[22]
A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems, pages 849--856, 2001.
[23]
B. Rochwerger, D. Breitgand, E. Levy, et al. The reservoir model and architecture for open federated cloud computing. IBM Journal of Research and Development, 53(4):4:1--4:11, July 2009.
[24]
D. W. Scott. On Optimal and Data-Based Histograms. Biometrika, 66(3):605--610, 1979.
[25]
T. Setzer and A. Stage. Decision support for virtual machine reassignments in enterprise data centers. In Proc. of Network Operations and Management Symposium (NOMS'10), Osaka, Japan, Apr. 2010.
[26]
C. Tang, M. Steinder, M. Spreitzer, and G. Pacifici. A scalable application placement controller for enterprise data centers. In Proc. of 16th World Wide Web Conference (WWW'07), Banff, Canada, May 2007.
[27]
T. Wood, P. Shenoy, A. Venkataramani, and M. Yousif. Black-box and gray-box strategies for virtual machine migration. In Proc. of Conference on Networked systems design and implementation (NSDI), Cambridge, Apr. 2007.
[28]
R. Zhang, R. Routray, D. M. Eyers, et al. IO Tetris: Deep storage consolidation for the cloud via fine-grained workload analysis. In IEEE Int'l Conference on Cloud Computing, Washington, DC USA, July 2011.

Cited By

View all
  • (2022)Design of IT Infrastructure Multicloud Management Platform Based on Hybrid CloudWireless Communications and Mobile Computing10.1155/2022/92279482022(1-12)Online publication date: 27-Jul-2022
  • (2022)A Complete Review on the Application of Statistical Methods for Evaluating Internet Traffic UsageIEEE Access10.1109/ACCESS.2022.322707310(128433-128455)Online publication date: 2022
  • (2020)Multiple Samples Clustering with Second-moment Information in Stock ClusteringProceedings of the 2020 3rd International Conference on Artificial Intelligence and Pattern Recognition10.1145/3430199.3430223(215-222)Online publication date: 26-Jun-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MultiCloud '13: Proceedings of the 2013 international workshop on Multi-cloud applications and federated clouds
April 2013
76 pages
ISBN:9781450320504
DOI:10.1145/2462326
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bhattacharyya distance
  2. cloud computing
  3. spectral clustering
  4. virtual machine clustering

Qualifiers

  • Research-article

Conference

ICPE'13
Sponsor:

Acceptance Rates

MultiCloud '13 Paper Acceptance Rate 9 of 18 submissions, 50%;
Overall Acceptance Rate 9 of 18 submissions, 50%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Design of IT Infrastructure Multicloud Management Platform Based on Hybrid CloudWireless Communications and Mobile Computing10.1155/2022/92279482022(1-12)Online publication date: 27-Jul-2022
  • (2022)A Complete Review on the Application of Statistical Methods for Evaluating Internet Traffic UsageIEEE Access10.1109/ACCESS.2022.322707310(128433-128455)Online publication date: 2022
  • (2020)Multiple Samples Clustering with Second-moment Information in Stock ClusteringProceedings of the 2020 3rd International Conference on Artificial Intelligence and Pattern Recognition10.1145/3430199.3430223(215-222)Online publication date: 26-Jun-2020
  • (2019)AGATE: Adaptive Gray Area-Based TEchnique to Cluster Virtual Machines with Similar BehaviorIEEE Transactions on Cloud Computing10.1109/TCC.2017.26648317:3(650-663)Online publication date: 1-Jul-2019
  • (2017)Clustering-Based IaaS Cloud Monitoring2017 IEEE 10th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD.2017.90(672-679)Online publication date: Jun-2017
  • (2017)Scalable and automatic virtual machines placement based on behavioral similaritiesComputing10.1007/s00607-016-0498-599:6(575-595)Online publication date: 1-Jun-2017
  • (2016)Projection Pursuit algorithm for virtual machine clustering for monitoring purposes2016 International Conference on Industrial Informatics and Computer Systems (CIICS)10.1109/ICCSII.2016.7462416(1-6)Online publication date: Mar-2016
  • (2016)An Analysis of Public Clouds Elasticity in the Execution of Scientific ApplicationsJournal of Grid Computing10.1007/s10723-016-9361-314:2(193-216)Online publication date: 1-Jun-2016
  • (2015)Anatomy of Cloud Monitoring and MeteringProceedings of the 6th Asia-Pacific Workshop on Systems10.1145/2797022.2797039(1-7)Online publication date: 27-Jul-2015
  • (2015)Exploiting Classes of Virtual Machines for Scalable IaaS Cloud ManagementProceedings of the 2015 IEEE 4th Symposium on Network Cloud Computing and Applications10.1109/NCCA.2015.13(15-22)Online publication date: 11-Jun-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media