Authors:
Matteo Stefanini
;
Riccardo Lancellotti
;
Lorenzo Baraldi
and
Simone Calderara
Affiliation:
Department of Engineering ”Enzo Ferrari”, University of Modena and Reggio Emilia, Modena and Italy
Keyword(s):
Cloud Computing, VMs Classification, Deep Learning.
Abstract:
Cloud computing data centers are growing in size and complexity to the point where monitoring and management of the infrastructure become a challenge due to scalability issues. A possible approach to cope with the size of such data centers is to identify VMs exhibiting a similar behavior. Existing literature demonstrated that clustering together VMs that show a similar behavior may improve the scalability of both monitoring and management of a data center. However, available clustering techniques suffer from a trade-off between the accuracy of the clustering and the time to achieve this result. Not being able to obtain an accurate clustering in short time hinders the application of these solutions, especially in public cloud scenarios where on-demand VMs are instantiated and run for a short time span. Throughout this paper we propose a different approach where, instead of an unsupervised clustering, we rely on classifiers based on deep learning techniques to assign a newly deployed V
Ms to a cluster of already-known VMs. The two proposed classifiers, namely DeepConv and DeepFFT use a convolution neural network and (in the latter model) exploits Fast Fourier Transformation to classify the VMs. Our proposal is validated using a set of traces describing the behavior of VMs from a real cloud data center. The experiments compare our proposal with state-of-the-art solutions available in literature, such as the AGATE technique and PCA-based clustering, demonstrating that our proposal can achieve a very high accuracy (compared to the best performing alternatives) without the need to introduce the notion of a gray-area to take into account not-yet assigned VMs as in AGATE. Furthermore, we show that our solution is significantly faster than the alternatives as it can produce a perfect classification even with just a few samples of data, such as 4 observations (corresponding to 20 minutes of data), making our proposal viable also to classify on-demand VMs that are characterized by a short life span.
(More)