Abstract:
Web services have increasingly begun to rely on public cloud platforms. The virtualization technologies employed by public clouds can, however, trigger contention between...Show MoreMetadata
Abstract:
Web services have increasingly begun to rely on public cloud platforms. The virtualization technologies employed by public clouds can, however, trigger contention between virtual machines (VMs) for shared physical machine resources, thereby leading to performance problems for Web services. Past studies have exploited physical-machine-level performance metrics such as clock cycles per instruction to detect such platform-induced performance interference. Unfortunately, public cloud customers do not have access to such metrics. They can only typically access VM-level metrics and application-level metrics such as transaction response times, and such metrics alone are often not useful for detecting inter-VM contention. This poses a difficult challenge to Web service operators for detecting and mitigating platform-induced performance interference issues inside the cloud. We propose a machine-learning-based interference detection technique to address this problem. The technique applies collaborative filtering to predict whether a given transaction being processed by a Web service is adversely suffering from interference. The results can be then used by a management controller to trigger remedial actions, e.g., reporting problems to the system manager or switching cloud providers. Results using a realistic Web benchmark show that the approach is effective. The most effective variant of our approach is able to detect about 96% of performance interference events with almost no false alarms. Furthermore, we show that a load redistribution technique that exploits the information from our detection technique is able to more effectively mitigate the interference than techniques that are interference agnostic.
Published in: IEEE Transactions on Network and Service Management ( Volume: 12, Issue: 3, September 2015)