Abstract
Docker, a kind of lightweight virtualization technology which has the characteristics of resource isolation, rapid deployment and low cost, is widely used in the construction of the cloud services. Docker-based containers has become the important basis of core cloud businesses. In order to manage the large-scale cloud cluster and enforce the quality of cloud services for consumers, monitoring mechanism for the container-based clouds are indispensable. In this paper, we design and implement a cloud monitoring system - PLMSys based on cluster performance and container logs. It provides the following functions: i) Multi-dimensional resources monitoring. PLMSys can monitor the running states of the cluster hosts and containers, including the utilization of CPU, memory, disk and other resources. ii) Container log collection. PLMSys can centrally collect the logs generated by all containers of the cluster. iii) Rule-based exception alerts. PLMSys allows users to define the abnormal state of the hosts and containers by creating rules, and provides multiple alerting methods. iv) Workload analysis and prediction. PLMSys extracts the descriptive statistics from the cluster workloads and uses the time series models to predict the future workloads. v) Data monitoring visualization. The system uses rich visual charts to reflect the running states of cluster hosts and containers. By using PLMSys, users can better manage cluster hosts and containers.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aceto, G., Botta, A., De Donato, W., Pescapè, A.: Cloud monitoring: a survey. Comput. Netw. 57(9), 2093–2115 (2013)
ColinIanKing: Stress-ng. https://kernel.ubuntu.com/~cking/stress-ng/
Datadog: Datadog. https://www.datadoghq.com/
Docker: docker stats. https://docs.docker.com/engine/reference/commandline/stats/
Google: cadvisor. https://github.com/google/cadvisor
He, S., Zhu, J., He, P., Lyu, M.R.: Experience report: system log analysis for anomaly detection. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 207–218. IEEE (2016)
Hennion, N.: Glances. https://nicolargo.github.io/glances/
Ji, S., Ye, K., Xu, C.-Z.: CMonitor: a monitoring and alarming platform for container-based clouds. In: Da Silva, D., Wang, Q., Zhang, L.-J. (eds.) CLOUD 2019. LNCS, vol. 11513, pp. 324–339. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23502-4_23
Jiménez, L.L., Simón, M.G., Schelén, O., Kristiansson, J., Synnes, K., Åhlund, C.: Coma: Resource monitoring of docker containers. In: CLOSER, pp. 145–154 (2015)
Liu, D., Liu, Z.: An adaptive cloud monitoring framework based on sampling frequency adjusting. Int. J. e-Collaboration (IJeC) 16(2), 12–26 (2020)
Molnar, I.: Cfs scheduler. https://www.kernel.org/doc/html/latest/scheduler/sched-design-CFS.html
Patidar, S., Rane, D., Jain, P.: A survey paper on cloud computing. In: 2012 Second International Conference on Advanced Computing & Communication Technologies, pp. 394–398. IEEE (2012)
Prometheus.io: Prometheus. https://github.com/prometheus
Taylor, S.J., Benjamin, L.: Forecasting at scale. Am. Stat. (2018)
Wang, T., Xu, J., Zhang, W., Gu, Z., Zhong, H.: Self-adaptive cloud monitoring with online anomaly detection. Fut. Generation Comput. Syst. 80, 89–101 (2018)
Xavier, M.G., Neves, M.V., Rossi, F.D., Ferreto, T.C., Lange, T., De Rose, C.A.: Performance evaluation of container-based virtualization for high performance computing environments. In: 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 233–240. IEEE (2013)
Yu, C., Huan, F.: Live migration of docker containers through logging and replay. In: 2015 3rd International Conference on Mechatronics and Industrial Informatics (ICMII 2015). Atlantis Press (2015)
Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research challenges. J. Internet Serv. Appl. 1(1), 7–18 (2010). https://doi.org/10.1007/s13174-010-0007-6
Acknowledgment
This work is supported by Key-Area Research and Development Program of Guangdong Province (NO.2020B010164003), National Natural Science Foundation of China (No. 61702492), Science and Technology Development Fund of Macao S.A.R (FDCT) under number 0015/2019/AKP, Shenzhen Basic Research Program (No. JCYJ20170818153016513), Shenzhen Discipline Construction Project for Urban Computing and Data Intelligence, and Youth Innovation Promotion Association CAS.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Sun, Y., Ye, K., Xu, CZ. (2020). PLMSys: A Cloud Monitoring System Based on Cluster Performance and Container Logs. In: Zhang, Q., Wang, Y., Zhang, LJ. (eds) Cloud Computing – CLOUD 2020. CLOUD 2020. Lecture Notes in Computer Science(), vol 12403. Springer, Cham. https://doi.org/10.1007/978-3-030-59635-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-59635-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59634-7
Online ISBN: 978-3-030-59635-4
eBook Packages: Computer ScienceComputer Science (R0)