Abstract
During the implementation of the container checkpoint strategy, checkpoint downtime is a pivotal performance indicator. Shorter downtime is especially important for systems that provide critical services. To reduce the checkpoint downtime, an adaptive pre-replication checkpoint strategy named APR-CKPOT is proposed in this paper. Through several rounds of pre-replication, the infrequently modified container memory pages are preferentially copied. The dirty pages generated in the previous round of Pre-Replication are saved in each round of pre-replication. The number of pre-replication checkpoints is adaptively determined by the workload of the user’s operating system in the container. The coordination between fault-tolerance service capabilities and performance of the container can be achieved, and the downtime of the checkpoint can be reduced, which is verified by the given experimental results based on Docker container system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
James, T.: The Docker Book: containerization is the new virtualization, pp. 10–20 (2014). http://www.dockerbook.com/. Accessed 22 Apr 2015
Siozios, K., Soudris, D., Hübner, M.: A framework for supporting adaptive fault-tolerant solutions. ACM Trans. Embed. Comput. Syst. 13(5s), 1–22 (2014)
Bernstein, D.: Containers and cloud: from LXC to Docker to Kubernetes. Cloud Comput. 1(3), 81–84 (2015)
Yang, C.T., Liu, J.C., Hsu, C.H., et al.: On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism. In: IEEE Third International Conference on Cloud Computing Technology and Science, pp. 122–129. IEEE (2013)
Lillibridge, M., Kave, E., Deepavali, B.: Improving restore speed for backup systems that use inline chunk-based deduplication. In: Proceedings of the 11th USENIX Conference on File and Storage Technologies, pp. 183–197. USENIX Conference (2013)
Pradhan, S., Gokhale, A., Otte, W.R., et al.: Real-time fault tolerant deployment and configuration framework for cyber physical systems. ACM SIGBED Rev. 10(2), 32 (2013)
LXC-checkpoint [EB/OL]. http://lxc.sourceforge.net/man/lxc-checkpoint.html
Burns, B., Grant, B., Oppenheimer, D., et al.: Borg, Omega, and Kubernetes. Queue 14(1), 10–34 (2016)
LXC-checkpoint. http://lxc.sourceforge.net/man/lxc-checkpoint.html. Accessed 22 Apr 2015
Liu, Q., Jung, C., Lee, D., et al.: Compiler-directed lightweight checkpointing for fine-grained guaranteed soft error recovery. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 228–239 (2017)
Lin, J.C., Leu, F.Y., Chen, Y.P.: Analyzing job completion reliability and job energy consumption for a heterogeneous MapReduce cluster under different intermediate-data replication policies. J. Supercomput. 71(5), 1657–1677 (2015)
Dinh, T., Barkataki, S.: Distributed container: a design pattern for fault tolerance and high-speed data exchange. ACM SIGAda Ada Lett. 29(3), 115–118 (2009)
Shao, Y., Zhu, X., Bao, W., et al.: CHIME: a checkpoint-based approach to improving the performance of shared clusters. In: International Conference on Parallel and Distributed Systems, pp. 1007–1014. IEEE (2017)
Xu, F., Liu, F.M., Liu, L.H., Jin, H., Li, B., Li, B.C.: iAware: making live migration of virtual machines interference-aware in the cloud. IEEE Trans. Comput. 63(12), 3012–3025 (2014)
Piao, G.Y., Oh, Y.G., Sung, B., Park, C.: Efficient pre-replication live migration with memory compaction and adaptive vm downtime control. In: Proceedings of IEEE 4th International Conference on Big Data and Cloud Computing, pp. 85–90. IEEE (2014)
Louati, T., Abbes, H., Cérin, C., et al.: LXCloud-CR: towards LinuX containers distributed hash table based checkpoint-restart. J. Parallel Distrib. Comput. 12(3), 12–16 (2017)
Beloglazov, A., Buyya, R.: OpenStack Neat: a framework for dynamic and energy-efficient consolidation of virtual machines in OpenStack clouds. Concurr. Comput. Pract. Exp. 27(5), 1310–1333 (2015)
Yamato, Y., Katsuragi, S., Nagao, S., et al.: Software maintenance evaluation of agile software development method based on OpenStack. IEICE Trans. Inf. Syst. E98.D(7), 1377–1380 (2015)
Regola, N., Ducom, J.C.: Recommendations for virtualization technologies in high performance computing. In: Proceedings of 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp. 409–416. IEEE (2010)
Li, C., Xi, S., Lu, C., et al.: Prioritizing soft real-time network traffic in virtualized hosts based on Xen. In: IEEE Real-Time and Embedded Technology and Applications Symposium, pp. 145–156. IEEE (2015)
Chi, X., Liu, B., Niu, Q., et al.: Web load balance and cache optimization design based Nginx under high-concurrency environment. In: Third International Conference on Digital Manufacturing and Automation, pp. 1029–1032. IEEE (2012)
Acknowledgements
This work is supported by the Natural Science Foundation of China (No. 61762008), the Natural Science Foundation Project of Guangxi (No. 2017GXNSFAA198141), the Key R&D project of Guangxi (No. AB17195014).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, S., Chen, N., Zhang, H., Xue, Y., Huang, R. (2018). A High-Performance Adaptive Strategy of Container Checkpoint Based on Pre-replication. In: Wang, G., Chen, J., Yang, L. (eds) Security, Privacy, and Anonymity in Computation, Communication, and Storage. SpaCCS 2018. Lecture Notes in Computer Science(), vol 11342. Springer, Cham. https://doi.org/10.1007/978-3-030-05345-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-05345-1_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05344-4
Online ISBN: 978-3-030-05345-1
eBook Packages: Computer ScienceComputer Science (R0)