A High-Performance Adaptive Strategy of Container Checkpoint Based on Pre-replication

Zhang, Shuo; Chen, Ningjiang; Zhang, Hanlin; Xue, Yijun; Huang, Ruwei

doi:10.1007/978-3-030-05345-1_20

Shuo Zhang¹⁶,
Ningjiang Chen¹⁶,
Hanlin Zhang¹⁶,
Yijun Xue¹⁶ &
…
Ruwei Huang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11342))

Included in the following conference series:

International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage

1514 Accesses
1 Citations

Abstract

During the implementation of the container checkpoint strategy, checkpoint downtime is a pivotal performance indicator. Shorter downtime is especially important for systems that provide critical services. To reduce the checkpoint downtime, an adaptive pre-replication checkpoint strategy named APR-CKPOT is proposed in this paper. Through several rounds of pre-replication, the infrequently modified container memory pages are preferentially copied. The dirty pages generated in the previous round of Pre-Replication are saved in each round of pre-replication. The number of pre-replication checkpoints is adaptively determined by the workload of the user’s operating system in the container. The coordination between fault-tolerance service capabilities and performance of the container can be achieved, and the downtime of the checkpoint can be reduced, which is verified by the given experimental results based on Docker container system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

James, T.: The Docker Book: containerization is the new virtualization, pp. 10–20 (2014). http://www.dockerbook.com/. Accessed 22 Apr 2015
Siozios, K., Soudris, D., Hübner, M.: A framework for supporting adaptive fault-tolerant solutions. ACM Trans. Embed. Comput. Syst. 13(5s), 1–22 (2014)
Article Google Scholar
Bernstein, D.: Containers and cloud: from LXC to Docker to Kubernetes. Cloud Comput. 1(3), 81–84 (2015)
Article Google Scholar
Yang, C.T., Liu, J.C., Hsu, C.H., et al.: On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism. In: IEEE Third International Conference on Cloud Computing Technology and Science, pp. 122–129. IEEE (2013)
Google Scholar
Lillibridge, M., Kave, E., Deepavali, B.: Improving restore speed for backup systems that use inline chunk-based deduplication. In: Proceedings of the 11th USENIX Conference on File and Storage Technologies, pp. 183–197. USENIX Conference (2013)
Google Scholar
Pradhan, S., Gokhale, A., Otte, W.R., et al.: Real-time fault tolerant deployment and configuration framework for cyber physical systems. ACM SIGBED Rev. 10(2), 32 (2013)
Article Google Scholar
LXC-checkpoint [EB/OL]. http://lxc.sourceforge.net/man/lxc-checkpoint.html
Burns, B., Grant, B., Oppenheimer, D., et al.: Borg, Omega, and Kubernetes. Queue 14(1), 10–34 (2016)
Article Google Scholar
LXC-checkpoint. http://lxc.sourceforge.net/man/lxc-checkpoint.html. Accessed 22 Apr 2015
Liu, Q., Jung, C., Lee, D., et al.: Compiler-directed lightweight checkpointing for fine-grained guaranteed soft error recovery. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 228–239 (2017)
Google Scholar
Lin, J.C., Leu, F.Y., Chen, Y.P.: Analyzing job completion reliability and job energy consumption for a heterogeneous MapReduce cluster under different intermediate-data replication policies. J. Supercomput. 71(5), 1657–1677 (2015)
Article Google Scholar
Dinh, T., Barkataki, S.: Distributed container: a design pattern for fault tolerance and high-speed data exchange. ACM SIGAda Ada Lett. 29(3), 115–118 (2009)
Article Google Scholar
Shao, Y., Zhu, X., Bao, W., et al.: CHIME: a checkpoint-based approach to improving the performance of shared clusters. In: International Conference on Parallel and Distributed Systems, pp. 1007–1014. IEEE (2017)
Google Scholar
Xu, F., Liu, F.M., Liu, L.H., Jin, H., Li, B., Li, B.C.: iAware: making live migration of virtual machines interference-aware in the cloud. IEEE Trans. Comput. 63(12), 3012–3025 (2014)
Article MathSciNet Google Scholar
Piao, G.Y., Oh, Y.G., Sung, B., Park, C.: Efficient pre-replication live migration with memory compaction and adaptive vm downtime control. In: Proceedings of IEEE 4th International Conference on Big Data and Cloud Computing, pp. 85–90. IEEE (2014)
Google Scholar
Louati, T., Abbes, H., Cérin, C., et al.: LXCloud-CR: towards LinuX containers distributed hash table based checkpoint-restart. J. Parallel Distrib. Comput. 12(3), 12–16 (2017)
Google Scholar
Beloglazov, A., Buyya, R.: OpenStack Neat: a framework for dynamic and energy-efficient consolidation of virtual machines in OpenStack clouds. Concurr. Comput. Pract. Exp. 27(5), 1310–1333 (2015)
Article Google Scholar
Yamato, Y., Katsuragi, S., Nagao, S., et al.: Software maintenance evaluation of agile software development method based on OpenStack. IEICE Trans. Inf. Syst. E98.D(7), 1377–1380 (2015)
Article Google Scholar
Regola, N., Ducom, J.C.: Recommendations for virtualization technologies in high performance computing. In: Proceedings of 2010 IEEE Second International Conference on Cloud Computing Technology and Science, pp. 409–416. IEEE (2010)
Google Scholar
Li, C., Xi, S., Lu, C., et al.: Prioritizing soft real-time network traffic in virtualized hosts based on Xen. In: IEEE Real-Time and Embedded Technology and Applications Symposium, pp. 145–156. IEEE (2015)
Google Scholar
Chi, X., Liu, B., Niu, Q., et al.: Web load balance and cache optimization design based Nginx under high-concurrency environment. In: Third International Conference on Digital Manufacturing and Automation, pp. 1029–1032. IEEE (2012)
Google Scholar

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of China (No. 61762008), the Natural Science Foundation Project of Guangxi (No. 2017GXNSFAA198141), the Key R&D project of Guangxi (No. AB17195014).

Author information

Authors and Affiliations

School of Computer and Electronic Information, Guangxi University, Nanning, 530004, China
Shuo Zhang, Ningjiang Chen, Hanlin Zhang, Yijun Xue & Ruwei Huang

Authors

Shuo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Ningjiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hanlin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yijun Xue
View author publications
You can also search for this author in PubMed Google Scholar
Ruwei Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ningjiang Chen .

Editor information

Editors and Affiliations

Guangzhou University, Guangzhou, China
Guojun Wang
Swinburne University of Technology, Melbourne, VIC, Australia
Jinjun Chen
St. Francis Xavier University, Antigonish, NS, Canada
Laurence T. Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, S., Chen, N., Zhang, H., Xue, Y., Huang, R. (2018). A High-Performance Adaptive Strategy of Container Checkpoint Based on Pre-replication. In: Wang, G., Chen, J., Yang, L. (eds) Security, Privacy, and Anonymity in Computation, Communication, and Storage. SpaCCS 2018. Lecture Notes in Computer Science(), vol 11342. Springer, Cham. https://doi.org/10.1007/978-3-030-05345-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-05345-1_20
Published: 07 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05344-4
Online ISBN: 978-3-030-05345-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics