Skip to main content
Log in

Efficient data synchronization method on integrated computing environment

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

To execute scientific applications and simulations of enormous scale, the computing paradigm is evolving into one of cluster computing and cloud computing that can exploit the large number of available computing resources. To maximize the utilization of them, company or research center needs a scheduler engine and its data space to construct a cluster computing environment. However, if certain data space is shared, problems related to the security of node, the network traffic imbalance between nodes, and the data protection could arise. To solve these issues, a manager synchronizing the shared data space for the nodes that constitute a cluster computing environment is designed. The synchronization manager shares data in two ways: First, under the cluster environment, the full synchronization group can mount a specific directory space of the master node via NFS. It is used for the data which can be globally referenced. Second, the partial synchronization group delivers data to assigned workers through rsync. It can be used to locally share data for the isolation. The partial synchronization group is superior to full synchronization group in security and efficiency because data are shared in separate manner. By applying adequate data-sharing method, the designed manager efficiently mediate sharing data as purposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Topcuoglu H, Hariri S, Min-You Wu (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13:260–274

    Article  Google Scholar 

  2. Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing. SIGMOD Rec 34(3):44–49

    Article  Google Scholar 

  3. Ezell SJ, Atkinson RD (2016) The vital importance of high-performance computing to US competitiveness. Information Technology and Innovation Foundation, Washington, DC. http://www2.itif.org/2016-high-performance-computing.pdf

  4. Buschettu A, Sanna D, Concas G, Pani FE (2015) A platform based on kanban to build taxonomies and folksonomies for DMS and CSS. J Convergence 6(1):1–8

    Google Scholar 

  5. Keegan N, Ji SY, Chaudhary A, Concolato C, Yu B, Jeong DH (2016) A survey of cloud-based network intrusion detection analysis. Human-centric Comput Inform Sci 6(1):19

    Article  Google Scholar 

  6. Zhu W, Lee C (2016) A security protection framework for cloud computing. J Inf Process Syst 12(3):538–547

    Google Scholar 

  7. Elastic Compute Cloud (EC2) (2017). http://aws.amazon.com/ec2

  8. Son of Grid Engine (2017). https://arc.liv.ac.uk/trac/SGE

  9. Windows Subsystem for Linux (2017). https://msdn.microsoft.com/en-us/commandline/w-sl/install_guide

  10. Oracle Grid Engine (2017). http://www.oracle.com

  11. Univa Grid Engine (2017). http://www.univa.com/products

  12. Open Grid Scheduler (2017). http://gridscheduler.sourceforge.net

  13. Reducing and Eliminating NFS Usage by Grid Engine (2017). http://arc.liv.ac.uk/SGE/howto/nfsreduce.html

  14. RSYNC (2017). https://rsync.samba.org

  15. Linux NFS-HOWTO (2017). http://www.tldp.org/HOWTO/NFS-HOWTO/server.html

  16. Apache (2017). https://www.apache.org/

  17. PHP (2017). http://php.net

  18. MySQL (2017). https://www.mysql.com

  19. Qhost (2017). http://gridscheduler.sourceforge.net/htmlman/htmlman1/qhost.html

  20. Qstat (2017). http://gridscheduler.sourceforge.net/htmlman/htmlman1/qstat.html

  21. mysql-connector-python (2017). https://pypi.python.org/pypi/mysql-connector-python/2.0.4

  22. python-daemon (2017). https://pypi.python.org/pypi/python-daemon

  23. python-lockfile (2017). https://pypi.python.org/pypi/lockfile/0.9.1

Download references

Acknowledgements

This work was supported by Korea Institute of Science and Technology Information (KISTI) and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2016R1C1B1008330).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jaesung Kim.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jung, D., Lee, D., Kim, M. et al. Efficient data synchronization method on integrated computing environment. J Supercomput 75, 4252–4266 (2019). https://doi.org/10.1007/s11227-018-2445-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2445-z

Keywords

Navigation