Efficient data synchronization method on integrated computing environment

Jung, Daeyong; Lee, Daewon; Kim, Myungil; Kim, Jaesung

doi:10.1007/s11227-018-2445-z

Efficient data synchronization method on integrated computing environment

Published: 28 May 2018

Volume 75, pages 4252–4266, (2019)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Daeyong Jung ORCID: orcid.org/0000-0002-9110-3093¹,
Daewon Lee²,
Myungil Kim¹ &
…
Jaesung Kim³

172 Accesses
1 Citation
Explore all metrics

Abstract

To execute scientific applications and simulations of enormous scale, the computing paradigm is evolving into one of cluster computing and cloud computing that can exploit the large number of available computing resources. To maximize the utilization of them, company or research center needs a scheduler engine and its data space to construct a cluster computing environment. However, if certain data space is shared, problems related to the security of node, the network traffic imbalance between nodes, and the data protection could arise. To solve these issues, a manager synchronizing the shared data space for the nodes that constitute a cluster computing environment is designed. The synchronization manager shares data in two ways: First, under the cluster environment, the full synchronization group can mount a specific directory space of the master node via NFS. It is used for the data which can be globally referenced. Second, the partial synchronization group delivers data to assigned workers through rsync. It can be used to locally share data for the isolation. The partial synchronization group is superior to full synchronization group in security and efficiency because data are shared in separate manner. By applying adequate data-sharing method, the designed manager efficiently mediate sharing data as purposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 4

Fig. 8

Fig. 15

Fig. 17

UNIO: A Unified I/O System Framework for Hybrid Scientific Workflow

Hybrid Distributed Computing Service Based on the DIRAC Interware

Integration of the JINR Hybrid Computing Resources with the DIRAC Interware for Data Intensive Applications

References

Topcuoglu H, Hariri S, Min-You Wu (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13:260–274
Article Google Scholar
Yu J, Buyya R (2005) A taxonomy of scientific workflow systems for grid computing. SIGMOD Rec 34(3):44–49
Article Google Scholar
Ezell SJ, Atkinson RD (2016) The vital importance of high-performance computing to US competitiveness. Information Technology and Innovation Foundation, Washington, DC. http://www2.itif.org/2016-high-performance-computing.pdf
Buschettu A, Sanna D, Concas G, Pani FE (2015) A platform based on kanban to build taxonomies and folksonomies for DMS and CSS. J Convergence 6(1):1–8
Google Scholar
Keegan N, Ji SY, Chaudhary A, Concolato C, Yu B, Jeong DH (2016) A survey of cloud-based network intrusion detection analysis. Human-centric Comput Inform Sci 6(1):19
Article Google Scholar
Zhu W, Lee C (2016) A security protection framework for cloud computing. J Inf Process Syst 12(3):538–547
Google Scholar
Elastic Compute Cloud (EC2) (2017). http://aws.amazon.com/ec2
Son of Grid Engine (2017). https://arc.liv.ac.uk/trac/SGE
Windows Subsystem for Linux (2017). https://msdn.microsoft.com/en-us/commandline/w-sl/install_guide
Oracle Grid Engine (2017). http://www.oracle.com
Univa Grid Engine (2017). http://www.univa.com/products
Open Grid Scheduler (2017). http://gridscheduler.sourceforge.net
Reducing and Eliminating NFS Usage by Grid Engine (2017). http://arc.liv.ac.uk/SGE/howto/nfsreduce.html
RSYNC (2017). https://rsync.samba.org
Linux NFS-HOWTO (2017). http://www.tldp.org/HOWTO/NFS-HOWTO/server.html
Apache (2017). https://www.apache.org/
PHP (2017). http://php.net
MySQL (2017). https://www.mysql.com
Qhost (2017). http://gridscheduler.sourceforge.net/htmlman/htmlman1/qhost.html
Qstat (2017). http://gridscheduler.sourceforge.net/htmlman/htmlman1/qstat.html
mysql-connector-python (2017). https://pypi.python.org/pypi/mysql-connector-python/2.0.4
python-daemon (2017). https://pypi.python.org/pypi/python-daemon
python-lockfile (2017). https://pypi.python.org/pypi/lockfile/0.9.1

Download references

Acknowledgements

This work was supported by Korea Institute of Science and Technology Information (KISTI) and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2016R1C1B1008330).

Author information

Authors and Affiliations

Department of Supercomputing M&S Technology Development, Korea Institute of Science and Technology Information (KISTI), Daejeon, Korea
Daeyong Jung & Myungil Kim
Department of Computer Engineering, Seokyeong University, Seoul, Korea
Daewon Lee
Supercomputing Modeling and Simulation Center, KISTI, Daejeon, Korea
Jaesung Kim

Authors

Daeyong Jung
View author publications
You can also search for this author in PubMed Google Scholar
Daewon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Myungil Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jaesung Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jaesung Kim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jung, D., Lee, D., Kim, M. et al. Efficient data synchronization method on integrated computing environment. J Supercomput 75, 4252–4266 (2019). https://doi.org/10.1007/s11227-018-2445-z

Download citation

Published: 28 May 2018
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s11227-018-2445-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient data synchronization method on integrated computing environment

Abstract

Access this article

Similar content being viewed by others

UNIO: A Unified I/O System Framework for Hybrid Scientific Workflow

Hybrid Distributed Computing Service Based on the DIRAC Interware

Integration of the JINR Hybrid Computing Resources with the DIRAC Interware for Data Intensive Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient data synchronization method on integrated computing environment

Abstract

Access this article

Similar content being viewed by others

UNIO: A Unified I/O System Framework for Hybrid Scientific Workflow

Hybrid Distributed Computing Service Based on the DIRAC Interware

Integration of the JINR Hybrid Computing Resources with the DIRAC Interware for Data Intensive Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation