Abstract
When offering the HPC as a Service middleware to remote users wishing them to use a HPC infrastructure, we are facing a problem of uploading large data collections for processing. On one hand, as users perform calculations on local infrastructures as well, they typically make use of the most common 1 Gb/s connectivity and the institution’s connectivity to the Internet is not influencing transfer rates. On the other hand, the institution offering the HPC as a Service middleware may be distant both geographically and network-wise. Furthermore, the amount of data to be transferred ranges from hundreds of gigabytes to several terabytes. In this paper, we study the common protocols used for secure data transfers and evaluate results on IT4Innovations infrastructure. Changes to the upload to HEAppE middleware developed in our institution are also discussed.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Allcock, B., et al.: Data management and transfer in high-performance computational grid environments. Parallel Comput. 28(5), 749–771 (2002). http://www.sciencedirect.com/science/article/pii/S0167819102000947
Baru, C., Moore, R., Rajasekar, A., Wan, M.: The SDSC storage resource broker. In: IBM Toronto Centre for Advanced Studies Conference (CASCON 1998), Toronto, Canada, November 1998
Belshe, M., Peon, R., Thomson, M.: Hypertext Transfer Protocol Version 2 (HTTP/2). RFC 7540, May 2015. https://rfc-editor.org/rfc/rfc7540.txt
Huisken, J., Swoger, J., Bene, F.D., Wittbrodt, J., Stelzer, E.H.K.: Optical sectioning deep inside live embryos by selective plane illumination microscopy. Science 305(5686), 1007–1009 (2004)
Liu, Z., Balaprakash, P., Kettimuthu, R., Foster, I.: Explaining wide area data transfer performance. In: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2017, pp. 167–178. ACM, New York, (2017). https://doi.org/10.1145/3078597.3078605
Lonvick, C.M., Ylonen, T.: The Secure Shell (SSH) Protocol Architecture. RFC 4251, January 2006. https://rfc-editor.org/rfc/rfc4251.txt
Marx, V.: The big challenges of big data. Nature 498, 255 EP (2013). https://doi.org/10.1038/498255a
Parashar, M., AbdelBaky, M., Rodero, I., Devarakonda, A.: Cloud paradigms and practices for computational and data-enabled science and engineering. Comput. Sci. Eng. 15(4), 10–18 (2013)
Rapier, C., Bennett, B.: High speed bulk data transfer using the SSH protocol. In: Proceedings of the 15th ACM Mardi Gras Conference: From Lightweight Mash-Ups to Lambda Grids: Understanding the Spectrum of Distributed Computing Requirements, Applications, Tools, Infrastructures, Interoperability, and the Incremental Adoption of Key Capabilities, MG 2008, pp. 11:1–11:7. ACM, New York, (2008). https://doi.org/10.1145/1341811.1341824
Rescorla, E., Schiffman, A.M.: The Secure HyperText Transfer Protocol. RFC 2660, August 1999. https://rfc-editor.org/rfc/rfc2660.txt
Reynaud, E.G., Peychl, J., Huisken, J., Tomancak, P.: Guide to light-sheet microscopy for adventurous biologists. Nat. Methods 12(1), 30–34 (2014)
Stelzer, E.H.K.: Light-sheet fluorescence microscopy for quantitative biology. Nat. Methods 12, 23 (2014)
Stewart, R.R.: Stream Control Transmission Protocol. RFC 4960, September 2007. https://rfc-editor.org/rfc/rfc4960.txt
Svatoň, V., et al.: Floreon+: a web-based platform for flood prediction, hydrologic modelling and dynamic data analysis. In: Ivan, I., Horák, J., Inspektor, T. (eds.) GIS OSTRAVA 2017. LNGC, pp. 409–422. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-61297-3_30
Xu, H., et al.: iRODS primer 2: integrated rule-oriented data system. In: Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers (2017). https://doi.org/10.2200/S00760ED1V01Y201702ICR057
Yamanaka, K., et al.: High-performance data transfer for full data replication between iter and the remote experimentation centre. Fusion Eng. Des. 138, 202–209 (2019). http://www.sciencedirect.com/science/article/pii/S0920379618306926
Acknowledgement
This work was supported by the European Regional Development Fund in the IT4Innovations national supercomputing center – path to exascale project, project number CZ.02.1.01/0.0/0.0/16_013/0001791 within the Operational Programme Research, Development and Education.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Moravec, P., Kožusznik, J., Krumnikl, M., Klímová, J. (2019). Performance of User Data Collections Uploads to HPCaaS Infrastructure. In: Saeed, K., Chaki, R., Janev, V. (eds) Computer Information Systems and Industrial Management. CISIM 2019. Lecture Notes in Computer Science(), vol 11703. Springer, Cham. https://doi.org/10.1007/978-3-030-28957-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-28957-7_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28956-0
Online ISBN: 978-3-030-28957-7
eBook Packages: Computer ScienceComputer Science (R0)