Skip to main content

Performance of User Data Collections Uploads to HPCaaS Infrastructure

  • Conference paper
  • First Online:
  • 837 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11703))

Abstract

When offering the HPC as a Service middleware to remote users wishing them to use a HPC infrastructure, we are facing a problem of uploading large data collections for processing. On one hand, as users perform calculations on local infrastructures as well, they typically make use of the most common 1 Gb/s connectivity and the institution’s connectivity to the Internet is not influencing transfer rates. On the other hand, the institution offering the HPC as a Service middleware may be distant both geographically and network-wise. Furthermore, the amount of data to be transferred ranges from hundreds of gigabytes to several terabytes. In this paper, we study the common protocols used for secure data transfers and evaluate results on IT4Innovations infrastructure. Changes to the upload to HEAppE middleware developed in our institution are also discussed.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.slac.stanford.edu/~abh/bbcp/.

  2. 2.

    www.jcraft.com/jsch/.

References

  1. Allcock, B., et al.: Data management and transfer in high-performance computational grid environments. Parallel Comput. 28(5), 749–771 (2002). http://www.sciencedirect.com/science/article/pii/S0167819102000947

    Article  Google Scholar 

  2. Baru, C., Moore, R., Rajasekar, A., Wan, M.: The SDSC storage resource broker. In: IBM Toronto Centre for Advanced Studies Conference (CASCON 1998), Toronto, Canada, November 1998

    Google Scholar 

  3. Belshe, M., Peon, R., Thomson, M.: Hypertext Transfer Protocol Version 2 (HTTP/2). RFC 7540, May 2015. https://rfc-editor.org/rfc/rfc7540.txt

  4. Huisken, J., Swoger, J., Bene, F.D., Wittbrodt, J., Stelzer, E.H.K.: Optical sectioning deep inside live embryos by selective plane illumination microscopy. Science 305(5686), 1007–1009 (2004)

    Article  Google Scholar 

  5. Liu, Z., Balaprakash, P., Kettimuthu, R., Foster, I.: Explaining wide area data transfer performance. In: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2017, pp. 167–178. ACM, New York, (2017). https://doi.org/10.1145/3078597.3078605

  6. Lonvick, C.M., Ylonen, T.: The Secure Shell (SSH) Protocol Architecture. RFC 4251, January 2006. https://rfc-editor.org/rfc/rfc4251.txt

  7. Marx, V.: The big challenges of big data. Nature 498, 255 EP (2013). https://doi.org/10.1038/498255a

    Article  Google Scholar 

  8. Parashar, M., AbdelBaky, M., Rodero, I., Devarakonda, A.: Cloud paradigms and practices for computational and data-enabled science and engineering. Comput. Sci. Eng. 15(4), 10–18 (2013)

    Article  Google Scholar 

  9. Rapier, C., Bennett, B.: High speed bulk data transfer using the SSH protocol. In: Proceedings of the 15th ACM Mardi Gras Conference: From Lightweight Mash-Ups to Lambda Grids: Understanding the Spectrum of Distributed Computing Requirements, Applications, Tools, Infrastructures, Interoperability, and the Incremental Adoption of Key Capabilities, MG 2008, pp. 11:1–11:7. ACM, New York, (2008). https://doi.org/10.1145/1341811.1341824

  10. Rescorla, E., Schiffman, A.M.: The Secure HyperText Transfer Protocol. RFC 2660, August 1999. https://rfc-editor.org/rfc/rfc2660.txt

  11. Reynaud, E.G., Peychl, J., Huisken, J., Tomancak, P.: Guide to light-sheet microscopy for adventurous biologists. Nat. Methods 12(1), 30–34 (2014)

    Article  Google Scholar 

  12. Stelzer, E.H.K.: Light-sheet fluorescence microscopy for quantitative biology. Nat. Methods 12, 23 (2014)

    Article  Google Scholar 

  13. Stewart, R.R.: Stream Control Transmission Protocol. RFC 4960, September 2007. https://rfc-editor.org/rfc/rfc4960.txt

  14. Svatoň, V., et al.: Floreon+: a web-based platform for flood prediction, hydrologic modelling and dynamic data analysis. In: Ivan, I., Horák, J., Inspektor, T. (eds.) GIS OSTRAVA 2017. LNGC, pp. 409–422. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-61297-3_30

    Chapter  Google Scholar 

  15. Xu, H., et al.: iRODS primer 2: integrated rule-oriented data system. In: Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool Publishers (2017). https://doi.org/10.2200/S00760ED1V01Y201702ICR057

    Article  Google Scholar 

  16. Yamanaka, K., et al.: High-performance data transfer for full data replication between iter and the remote experimentation centre. Fusion Eng. Des. 138, 202–209 (2019). http://www.sciencedirect.com/science/article/pii/S0920379618306926

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by the European Regional Development Fund in the IT4Innovations national supercomputing center – path to exascale project, project number CZ.02.1.01/0.0/0.0/16_013/0001791 within the Operational Programme Research, Development and Education.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Pavel Moravec or Jan Kožusznik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moravec, P., Kožusznik, J., Krumnikl, M., Klímová, J. (2019). Performance of User Data Collections Uploads to HPCaaS Infrastructure. In: Saeed, K., Chaki, R., Janev, V. (eds) Computer Information Systems and Industrial Management. CISIM 2019. Lecture Notes in Computer Science(), vol 11703. Springer, Cham. https://doi.org/10.1007/978-3-030-28957-7_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28957-7_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28956-0

  • Online ISBN: 978-3-030-28957-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics