Skip to main content

Modelling Network Throughput of Large-Scale Scientific Data Transfers

  • Conference paper
  • First Online:
  • 529 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1444))

Abstract

Rucio is an open-source software framework that provides scientific collaborations with the functionality to organize, manage, and access their data at scale. The data can be distributed across heterogeneous data centers at widely distributed locations [1]. Since its commissioning in 2014, Rucio has become the de-facto standard for scientific data management, even outside CERN community [6]. The rich amount of data gathered about the transfers by Rucio presents a unique opportunity to better understand the complex mechanisms involved in file transfers across the Worldwide LHC Computing Grid (WLCG). This work focuses on the study of a recently published dataset [4] to reconstruct the lifetime of transfers and reveals important information that can be used to predict the Time To Complete (TTC) of transfers across the WLCG.

Supported by CONICET - IFLP - CERN - LINTI.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Barisits, M., et al.: Rucio - scientific data management, February 2019. arXiv e-prints. arXiv:1902.09857. https://arxiv.org/abs/1902.09857

  2. Begy, V., Barisits, M., Lassnig, M., Schikuta, E.: Forecasting network throughput of remote data access in computing grids. J. Comput. Sci. 44, 101158 (2020). https://doi.org/10.1016/j.jocs.2020.101158. http://www.sciencedirect.com/science/article/pii/S1877750320304592

  3. Bogado, J., Monticelli, F., Diaz, J., Lassnig, M., Vukotic, I.: Modelling high-energy physics data transfers. In: 2018 IEEE 14th International Conference on e-Science (e-Science), pp. 334–335 (2018). https://doi.org/10.1109/eScience.2018.00081

  4. Bogado, J., Lassnig, M., Monticelli, F., Díaz, J., Beermann, T.: Atlas rucio transfers dataset. Zenodo, December 2020. https://doi.org/10.5281/zenodo.4320937

  5. Kiryanov, A., Álvarez Ayllón, A., Salichos, M., Keeble, O.: FTS3 - a file transfer service for grids, HPCs and clouds. In: International Symposium on Grids and Clouds 2015, p. 028, March 2016. https://doi.org/10.22323/1.239.0028

  6. Lassnig, M., et al.: Rucio beyond ATLAS: experiences from Belle II, CMS, DUNE, EISCAT3D, LIGO/VIRGO, SKA, XENON. Technical report, ATL-SOFT-PROC-2020-017, CERN, Geneva, March 2020. https://doi.org/10.1051/epjconf/202024511006. https://cds.cern.ch/record/2711755

  7. Lassnig, M., Toler, W., Vamosi, R., Bogado, J.: Machine learning of network metrics in atlas distributed data management. J. Phys. Conf. Ser. 898, 062009 (2017). https://doi.org/10.1088/1742-6596/898/6/062009

    Article  Google Scholar 

  8. Zheng, A.: Evaluating Machine Learning Models. O’Reilly Media, Inc., Sebastopol (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joaquin Bogado .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bogado, J., Lassnig, M., Monticelli, F., Díaz, J. (2021). Modelling Network Throughput of Large-Scale Scientific Data Transfers. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds) Cloud Computing, Big Data & Emerging Topics. JCC-BD&ET 2021. Communications in Computer and Information Science, vol 1444. Springer, Cham. https://doi.org/10.1007/978-3-030-84825-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-84825-5_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-84824-8

  • Online ISBN: 978-3-030-84825-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics