Skip to main content

Scientific Data Management and Application in High Energy Physics

  • Conference paper
  • First Online:
Big Scientific Data Management (BigSDM 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11473))

Included in the following conference series:

  • 796 Accesses

Abstract

High energy physics experiments have been producing a large amount of data at PB or EB level, and there will be ambitious experimental programs in the coming decades. The efficiency of data-intensive researches is closely related to how fast data can be accessed and how many computational resources can be used. Changes in computing technology and large increases in data volume require new computing models. This paper will give an overall introduction to scientific data management technologies and applications in high energy physics. The current data management framework and workflow will be investigated at first. These include data acquisition, data transfer, data storage, data processing, data sharing and data preservation. Then some ongoing research and development on data organization, management and access will be introduced. Finally the EventDB, an event-based big scientific data management system will be introduced. The test on more than ten billion physics events shows the query speed is greatly improved than traditional file-base data management system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. WLCG Homepage. http://wlcg.web.cern.ch/. Accessed 27 Oct 2018

  2. Belov, S., Suo, B., Deng, Z.Y., et al.: Design and operation of the BES-III distributed computing system. Procedia Comput. Sci. 66, 619–624 (2015)

    Article  Google Scholar 

  3. Ayllon, A.A., Salichos, M., Simon, M.K., et al.: FTS3: new data movement service for WLCG. J. Phys. Conf. Ser. 513(3), 032081 (2014)

    Google Scholar 

  4. Takanori, H., Belle, I.I.: Computing at the Belle II experiment. J. Phys: Conf. Ser. 664(1), 012002 (2015)

    Google Scholar 

  5. Karle, A., Ahrens, J., Bahcall, J.N., et al.: IceCube—the next generation neutrino telescope at the South Pole. Nucl. Phys. B-Proc. Suppl. 118, 388–395 (2003)

    Article  Google Scholar 

  6. Apollinari, G., et al.: High-Luminosity Large Hadron Collider (HL-LHC): Technical Design Report V. 0.1. CERN Yellow Reports: Monographs. CERN, Geneva (2017). https://cds.cern.ch/record/2284929

  7. Djurcic, Z., Li, X., Hu, W., et al.: JUNO conceptual design report (2015). https://arxiv.org/abs/1508.07166

  8. He, H.H., LHAASO Collaboration: Design highlights and status of the LHAASO project. In: Proceedings of the 34rd ICRC (2015)

    Google Scholar 

  9. Butler, M., Mount, R., Hildreth, M.: Snowmass 2013 Computing Frontier Storage and Data Management. arXiv preprint arXiv:1311.4580 (2013)

  10. Perret-Gallix, D.: Simulation and event generation in high-energy physics. Comput. Phys. Commun. 147(1), 488–493 (2002)

    Article  Google Scholar 

  11. Gutleber, J., Murray, S., Orsini, L.: Towards a homogeneous architecture for high-energy physics data acquisition systems. Comput. Phys. Commun. 153(2), 155–163 (2003)

    Article  Google Scholar 

  12. Nakahama, Y.: The atlas trigger system: Ready for run-2. J. Phys: Conf. Ser. 664(8), 082037 (2015)

    Google Scholar 

  13. Ratti, C., Thaler, M.A., Weise, W.: Phases of QCD: lattice thermodynamics and a field theoretical model. Phys. Rev. D 73(1), 014019 (2006)

    Article  Google Scholar 

  14. Schwan, P.: Lustre: building a file system for 1000-node clusters. In: Proceedings of the 2003 Linux Symposium, pp. 380–386 (2003)

    Google Scholar 

  15. Peters, A.J., Janyst, L.: Exabyte scale storage at CERN. J. Phys: Conf. Ser. 331(5), 052015 (2011)

    Google Scholar 

  16. Fuhrmann, P., Gülzow, V.: dCache, storage system for the future. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro-Par 2006. LNCS, vol. 4128, pp. 1106–1113. Springer, Heidelberg (2006). https://doi.org/10.1007/11823285_116

    Chapter  Google Scholar 

  17. Presti, G.L., Barring, O., Earl, A., et al.: CASTOR: a distributed storage resource facility for high performance data processing at CERN. In: MSST, vol. 7, pp. 275–280 (2007)

    Google Scholar 

  18. Devision, C.: Fermi National Accelerator Laboratory, “Enstore mass storage system”. http://www-ccf.fnal.gov/enstore/design.html

  19. Watson, R.W., Coyne, R.A.: The parallel I/O architecture of the high-performance storage system (HPSS). In: MSS, p. 27. IEEE (1995)

    Google Scholar 

  20. Alves Jr., A.A., Amadio, G., Anh-Ky, N., et al.: A Roadmap for HEP Software and Computing R&D for the 2020 s. arXiv preprint arXiv:1712.06982 (2017)

  21. Bonacorsi, D., Ferrari, T.: WLCG service challenges and tiered architecture in the LHC era. In: IFAE 2006, pp. 365–368. Springer, Milano (2007). https://doi.org/10.1007/978-88-470-0530-3_68

  22. I Bird. The Challenges of Big (Science) Data. https://indico.cern.ch/event/466934/contributions/2524828/attachments/1490181/2315978/BigDataChallenges-EPS-Venice-080717.pdf

  23. Stewart, G.A., Cameron, D., Cowan, G.A., et al.: Storage and data management in EGEE. In: Proceedings of the Fifth Australasian Symposium on ACSW Frontiers, vol. 68, pp. 69–77. Australian Computer Society, Inc. (2007)

    Google Scholar 

  24. Baud, J.-P., Casey, J.: Evolution of LCG-2 Data Management CHEP, La Jolla, California, March 2004

    Google Scholar 

  25. Barrass, T., Newbold, D., Tuura, L.: The CMS PhEDEx system: a novel approach to robust grid data distribution. In: AHM 2005, 19–22nd September 2005, Nottingham (UK) (2005)

    Google Scholar 

  26. Garonne, V., et al.: Rucio - the next generation of large scale distributed system for ATLAS Data Management. J. Phys.: Conf. Ser. 513, 042021 (2014)

    Google Scholar 

  27. Patton, S., Samak, T., Tull, C.E., et al.: Spade: decentralized orchestration of data movement and warehousing for physics experiments. In: 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 1014–1019. IEEE (2015)

    Google Scholar 

  28. Martelli, E., Stancu, S.: Lhcopn and lhcone: status and future evolution. J. Phys: Conf. Ser. 664(5), 052025 (2015)

    Google Scholar 

  29. Akopov, Z., Amerio, S., Asner, D., et al.: Status report of the DPHEP Study Group: Towards a global effort for sustainable data preservation in high energy physics. arXiv preprint arXiv:1205.4667 (2012)

  30. CERN Open Data Portal. http://opendata.cern.ch/. Accessed 27 Oct 2018

  31. Maguire, E., Heinrich, L., Watt, G.: HEPData: a repository for high energy physics data. J. Phys: Conf. Ser. 898(10), 102006 (2017)

    Google Scholar 

  32. Buckley, A., Butterworth, J., Grellscheid, D., et al.: Rivet user manual. Comput. Phys. Commun. 184(12), 2803–2819 (2013)

    Article  Google Scholar 

  33. Barberis, D., Zárate, S.E.C., Cranshaw, J., et al.: The ATLAS EventIndex: architecture, design choices, deployment and first operation experience. J. Phys: Conf. Ser. 664(4), 042003 (2015)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National key Research Program of China “Scientific Big Data Management System” (No.2016YFB1000605) and National Natural Science Foundation of China (No. 11675201 and 11575223).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yaodong Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, G., Cheng, Y. (2019). Scientific Data Management and Application in High Energy Physics. In: Li, J., Meng, X., Zhang, Y., Cui, W., Du, Z. (eds) Big Scientific Data Management. BigSDM 2018. Lecture Notes in Computer Science(), vol 11473. Springer, Cham. https://doi.org/10.1007/978-3-030-28061-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-28061-1_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-28060-4

  • Online ISBN: 978-3-030-28061-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics