Skip to main content

Converging HPC, Big Data and Cloud Technologies for Precision Agriculture Data Analytics on Supercomputers

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12321))

Abstract

The convergence of HPC and Big Data along with the influence of Cloud are playing an important role in the democratization of HPC. The increasing needs of Data Analytics in computational power has added new fields of interest for the HPC facilities but also new problematics such as interoperability with Cloud and ease of use. Besides the typical HPC applications, these infrastructures are now asked to handle more complex workflows combining Machine Learning, Big Data and HPC. This brings challenges on the resource management, scheduling and environment deployment layers. Hence, enhancements are needed to allow multiple frameworks to be deployed under common system management while providing the right abstraction to facilitate adoption.

This paper presents the architecture adopted for the parallel and distributed execution management software stack of Cybele EU funded project which is put in place on production HPC centers to execute hybrid data analytics workflows in the context of precision agriculture and livestock farming applications. The design is based on: Kubernetes as a higher level orchestrator of Big Data components, hybrid workflows and a common interface to submit HPC or Big Data jobs; Slurm or Torque for HPC resource management; and Singularity containerization platform for the dynamic deployment of the different Data Analytics frameworks on HPC. The paper showcases precision agriculture workflows being executed upon the architecture and provides some initial performance evaluation results and insights for the whole prototype design.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.cybele-project.eu/demonstrators.

  2. 2.

    https://github.com/apache/mesos.

  3. 3.

    https://github.com/kubernetes/kubernetes.

  4. 4.

    https://github.com/docker/classicswarm.

  5. 5.

    https://github.com/sylabs/singularity.

  6. 6.

    https://github.com/indigo-dc/udocker.

  7. 7.

    https://sylabs.io/guides/cri/1.0/user-guide/k8s.html.

  8. 8.

    https://github.com/sylabs/singularity-cri.

  9. 9.

    https://github.com/sylabs/wlm-operator.

  10. 10.

    https://github.com/virtual-kubelet/virtual-kubelet.

  11. 11.

    https://spring.io/projects/spring-cloud-dataflow.

  12. 12.

    https://www.leanxcale.com/.

  13. 13.

    https://slurm.schedmd.com/rest.html.

References

  1. ETP4HPC. Strategic research agenda (SRA4) for HPC in Europe, March 2020. https://www.etp4hpc.eu/pujades/files/ETP4HPC_SRA4_2020_web(1).pdf

  2. Perakis, K., Lampathaki, F., Nikas, K., Georgiou, Y., Marko, O., Maselyne, J.: CYBELE - fostering precision agriculture & livestock farming through secure access to large-scale HPC enabled virtual industrial experimentation environments fostering scalable big data analytics. Comput. Netw. 168, 107035 (2020). ISSN 1389–1286

    Google Scholar 

  3. Zhou, N., Georgiou, Y., Zhong, L., Zhou, H., Pospieszny, M.: Container orchestration on HPC systems. In: IEEE CLOUD (2020, to appear)

    Google Scholar 

  4. Casalicchio, E.: Container orchestration: a survey. In: Puliafito, A., Trivedi, K.S. (eds.) Systems Modeling: Methodologies and Tools. EICC, pp. 221–235. Springer, Cham (2019). https://doi.org/10.1007/978-3-319-92378-9_14

    Chapter  Google Scholar 

  5. Hightower, K., Burns, B., Beda, J.: Kubernetes: Up and Running Dive into the Future of Infrastructure, 1st edn. OReilly Media (2017)

    Google Scholar 

  6. Xavier, M.G., Neves, M.V., Rossi, F.D., Ferreto, T.C., Lange, T., De Rose, C.A.F.: Performance evaluation of container-based virtualization for high performance computing environments. In: 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, pp. 233–240 (2013)

    Google Scholar 

  7. Plauth, M., Feinbube, L., Polze, A.: A performance survey of lightweight virtualization techniques. In: De Paoli, F., Schulte, S., Broch Johnsen, E. (eds.) ESOCC 2017. LNCS, vol. 10465, pp. 34–48. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67262-5_3

    Chapter  Google Scholar 

  8. Zhang, J., Lu, X., Panda, D.K.: Is singularity-based container technology ready for running MPI applications on HPC clouds? In: Proceedings of The10th International Conference on Utility and Cloud Computing, Association for Computing Machinery (2017)

    Google Scholar 

  9. Mercier, M., Glesser, D., Georgiou, Y., Richard, O.: Big data and HPC collocation: using HPC idle resources for Big Data analytics. In: BigData, pp. 347–352 (2017)

    Google Scholar 

  10. Spark - Kubernetes integration. https://spark.apache.org/docs/latest/running-on-kubernetes.html

  11. Boettiger, C.: An introduction to Docker for reproducible research. In: ACM SIGOPS Operating Systems Review (2015)

    Google Scholar 

  12. Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple Linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003). https://doi.org/10.1007/10968987_3

    Chapter  Google Scholar 

  13. Godlove, D.: Singularity: simple, secure containers for compute-driven workloads. PEARC 24(1–24), 4 (2019)

    Google Scholar 

  14. Muscianisi, G., Fiameni, G., Azab, A.: Singularity GPU containers execution on HPC cluster. In: ISC Workshops, pp. 61–68 (2019)

    Google Scholar 

Download references

Acknowledgments

This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement NO. 825355.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yiannis Georgiou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Georgiou, Y. et al. (2020). Converging HPC, Big Data and Cloud Technologies for Precision Agriculture Data Analytics on Supercomputers. In: Jagode, H., Anzt, H., Juckeland, G., Ltaief, H. (eds) High Performance Computing. ISC High Performance 2020. Lecture Notes in Computer Science(), vol 12321. Springer, Cham. https://doi.org/10.1007/978-3-030-59851-8_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59851-8_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59850-1

  • Online ISBN: 978-3-030-59851-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics