Abstract
The next generation of high-intensity light sources, microscopes, and particle accelerators enable exciting new insights and discoveries. However, the data rates generated by these sophisticated instruments are exploding due to higher sensor scan rates and increased resolution. In parallel, the vision connecting experiments with real time feedback, steering, and integration demands new solutions in both hardware and software. An edge-supercomputer co-located with the sensors or instruments combined with a larger supercomputer enables real-time processing of streaming experimental data at the edge with resource intensive analysis, simulation, and reconstruction at the larger cluster.
Today, post-acquisition data processing is expensive in terms of time as well as storage, and it is scientifically costly since many opportunities are missed during data acquisition. We will describe how a small computational infrastructure can reduce the cost and latency to using the data as it is generated.
Using applications in ptychography and light sheet microscopy as examples, this paper will show how to build data streaming pipelines that form the foundation for real-time processing, visualization, feedback, and steering. We will show how a developer can write high-performance data processing pipelines using Python and C/C++ to integrate traditional processing with the latest ML and AI techniques. We highlight end-to-end performance profiling and optimization as well as the libraries and frameworks from NVIDIA to build these application-driven processing pipelines from edge to computing center.
This work pushes us towards the vision of realizing an end-to-end workflow starting with streaming directly from the instrument at the edge to the data center.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
NIH funds three national cryo-EM service centers and training for new microscopists \(|\) National Institutes of Health (NIH). https://www.nih.gov/news-events/news-releases/nih-funds-three-national-cryo-em-service-centers-training-new-microscopists
The must-have multimillion-dollar microscopy machine \(|\) News \(|\) Nature Index. https://www.natureindex.com/news-blog/must-have-multimillion-dollar-microscopy-machine-cryo-em
Batey, D., Rau, C., Cipiccia, S.: High-speed X-ray ptychographic tomography. Sci. Rep. 12(1), 1–6 (2022). https://doi.org/10.1038/s41598-022-11292-8
Blaiszik, B., Chard, K., Chard, R., Foster, I., Ward, L.: Data automation at light sources. In: AIP Conference Proceedings, vol. 2054, no. 1, p. 020003 (2019). https://doi.org/10.1063/1.5084563. https://aip.scitation.org/doi/abs/10.1063/1.5084563
Elbakri, I.A., Fessler, J.A.: Statistical image reconstruction for polyenergetic X-ray computed tomography. IEEE Trans. Med. Imaging 21(2), 89–99 (2002). https://doi.org/10.1109/42.993128
Enders, B., et al.: Dataflow at the COSMIC beamline - stream processing and supercomputing. Microsc. Microanal. 24(S2), 56–57 (2018). https://doi.org/10.1017/S1431927618012710. https://www.cambridge.org/core/journals/microscopy-and-microanalysis/article/dataflow-at-the-cosmic-beamline-stream-processing-and-supercomputing/2F4AD3721A36EE02C0336A8191356065
Guizar-Sicairos, M., et al.: High-throughput ptychography using Eiger: scanning X-ray nano-imaging of extended regions. Opt. Express 22(12), 14859–14870 (2014). https://doi.org/10.1364/OE.22.014859
Holler, M., et al.: High-resolution non-destructive three-dimensional imaging of integrated circuits. Nature 543(7645), 402–406 (2017). https://doi.org/10.1038/nature21698. https://www.nature.com/articles/nature21698
Holler, M., et al.: Three-dimensional imaging of integrated circuits with macro- to nanoscale zoom. Nat. Electron. 2(10), 464–470 (2019). https://doi.org/10.1038/s41928-019-0309-z. https://www.nature.com/articles/s41928-019-0309-z
Klöckner, A., Pinto, N., Lee, Y., Catanzaro, B., Ivanov, P., Fasih, A.: PyCUDA and PyOpenCL: a scripting-based approach to GPU run-time code generation. Parallel Comput. 38(3), 157–174 (2012). https://doi.org/10.1016/J.PARCO.2011.09.001
Leong, S.H., Stadler, H.C., Chang, M.C., Dorsch, J.P., Aliaga, T., Ashton, A.W.: SELVEDAS: a data and compute as a service workflow demonstrator targeting supercomputing ecosystems. In: Proceedings of SuperCompCloud 2020: 3rd Workshop on Interoperability of Supercomputing and Cloud Technologies, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 7–13 (2020). https://doi.org/10.1109/SUPERCOMPCLOUD51944.2020.00007
Marchesini, S., et al.: SHARP: a distributed GPU-based ptychographic solver. J. Appl. Crystallogr. 49(4), 1245–1252 (2016). https://doi.org/10.1107/S1600576716008074. http://scripts.iucr.org/cgi-bin/paper?jo5020. URN: ISSN 1600-5767
Okuta, R., Unno, Y., Nishino, D., Hido, S., Loomis, C.: CuPy: a NumPy-compatible library for NVIDIA GPU calculations. Technical report (2017). https://github.com/cupy/cupy
Zhang, Z., et al.: Toward fully automated UED operation using two-stage machine learning model. Sci. Rep. 12(1), 1–12 (2022). https://doi.org/10.1038/s41598-022-08260-7. https://www.nature.com/articles/s41598-022-08260-7
Zheng, S.Q., Palovcak, E., Armache, J.P., Verba, K.A., Cheng, Y., Agard, D.A.: MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods 14(4), 331–332 (2017). https://doi.org/10.1038/nmeth.4193. https://www.nature.com/articles/nmeth.4193
Acknowledgements
We thank our colleagues Jack Wells, Chris Porter, and Ryan Olson for their useful feedback. We additionally thank David Shapiro and Pablo Enfedaque at ALS for their collaboration; and Gokul Upadhyaula, Matthew Mueller, Thayer Alshaabi, and Xiongtao Ruan at the Advanced Bioimaging Center, University of California, Berkeley for their ongoing collaboration.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rietmann, M., Nakshatrala, P., Lefman, J., Gupta, G. (2022). Real-Time Edge Processing During Data Acquisition. In: Doug, K., Al, G., Pophale, S., Liu, H., Parete-Koon, S. (eds) Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation. SMC 2022. Communications in Computer and Information Science, vol 1690. Springer, Cham. https://doi.org/10.1007/978-3-031-23606-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-031-23606-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23605-1
Online ISBN: 978-3-031-23606-8
eBook Packages: Computer ScienceComputer Science (R0)