Abstract
ML/AI techniques, particularly based on deep learning, will increasingly be used to accelerate scientific discovery for fusion experiment and simulation. Fusion energy devices have many disparate diagnostic instruments, capturing a broad range of interacting physics phenomena over multiple time and spatial scales. Also, fusion experiments are increasingly built to run longer pulses, with a goal of eventually running a reactor continuously. The confluence of these facts leads to large, complex datasets with phenomena manifest over long sequences. A key challenge is enabling scientists/engineers to utilize these datasets, for example to automatically catalog events of interest, predict the onset of phenomena such as tokamak disruptions, and enable comparisons to models/simulation. Given the size, multiple modalities, and multi-scale nature of fusion data, deep learning models are attractive, but at these scales requires utilizing HPC resources. Many ML/AI techniques not fully utilized now will demand even more HPC resources, such as self-supervised learning to help fusion scientists create AI models with less labelled data, and advanced sequence models which use less GPU memory at the expense of increased compute. Additionally, deep learning models will enable faster, more in-depth analysis than previously available, such as extracting physics model parameters from data using conditional variational autoencoders, instead of slower techniques such as Markov chain Monte Carlo (MCMC). Comparison to simulation will also be enhanced through direct acceleration of simulation kernels using deep learning. These ML/AI techniques will give fusion scientists faster results, allowing more efficient machine use, and faster scientific discovery.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
NervanaSystems/distiller: Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://nervanasystems.github.io/distiller. https://github.com/NervanaSystems/distiller
Ahmad, S., Lavin, A., Purdy, S., Agha, Z.: Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262, 134–147 (2017). https://doi.org/10.1016/j.neucom.2017.04.070
Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv e-prints arXiv:1803.01271 (2018). URL http://arxiv.org/abs/1803.01271
Bai, S., Kolter, J.Z., Koltun, V.: Deep equilibrium models. arXiv e-prints arXiv:1909.01377 (2019). http://arxiv.org/abs/1909.01377
Bar-Sinai, Y., Hoyer, S., Hickey, J., Brenner, M.P.: Learning data-driven discretizations for partial differential equations. Proc. Natl. Acad. Sci. U. S. A. 116(31), 15344–15349 (2019). https://doi.org/10.1073/pnas.1814058116. http://www.ncbi.nlm.nih.gov/pubmed/31311866
Ben-Nun, T., Hoefler, T.: Demystifying parallel and distributed deep learning: an in-depth concurrency analysis. arXiv e-prints arXiv:1802.09941 (2018). URL http://arxiv.org/abs/1802.09941
Berg, J., Nyström, K.: A unified deep artificial neural network approach to partial differential equations in complex geometries. arXiv::1711.06464 (2017). https://doi.org/10.1016/j.neucom.2018.06.056. http://dx.doi.org/10.1016/j.neucom.2018.06.056
Beucler, T., Pritchard, M., Rasp, S., Gentine, P., Ott, J., Baldi, P.: Enforcing analytic constraints in neural-networks emulating physical systems. arXiv::1909.00912 (2019). URL http://arxiv.org/abs/1909.00912
Bishop, C.M., Roach, C.M., von Hellermann, M.G.: Automatic analysis of JET charge exchange spectra using neural networks. Plasma Phys. Control. Fusion 35(6), 765–773 (1993). https://doi.org/10.1088/0741-3335/35/6/010. http://iopscience.iop.org/0741-3335/35/6/010
Boozer, A.H.: Theory of tokamak disruptions. Phys. Plasmas 19(5), 058–101 (2012). https://doi.org/10.1063/1.3703327. http://aip.scitation.org/doi/10.1063/1.3703327
Brehmer, J., Mishra-Sharma, S., Hermans, J., Louppe, G., Cranmer, K.: Mining for dark matter substructure: inferring subhalo population properties from strong lenses with machine learning. Astrophys. J. 886(1), 49 (2019). https://doi.org/10.3847/1538-4357/ab4c41. http://dx.doi.org/10.3847/1538-4357/ab4c41
Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. arXiv e-prints arXiv:1901.03407 (2019). http://arxiv.org/abs/1901.03407
Choi, J.Y., et al.: Stream processing for near real-time scientific data analysis. In: 2016 New York Sci. Data Summit, pp. 1–8. IEEE (2016). https://doi.org/10.1109/NYSDS.2016.7747804. http://ieeexplore.ieee.org/document/7747804/
Choi, M.J., et al.: Improved accuracy in the estimation of the tearing mode stability parameters (\(\Delta \)’ and w c ) using 2D ECEI data in KSTAR. Nucl. Fusion 54(8), 083,010 (2014). https://doi.org/10.1088/0029-5515/54/8/083010. http://stacks.iop.org/0029-5515/54/i=8/a=083010?key=crossref.88a6457ca7434ceddf6b6be95522512a
Chua, A.J., Vallisneri, M.: Learning bayesian posteriors with neural networks for gravitational-wave inference. Phys. Rev. Lett. 124(4), 041–102 (2020). https://doi.org/10.1103/PhysRevLett.124.041102
Churchill, R., et al: A framework for international collaboration on ITER using large-scale data transfer to enable near real-time analysis. In: IAEA, Fusion Data Process, p. 2019. Tech. Meet, Validation, Anal (2019)
Churchill, R., Tobias, B., Zhu, Y.: The DIII-D Team: deep convolutional neural networks for multi-scale time-series classification and application to tokamak disruption prediction using raw, high temporal resolution diagnostic data. Phys. Plasmas 27 (2020)
Churchill, R.M.: The DIII-D Team: deep convolutional neural networks for multi-scale time-series classification and application to disruption prediction in fusion devices. Second Work. Mach. Learn. Phys. Sci. (NeurIPS 2019) (2019). http://arxiv.org/abs/1911.00149
Cranmer, K., Brehmer, J., Louppe, G.: The frontier of simulation-based inference. arXiv e-prints arXiv:1911.01429 (2019). http://arxiv.org/abs/1911.01429
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv e-prints arXiv:1810.04805 (2018). http://arxiv.org/abs/1810.04805
Dinklage, A., Dreier, H., Fischer, R., Gori, S., Preuss, R., Toussaint, U.V.: Integrated data analysis for fusion: a Bayesian tutorial for fusion diagnosticians. In: AIP Conference Proceedings, vol. 988, pp. 471–480. AIP (2008). https://doi.org/10.1063/1.2905117. http://aip.scitation.org/doi/abs/10.1063/1.2905117
Dumoulin, V., et al: Feature-wise transformations. Distill. 3(7), e11 (2018). https://doi.org/10.23915/distill.00011. https://distill.pub/2018/feature-wise-transformations
Ferraro, N., Lyons, B., Kim, C., Liu, Y., Jardin, S.: 3D two-temperature magnetohydrodynamic modeling of fast thermal quenches due to injected impurities in tokamaks. Nucl. Fusion 59(1), 016,001 (2019). https://doi.org/10.1088/1741-4326/AAE990
Ferreira, D.R.: Applications of deep learning to nuclear fusion research. arXiv e-prints arXiv:1811.00333 (2018). http://arxiv.org/abs/1811.00333
Gabbard, H., Messenger, C., Heng, I.S., Tonolini, F., Murray-Smith, R.: Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy. arXiv e-prints arXiv:1909.06296 (2019). http://arxiv.org/abs/1909.06296
Green, S.R., Simpson, C., Gair, J.: Gravitational-wave parameter estimation with autoregressive neural network flows. arXiv e-prints arXiv:2002.07656 (2020). http://arxiv.org/abs/2002.07656
Hager, R., Yoon, E., Ku, S., D’Azevedo, E., Worley, P., Chang, C.: A fully non-linear multi-species Fokker-Landau collision operator for simulation of fusion plasma. J. Comput. Phys. 315, 644–660 (2016). https://doi.org/10.1016/J.JCP.2016.03.064. https://www.sciencedirect.com/science/article/pii/S0021999116300298?via%3Dihub
Han, J., Ma, C., Ma, Z., Weinan, E.: Uniformly accurate machine learning-based hydrodynamic models for kinetic equations. Proc. Natl. Acad. Sci. 116(44), 21983–21991 (2019). https://doi.org/10.1073/pnas.1909854116. http://www.pnas.org/lookup/doi/10.1073/pnas.1909854116
Hogg, D.W., Foreman-Mackey, D.: Data analysis recipes: using Markov Chain Monte Carlo. Astrophys. J. Suppl. Ser. 236(1), 11 (2018). https://doi.org/10.3847/1538-4365/aab76e. http://stacks.iop.org/0067-0049/236/i=1/a=11?key=crossref.0a2b61f395b98c90f2d746466846903c
Hortua, H.J., Volpi, R., Marinelli, D., Malagò, L.: Parameters estimation for the cosmic microwave background with bayesian neural networks. arXiv e-prints arXiv:1911.08508 (2019). http://arxiv.org/abs/1911.08508
Hsieh, J.T., Zhao, S., Eismann, S., Mirabella, L., Ermon, S.: Learning neural PDE solvers with convergence guarantees. arXiv::1906.01200 (2019). http://arxiv.org/abs/1906.01200
Kaiser, Ł., et al.: Fast decoding in sequence models using discrete latent variables. In: 35th International Conference Machine Learning ICML 2018, vol. 6, pp. 3743–3752 (2018). http://arxiv.org/abs/1803.03382
Kates-Harbeck, J., Svyatkovskiy, A., Tang, W.: Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 568(7753), 526–531 (2019). https://doi.org/10.1038/s41586-019-1116-4.http://www.nature.com/articles/s41586-019-1116-4
Kitaev, N., Kaiser, Ł., Levskaya, A.: Reformer: the efficient transformer. arXiv e-prints arXiv:2001.04451 (2020). http://arxiv.org/abs/2001.04451
Ku, S., Hager, R., Chang, C., Kwon, J., Parker, S.: A new hybrid-Lagrangian numerical scheme for gyrokinetic simulation of tokamak edge plasma. J. Comput. Phys. 315, 467–475 (2016). https://doi.org/10.1016/j.jcp.2016.03.062. http://linkinghub.elsevier.com/retrieve/pii/S0021999116300274
Kube, R., Churchill, R., Choi, J.Y., Wang, R., Klasky, S., Chang, C.S.: Leading magnetic fusion energy science into the big-and-fast data lane. In: Proceedings 19th Python Science Conference (2020). https://conference.scipy.org/proceedings/
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015). https://doi.org/10.1038/nature14539. http://www.nature.com/articles/nature14539
McCandlish, S., et al.: An empirical model of large-batch training. arXiv e-prints arXiv:1812.06162 (2018). https://arxiv.org/pdf/1812.06162.pdf
Meneghini, O., et al.: Self-consistent core-pedestal transport simulations with neural network accelerated models. Nucl. Fusion 57(8), 086,034 (2017). https://doi.org/10.1088/1741-4326/aa7776. http://stacks.iop.org/0029-5515/57/i=8/a=086034?key=crossref.bd8ca2032ac2046a3a270c0b80762b50
Miller, M.A., Churchill, R.M., Chang, C.S., Hager, R.: Encoder-decoder neural network for solving the nonlinear Fokker-Planck-Landau collision operator in XGC. In: Workshop Integr. Deep Neural Model. Differ. Equations (ICLR 2020) (2020)
van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv (2018). http://arxiv.org/abs/1807.03748
van den Oord, A., Vinyals, O., Kavukcuoglu, K.: Neural discrete representation learning. Adv. Neural Inf. Process. Syst. 2017-Decem, 6307–6316 (2017). http://arxiv.org/abs/1711.00937
Perez, E., Strub, F., De Vries, H., Dumoulin, V., Courville, A.: FiLM: visual reasoning with a general conditioning layer. In: 32nd AAAI Conference Artificial Intelligence AAAI 2018, pp. 3942–3951. AAAI press (2018)
Rajbhandari, S., Rasley, J., Ruwase, O., He, Y.: ZeRO: memory optimization towards training a trillion parameter models. arXiv e-prints arXiv:11910.02054 (2019). http://arxiv.org/abs/1910.02054
Razavi, A., van den Oord, A., Vinyals, O.: Generating diverse high-fidelity images with VQ-VAE-2. arXiv e-prints arXiv:1906.00446 (2019). http://arxiv.org/abs/1906.00446
Rea, C., Granetz, R.S.: Exploratory machine learning studies for disruption prediction using large databases on DIII-D. Fusion Sci. Technol. pp. 1–12 (2018). https://doi.org/10.1080/15361055.2017.1407206. https://www.tandfonline.com/doi/full/10.1080/15361055.2017.1407206
Ruder, S.: Transfer Learning - Machine Learning’s Next Frontier (2017). http://ruder.io/transfer-learning/
Schneider, S., Baevski, A., Collobert, R., Auli, M.: wav2vec: unsupervised pre-training for speech recognition. arXiv e-prints arXiv:1904.05862 (2019). http://arxiv.org/abs/1904.05862
Standley, T., Zamir, A.R., Chen, D., Guibas, L., Malik, J., Savarese, S.: Which tasks should be learned together in multi-task learning? arXiv e-prints arXiv:1905.07553 (2019). http://arxiv.org/abs/1905.07553
Subcommittee, F.I.: FESAC ISOFS subcommittee final report. Technical report, FES (2002). https://www.cs.odu.edu/~keyes/scales/reports/fsp_2002b.pdf
Vaswani, A., et al: Attention is all you need. arXiv e-prints arXiv:1706.03762 (2017). http://arxiv.org/abs/1706.03762
Vega, J., et al.: Results of the JET real-time disruption predictor in the ITER-like wall campaigns. Fusion Eng. Des. 88(6–8), 1228–1231 (2013). https://doi.org/10.1016/J.FUSENGDES.2013.03.003. https://www.sciencedirect.com/science/article/pii/S0920379613002974?via%3Dihub
de Vries, P.C., et al.: Requirements for triggering the ITER disruption mitigation system. Fusion Sci. Technol. 69(2), 471–484 (2016). https://doi.org/10.13182/FST15-176. https://www.tandfonline.com/doi/full/10.13182/FST15-176
Wallace, E.: Eric Wallace on Twitter (2020). https://twitter.com/Eric_Wallace_/status/1235907651193548801
Wang, J., Ma, Y., Zhang, L., Gao, R.X.: Deep learning for smart manufacturing: methods and applications. J. Manuf. Syst. 48, 144–156 (2018). https://doi.org/10.1016/J.JMSY.2018.01.003. https://www.sciencedirect.com/science/article/pii/S0278612518300037
Weng, L.: Self-supervised representation learning (2018). https://lilianweng.github.io/lil-log/2019/11/10/self-supervised-learning.html
Windsor, C., et al.: A cross-tokamak neural network disruption predictor for the JET and ASDEX Upgrade tokamaks. Nucl. Fusion 45(5), 337–350 (2005). https://doi.org/10.1088/0029-5515/45/5/004. http://stacks.iop.org/0029-5515/45/i=5/a=004?key=crossref.170e4cfeab7836eaf142634f3e851578
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Churchill, R.M., Choi, J., Kube, R., Chang, C.S., Klasky, S. (2020). Machine Learning for the Complex, Multi-scale Datasets in Fusion Energy. In: Nichols, J., Verastegui, B., Maccabe, A.‘., Hernandez, O., Parete-Koon, S., Ahearn, T. (eds) Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI. SMC 2020. Communications in Computer and Information Science, vol 1315. Springer, Cham. https://doi.org/10.1007/978-3-030-63393-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-63393-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63392-9
Online ISBN: 978-3-030-63393-6
eBook Packages: Computer ScienceComputer Science (R0)