Abstract
Solving partial differential equations (PDEs) on fine spatio-temporal scales for high-fidelity solutions is critical for numerous scientific breakthroughs. Yet, this process can be prohibitively expensive, owing to the inherent complexities of the problems, including nonlinearity and multiscale phenomena. To speed up large-scale computations, a process known as downscaling is employed, which generates high-fidelity approximate solutions from their low-fidelity counterparts. In this paper, we propose a novel Physics-Guided Diffusion Model (PGDM) for downscaling. Our model, initially trained on a dataset comprising low-and-high-fidelity paired solutions across coarse and fine scales, generates new high-fidelity approximations from any new low-fidelity inputs. These outputs are subsequently refined through fine-tuning, aimed at minimizing the physical discrepancies as defined by the discretized PDEs at the finer scale. We evaluate and benchmark our model’s performance against other downscaling baselines in three categories of nonlinear PDEs. Our numerical experiments demonstrate that our model not only outperforms the baselines but also achieves a computational acceleration exceeding tenfold, while maintaining the same level of accuracy as the conventional fine-scale solvers.













Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data Availability
All data reported in the manuscript were generated through a Python implementation of the methods outlined in the paper. The source code is available at https://github.com/woodssss/Generative-downsscaling-PDE-solvers.
References
Apte, R., Nidhan, S., Ranade, R., Pathak, J.: Diffusion model based data generation for partial differential equations. arXiv preprint arXiv:2306.11075 (2023)
Arisaka, S., Li, Q.: Principled acceleration of iterative numerical methods using machine learning. Proceedings of the 40th International Conference on Machine Learning (2023)
Azulay, Y., Treister, E.: Multigrid-augmented deep learning preconditioners for the Helmholtz equation. SIAM J. Sci. Comput. 45(3), S127-51 (2022)
Baño-Medina, J., Manzanas, R., Gutiérrez, J.M.: Configuration and intercomparison of deep learning neural models for statistical downscaling. Geosci. Model Dev. 13(4), 2109–2124 (2020)
Cai, S., Mao, Z., Wang, Z., Yin, M., Karniadakis, G.E.: Physics-informed neural networks (pinns) for fluid mechanics: a review. Acta Mech. Sin. 37(12), 1727–1738 (2021)
Chen, Y., Dong, B., Xu, J.: Meta-mgnet: meta multigrid networks for solving parameterized partial differential equations. J. Comput. Phys. 455, 110996 (2022)
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. Adv. Neural Inf. Process. Syst. 34, 8780–8794 (2021)
Farimani, A.B., Gomes, J., Pande, V.S.: Deep learning the physics of transport phenomena. arXiv preprint arXiv:1709.02432 (2017)
Goswami, S., Bora, A., Yu, Y., Karniadakis, G.E.: Physics-informed neural operators. arXiv preprint arXiv:2207.05748 (2022)
Goswami, S., Bora, A., Yu, Y., Karniadakis, G.E.: Physics-informed deep neural operator networks. In: Machine Learning in Modeling and Simulation: Methods and Applications, pp. 219–254. Springer (2023)
Groenke, B., Madaus, L., Monteleoni, C.: Climalign: Unsupervised statistical downscaling of climate variables via normalizing flows. In: Proceedings of the 10th International Conference on Climate Informatics, pp. 60–66 (2020)
Han, J., Jentzen, A.E.W.: Solving high-dimensional partial differential equations using deep learning. Proc. Natl. Acad. Sci. 115(34), 8505–8510 (2018)
Han, J., Jentzen, A., et al.: Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun. Math. Stat. 5(4), 349–380 (2017)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 33, 6840–6851 (2020)
Ho, J., Saharia, C., Chan, W., Fleet, D.J., Norouzi, M., Salimans, T.: Cascaded diffusion models for high fidelity image generation. J. Mach. Learn. Res. 23(1), 2249–2281 (2022)
Hsieh, J.T., Zhao, S., Eismann, S., Mirabella, L., Ermon, S.: Learning neural pde solvers with convergence guarantees. In: International Conference on Learning Representations (2018)
Jiang, P., Yang, Z., Wang, J., Huang, C., Xue, P., Chakraborty, T., Chen, X., Qian, Y.: Efficient super-resolution of near-surface climate modeling using the fourier neural operator. J. Adv. Modeling Earth Syst. 15(7), e2023MS003800 (2023)
Jin, S., Ma, Z., Wu, K.: Asymptotic-preserving neural networks for multiscale time-dependent linear transport equations. J. Sci. Comput. 94(3), 57 (2023)
Joshi, A., Shah, V., Ghosal, S., Pokuri, B., Sarkar, S., Ganapathysubramanian, B., Hegde, C.: Generative models for solving nonlinear partial differential equations. In: Proc. of NeurIPS Workshop on ML for Physics (2019)
Kharazmi, E., Zhang, Z., Karniadakis, G.E.: Variational physics-informed neural networks for solving partial differential equations. arXiv preprint arXiv:1912.00873 (2019)
Kharazmi, E., Zhang, Z., Karniadakis, G.E.: hp-vpinns: variational physics-informed neural networks with domain decomposition. Comput. Methods Appl. Mech. Eng. 374, 113547 (2021)
Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., Anandkumar, A.: Neural operator: Learning maps between function spaces. arXiv preprint arXiv:2108.08481 (2021)
Leinonen, J., Nerini, D., Berne, A.: Stochastic super-resolution for downscaling time-evolving atmospheric fields with a generative adversarial network. IEEE Trans. Geosci. Remote Sens. 59(9), 7211–7223 (2020)
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., Anandkumar, A.: Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895 (2020)
Li, Z., Zheng, H., Kovachki, N., Jin, D., Chen, H., Liu, B., Azizzadenesheli, K., Anandkumar, A.: Physics-informed neural operator for learning partial differential equations. arXiv preprint arXiv:2111.03794 (2021)
Lu, J., Lu, Y.: A priori generalization error analysis of two-layer neural networks for solving high dimensional schrödinger eigenvalue problems. Commun. Am. Math. Soc. 2(1), 1–21 (2022)
Lu, L., Jin, P., Karniadakis, G.E.: Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators. arXiv preprint arXiv:1910.03193 (2019)
Lu, Y., Lu, J., Wang, M.: A priori generalization analysis of the deep ritz method for solving high dimensional elliptic partial differential equations. In: Conference on learning theory, pp. 3196–3241. PMLR (2021)
Lu, Y., Wang, L., Xu, W.: Solving multiscale steady radiative transfer equation using neural networks with uniform stability. Res. Math. Sci. 9(3), 45 (2022)
Nikolopoulos, S., Kalogeris, I., Stavroulakis, G., Papadopoulos, V.: Ai-enhanced iterative solvers for accelerating the solution of large-scale parametrized systems. Int. J. Numer. Methods Eng. 125(2), e7372 (2024)
Price, I., Rasp, S.: Increasing the accuracy and resolution of precipitation forecasts using deep generative models. In: International conference on artificial intelligence and statistics, pp. 10555–10571. PMLR (2022)
Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241. Springer (2015)
Sachindra, D., Ahmed, K., Rashid, M.M., Shahid, S., Perera, B.: Statistical downscaling of precipitation using machine learning techniques. Atmos. Res. 212, 240–258 (2018)
Shu, D., Li, Z., Farimani, A.B.: A physics-informed diffusion model for high-fidelity flow field reconstruction. J. Comput. Phys. 478, 111972 (2023)
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502 (2020)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456 (2020)
Um, K., Brand, R., Fei, Y.R., Holl, P., Thuerey, N.: Solver-in-the-loop: learning from differentiable physics to interact with iterative pde-solvers. Adv. Neural Inf. Process. Syst. 33, 6111–6122 (2020)
Vandal, T., Kodra, E., Ganguly, A.R.: Intercomparison of machine learning methods for statistical downscaling: the case of daily and extreme precipitation. Theor. Appl. Climatol. 137, 557–570 (2019)
Vandal, T., Kodra, E., Ganguly, S., Michaelis, A., Nemani, R., Ganguly, A.R.: Deepsd: Generating high resolution climate change projections through single image super-resolution. In: Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining, pp. 1663–1672 (2017)
Wang, S., Perdikaris, P.: Long-time integration of parametric evolution equations with physics-informed deeponets. J. Comput. Phys. 475, 111855 (2023)
Wang, S., Sankaran, S., Wang, H., Perdikaris, P.: An expert’s guide to training physics-informed neural networks. arXiv preprint arXiv:2308.08468 (2023)
Wang, S., Wang, H., Perdikaris, P.: Learning the solution operator of parametric partial differential equations with physics-informed DeepONets. Sci. Adv. 7(40), eabi8605 (2021)
Wei, M., Zhang, X.: Super-resolution neural operator. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18247–18256 (2023)
Weinan, E., Yu, B.: The deep ritz method: a deep learning-based numerical algorithm for solving variational problems. Commun. Math. Stat. 6(1), 1–12 (2018)
Wilby, R.L., Wigley, T., Conway, D., Jones, P., Hewitson, B., Main, J., Wilks, D.: Statistical downscaling of general circulation model output: a comparison of methods. Water Resour. Res. 34(11), 2995–3008 (1998)
Yang, G., Sommer, S.: A denoising diffusion model for fluid field prediction. arXiv e-prints pp. arXiv–2301 (2023)
Yang, Q., Hernandez-Garcia, A., Harder, P., Ramesh, V., Sattegeri, P., Szwarcman, D., Watson, C.D., Rolnick, D.: Fourier neural operators for arbitrary resolution climate data downscaling. arXiv preprint arXiv:2305.14452 (2023)
Yu, J., Lu, L., Meng, X., Karniadakis, G.E.: Gradient-enhanced physics-informed neural networks for forward and inverse pde problems. Comput. Methods Appl. Mech. Eng. 393, 114823 (2022)
Zang, Y., Bao, G., Ye, X., Zhou, H.: Weak adversarial networks for high-dimensional partial differential equations. J. Comput. Phys. 411, 109409 (2020)
Acknowledgements
YL thanks the support from the National Science Foundation through the award DMS-2343135 and the support from the Data Science Initiative at University of Minnesota through a MnDRIVE DSI Seed Grant.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix
Neural Networks Architecture and Hyperparameters
Our diffusion models are based on the DDPM architecture [14], which uses U-Net [33] as the backbone (Table 5). During our experiments, we omit the use of self-attention, resulting in significant reductions in training time while maintaining similar sample quality. The base channel count, the list of Down/Up channel multipliers and the list of middle channel refer to the hyperparameters of the U-Net, which is detailed in Table 6. To accelerate sampling process using DDIM, we take skipped time steps \(\tau \) be \([1, 5, 10, 15, 20, 25, \cdots , T-5, T]\). The linear noise schedule is configured from \(\beta _0 = 0.0001\) to \(\beta _{T}=0.02\).
During training, we utilize the Adam optimizer with a dynamic learning rate that linearly decays every 5000 steps with a decay rate of 0.05. The total number of training epochs is set to 10000.
The architecture of FNO follows that described in [24]. The number of lifting channels, number of FFT truncation modes, and number of Fourier layers for different examples are specified in Table 7. During training, we utilize the Adam optimizer with a dynamic learning rate that linearly decays every 5000 steps with a decay rate of 0.05. Training continues until the loss drops below 1e-6 or reaches the maximum iteration number of 50000.
Our model training were performed on an NVIDIA RTX 3070 graphics card, while predictions and refinements with Gaussian-Newton were executed on an AMD Ryzen 7 3700X processor.
Levenberg–Marquardt Algorithm
In this part, we present the Levenberg–Marquardt (LM) algorithm for solving the nonlinear optimization problem (9) and (10) in Algorithm 3. In all of our numerical experiments, we fix \(\lambda =0.5\) and \(\eta =\)1e-5.
Physics-Informed Diffusion Model
For the physics-informed diffusion model [35], our numerical test suggests that conditioning the diffusion model on both gradient information and the coarse solution yields better performance compared to the vanilla PIDM, which is conditioned solely on gradient information. In the application of the PIDM to 2D nonlinear Poisson equation, the conditioning information is defined as the gradient of the \(L^2\) misfit, i.e.
where
We employ the same architecture to construct and train the model using Algorithm 1, with a modified loss function:
The gradient guidance strength is set to \(w=1\). Various time-step locations \(t_s\) in the backward diffusion process were tested (\(t_s = [20, 100, 200, 400]\)), and it was determined that \(t_s = 20\) provides optimal performance, leading to its adoption in the model. As shown Fig. 3, DDPM outperforms PIDM, and we provide some heuristic explanations for this below. In fact, the inputs of the two score networks of the two methods are different. At a specific time t, the score of PIDM takes \(\varvec{x}_t, t, \varvec{u}^c, \varvec{g}\) as the inputs, where \(\varvec{g}\) is the output of a fixed problem-dependent function of \(\varvec{x}_t\) and source term \(\varvec{a}\). In contrast, the score of DDPM takes \(\varvec{x}_t, t, \varvec{u}^c, \varvec{a}\) as the inputs. Intuitively, including gradient information as an additional input provides more comprehensive information than simply the source term. However, this significantly increases training complexity, especially when the residual function is complicated and the total time step \(N_t\) is large, making training much more difficult and potentially leading to poor performance when the training data is limited.
Gaussian-Newton Algorithm
To refine the solution obtained from the coarse solver, diffusion model and the FNO, we introduce the one-step Gaussian-Newton refinement process, outlined in Algorithm 4.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lu, Y., Xu, W. Generative Downscaling of PDE Solvers with Physics-Guided Diffusion Models. J Sci Comput 101, 71 (2024). https://doi.org/10.1007/s10915-024-02709-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10915-024-02709-9