skip to main content
10.1145/3488560.3498448acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

A New Class of Polynomial Activation Functions of Deep Learning for Precipitation Forecasting

Published: 15 February 2022 Publication History

Abstract

Precipitation forecasting, modeled as an important chaotic system in earth system science, is not explicitly solved with theory-driven models. In recent years, deep learning models have achieved great success in various applications including rainfall prediction. However, these models work in an image processing manner regardless of the nature of a physical system. We found that the non-linearity relationships learned by deep learning models, which mostly rely on the activation functions, are commonly weighted piecewise continuous functions with bounded first-order derivatives. In contrast, the polynomial is one of the most widely used classes of functions for theory-driven models, applied to numerical approximation, dynamic system modeling, etc. Researchers started to use the polynomial activation functions (Pacs in short) for neural networks from the 1990s. In recent years, with bloomed researches that apply deep learning to scientific problems, it is weird that such a powerful class of basis functions is rarely used. In this paper, we investigate it and argue that, even though polynomials are good at information extraction, it is too fragile to train stably. We finally solve its serious data flow explosion problem with Chebyshev polynomials and prepended normalization, which enables networks to go deep with Pacs. To enhance the robustness of training, a normalization called Range Norm is further proposed. Performance on synthetic dataset and summer precipitation prediction task validates the necessity of such a class of activation functions to simulate complex physical mechanisms. The new tool for deep learning enlightens a new way of automatic theoretical physics analysis.

Supplementary Material

MP4 File (WSDM22-fp391.mp4)
Presentation video of paper "A New Class of Polynomial Activation Functions of Deep Learning for Precipitation Forecasting". To approximate the non-linear relationships in the rainfall system, we use polynomial activation functions, which are more powerful but unstable. We solve this problem by using Chebyshev Polynomials as basis functions with prepended range norm as normalization. Evaluation results show that our modules outperform previous methods on both synthetic and real datasets.

References

[1]
2021. [Online] Code. https://github.com/dominatorX/Pacs/.
[2]
P. Allan. 1999. Approximation theory of the MLP model in neural networks. Acta Numerica 8 (1999), 143--195.
[3]
J. L. Ba, J. R. Kiros, and G. E. Hinton. 2016. Layer Normalization. (2016).
[4]
Rex L Baum and Jonathan W Godt. 2010. Early warning of rainfall-induced shallow landslides and debris flows in the USA. Landslides 7, 3 (2010), 259--272.
[5]
E De Bezenac, A. Pajot, and P. Gallinari. 2017. Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge. Journal of Statistical Mechanics Theory and Experiment 2019, 12 (2017).
[6]
M? Birman and M. Z. Solomjak. 1967. Piecewise-Polynomial Approximations of Functions of the Classes. Mathematics of the USSR-Sbornik 2, 3 (1967), 295--317.
[7]
N. D. Brenowitz and C. S. Bretherton. 2018. Prognostic Validation of a Neural Network Unified Physics Parameterization. Geophysical Research Letters 45 (2018).
[8]
Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In EMNLP. ACL, 1724--1734.
[9]
Djork-Arné Clevert, T. Unterthiner, and S. Hochreiter. 2015. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). Computer Science (2015).
[10]
N. E. Cotter. 1990. The Stone-Weierstrass theorem and its application to neural networks. IEEE Transactions on Neural Networks 1, 4 (1990), 290--5.
[11]
Paul Covington, Jay Adams, and Emre Sargin. 2016. Deep Neural Networks for YouTube Recommendations. In RecSys. ACM, 191--198.
[12]
G. Cybenko. 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2, 4 (1989), 303--314.
[13]
Bhaskar DasGupta and Georg Schnitger. 1992. The Power of Approximation: A Comparison of Activation Functions. In NIPS. Morgan Kaufmann, 615--622.
[14]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. CoRR abs/2010.11929 (2020).
[15]
S. Durrleman and R. Simon. 1989. Flexible regression models with cubic splines. Statistics in Medicine 8, 5 (1989), 551--561.
[16]
J. Fritsch and C. Chappell. 1980. Numerical Prediction of Convectively Driven Mesoscale Pressure Systems. Part I: Convective Parameterization. J.atmos 37, 3 (1980), págs. 198--202.
[17]
A. R. Gallant and H. White. 1988. There Exists A Neural Network That Does Not Make Avoidable Mistakes. In Neural Networks, 1988., IEEE International Conference on.
[18]
P. Gentine, M. Pritchard, S. Rasp, G. Reinaudi, and G. Yacalis. 2018. Could Machine Learning Break the Convection Parameterization Deadlock? Geophysical Research Letters 45, 11 (2018).
[19]
G. A. Grell and Dezs Dévényi. 2002. A generalized approach to parameterizing convection combining ensemble and data assimilation techniques. Geophys.res.lett 29, 14 (2002).
[20]
G. A. Grell and S. R. Freitas. 2014. A scale and aerosol aware stochastic convective parameterization for weather and air quality modeling. Atmospheric Chemistry and Physics 13, 9 (2014), 5233--5250.
[21]
R. Hecht-Nielsen. 1987. Kolmogorov's mapping neural network existence theorem. IEEE. Press (1987).
[22]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (1997), 1735--1780.
[23]
K. Hornik. 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4, 2 (1991), 251--257.
[24]
K. Hornik, M. Stinchcombe, and H. White. 1989. Multilayer feedforward networks are universal approximators. Neural Networks 2, 5 (1989), 359--366.
[25]
S. Ioffe and C. Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. JMLR.org (2015).
[26]
Irie and Miyake. 1988. Capabilities of three-layered perceptrons. In Proc IEEE International Conference on Neural Nerworks. 641--648 vol.1.
[27]
V. Jochem, S. Neus, R. Juan, M. M. Jordi, V. Jorge, C. V. Gustau, and M José. 2016. Emulation of Leaf, Canopy and Atmosphere Radiative Transfer Models for Fast Global Sensitivity Analysis. Remote Sensing 8, 8 (2016), 673.
[28]
H. K. Khalil. 2002. Nonlinear Systems. 38, 6 (2002), 1091--1093.
[29]
Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In EMNLP. ACL, 1746--1751.
[30]
D. Kingma and J. Ba. 2014. Adam: A Method for Stochastic Optimization. Computer Science (2014).
[31]
Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2017. Self-Normalizing Neural Networks. In NIPS. 971--980.
[32]
M. Kozel. 2015. Deep Learning for Image Recognition. (2015).
[33]
A. Krizhevsky, I. Sutskever, and G. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS.
[34]
T. T. Lee and J. T. Jeng. 1998. The Chebyshev-polynomials-based unified model neural networks for function approximation. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics (1998).
[35]
M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken. [n.d.]. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function. Social Science Electronic Publishing ([n. d.]).
[36]
Y. Liu, E. Racah, Prabhat, J. Correa, andW. Collins. 2016. Application of Deep Convolutional Neural Networks for Detecting Extreme Weather in Climate Datasets. (2016).
[37]
Liying, Khorasani, and K. 2005. Constructive Feedforward Neural Networks Using Hermite Polynomial Activation Functions. IEEE Transactions on Neural Networks 16, 4 (2005), 821--833.
[38]
Ezequiel Lopez-Rubio, F. Ortega-Zamorano, Enrique Dominguez, and Jose Munoz- Perez. 2019. Piecewise Polynomial Activation Functions for Feedforward Neural Networks. Neural Processing Letters 50, 1 (2019), 121--147.
[39]
A. L. Maas, A. Y. Hannun, and A. Y. Ng. 2013. Rectifier Nonlinearities Improve Neural Network Acoustic Models. (2013).
[40]
Markus, Reichstein, Gustau, Camps-Valls, Bjorn, Stevens, Martin, Jung, Joachim, and Denzler. 2019. Deep learning and process understanding for data-driven Earth system science. Nature (2019).
[41]
H. N. Mhaskar and C. A. Micchelli. 1992. Approximation by superposition of sigmoidal and radial basis functions. Advances in Applied Mathematics 13, 3 (1992), 350--373.
[42]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In NeurIPS. 8024--8035.
[43]
Powell and M. Jd. 2015. Approximation theory and methods. Cambridge University Press (2015).
[44]
TJ Rivlin. 1974. The Chebyshev Polynomials, Pure and Applied Mathematics.
[45]
Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo. 2015. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In NIPS. 802--810.
[46]
Xingjian Shi, Zhihan Gao, Leonard Lausen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo. 2017. Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model. In NIPS. 5617--5627.
[47]
W. C. Skamarock, J. B. Klemp, J. Dudhia, D. O. Gill, and J. G. Powers. 2005. A Description of the Advanced Research WRF Version 2. Ncar Technical (2005).
[48]
M. Stinchcombe and H. White. 1990. Approximating and learning unknown mappings using multilayer feedforward networks with bounded weights. In Ijcnn International Joint Conference on Neural Networks.
[49]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In NIPS. 3104--3112.
[50]
D. L. Swain, B. Langenbrunner, J. D. Neelin, and A. Hall. 2018. Increasing precipitation volatility in twenty-first-century California. Nature Climate Change 8, 5 (2018).
[51]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In CVPR. IEEE Computer Society, 1--9.
[52]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS. 5998--6008.
[53]
Yunbo Wang, Mingsheng Long, Jianmin Wang, Zhifeng Gao, and Philip S. Yu. 2017. PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs. In NIPS. 879--888.
[54]
Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2019. Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics. In CVPR. Computer Vision Foundation / IEEE, 9154--9162.
[55]
Xinyang Yi, Ji Yang, Lichan Hong, Derek Zhiyuan Cheng, Lukasz Heldt, Aditee Kumthekar, Zhe Zhao, Li Wei, and Ed H. Chi. 2019. Sampling-bias-corrected neural modeling for large corpus item recommendations. In RecSys. ACM, 269-- 277.
[56]
M. A. Zaytar and C. E. Amrani. 2016. Sequence to Sequence Weather Forecasting with Long Short-Term Memory Recurrent Neural Networks. International Journal of Computer Applications 143, 11 (2016), 7--11.
[57]
X. Zhang, F. W. Zwiers, G. Li, H. Wan, and A. J. Cannon. 2017. Complexity in estimating past and future extreme short-duration rainfall. Nature Geoscience 10, 4 (2017), 255--259.
[58]
Guorui Zhou, Xiaoqiang Zhu, Chengru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In KDD. ACM, 1059--1068.

Cited By

View all
  • (2025)Efficient CORDIC-Based Activation Functions for RNN Acceleration on FPGAsIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.34746486:1(199-210)Online publication date: Jan-2025
  • (2024)Padé-ResNet: Improving the Accuracy and Stability of Medical Image Classification2024 IEEE 17th International Conference on Signal Processing (ICSP)10.1109/ICSP62129.2024.10846648(662-667)Online publication date: 28-Oct-2024
  • (2024)Development of Ensemble Probabilistic Machine Learning Models for Rainfall PredictionsAdvances in Mathematical Modelling, Applied Analysis and Computation10.1007/978-3-031-56304-1_11(175-195)Online publication date: 29-Mar-2024
  • Show More Cited By

Index Terms

  1. A New Class of Polynomial Activation Functions of Deep Learning for Precipitation Forecasting

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
        February 2022
        1690 pages
        ISBN:9781450391320
        DOI:10.1145/3488560
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 15 February 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. heterogeneous data
        2. neural networks
        3. polynomials
        4. precipitation

        Qualifiers

        • Research-article

        Funding Sources

        • Hong Kong RGC Theme-based project
        • National Key Research and Development Program of China Grant
        • Hong Kong RGC AOE Project

        Conference

        WSDM '22

        Acceptance Rates

        Overall Acceptance Rate 498 of 2,863 submissions, 17%

        Upcoming Conference

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)71
        • Downloads (Last 6 weeks)6
        Reflects downloads up to 13 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Efficient CORDIC-Based Activation Functions for RNN Acceleration on FPGAsIEEE Transactions on Artificial Intelligence10.1109/TAI.2024.34746486:1(199-210)Online publication date: Jan-2025
        • (2024)Padé-ResNet: Improving the Accuracy and Stability of Medical Image Classification2024 IEEE 17th International Conference on Signal Processing (ICSP)10.1109/ICSP62129.2024.10846648(662-667)Online publication date: 28-Oct-2024
        • (2024)Development of Ensemble Probabilistic Machine Learning Models for Rainfall PredictionsAdvances in Mathematical Modelling, Applied Analysis and Computation10.1007/978-3-031-56304-1_11(175-195)Online publication date: 29-Mar-2024
        • (2022)Nonlinear H∞ path following control for autonomous ground vehicles via neural network and policy iteration algorithmProceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering10.1177/09544070221145468238:6(1670-1683)Online publication date: 27-Dec-2022

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media