Neural partially linear additive model

Zhu, Liangxuan; Li, Han; Zhang, Xuelin; Wu, Lingjuan; Chen, Hong

doi:10.1007/s11704-023-2662-3

Neural partially linear additive model

Research Article
Published: 28 December 2023

Volume 18, article number 186334, (2024)
Cite this article

Frontiers of Computer Science Aims and scope Submit manuscript

Liangxuan Zhu¹,
Han Li¹,
Xuelin Zhang¹,
Lingjuan Wu¹ &
…
Hong Chen^1,2,3,4

122 Accesses
36 Altmetric
5 Mentions
Explore all metrics

Abstract

Interpretability has drawn increasing attention in machine learning. Most works focus on post-hoc explanations rather than building a self-explaining model. So, we propose a Neural Partially Linear Additive Model (NPLAM), which automatically distinguishes insignificant, linear, and nonlinear features in neural networks. On the one hand, neural network construction fits data better than spline function under the same parameter amount; on the other hand, learnable gate design and sparsity regular-term maintain the ability of feature selection and structure discovery. We theoretically establish the generalization error bounds of the proposed method with Rademacher complexity. Experiments based on both simulations and real-world datasets verify its good performance and interpretability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity

xNN-SF: An Explainable Neural Network Inspired by Stochastic Frontier Model

A new large-scale learning algorithm for generalized additive models

Article 31 May 2023

References

Rudin C, Chen C, Chen Z, Huang H, Semenova L, Zhong C. Interpretable machine learning: fundamental principles and 10 grand challenges. Statistics Surveys, 2022, 16: 1–85
Article MathSciNet Google Scholar
Du M, Liu N, Hu X. Techniques for interpretable machine learning. Communications of the ACM, 2019, 63(1): 68–77
Article Google Scholar
Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 2019, 1(5): 206–215
Article Google Scholar
Härdle W, Liang H, Gao J T. Partially Linear Models. Heidelberg: Physica, 2000
Book Google Scholar
Xie Q, Liu J. Combined nonlinear effects of economic growth and urbanization on CO₂ emissions in China: evidence from a panel data partially linear additive model. Energy, 2019, 186: 115868
Article Google Scholar
Shim J H, Lee Y K. Generalized partially linear additive models for credit scoring. The Korean Journal of Applied Statistics, 2011, 24(4): 587–595
Article Google Scholar
Kazemi M, Shahsavani D, Arashi M. Variable selection and structure identification for ultrahigh-dimensional partially linear additive models with application to cardiomyopathy microarray data. Statistics, Optimization & Information Computing, 2018, 6(3): 373–382
Article MathSciNet Google Scholar
Zhang H H, Cheng G, Liu Y. Linear or nonlinear? Automatic structure discovery for partially linear models. Journal of the American Statistical Association, 2011, 106(495): 1099–1112
Article MathSciNet Google Scholar
Du P, Cheng G, Liang H. Semiparametric regression models with additive nonparametric components and high dimensional parametric components. Computational Statistics & Data Analysis, 2012, 56(6): 2006–2017
Article MathSciNet Google Scholar
Huang J, Wei F, Ma S. Semiparametric regression pursuit. Statistica Sinica, 2012, 22(4): 1403–1426
MathSciNet Google Scholar
Lou Y, Bien J, Caruana R, Gehrke J. Sparse partially linear additive models. Journal of Computational and Graphical Statistics, 2016, 25(4): 1126–1140
Article MathSciNet Google Scholar
Petersen A, Witten D. Data-adaptive additive modeling. Statistics in Medicine, 2019, 38(4): 583–600
Article MathSciNet Google Scholar
Sadhanala V, Tibshirani R J. Additive models with trend filtering. The Annals of Statistics, 2019, 47(6): 3032–3068
Article MathSciNet Google Scholar
Agarwal R, Melnick L, Frosst N, Zhang X, Lengerich B, Caruana R, Hinton G E. Neural additive models: Interpretable machine learning with neural nets. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 4699–4711
Nelder J A, Wedderburn R W M. Generalized linear models. Journal of the Royal Statistical Society. Series A (General), 1972, 135(3): 370–384
Article Google Scholar
Hastie T, Tibshirani R. Generalized additive models. Statistical Science, 1986, 1(3): 297–310
MathSciNet Google Scholar
Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 1996, 58(1): 267–288
Article MathSciNet Google Scholar
Ravikumar P, Lafferty J, Liu H, Wasserman L. Sparse additive models. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 2009, 71(5): 1009–1030
Article MathSciNet Google Scholar
Xu S Y, Bu Z Q, Chaudhari P, Barnett I J. Sparse neural additive model: Interpretable deep learning with feature selection via group sparsity. In: Proceedings of ICLR 2022 PAIR²Struct Workshop. 2022
Feng J, Simon N. Sparse-input neural networks for high-dimensional nonparametric regression and classification. 2017, arXiv preprint arXiv: 1711.07592v1
Lemhadri I, Ruan F, Abraham L, Tibshirani R. Lassonet: A neural network with feature sparsity. The Journal of Machine Learning Research, 2021, 22(1): 127
MathSciNet Google Scholar
Wang X, Chen H, Yan J, Nho K, Risacher S L, Saykin A J, Shen L, Huang H, ADNI. Quantitative trait loci identification for brain endophenotypes via new additive model with random networks. Bioinformatics, 2018, 34(17): i866–i874
Article Google Scholar
Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning. 2010, 807–814
Huber P J. Robust estimation of a location parameter. In: Kotz S, Johnson N L, eds. Breakthroughs in statistics: Methodology and Distribution. New York: Springer, 1992, 492–518
Chapter Google Scholar
Lu Y Y, Fan Y, Lv J, Noble W S. DeepPINK: reproducible feature selection in deep neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 8690–8700
Kingma D P, Ba J. Adam: A method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
Golowich N, Rakhlin A, Shamir O. Size-independent sample complexity of neural networks. In: Proceedings of Conference on Learning Theory. 2018, 297–299
McDiarmid C. On the method of bounded differences. In: Siemons J, ed. Surveys in Combinatorics. Cambridge: Cambridge University Press, 1989, 148–188
Google Scholar
Chen H, Wang Y, Zheng F, Deng C, Huang H. Sparse modal additive model. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(6): 2373–2387
Article MathSciNet Google Scholar
Wang X, Chen H, Cai W, Shen D, Huang H. Regularized modal regression with applications in cognitive impairment prediction. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 1447–1457
Cucker F, Zhou D X. Learning Theory: An Approximation Theory Viewpoint. Cambridge: Cambridge University Press, 2007
Book Google Scholar
Wu Q, Ying Y, Zhou D X. Learning rates of least-square regularized regression. Foundations of Computational Mathematics, 2006, 6(2): 171–192
Article MathSciNet Google Scholar
Krogh A. What are artificial neural networks? Nature Biotechnology, 2008, 26(2): 195–197
Article Google Scholar
Ng A Y. Feature selection, L₁ vs. L₂ regularization, and rotational invariance. In: Proceedings of the 21st International Conference on Machine Learning. 2004, 78
Yang L, Lv S, Wang J. Model-free variable selection in reproducing kernel Hilbert space. The Journal of Machine Learning Research, 2016, 17(1): 2885–2908
MathSciNet Google Scholar
Aygun R C, Yavuz A G. Network anomaly detection with stochastically improved autoencoder based models. In: Proceedings of the 4th IEEE International Conference on Cyber Security and Cloud Computing. 2017, 193–198
Chicco D, Warrens M J, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 2021, 7: e623
Article Google Scholar
Lin Y, Tu Y, Dou Z. An improved neural network pruning technology for automatic modulation classification in edge devices. IEEE Transactions on Vehicular Technology, 2020, 69(5): 5703–5706
Article Google Scholar
Pace R K, Barry R. Sparse spatial autoregressions. Statistics & Probability Letters, 1997, 33(3): 291–297
Article Google Scholar
Hamidieh K. A data-driven statistical model for predicting the critical temperature of a superconductor. Computational Materials Science, 2018, 154: 346–354
Article Google Scholar
Zhang S, Guo B, Dong A, He J, Xu Z, Chen S X. Cautionary tales on air-quality improvement in Beijing. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2017, 473(2205): 20170457
Article Google Scholar
Harrison D Jr, Rubinfeld D L. Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 1978, 5(1): 81–102
Article Google Scholar
Buitinck L, Louppe G, Blondel M, Pedregosa F, Müeller A, Grisel O, Niculae V, Prettenhofer P, Gramfort A, Grobler J, Layton R, VanderPlas J, Joly A, Holt B, Varoquaux G. API design for machine learning software: experiences from the scikit-learn project. 2013, arXiv preprint arXiv: 1309.0238
Asuncion A, Newman D J. UCI machine learning repository. Irvine: Irvine University of California, 2017
Google Scholar
Hazan E, Singh K. Boosting for online convex optimization. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 4140–4149
Couellan N. Probabilistic robustness estimates for feed-forward neural networks. Neural Networks, 2021, 142: 138–147
Article Google Scholar
Konstantinov A V, Utkin L V. Interpretable machine learning with an ensemble of gradient boosting machines. Knowledge-Based Systems, 2021, 222: 106993
Article Google Scholar
Xing Y F, Xu Y H, Shi M H, Lian Y X. The impact of PM_2.5 on the human respiratory system. Journal of Thoracic Disease, 2016, 8(1): E69–E74
Google Scholar
Oune N, Bostanabad R. Latent map Gaussian processes for mixed variable metamodeling. Computer Methods in Applied Mechanics and Engineering, 2021, 387: 114128
Article MathSciNet Google Scholar
Bekkar A, Hssina B, Douzi S, Douzi K. Air-pollution prediction in smart city, deep learning approach. Journal of Big Data, 2021, 8(1): 161
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China
Liangxuan Zhu, Han Li, Xuelin Zhang, Lingjuan Wu & Hong Chen
Engineering Research Center of Intelligent Technology for Agriculture (Ministry of Education), Huazhong Agricultural University, Wuhan, 430070, China
Hong Chen
Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Shenzhen, 518000, China
Hong Chen
Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518000, China
Hong Chen

Authors

Liangxuan Zhu
View author publications
You can also search for this author inPubMed Google Scholar
Han Li
View author publications
You can also search for this author inPubMed Google Scholar
Xuelin Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Lingjuan Wu
View author publications
You can also search for this author inPubMed Google Scholar
Hong Chen
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Han Li.

Additional information

Liangxuan Zhu received his BS degree from the North China Institute of Science and Technology, China in 2019. He is currently pursuing the MS degree at the College of Informatics, Huazhong Agricultural University, China. His current research interests lie in machine learning, including deep learning, learning theory and interpretability.

Han Li received BS degree in Mathematics and Applied Mathematics from Faculty of Mathematics and Computer Science, Hubei University, China in 2007. She received her PhD degree in the School of Mathematics and Statistics at Beihang University, China. She worked as a project assistant professor in the Department of Mechanical Engineering, Kyushu University, Japan. She now works as an associate professor in the College of Informatics, Huazhong Agricultural University, China. Her research interests include neural networks, learning theory and pattern recognition.

Xuelin Zhang received his BE degree from the China Agricultural University, China in 2019. He is currently a PhD student with the College of Science, Huazhong Agricultural University, China. His current research interests include robust machine learning and statistical learning theory.

Lingjuan Wu got her PhD degree in Microelectronics and Solid State Electronics from Peking University, China in 2013. She visited the University of California, San Diego, USA as a research scholar from 2010 to 2012. She is currently an associate professor with the College of Informatics, Huazhong Agricultural University, China. Her research interests are in machine learning and hardware security, including learning theory and machine learning based side channel analysis and hardware Trojan detection.

Hong Chen received the BS and PhD degrees from Hubei University, China in 2003 and 2009, respectively. He worked as a postdoc researcher at University of Texas, USA during 2016–2017. He is currently a professor with the College of Informatics, Huazhong Agricultural University, China. His current research interests include machine learning, statistical learning theory and approximation theory.

Electronic Supplementary Material