Abstract
Identifying factors that exert more influence on system output from data is one of the most challenging tasks in science and engineering. In this work, a sensitivity analysis of the generalized Gaussian process regression (SA-GGPR) model is proposed to identify important factors of the nonlinear counting system. In SA-GGPR, the GGPR model with Poisson likelihood is adopted to describe the nonlinear counting system. The GGPR model with Poisson likelihood inherits the merits of nonparametric kernel learning and Poisson distribution, and can handle complex nonlinear counting systems. Nevertheless, understanding the relationships between model inputs and output in the GGPR model with Poisson likelihood is not readily accessible due to its nonparametric and kernel structure. SA-GGPR addresses this issue by providing a quantitative assessment of how different inputs affect the system output. The application results on a simulated nonlinear counting system and a real steel casting-rolling process have demonstrated that the proposed SA-GGPR method outperforms several state-of-the-art methods in identification accuracy.
摘要
从数据中识别对系统输出产生较大影响的关键因子是科学和工程领域最具挑战性的任务之一。本文针对非线性计数系统,提出基于敏感性分析的广义高斯过程回归(SA-GGPR)建模方法,以识别影响系统输出的关键因子。SA-GGPR采用具有泊松似然的GGPR模型描述非线性计数系统。GGPR模型继承了非参数核学习和泊松分布的优点,可处理复杂非线性计数系统。然而,由于GGPR模型的非参数核学习架构,难以理解GGPR模型中输入和输出之间的关系。SA-GGPR方法通过定量评估不同输入对系统输出的影响来辨识影响系统输出的关键因子。在模拟非线性计数系统和实际钢铁轧制过程的应用结果表明,SA-GGPR方法在识别精度方面优于几种先进方法。
Similar content being viewed by others
References
Abdi H, 2010. Partial least squares regression and projection on latent structure regression (PLS regression). WIREs Comput Stat, 2(1):97–106. https://doi.org/10.1002/wics.51
Biau G, 2012. Analysis of a random forests model. J Mach Learn Res, 13(1):1063–1095.
Blix K, Camps-Valls G, Jenssen R, 2017. Gaussian process sensitivity analysis for oceanic chlorophyll estimation. IEEE J Sel Top Appl Earth Obs Remote Sens, 10(4):1265–1277. https://doi.org/10.1109/JSTARS.2016.2641583
Bühlmann P, 2012. Bagging, boosting and ensemble methods. In: Gentle JE, Härdle WK, Mori Y (Eds.), Handbook of Computational Statistics. Springer, Berlin, Germany, p.985–1022. https://doi.org/10.1007/978-3-642-21551-3_33
Chan AB, Dong DX, 2011. Generalized Gaussian process models. Proc 24th IEEE Conf on Computer Vision and Pattern Recognition, p.2681–2688. https://doi.org/10.1109/CVPR.2011.5995688
Coxe S, West SG, Aiken LS, 2009. The analysis of count data: a gentle introduction to Poisson regression and its alternatives. J Pers Assess, 91(2):121–136. https://doi.org/10.1080/00223890802634175
Cutler A, Cutler DR, Stevens JR, 2012. Random forests. In: Zhang C, Ma YQ (Eds.), Ensemble Machine Learning: Methods and Applications. Springer, Boston, USA, p.157–175. https://doi.org/10.1007/978-1-4419-9326-7
Ge ZQ, 2018. Process data analytics via probabilistic latent variable models: a tutorial review. Ind Eng Chem Res, 57(38):12646–12661. https://doi.org/10.1021/acs.iecr.8b02913
Ge ZQ, Song ZH, Ding SX, et al., 2017. Data mining and analytics in the process industry: the role of machine learning. IEEE Access, 5:20590–20616. https://doi.org/10.1109/ACCESS.2017.2756872
Hutchinson MK, Holtman MC, 2005. Analysis of count data using Poisson regression. Res Nurs Health, 28(5):408–418. https://doi.org/10.1002/nur.20093
Kano M, Ogawa M, 2010. The state of the art in chemical process control in Japan: good practice and questionnaire survey. J Process Contr, 20(9):969–982. https://doi.org/10.1016/j.jprocont.2010.06.013
Mohri M, Rostamizadeh A, Talwalkar A, 2018. Foundations of Machine Learning. MIT Press, Cambridge, UK.
Nickisch H, Rasmussen CE, 2008. Approximations for binary Gaussian process classification. J Mach Learn Res, 9:2035–2078.
Rasmussen CE, Williams CKI, 2006. Gaussian Processes for Machine Learning. MIT Press, Cambridge, UK.
Rasmussen CE, Nickisch H, 2010. Gaussian processes for machine learning (GPML) toolbox. J Mach Learn Res, 11:3011–3015.
Shao WM, Tian XM, 2015. Adaptive soft sensor for quality prediction of chemical processes based on selective ensemble of local partial least squares models. Chem Eng Res Des, 95:113–132. https://doi.org/10.1016/j.cherd.2015.01.006
Sugiyama M, 2015. Introduction to Statistical Machine Learning. Morgan Kaufmann Publishers, Waltham, MA, USA.
Talabis M, McPherson R, Miyamoto I, et al., 2014. Information Security Analytics: Finding Security Insights, Patterns, and Anomalies in Big Data. Syngress, Waltham, MA, USA.
Wang ZX, He QP, Wang J, 2015. Comparison of variable selection methods for PLS-based soft sensor modeling. J Process Contr, 26:56–72. https://doi.org/10.1016/j.jprocont.2015.01.003
Wold S, Sjöström M, Eriksson L, 2001. PLS-regression: a basic tool of chemometrics. Chemom Intell Lab Syst, 58(2):109–130. https://doi.org/10.1016/S0169-7439(01)00155-1
Zhang XM, Kano M, Li Y, 2017. Locally weighted kernel partial least squares regression based on sparse nonlinear features for virtual sensing of nonlinear time-varying processes. Comput Chem Eng, 104:164–171. https://doi.org/10.1016/j.compchemeng.2017.04.014
Zhang XM, Kano M, Matsuzaki S, 2019. A comparative study of deep and shallow predictive techniques for hot metal temperature prediction in blast furnace ironmaking. Comput Chem Eng, 130:106575. https://doi.org/10.1016/j.compchemeng.2019.106575
Zhang XM, Kano M, Song ZH, 2020a. Optimal weighting distance-based similarity for locally weighted PLS modeling. Ind Eng Chem Res, 59(25):11552–11558. https://doi.org/10.1021/acs.iecr.9b06847
Zhang XM, Wada T, Fujiwara K, et al., 2020b. Regression and independence based variable importance measure. Comput Chem Eng, 135:106757. https://doi.org/10.1016/j.compchemeng.2020.106757
Author information
Authors and Affiliations
Contributions
Xinmin ZHANG designed the research, processed the data, and drafted the manuscript. Jingbo WANG, Chihang WEI, and Zhihuan SONG revised and finalized the paper.
Corresponding author
Additional information
Compliance with ethics guidelines
Xinmin ZHANG, Jingbo WANG, Chihang WEI, and Zhihuan SONG declare that they have no conflict of interest.
Project supported by the National Natural Science Foundation of China (Nos. 62003301 and 61833014) and the Natural Science Foundation of Zhejiang Province, China (No. LQ21F030018)
Rights and permissions
About this article
Cite this article
Zhang, X., Wang, J., Wei, C. et al. Identification of important factors influencing nonlinear counting systems. Front Inform Technol Electron Eng 23, 123–133 (2022). https://doi.org/10.1631/FITEE.2000324
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.2000324
Key words
- Important factors
- Nonlinear counting system
- Generalized Gaussian process regression
- Sensitivity analysis
- Steel casting-rolling process