Recursive maximum likelihood estimation with t-distribution noise model

doi:10.1016/j.automatica.2021.109789

Automatica

Volume 132, October 2021, 109789

https://doi.org/10.1016/j.automatica.2021.109789 Get rights and content

Abstract

In this paper, a recursive $t$ -distribution noise model based maximum likelihood estimation algorithm for discrete-time dynamic state estimation is proposed. The proposed estimator is robust to outliers because the “thick tail” of the $t$ -distribution reduces the effect of large errors in the likelihood function. A computationally efficient recursive algorithm is derived using the influence function. As the $t$ -distribution reduces to the Gaussian distribution when its degree of freedom tends to infinity, the proposed estimator reduces to the Kalman filter. The mean squared error is used to evaluate the performance of the proposed estimator. Compared with the Kalman filter, the proposed estimator is more robust to outliers in the process and measurement noise. Simulations show that for the particle filter to give a better mean squared error, its computational time is two orders of magnitude slower than the proposed estimator.

Introduction

Gaussian noise is often assumed in state estimation problems. However, the Gaussian noise assumption is an approximation to reality. The occurrence of outliers, transient data in steady-state measurements, instrument failure and model nonlinearity can all induce non-Gaussian data (Abur and Gómez-Expósito, 2004, Ho et al., 2014, Ho et al., 2013, Wang and Romagnoli, 2003, Wang and Romagnoli, 2005). Outliers that are far away from the expected Gaussian distribution can give rise to misleading estimation results (Hampel et al., 2011).

The sensitivity of an estimator when the underlying noise assumption is violated has been extensively studied in the robust statistics literatures (Hampel et al., 2011, Huber, 1981, Huber, 1992, Maronna et al., 2006). In particular, Huber (1992) studied the effect of outliers by contaminating the underlying noise distribution with data from an arbitrary unknown distribution. Hampel et al. (2011) proposed the influence function (IF) approach to describe the approximate effect of an observation given a distribution. In this paper, the IF is used to approximate the maximum likelihood estimation (MLE) problem. In the context of this paper, the approximate effect is the state estimate while the observation is the measurement and the $t$ -distribution is the given distribution. If the IF of an estimator is bounded and/or decreasing, or increasing slowly for large magnitude of noise, the estimator is robust to outliers.

The least-squares estimator and the Kalman filter (KF) have been widely used in real-time estimation (Shivakumar & Jain, 2008). However, it can be shown that the IFs of the least-squares estimator and the KF increase linearly with the magnitude of noise and are unbounded (Hampel et al., 2011). This confirms the well-known fact that the least-squares estimator and the conventional KF are not robust to outliers (Abur and Gómez-Expósito, 2004, Hampel et al., 2011). On the other hand, robust estimators such as generalized $t$ -distribution (Ho et al., 2014, Ho et al., 2013, Wang and Romagnoli, 2003, Wang and Romagnoli, 2005), quadratic-constant, quadratic-linear, and multiple-segment estimators (Chen et al., 2019, Ho et al., 2017, Meliopoulos, 2004, Monticelli, 1999) have bounded or decreasing IFs, and hence various degrees of robustness (Hampel et al., 2011, Ho et al., 2017, Ho et al., 2013).

The $t$ -distribution has the property of “thick tail” that can model outlier statistics (Kotz & Nadarajah, 2004). In addition, as a special case, the $t$ -distribution reduces to the Gaussian distribution when its degree of freedom tends to infinity (Grigelionis, 2013). Thus the $t$ -distribution has the flexibility to characterize noise with Gaussian or non-Gaussian statistical properties.

This paper pushes non-Gaussian state estimation work further into real-time estimation by approximating the MLE problem. A new MLE algorithm using $t$ -distribution noise model is proposed. The proposed algorithm directly uses $t$ -distribution probability density function (pdf) in its likelihood function. As the $t$ -distribution reduces to the Gaussian distribution when its degree of freedom tends to infinity (Grigelionis, 2013), the proposed estimator reduces to the KF. IF has been mainly used as an analysis tool (Hampel et al., 2011). The main contribution of this paper is to use the IF to approximate the MLE problem and give a recursive algorithm that does not rely on actual measurements for the covariance matrix for efficient real-time implementation. The covariance matrix can be calculated offline.

Robust recursive estimators with $t$ -distribution noise assumption have also been reported in the literature. Aravkin et al. (2014) and Fahrmeir and Künstler (1999) apply $t$ -distribution based MLE in a recursive manner. In Fahrmeir and Künstler (1999), the estimation is formulated in a Bayesian framework by maximizing the pre-defined posteriori density function and then solved recursively. In Aravkin et al. (2014), the estimation is transformed into a quadratic programming problem and then solved using iterations. However, unlike the proposed estimator where the covariance matrix does not rely on actual measurements and can be calculated offline, both the information matrix in Fahrmeir and Künstler (1999) and the quadratic programming in Aravkin et al. (2014) rely on real-time measurements, and thus have to be calculated online. A survey of Generalized Kalman Smoothing is given in Aravkin et al. (2017). Besides dynamic state estimation, the generalized $t$ -distribution has also been used for robust data reconciliation (Wang and Romagnoli, 2003, Wang and Romagnoli, 2005), robust parameter estimation (Ho et al., 2013) and robust filtering of the autoregressive moving-average with exogenous input model (Ho et al., 2014).

$M$ -estimation based robust estimators such as quadratic- constant estimator, quadratic-linear estimator (Huber estimator) and multiple-segment estimator (Hampel estimator) have been proven effective in the presence of outliers (Chen et al., 2019, Meliopoulos, 2004, Monticelli, 1999). However, these $M$ -estimators are not implemented in a recursive manner and thus suffer from computational load burden because of batch implementation (Sun et al., 2019). Using IF, the proposed estimator, on the other hand, can be implemented recursively.

Other approaches for handling non-Gaussian noise at the expense of a heavier computational load include the particle filter (PF) (Arulampalam et al., 2002) which is based on Monte-Carlo method.

The paper is organized as follows. In Section 2, the MLE with $t$ -distribution is introduced. In Section 3, the proposed recursive state estimation algorithm is derived. Examples to compare the proposed estimator with existing estimators are given in Section 4 and conclusions, in Section 5.

Section snippets

Maximum likelihood estimation with $t$ -distribution process and measurement noise

Consider the following state–space model in discrete time domain. $x (k + 1) = A x (k) + B u (k) + w (k),$ $y (k) = C x (k) + v (k),$ where $k = 1, \dots, N$ . The state vector, system input and output are given by $x (k)$ , $u (k)$ and $y (k)$ respectively. The process, input and output matrices are given by $A$ , $B$ and $C$ respectively, where $A$ is assumed to be asymptotically stable, i.e. all its eigenvalues are inside the unit disc. Random variables $w (k)$ and $v (k)$ are the process noise and the measurement noise of the system, both assumed

The proposed estimator

Solving (9) for ${\hat{x}}_{MLE} (N)$ numerically can be time consuming for large $N$ . A recursive state estimation algorithm is derived using IF.

A motivating example that can explain the IF intuitively is given in Section 3.1. The IF of the MLE problem is given in Section 3.2, and the recursive algorithm, in Section 3.3.

Examples

Consider applying the MLE on a 2-state-3-measurement system given by (2), (3), where $A = [\begin{matrix} 0.9 & 0.1 \\ 0 & 0.9 \end{matrix}],$ $C = [\begin{matrix} 1 & 0 \\ 0 & 1 \\ 1 & 1 \end{matrix}],$ with the initial condition $x (0) = {[0, 0]}^{T}$ . The mean squared error (MSE) and computational time of the proposed estimator are compared with the KF (Lewis et al., 2007), the Modified Kalman Filter (MKF) and the PF (Arulampalam et al., 2002). The noise parameters used in the simulations are summarized in Table 2. Five simulations are carried out. Simulation 1 assumes Gaussian process and

Conclusion

This paper pushes non-Gaussian state estimation work further into real-time estimation by approximating the MLE problem. The IF is employed to give an approximate solution to the MLE problem with independent $t$ -distribution process and measurement noise. The approximate solution can be formulated as a recursive algorithm which makes it suitable for real-time applications. Under $t$ -distribution noise, the simulation examples show that the proposed estimator gives an estimate with a smaller MSE

Lu Sun received the bachelor’s degree from the Department of Electrical Engineering and Automation, Honors School, Harbin Institute of Technology, Harbin, China, in 2014, and the Ph.D. degree from the Department of Electrical and Computer Engineering, Faculty of Engineering, National University of Singapore, Singapore, in 2019. He is currently a Research Scientist with the Experimental Power Grid Centre as part of the Energy Research Institute at Nanyang Technological University, Singapore. His

References (25)

AravkinA. et al.
Generalized Kalman smoothing: Modeling and algorithms
Automatica
(2017)
HoW.K. et al.
Variance analysis of robust state estimation in power system using influence function
International Journal of Electrical Power & Energy Systems
(2017)
AburA. et al.
Power System State Estimation: Theory and Implementation
(2004)
AravkinA.Y. et al.
Robust and trend-following Student’s t Kalman smoothers
SIAM Journal on Control and Optimization
(2014)
ArulampalamM.S. et al.
A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking
IEEE Transactions on Signal Processing
(2002)
BeilinaL. et al.
Numerical Linear Algebra: Theory and Applications
(2017)
ChenT. et al.
Robust power system state estimation using t-distribution noise model
IEEE Systems Journal
(2019)
FahrmeirL. et al.
Penalized likelihood smoothing in robust state space models
Metrika
(1999)
GrigelionisB.
Student’s $t$ -Distribution and Related Stochastic Processes
(2013)
HampelF.R. et al.
Robust Statistics: the Approach Based on Influence Functions, (Vol. 114)
(2011)

HoW.K. et al.

Filtering of the ARMAX process with generalized t-distribution noise: The influence function approach

Industrial & Engineering Chemistry Research

(2014)

HoW.K. et al.

Influence function analysis of parameter estimation with generalized t distribution noise model

Industrial & Engineering Chemistry Research

(2013)

Cited by (9)

Multistep short-term wind power forecasting model based on secondary decomposition, the kernel principal component analysis, an enhanced arithmetic optimization algorithm, and error correction
2024, Energy
Wind power forecasting can effectively improve the energy utilization efficiency of a power system and ensure its stable operation. In this study, a novel hybrid multistep prediction model, including the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), variational mode decomposition (VMD), the kernel principal component analysis (KPCA), an enhanced arithmetic optimization algorithm (ENAOA), a bidirectional long short-term memory (BILSTM) neural network, and error correction, was designed for short-term wind power forecasting. First, the collected original wind power data were decomposed into multiple intrinsic mode functions (IMFs) through a secondary decomposition composed of the CEEMDAN and VMD, which eliminated the interactions between different components to achieve denoising. Second, the KPCA was adopted to reduce the dimensionality of the multiple IMFs by extracting the principal components, effectively reducing the complexity of the multidimensional IMF data and improving the forecasting efficiency of the proposed prediction model. Subsequently, an ENAOA was proposed based on the Sobol sequence, adaptive T-distribution, and random walk strategy to optimize the BILSTM parameters. Finally, multiple preprocessed components were predicted by the optimized BILSTM, after which error correction was performed to obtain the final prediction results, which could further reduce the forecast error of the designed prediction model. Based on two sets of data collected from a wind farm in northwest China, the simulation results of 1-step, 4-step, 7-step, and 10-step forecasting revealed that compared with other incomplete models, the various algorithms adopted in the hybrid forecasting model reduced the prediction errors to different degrees, significantly enhanced the wind power prediction performance, and validated the effectiveness and feasibility of the proposed model.
Discrete-time linear skew-Gaussian system and its recursive fixed-dimensional exact density filtering
2024, Automatica
This paper deals with modeling and exact density filtering, in a finite fixed dimension, of a discrete-time linear system with skew-Gaussian (SG) distributions. More general than a linear Gaussian system, a linear SG system is presented, where the initial state, process noise, and measurement noise are mutually independent SGs. We first investigate the SG distribution and propose an SG process. Then, we develop a linear state-space model of the SG process, which subsumes the linear Gaussian system model in an analogous form. With additional parameters beyond the linear Gaussian case, it can model some practical problems involving certain asymmetry (skewness). Finally, for the linear SG system, we derive a finite fixed-dimensional exact filter, which is similar to the Kalman filter (KF) in structure and computation. The proposed recursive filter obtains the evolving posterior distribution exactly and includes the KF as a special case. As an illustration, our proposed skew-Gaussian filter is demonstrated via a simulation study.
Maximum likelihood interval-varying recursive least squares identification for output-error autoregressive systems with scarce measurements
2023, Journal of the Franklin Institute
The identification problem of output-error autoregressive (OEAR) systems with scarce measurements is considered in this paper. In order to overcome the massive absence of outputs, an interval-varying recursive identification algorithm is proposed through changing the sampling interval and skipping the missing outputs. Based on the maximum likelihood principle, a maximum likelihood interval-varying recursive least squares algorithm is proposed. The effectiveness of the proposed algorithm is tested by a numerical simulation example, and an application example about the heading motion control of underwater vehicle.
Prediction of modulus of elasticity of UHPC using maximum likelihood estimation method
2022, Structures
Citation Excerpt :
The MLE method has been proven to be more accurate and efficient in comparison to the LSE method when a large set of data is used. In terms of applications, MLE has been used for various majors, such as economy, finance, or engineering [44–46]. For the last few years, the MLE has started emerging in civil engineering applications, from structural engineering to structural health monitoring, and has shown improvements over the other statistical methods [47–51].
Modulus of elasticity (MOE) is a significant design parameter representative of the stiffness of concrete materials. In the current design practice, the determination of MOE primarily relies on empirical equations. Previous studies have recommended different equations to predict the MOE of ultra-high performance concrete (UHPC) based on a correlation with concrete compressive strength. The coefficients of these equations are dependent on the chosen empirical fits, in which the least-squares estimation (LSE) is one of the most popular fits. This study proposes a new approach by using a probabilistic method called the maximum likelihood estimation (MLE). A data set consisting of 364 data points of concrete compressive strength and MOE was developed for the MLE analysis. The negative log-likelihood is used as an indicator for the analysis. Two MOE equations are proposed. The proposed equations achieved negative log-likelihoods of 3,725 and 3,720, respectively, in comparison to 3,737 and 3,999 as the smallest and greatest negative log-likelihoods of the equations of the literature. These equations reveal that the MOE of UHPC is not proportional to the square root of concrete compressive strength as the current code equations specify for conventional concrete. The difference in the microstructure between UHPC and conventional concrete is a key factor attributing to the observation.
Maximum likelihood gradient-based iterative estimation for closed-loop Hammerstein nonlinear systems
2024, International Journal of Robust and Nonlinear Control
Learning Bayesian Network Parameters from Limited Data Integrating Entropy and Monotonicity
2023, SSRN

View all citing articles on Scopus

Weng Khuen Ho received the B.Eng. and Ph.D. degrees in electrical engineering from the National University of Singapore, Singapore, in 1987 and 1992, respectively. He was an Engineer with the Chartered Industry of Singapore from 1987 to 1988. He joined the Department of Electrical and Computer Engineering, National University of Singapore, in 1992. He served full-time National Service in the Singapore Armed Forces as an Infantry Officer from 1979 to 1981 and in the army reserve from 1981 to 2012. He has supervised more than 30 graduate students. His research interests include control and signal processing in semiconductor manufacturing and process control. Dr. Ho was the recipient of the 2003 Best Paper Award for the IEEE Transactions on Semiconductor Manufacturing.

Keck-Voon Ling received the B.Eng. (First Class) degree in electrical engineering from the National University of Singapore, Singapore, in 1988, and the D.Phil. degree in control engineering from Oxford University, Oxford, U.K., in 1992. He is currently an Associate Professor with Nanyang Technological University, Singapore. His current research interests include model predictive control, and its embedded implementations on reconfigurable computing platforms and applications.

Tengpeng Chen received the Ph.D. degree from the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, in 2017. He is currently an Assistant Professor with the Department of Instrumental and Electrical Engineering, Xiamen University, Xiamen, China. His research interests lie in distributed optimization algorithms with application in large-scale systems, power system state estimation, energy management for microgrid, and power system dynamics.

Jan Maciejowski graduated from Sussex University in 1971 with a B.Sc degree in Automatic Control, and from Cambridge University in 1978 with a Ph.D degree in Control Engineering. From 1971 to 1974 he was a Systems Engineer with Marconi Space and Defence Systems Ltd, working mostly on attitude control of spacecraft and high-altitude balloon platforms. He is a Professor Emeritus of Control Engineering at Cambridge, retired since November 2018. He is also a Fellow Emeritus of Pembroke College, Cambridge, and was one of the Principal Investigators in Phase 1 of the Cambridge CARES project based in Singapore. From 2009 to 2014 he was the Head of the Information Engineering Division at Cambridge. From 2008 to 2018 he was President of Pembroke College, Cambridge. He was the President of the European Union Control Association from 2003 to 2005, and President of the Institute of Measurement and Control for 2002. He is a Chartered Engineer and a Fellow of the Institution of Engineering and Technology (IET), the Institute of Electrical and Electronic Engineers (IEEE), the Institute of Measurement and Control (InstMC), and of the International Federation of Automatic Control (IFAC). He was a Distinguished Lecturer of the IEEE Control Systems Society from 2001 to 2007. He has recently been applying optimal control methods to the study of trade-offs in the control of the Covid-19 pandemic.

^☆: The authors acknowledge support by the Singapore National Research Foundation (NRF) under its Campus for Research Excellence And Technological Enterprise (CREATE) programme, specifically the Cambridge Centre for Advanced Research and Education in Singapore (Cambridge CARES, http://www.cares.cam.ac.uk), project C4T. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Adrian George Wills under the direction of Editor Torsten Söderström.

View full text

Recursive maximum likelihood estimation with t-distribution noise model☆

Abstract

Introduction

Section snippets

Maximum likelihood estimation with t-distribution process and measurement noise

The proposed estimator

Examples

Conclusion

Automatica

International Journal of Electrical Power & Energy Systems

Power System State Estimation: Theory and Implementation

Robust and trend-following Student’s t Kalman smoothers

SIAM Journal on Control and Optimization

A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking

IEEE Transactions on Signal Processing

Numerical Linear Algebra: Theory and Applications

Robust power system state estimation using t-distribution noise model

IEEE Systems Journal

Penalized likelihood smoothing in robust state space models

Metrika

Student’s t-Distribution and Related Stochastic Processes

Robust Statistics: the Approach Based on Influence Functions, (Vol. 114)

Filtering of the ARMAX process with generalized t-distribution noise: The influence function approach

Industrial & Engineering Chemistry Research

Influence function analysis of parameter estimation with generalized t distribution noise model

Industrial & Engineering Chemistry Research

Recursive maximum likelihood estimation with $t$ -distribution noise model☆

Maximum likelihood estimation with $t$ -distribution process and measurement noise

Student’s $t$ -Distribution and Related Stochastic Processes