Abstract
This paper presents a novel method to estimate the impulse response function of Linear Time-Invariant systems from input-output data by means of Laguerre functions and Convolved Gaussian Processes. We define a new non-stationary covariance function that encodes the convolution between the Laguerre functions and the input. The input (excitation) is modelled by a Gaussian Process prior. Thus, we are able to estimate the system’s impulse response by performing maximum likelihood estimation over the model hyperparameters. Besides, the proposed model performs well in missing and noisy data scenarios.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Ordinary differential equations (ODEs) are used to describe Linear Time-Invariant (LTI) systems in many fields, such as, Biology, Engineering, Economics, among others [3]. Generally, an ODE can be represented as
where p is the order of the differential equation, \(b_i\) (with \(i = \{0, 1,\ldots ,\ p \}\)) is a coefficient that weights the i-th derivative of f(t) w.r.t. t, u(t) is the forcing function and f(t) is the solution function. Equation (1) can be rewritten as the following convolution,
where h(t) is function of all the \(b_i\)’s and is also known as the impulse response function (IRF).
In [2] is introduced a Gaussian Process (GP) model where the covariance function of the output is build from the convolution of a smoothing kernel and the covariance of a white noise process. Later, this model is extended in Latent Force models (LFMs) [1] by replacing the smoothing kernel with the system’s IRF. Thus, the IRF is encoded within the covariance function in the LFM framework. Besides, LFMs are an interesting tool given that we are able to make system identification by learning the model from the available output data.
In many system identification applications, the order the ODE and its parametrization is unknown. This problem have been addressed by approximating the IRF using Orthogonal Basis Functions (OBFs) [8, 10]. Laguerre functions are a type of OBFs that are characterized by having only one pole or parameter that controls the waveform of the basis functions. Furthermore, Laguerre functions have been used in system identification tasks, as in [5, 12].
In this paper, we propose to estimate the IRF using the Laguerre functions encoded in the covariance function of a Convolved Gaussian Process (CGP). Laguerre parameters are learned during the maximization of the CGP marginal likelihood function.
This paper is organized as follows. In Sect. 2, a brief overview of CGP theory is given, with the mathematical construction of all covariance functions required to perform system identification. Then, in Sect. 3 the Laguerre functions are introduced. Using the theory developed in previous section, the proposed model is presented in Sect. 4. Some results showing the ability of the model to estimate the IRF under different scenarios are explored in Sect. 5. Finally, a discussion about similar approaches is given in Sect. 6.
2 Convolved Gaussian Processes
In this section an overview of the mathematical foundation for CGPs is given. We start by considering a multi-output multi-input (MIMO) dynamical system with D outputs and Q inputs. In this system, the d-th output is described by
with \(\epsilon _d(t) \sim \mathcal {N}(0, \sigma _d^2)\) and
where \(h_d(t)\) is a smoothing kernel which corresponds to the system’s IRF [1]. The set of inputs \(\{u_q(z)\}_{q=1}^{Q}\) are independent GPs, and each one is defined as
where \(k_{u_q,u_q}(t,t')\) is the covariance function used to model the input functions. As in [1] we adopted the square exponential covariance function defined as follows
where \(l_q\) is the lengthscale associated to the q-th input. From Eqs. (3), (4) and (5), we are able to model the set of outputs \(\{y_d(t)\}_{d=1}^{D}\) as a GP with zero mean and covariance function given by
with
where \(k^{(q)}_{f_d, f_{d'}}(t,t')\) is defined as
Additionally, we require to define the cross-covariance between outputs and inputs which is defined as
Covariance function \(k_{f_d,u_q}(t,t')\) allows us to make predictions from the output domain to the input domain, and vice versa. This property is exploited in the proposed model defined in Sect. 4.
3 Laguerre Functions
Laguerre functions form a complete and orthonormal set of basis, where each function is defined as [5]
where \(\gamma \) is a free parameter known as Laguerre scale [4], and
The m-th degree polynomial \(\mathcal {L}_m(t)\) is called the m-th Laguerre polynomial. It is also interesting to note that the Laguerre function \(l_m(t)\) has m zero crossings defined by the zeros of \(\mathcal {L}_m(t)\). Additionally, each Laguerre function represents a dynamical system characterized by one pole with multiplicity, e.g. the Laplace transform of the m-th Laguerre function is given by
where the Laguerre scale parameter \(\gamma \) controls the position of the pole and the zeros of each Laguerre function.
4 Proposed Method
In LFMs, the IRF is assumed to be known before hand, this allows to the CGP to predict the input values solely from the output data. Here, we are interested in approximating the IRF by means of Laguerre functions. In order to do so, we require to make use of input-output data. Thus, the IRF of the d-th output is approximated by the Laguerre functions as
where \(c_{d,m}\) weights the m-th Laguerre function of the d-th output. Then (4) becomes
Next, we proceed to define the covariance functions for the new model described in (7). Thus, the covariance function \(k^{(q)}_{f_d, f_{d'}}(t,t')\) becomes
and \(k_{f_d,u_q}(t,t')\) is rewritten as
Note that, if \(m>0\), then the convolutions for the above covariance functions have no closed form. Consequently, we approximate the convolutions by using discrete sums as in [6]. This increases the computation time for the evaluation of the covariance function, but it also allows the model to use any covariance function for the inputs.
4.1 Hyperparameter Learning
Let us assume we are given observations of Q-inputs and D-outputs in vectors \(\{\mathbf {u}_q\}_{q=1}^{Q}\) and \(\{\mathbf {y}_d\}_{d=1}^D\), respectively. Thus we condition the proposed GP model on this finite set of observations, the model becomes into the following multivariate normal distribution
where \({\mathbf {g}}\) is obtained by stacking in one column the vectors \(\mathbf {y}\) and \(\mathbf {u}\). Furthermore, the covariance function for \({\mathbf {g}}\) is defined as follows,
where each \({\mathbf {K}}_{i,j}\) (i or j can be either y, f or u) is obtained by evaluating the covariance function \(k_{i,j}(t,t')\) at the time inputs \(\mathbf {t}\) associated to the observations. From the above definitions, the set of parameters \(\theta = \{l_q, \sigma ^2_d, \alpha _d, c_{d,m} \}\) (where m indexes the Laguerre functions as described in (6)) are learned by maximizing the logarithm of the marginal likelihood, which is given by
where N is the total number of data points [7].
5 Results
In this section, two different numerical problems are given to illustrate the properties of the proposed model.
5.1 Approximation of the Impulse Response
In this experiment we show the ability of the proposed model to recover the impulse response function by means of the Laguerre functions that are encoded within the covariance function. We first generate 50 data points equally spaced along the range [0, 4] s, by sampling \(\mathbf {u}\) from the GP defined in (5) with \(lq = 0.8\). Then, the output data \(\mathbf {f}\) is obtained by applying \(\mathbf {u}\) through a second order dynamical system characterized by the following IRF
with \(b_0 = 1\), \(b_1 = 4\) and \(\omega = \sqrt{b_1^2- 4b_0}/2\). By stacking data vectors \(\mathbf {f}\) and \(\mathbf {u}\) we proceed to learn a model with \(M=10\) Laguerre basis by using the procedure described in Sect. 4.1. The model is learned 10 times with different initializations for the parameters \(\theta \). In Fig. 1 is shown the mean and two standard deviations calculated from the 10 IRFs learned by the proposed model. The mean function fits well the true IRF, but the standard deviation indicates that some of the obtained approximations tend to be less accurate at the end of the available data. This effect is produced when using high order Laguerre functions to fit smooth IRFs. As discussed in [5], this problem can be addressed by forcing the higher order weights \(c_{d,m}\) to be zero. In other words, overfitting might be reduced by setting to zero the coefficients of Laguerre functions related to high frequency components.
5.2 Prediction of Missing Input/Output Values
For this experiment, we are interested in predicting missing values of inputs and outputs. First, we configured a MIMO system with 2 inputs and 2 outputs described by the following equations
with \(\bar{u}(t) = \sum _{q=1}^{2}u_q(t)\). Then, 50 data points are generated for inputs and outputs variables (we follow similar steps as the ones described in Sect. 5.1). Additionally, output data is corrupted by an additive white noise process as shown in (3). In order to show the ability of the proposed model to deal with missing data, the dataset is divided into 2 subsets (training and testing) as shown in Fig. 2.
Output 2 missing data is predicted by the data from output 1 and the inputs, while input 1 missing data is predicted only from the outputs (regarding the covariance function defined in (8)). In Fig. 3 is shown the IRFs estimated for both outputs. The estimated IRF for output 1 is close enough, but the estimation of the IRF for output 2 is clearly affected by the missing data. Nevertheless, predictions for the testing data are well fitted.
6 Discussion and Future Work
Our experiments demonstrated that the proposed model is able to estimate the continuous-time IRF of LTI systems in the presence of noise corruption and missing input and output data. The IRF can be approximated in a non-parametric manner by placing a GP prior over this function. In [11] the Gaussian Process Convolution Model (GPCM) is introduced. The GPCM is described as a continuous-time non-parametric window moving average process. One advantage of this model is that it can be considered in the frequency domain as well. Another non-parametric approximation is proposed in [9], where a specific covariance matrix is built by using a stable spline kernel.
The above mentioned methods require to approximate the posterior distribution of the IRF by using special algorithms, i.e. the GPCM requires a variational inference approach, meanwhile the work in [9] an Expectation-Maximization algorithm is adopted. In contrast, for the model proposed here, the IRF is estimated using a set of orthogonal basis (Laguerre functions) which included a dynamical representation. Additionally, the proposed model is learned using the standard GP training method [7].
This model can be easily extended by using another set of orthonormal basis. For example, selecting a set of basis which allows to find a closed form for the convolutions is recommended.
References
Álvarez, M.A., Luengo, D., Lawrence, N.D.: Linear latent force models using Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2693–2705 (2013)
Boyle, P., Frean, M.: Dependent Gaussian processes. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Advances in Neural Information Processing Systems, vol. 17, pp. 217–224. MIT Press (2005). http://papers.nips.cc/paper/2561-dependent-gaussian-processes.pdf
Dass, S.C., Lee, J., Lee, K., Park, J.: Laplace based approximate posterior inference for differential equation models. Stat. Comput. 27(3), 679–698 (2017). https://doi.org/10.1007/s11222-016-9647-0
Haber, R., Keviczky, L.: Nonlinear System Identification - Input-Output Modeling Approach. Kluwer Academic, Dordrecht (1999)
Israelsen, B.W., Smith, D.A.: Generalized laguerre reduction of the volterra kernel for practical identification of nonlinear dynamic systems. CoRR abs/1410.0741 (2014). http://arxiv.org/abs/1410.0741
Lawrence, N.D., Sanguinetti, G., Rattray, M.: Modelling transcriptional regulation using Gaussian processes. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 785–792. MIT Press (2007). http://papers.nips.cc/paper/3119-modelling-transcriptional-regulation-using-gaussian-processes.pdf
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2006)
Reginato, B.C., Oliveira, G.H.C.: On selecting the MIMO generalized orthonormal basis functions poles by using particle swarm optimization. In: 2007 European Control Conference (ECC), pp. 5182–5188, July 2007
Risuleo, R.S., Bottegal, G., Hjalmarsson, H.: Kernel-based system identification from noisy and incomplete input-output data. In: 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 2061–2066, December 2016
Stanislawski, R., Hunek, W.P., Latawiec, K.J.: Modeling of nonlinear block-oriented systems using orthonormal basis and radial basis functions. In: International Conference on Systems Engineering, pp. 55–58 (2008)
Tobar, F., Bui, T.D., Turner, R.E.: Learning stationary time series using Gaussian processes with nonparametric kernels. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 3501–3509. Curran Associates, Inc. (2015)
Wahlberg, B.: System identification using Laguerre models. IEEE Trans. Autom. Control 36(5), 551–562 (1991)
Acknowledgements
Authors would like to thank Convocatoria 567 from the Administrative Department of Science, Technology and Innovation of Colombia (COLCIENCIAS) for the support and funding of this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Guarnizo, C., Álvarez, M.A. (2018). Impulse Response Estimation of Linear Time-Invariant Systems Using Convolved Gaussian Processes and Laguerre Functions. In: Mendoza, M., Velastín, S. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2017. Lecture Notes in Computer Science(), vol 10657. Springer, Cham. https://doi.org/10.1007/978-3-319-75193-1_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-75193-1_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75192-4
Online ISBN: 978-3-319-75193-1
eBook Packages: Computer ScienceComputer Science (R0)