A new kernel-based approach to hybrid system identification☆
Introduction
Hybrid (switched) systems have been subject of much research in the last years. Their importance stems from the capability of describing in an unified setting several processes evolving through continuous/discrete dynamics and logic rules (Bemporad, Ferrari-Trecate, & Morari, 2000), permitting e.g. to represent linear complementarity systems (Heemels, De Schutter, & Bemporad, 2001) as well as interactions between affine systems and finite automata (Sontag, 1996). Hybrid systems can be used also to approximate (with arbitrary accuracy) nonlinear dynamics by linearization around different working points, see Breiman (1993) and Lin and Unbehauen (1992) for universal approximation properties. Examples and applications of these models can be found in many different fields, including e.g. model predictive/nonlinear systems control, state estimation, computer vision, air traffic management (Bemporad and Morari, 1999, Liberzon, 2003, Paoletti et al., 2007).
This paper in particular deals with the identification of a discrete-time hybrid system composed by affine submodels, each defined by a (column) vector . A discrete state variable evolves over time and selects the th submodel if . For , the measurements model is where is the system output corrupted by a zero-mean white Gaussian noise of variance , i.e. , while is an observable regression (column) vector. In particular, as a concrete and important example, hereby we assume where is the system input measured at instant and is the system order/memory. With this definition, the first component of defines the offset, while the other contain two “impulse responses”. Examples built using (1.2) are in Table 1 which reports six popular hybrid systems taken from the literature. Starting from the measurements , our problem is to reconstruct the vectors .
Switched (also called segmented) models arise e.g. when the state variable follows a deterministic (possibly periodic) trajectory, e.g. see System 1 in Table 1, or when the are modeled as random variables independent of the regressors , as in System 2 of Table 1 ( indicates a random variable uniform on the set ). The opposite situation is found when the regressor space is partitioned into subsets and the switching rule becomes . This leads to the popular piecewise affine (PWA) models which, under (1.2), specialize to the important subclass of the piecewise auto-regressive with exogenous input (PWARX) models. Examples are Systems 3–6 contained in Table 1. Combining the condition and (1.1), (1.2), one can see that a PWARX model determines the active model only on the basis of the last input–output samples. A variant is obtained neglecting the autoregressive part, i.e. , thus leading to PWFIR models.
The difficulty of hybrid system identification is the need of jointly classifying the data (assigning each regressor to the submodel more likely to be active) and estimating the system parameters. Furthermore, the input–output hybrid map can be discontinuous along the boundaries of the submodels regions. This encumbers the use of standard kernel-based approaches, e.g. support vector regression and regularization/neural networks (Evgeniou et al., 2000, Fausett, 1994, Schölkopf and Smola, 2001) which postulate function smoothness.
The approach in Ferrari-Trecate, Muselli, Liberati, and Morari (2003) faces these difficulties combining clustering, linear identification and pattern recognition technique. In particular, the algorithm is based on the assumption that regressors close each other likely belong to the same ARX submodel. In Roll, Bemporad, and Ljung (2004), mixed-integer linear and quadratic programming is proposed to identify two subclasses of PWARX models. The approach in Bemporad, Garulli, Paoletti, and Vicino (2005) is instead inspired by set-membership identification techniques (Milanese & Vicino, 1991). The identification error is assumed to be bounded by a known quantity, and then the search for a minimum number of feasible subsystems is performed. This problem is however NP-hard and a suboptimal algorithm is proposed based on thermal relaxations. A Bayesian framework is introduced in Juloski, Weiland, and Heemels (2005). Here, the are random vectors and classification corresponds to extracting data with highest a posteriori probability. This step is performed by designing an approximated Bayes estimator implemented by particle filters (Andrieu, Doucet, & Holenstein, 2010).
Hybrid system identification is faced in an algebraic fashion in Vidal, Chiuso, and Soatto (2002): exploiting polynomial factorization and hyperplane clustering an exact solution is obtained but only in the noiseless case. While a recursive estimation scheme is described in Vidal (2008), more recent approaches rely on convex relaxation and sparse optimization. In particular, in Ohlsson and Ljung (2013) the problem’s combinatorial nature is tackled by first introducing an overparametrized model. Then, the submodels parameters are estimated by least squares regularized via a sum-of-norms penalty. A regularization parameter is introduced to balance adherence to experimental data and number of submodels. In Bako (2011), identification is instead performed by solving a sequence of (non regularized) problems defined by weighted (and reweighted) losses. An analysis of the algorithm is also obtained under noiseless assumptions.
It is worth noticing that all of the aforementioned approaches to hybrid system identification assume known the order of the ARX submodels. In addition, all the proposed algorithms have been tested only on quite simple hybrid systems (e.g. in Table 1 one has , at most). This appears an important drawback for real applications where systems can be more complex and is typically unknown. This is a central issue in system identification: it is crucial to find a suitable model structure with the right model complexity yielding a good bias–variance tradeoff (Ljung, 1999, Söderström and Stoica, 1989). In light of this, the aim of this paper is to design a new regularized technique which determines from data also submodels complexity. This will be achieved by extending the stable spline estimator proposed in Pillonetto, Chiuso, and De Nicolao (2011) and Pillonetto and De Nicolao (2010) (and further discussed in Chen, Ohlsson, & Ljung, 2012 and Pillonetto, Dinuzzo, Chen, De Nicolao, & Ljung, 2014). We interpret hybrid system identification as a functional estimation problem, facing its ill-posedness/ill-conditioning in a Bayesian framework (Rasmussen & Williams, 2006). In particular, submodels impulse responses are modeled as zero-mean Gaussian processes with autocovariances equal to the stable spline kernel. In this way, information on the exponential stability of the predictor of each isolated subsystem is included in the estimation process.
The stable spline estimator for linear system identification depends on two (unknown) hyperparameters: the scale factor and the stability parameter which regulates how fast the impulse response decays to zero. In comparison with classical parametric approaches, one important feature of this estimator is that the difficult model order selection can be replaced by hyperparameters estimation. In particular, in Chen et al. (2012) and Pillonetto and De Nicolao (2010) and are estimated optimizing the marginal likelihood (ML), i.e. the marginal density of measurements obtained after integrating out the dependence on the impulse response (MacKay, 1992). This operation is also known as Empirical Bayes (Maritz & Lwin, 1989). Several merits of ML are documented in the literature, e.g. the fact that it automatically includes the Occam’s razor (MacKay, 1992). Recent studies have also clarified why ML may work well also in presence of deviations from the stochastic model, i.e. when undermodeling affects the kernel-based impulse response description (Pillonetto and Chiuso, 2014, Pillonetto and Chiuso, 2015).
However, differently from the linear scenario, in hybrid system identification other unknown variables have to be considered: the state variables which indicate which submodel is active at every instant . The main idea explored in this paper is to consider these classification variables as further hyperparameters which can be estimated via ML optimization. Due to its combinatorial nature, this problem would seem unfeasible. We will instead show how an approximated optimization can be efficiently performed through a Markov chain Monte Carlo (MCMC) approach (Gilks, Richardson, & Spiegelhalter, 1996). Our scheme is completely automatic: it generates a Markov chain exploring the ML without the need of specifying any proposal density or tuning parameter. Experimental results show that running few and short Markov chains can already lead to very accurate classifications. Then, once the are determined via ML optimization, stable spline estimators are used to reconstruct the submodels.
The paper is so organized. In Section 2, we introduce the stable spline model for hybrid systems adopted to classify and distribute data to the submodels. Section 3 then describes how the classification problem is solved by HSS via ML optimization. In particular, an MCMC scheme to efficiently explore the support of ML is designed. In Section 4 the algorithm’s description is completed showing how the submodels are reconstructed by HSS once the estimates of are available. Section 5 introduces some indexes related to classification and impulse responses reconstruction. We also present two oracle-based procedures, and related indices, which permit to define useful performance references to assess the effectiveness of a hybrid system identification procedure. Section 6 reports some numerical experiments. First, HSS is used to identify the six systems in Table 1 without having precise information on ARX submodels orders. Next, we also set up another Monte Carlo study where HSS is employed to reconstruct more complex (randomly generated) PWFIR models of 30-th order. Finally, HSS is tested using real data coming from a pick-and-place machine whose aim is to allocate electronic components on a printed circuit board (Juloski, Heemels, & Ferrari-Trecate, 2004). Conclusions end the paper.
Section snippets
Modeling hybrid systems using stable spline kernels
In this section, we introduce the stable spline stochastic model adopted for outputs classification (the first phase of HSS). The model is also graphically described in Fig. 1 through a Bayesian network (Jensen, 2001, Magni et al., 1998). Note that the system inputs are not reported in the model since they are assumed deterministic and known. Equivalently, one can assume that the inputs and the noises are independent and think of all the probability density functions reported in the
Stable spline classifier
We now describe a new classifier which forms the first step of HSS. Even if the aim of the classifier is only to return the estimates of the (and of if it is unknown), our strategy is to optimize ML w.r.t. all the hyperparameter vector Above, the set embeds the constraints , , and . We also use to indicate the hyperparameter vector obtained after removing some of its components, e.g.
HSS
The Hybrid Stable Spline algorithm is now described. The first step is the stable spline classifier already discussed, while the second and final step is the reconstruction of the . This is obtained using the estimates of (and ) coming from the classifier. As for kernel parameters, new estimates of and are derived as follows.
Consider the new Bayesian network in Fig. 2 where the are function of the estimates of . As in Fig. 1, each is a zero-mean Gaussian vector.
However, to
Classification and impulse responses fits
We introduce two indices useful to measure the performance of a hybrid system identification algorithm.
Without loss of generality, hereby we assume that the submodels estimates are ordered so that is the vector closest to according to the Euclidean norm . The identification data are then partitioned into pieces by the classifier in such a way that the outputs used to compute via (4.2), (4.3) are associated to the th submodel. Then, the first index, related to data
Identification of the six benchmark hybrid systems via HSS
HSS is now used to identify the hybrid systems reported in Table 1. For each of the six models, a Monte Carlo study of 100 runs is performed.
At every run, HSS has to reconstruct the submodels from 500 input–output pairs, with the noise variance known and using a 30-th order PWARX.4Thus, for the
Conclusions
We have proposed a new algorithm, called HSS, which extends the stable spline estimator to the hybrid system identification scenario. The difficult segmentation step is performed by interpreting the classification variables as hyperparameters. These are then estimated by optimizing ML via a stochastic simulation scheme. Once the estimates become available, stable spline estimators are employed to reconstruct the submodels composing the hybrid structure.
One of the key features of HSS is the use
Gianluigi Pillonetto was born on January 21, 1975 in Montebelluna (TV), Italy. He received the Doctoral degree in Computer Science Engineering cum laude from the University of Padova in 1998 and the Ph.D. degree in Bioengineering from the Polytechnic of Milan in 2002. In 2000 and 2002 he was visiting scholar and visiting scientist, respectively, at the Applied Physics Laboratory, University of Washington, Seattle. In 2005, he became Assistant Professor of Control and Dynamic Systems at the
References (44)
Identification of switched linear systems via sparse optimization
Automatica
(2011)- et al.
Control of systems integrating logic, dynamics, and constraints
Automatica
(1999) - et al.
On the estimation of transfer functions, regularizations and Gaussian processes–revisited
Automatica
(2012) - et al.
A clustering technique for the identification of piecewise affine systems
Automatica
(2003) - et al.
Equivalence of hybrid dynamical models
Automatica
(2001) - et al.
Data-based hybrid modelling of the component placement process in pick-and-place machines
Control Engineering Practice
(2004) - et al.
Optimal estimation theory for dynamic systems with set membership uncertainty: An overview
Automatica
(1991) - et al.
Bayesian system identification via MCMC techniques
Automatica
(2010) - et al.
Identification of switched linear regression models using sum-of-norms regularization
Automatica
(2013) - et al.
Identification of hybrid systems a tutorial
European Journal of Control
(2007)
Optimal smoothing of non-linear dynamic systems via Monte Carlo Markov chains
Automatica
Tuning complexity in regularized kernel-based regression and linear system identification: the robustness of the marginal likelihood estimator
Automatica
Prediction error identification of linear systems: a nonparametric Gaussian regression approach
Automatica
A new kernel-based approach for linear system identification
Automatica
Identification of piecewise affine systems via mixed-integer programming
Automatica
Recursive identification of switched ARX systems
Automatica
Particle Markov chain Monte Carlo methods
Journal of the Royal Statistical Society. Series B (Statistical Methodology)
Observability and controllability of piecewise affine and hybrid systems
IEEE Transactions on Automatic Control
A bounded-error approach to piecewise affine system identification
IEEE Transactions on Automatic Control
Hinging hyperplanes for regression, classification, and function approximation
IEEE Transactions on Information Theory
Regularization networks and support vector machines
Advances in Computational Mathematics
Cited by (58)
A randomized method for the identification of switched NARX systems
2023, Nonlinear Analysis: Hybrid SystemsModels and methods for hybrid system identification: a systematic survey
2023, IFAC-PapersOnLineDeep prediction networks
2022, NeurocomputingHierarchical identification of nonlinear hybrid systems in a Bayesian framework
2022, Information and Computation
Gianluigi Pillonetto was born on January 21, 1975 in Montebelluna (TV), Italy. He received the Doctoral degree in Computer Science Engineering cum laude from the University of Padova in 1998 and the Ph.D. degree in Bioengineering from the Polytechnic of Milan in 2002. In 2000 and 2002 he was visiting scholar and visiting scientist, respectively, at the Applied Physics Laboratory, University of Washington, Seattle. In 2005, he became Assistant Professor of Control and Dynamic Systems at the Department of Information Engineering, University of Padova where he currently serves as an Associate Professor.
His research interests are in the field of system identification and machine learning.
Dr. Pillonetto is an Associate Editor of Automatica and Systems and Control Letters.
- ☆
This research has been partially supported by the MIUR FIRB project RBFR12M3AC-Learning meets time: a new computational approach to learning in dynamic systems and by the Progetto di Ateneo CPDA147754/14-New statistical learning approach for multi-agents adaptive estimation and coverage control. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Er-Wei Bai under the direction of Editor Torsten Söderström. The author would like to thank Prof. Henrik Ohlsson for providing the real data coming from a pick-and-place machine.
- 1
Tel.: +390498277607.