A new kernel-based approach to hybrid system identification

doi:10.1016/j.automatica.2016.03.011

Automatica

Volume 70, August 2016, Pages 21-31

https://doi.org/10.1016/j.automatica.2016.03.011 Get rights and content

Abstract

All the approaches for hybrid system identification appeared in the literature assume that model complexity is known. Popular models are e.g. piecewise ARX with a priori fixed orders. Furthermore, the developed numerical procedures have been tested only on simple systems, e.g. composed of ARX subsystems of order 1 or at most 2. This represents a major drawback for real applications. This paper proposes a new regularized technique for identification of piecewise affine systems, namely the hybrid stable spline algorithm (HSS). HSS exploits the recently introduced stable spline kernel to model the submodels impulse responses as zero-mean Gaussian processes, including information on submodels predictor stability. The algorithm consists of a two-step procedure. First, exploiting the Bayesian interpretation of regularization, the problem of classifying and distributing the data to the subsystems is cast as marginal likelihood optimization. We show how an approximated optimization can be efficiently performed by a Markov chain Monte Carlo scheme. Then, the stable spline algorithm is used to reconstruct each subsystem. Numerical experiments on real and simulated data are included to test the new procedure. They show that HSS not only solves all the most popular benchmark problems proposed in the literature without having exact information on ARX subsystems order, but can also identify more complex (high-order) piecewise affine systems. MATLAB code implementing the approach, called Hybrid Stable Spline Toolbox, is also made available.

Introduction

Hybrid (switched) systems have been subject of much research in the last years. Their importance stems from the capability of describing in an unified setting several processes evolving through continuous/discrete dynamics and logic rules (Bemporad, Ferrari-Trecate, & Morari, 2000), permitting e.g. to represent linear complementarity systems (Heemels, De Schutter, & Bemporad, 2001) as well as interactions between affine systems and finite automata (Sontag, 1996). Hybrid systems can be used also to approximate (with arbitrary accuracy) nonlinear dynamics by linearization around different working points, see Breiman (1993) and Lin and Unbehauen (1992) for universal approximation properties. Examples and applications of these models can be found in many different fields, including e.g. model predictive/nonlinear systems control, state estimation, computer vision, air traffic management (Bemporad and Morari, 1999, Liberzon, 2003, Paoletti et al., 2007).

This paper in particular deals with the identification of a discrete-time hybrid system composed by $s$ affine submodels, each defined by a (column) vector $θ_{k}$ . A discrete state variable $x_{t}$ evolves over time $t$ and selects the $k$ th submodel if $x_{t} = k$ . For $t = 1, \dots, N$ , the measurements model is $y_{t} = ρ_{t}^{⊤} θ_{k} + e_{t} for x_{t} = k$ where $y_{t}$ is the system output corrupted by a zero-mean white Gaussian noise $e_{t}$ of variance $σ^{2}$ , i.e. $e_{t} \sim N (0, σ^{2})$ , while $ρ_{t}$ is an observable regression (column) vector. In particular, as a concrete and important example, hereby we assume $ρ_{t} = {[1 y_{t - 1} \dots y_{t - m} u_{t - 1} \dots u_{t - m}]}^{⊤}$ where $u_{t}$ is the system input measured at instant $t$ and $m$ is the system order/memory. With this definition, the first component of $θ_{k}$ defines the offset, while the other $2 m$ contain two “impulse responses”. Examples built using (1.2) are in Table 1 which reports six popular hybrid systems taken from the literature. Starting from the measurements ${u_{t}, y_{t}}_{t = 1}^{N}$ , our problem is to reconstruct the $s$ vectors $θ_{k}$ .

Switched (also called segmented) models arise e.g. when the state variable follows a deterministic (possibly periodic) trajectory, e.g. see System 1 in Table 1, or when the $x_{t}$ are modeled as random variables independent of the regressors $ρ_{t}$ , as in System 2 of Table 1 ( $x \sim U_{A}$ indicates a random variable $x$ uniform on the set $A$ ). The opposite situation is found when the regressor space $X$ is partitioned into $s$ subsets $X_{k}$ and the switching rule becomes $x_{t} = k ⟺ ρ_{t} \in X_{k}$ . This leads to the popular piecewise affine (PWA) models which, under (1.2), specialize to the important subclass of the piecewise auto-regressive with exogenous input (PWARX) models. Examples are Systems 3–6 contained in Table 1. Combining the condition $x_{t} = ρ_{t}$ and (1.1), (1.2), one can see that a PWARX model determines the active model only on the basis of the last $m$ input–output samples. A variant is obtained neglecting the autoregressive part, i.e. $ρ_{t} = {[1 u_{t - 1} \dots u_{t - m}]}^{⊤}$ , thus leading to PWFIR models.

The difficulty of hybrid system identification is the need of jointly classifying the data (assigning each regressor to the submodel more likely to be active) and estimating the system parameters. Furthermore, the input–output hybrid map can be discontinuous along the boundaries of the submodels regions. This encumbers the use of standard kernel-based approaches, e.g. support vector regression and regularization/neural networks (Evgeniou et al., 2000, Fausett, 1994, Schölkopf and Smola, 2001) which postulate function smoothness.

The approach in Ferrari-Trecate, Muselli, Liberati, and Morari (2003) faces these difficulties combining clustering, linear identification and pattern recognition technique. In particular, the algorithm is based on the assumption that regressors close each other likely belong to the same ARX submodel. In Roll, Bemporad, and Ljung (2004), mixed-integer linear and quadratic programming is proposed to identify two subclasses of PWARX models. The approach in Bemporad, Garulli, Paoletti, and Vicino (2005) is instead inspired by set-membership identification techniques (Milanese & Vicino, 1991). The identification error is assumed to be bounded by a known quantity, and then the search for a minimum number of feasible subsystems is performed. This problem is however NP-hard and a suboptimal algorithm is proposed based on thermal relaxations. A Bayesian framework is introduced in Juloski, Weiland, and Heemels (2005). Here, the $θ_{k}$ are random vectors and classification corresponds to extracting data with highest a posteriori probability. This step is performed by designing an approximated Bayes estimator implemented by particle filters (Andrieu, Doucet, & Holenstein, 2010).

Hybrid system identification is faced in an algebraic fashion in Vidal, Chiuso, and Soatto (2002): exploiting polynomial factorization and hyperplane clustering an exact solution is obtained but only in the noiseless case. While a recursive estimation scheme is described in Vidal (2008), more recent approaches rely on convex relaxation and sparse optimization. In particular, in Ohlsson and Ljung (2013) the problem’s combinatorial nature is tackled by first introducing an overparametrized model. Then, the submodels parameters are estimated by least squares regularized via a sum-of-norms penalty. A regularization parameter is introduced to balance adherence to experimental data and number of submodels. In Bako (2011), identification is instead performed by solving a sequence of (non regularized) problems defined by weighted (and reweighted) $ℓ_{1}$ losses. An analysis of the algorithm is also obtained under noiseless assumptions.

It is worth noticing that all of the aforementioned approaches to hybrid system identification assume known the order $m$ of the ARX submodels. In addition, all the proposed algorithms have been tested only on quite simple hybrid systems (e.g. in Table 1 one has $m = 2$ , at most). This appears an important drawback for real applications where systems can be more complex and $m$ is typically unknown. This is a central issue in system identification: it is crucial to find a suitable model structure with the right model complexity yielding a good bias–variance tradeoff (Ljung, 1999, Söderström and Stoica, 1989). In light of this, the aim of this paper is to design a new regularized technique which determines from data also submodels complexity. This will be achieved by extending the stable spline estimator proposed in Pillonetto, Chiuso, and De Nicolao (2011) and Pillonetto and De Nicolao (2010) (and further discussed in Chen, Ohlsson, & Ljung, 2012 and Pillonetto, Dinuzzo, Chen, De Nicolao, & Ljung, 2014). We interpret hybrid system identification as a functional estimation problem, facing its ill-posedness/ill-conditioning in a Bayesian framework (Rasmussen & Williams, 2006). In particular, submodels impulse responses are modeled as zero-mean Gaussian processes with autocovariances equal to the stable spline kernel. In this way, information on the exponential stability of the predictor of each isolated subsystem is included in the estimation process.

The stable spline estimator for linear system identification depends on two (unknown) hyperparameters: the scale factor $λ$ and the stability parameter $α$ which regulates how fast the impulse response decays to zero. In comparison with classical parametric approaches, one important feature of this estimator is that the difficult model order selection can be replaced by hyperparameters estimation. In particular, in Chen et al. (2012) and Pillonetto and De Nicolao (2010) $λ$ and $α$ are estimated optimizing the marginal likelihood (ML), i.e. the marginal density of measurements obtained after integrating out the dependence on the impulse response (MacKay, 1992). This operation is also known as Empirical Bayes (Maritz & Lwin, 1989). Several merits of ML are documented in the literature, e.g. the fact that it automatically includes the Occam’s razor (MacKay, 1992). Recent studies have also clarified why ML may work well also in presence of deviations from the stochastic model, i.e. when undermodeling affects the kernel-based impulse response description (Pillonetto and Chiuso, 2014, Pillonetto and Chiuso, 2015).

However, differently from the linear scenario, in hybrid system identification other $N$ unknown variables have to be considered: the state variables $x_{t}$ which indicate which submodel is active at every instant $t$ . The main idea explored in this paper is to consider these classification variables as further hyperparameters which can be estimated via ML optimization. Due to its combinatorial nature, this problem would seem unfeasible. We will instead show how an approximated optimization can be efficiently performed through a Markov chain Monte Carlo (MCMC) approach (Gilks, Richardson, & Spiegelhalter, 1996). Our scheme is completely automatic: it generates a Markov chain exploring the ML without the need of specifying any proposal density or tuning parameter. Experimental results show that running few and short Markov chains can already lead to very accurate classifications. Then, once the $x_{t}$ are determined via ML optimization, $s$ stable spline estimators are used to reconstruct the submodels.

The paper is so organized. In Section 2, we introduce the stable spline model for hybrid systems adopted to classify and distribute data to the submodels. Section 3 then describes how the classification problem is solved by HSS via ML optimization. In particular, an MCMC scheme to efficiently explore the support of ML is designed. In Section 4 the algorithm’s description is completed showing how the submodels are reconstructed by HSS once the estimates of $x_{t}$ are available. Section 5 introduces some indexes related to classification and impulse responses reconstruction. We also present two oracle-based procedures, and related indices, which permit to define useful performance references to assess the effectiveness of a hybrid system identification procedure. Section 6 reports some numerical experiments. First, HSS is used to identify the six systems in Table 1 without having precise information on ARX submodels orders. Next, we also set up another Monte Carlo study where HSS is employed to reconstruct more complex (randomly generated) PWFIR models of 30-th order. Finally, HSS is tested using real data coming from a pick-and-place machine whose aim is to allocate electronic components on a printed circuit board (Juloski, Heemels, & Ferrari-Trecate, 2004). Conclusions end the paper.

Section snippets

Modeling hybrid systems using stable spline kernels

In this section, we introduce the stable spline stochastic model adopted for outputs classification (the first phase of HSS). The model is also graphically described in Fig. 1 through a Bayesian network (Jensen, 2001, Magni et al., 1998). Note that the system inputs $u_{t}$ are not reported in the model since they are assumed deterministic and known. Equivalently, one can assume that the inputs and the noises $e_{t}$ are independent and think of all the probability density functions reported in the

Stable spline classifier

We now describe a new classifier which forms the first step of HSS. Even if the aim of the classifier is only to return the estimates of the $x_{t}$ (and of $σ^{2}$ if it is unknown), our strategy is to optimize ML w.r.t. all the hyperparameter vector $ξ = [λ, α, σ, {x_{t}}_{t = 1}^{N}], ξ \in Ω .$ Above, the set $Ω$ embeds the constraints $λ, σ \geq 0$ , $α \in [0, 1)$ , and $x_{t} \in {1, \dots, s}$ . We also use $ξ ∖ \cdot$ to indicate the hyperparameter vector obtained after removing some of its components, e.g. $ξ ∖ λ = [α, σ, {x_{t}}_{t = 1}^{N}] .$

HSS

The Hybrid Stable Spline algorithm is now described. The first step is the stable spline classifier already discussed, while the second and final step is the reconstruction of the $θ_{k}$ . This is obtained using the estimates of $x_{t}$ (and $σ$ ) coming from the classifier. As for kernel parameters, new estimates of $λ$ and $α$ are derived as follows.

Consider the new Bayesian network in Fig. 2 where the $Y_{k}$ are function of the estimates of $x_{t}$ . As in Fig. 1, each $θ_{k}$ is a zero-mean Gaussian vector.

However, to

Classification and impulse responses fits

We introduce two indices useful to measure the performance of a hybrid system identification algorithm.

Without loss of generality, hereby we assume that the submodels estimates are ordered so that ${\hat{θ}}_{k}$ is the vector closest to $θ_{k}$ according to the Euclidean norm $‖ \cdot ‖$ . The identification data are then partitioned into $s$ pieces by the classifier in such a way that the outputs used to compute ${\hat{θ}}_{k}$ via (4.2), (4.3) are associated to the $k$ th submodel. Then, the first index, related to data

Identification of the six benchmark hybrid systems via HSS

HSS is now used to identify the hybrid systems reported in Table 1. For each of the six models, a Monte Carlo study of 100 runs is performed.

At every run, HSS has to reconstruct the submodels from 500 input–output pairs, with the noise variance known and using a 30-th order PWARX.⁴Thus, for the

Conclusions

We have proposed a new algorithm, called HSS, which extends the stable spline estimator to the hybrid system identification scenario. The difficult segmentation step is performed by interpreting the classification variables as hyperparameters. These are then estimated by optimizing ML via a stochastic simulation scheme. Once the estimates become available, stable spline estimators are employed to reconstruct the submodels composing the hybrid structure.

One of the key features of HSS is the use

Gianluigi Pillonetto was born on January 21, 1975 in Montebelluna (TV), Italy. He received the Doctoral degree in Computer Science Engineering cum laude from the University of Padova in 1998 and the Ph.D. degree in Bioengineering from the Polytechnic of Milan in 2002. In 2000 and 2002 he was visiting scholar and visiting scientist, respectively, at the Applied Physics Laboratory, University of Washington, Seattle. In 2005, he became Assistant Professor of Control and Dynamic Systems at the

References (44)

L. Bako
Identification of switched linear systems via sparse optimization
Automatica
(2011)
A. Bemporad et al.
Control of systems integrating logic, dynamics, and constraints
Automatica
(1999)
T. Chen et al.
On the estimation of transfer functions, regularizations and Gaussian processes–revisited
Automatica
(2012)
G. Ferrari-Trecate et al.
A clustering technique for the identification of piecewise affine systems
Automatica
(2003)
W.P.M.H. Heemels et al.
Equivalence of hybrid dynamical models
Automatica
(2001)
A. Juloski et al.
Data-based hybrid modelling of the component placement process in pick-and-place machines
Control Engineering Practice
(2004)
M. Milanese et al.
Optimal estimation theory for dynamic systems with set membership uncertainty: An overview
Automatica
(1991)
B. Ninness et al.
Bayesian system identification via MCMC techniques
Automatica
(2010)
H. Ohlsson et al.
Identification of switched linear regression models using sum-of-norms regularization
Automatica
(2013)
S. Paoletti et al.
Identification of hybrid systems a tutorial
European Journal of Control
(2007)

Carli, F.P. (2014). On the maximum entropy property of the first-order stable spline kernel and its implications. In...

T. Evgeniou et al.

Regularization networks and support vector machines

Advances in Computational Mathematics

(2000)

Cited by (58)

A randomized method for the identification of switched NARX systems
2023, Nonlinear Analysis: Hybrid Systems
The identification of switched systems is a complex optimization problem that involves both continuous (parametrizations of the local models, a.k.a. modes) and discrete variables (model structures, switching signal). In particular, the combinatorial complexity associated with the estimation of the switching signal grows exponentially with the number of samples, which makes data segmentation (i.e. estimating the number and location of mode switchings, and the mode sequence) a challenging problem. In this work, we extend a previously developed randomized approach for the identification of switched systems to encompass the estimation of the switching locations. The method operates by extracting samples from a probability distribution of switched models, and gathering information from the associated model performances to update the distribution, until convergence to a limit distribution associated to a specific model. A suitable probability distribution is employed to represent the likelihood of a mode switching at a certain time, and the update process is designed to correct the switching locations and remove redundant switchings. The proposed algorithm has been compared to existing state-of-the-art methods and has been tested on various benchmark examples, to demonstrate its effectiveness.
Models and methods for hybrid system identification: a systematic survey
2023, IFAC-PapersOnLine
Dynamical systems and processes that either exhibit non-smooth behaviours (e.g. through logic control or natural phenomena) or work in different modes of operation are usually represented using hybrid systems models, i.e. mathematical models that combine continuous dynamics with discrete-event dynamics. Identification of a hybrid system includes finding switching patterns and identification of model parameters to obtain a data-driven model. This survey paper provides a systematic review of models (how to parameterize the system) and methods (how to identify unknown parameters) proposed for hybrid system identification with an exposition of recent advances and developments, and further research directions.
Sparse estimation in linear dynamic networks using the stable spline horseshoe prior
2022, Automatica
Identification of the so-called dynamic networks is one of the most challenging problems appeared recently in control literature. Such systems consist of large-scale interconnected systems, also called modules. To recover full networks dynamics the two crucial steps are topology detection, where one has to infer from data which connections are active, and modules estimation. Since a small percentage of connections are effective in many real systems, the problem finds also fundamental connections with group-sparse estimation. In particular, in the linear setting modules correspond to unknown impulse responses expected to have null norm but in a small fraction of samples. This paper introduces a new Bayesian approach for linear dynamic networks identification where impulse responses are described through the combination of two particular prior distributions. The first one is a block version of the horseshoe prior, a model possessing important global–local shrinkage features. The second one is the stable spline prior, that encodes information on smooth-exponential decay of the modules. The resulting model is called stable spline horseshoe (SSH) prior. It implements aggressive shrinkage of small impulse responses while larger impulse responses are conveniently subject to stable spline regularization. Inference is performed by a Markov Chain Monte Carlo scheme, tailored to the dynamic context and able to efficiently return the posterior of the modules in sampled form. Numerical studies show that the new approach can accurately reconstruct “line by line” networks dynamics without assuming any knowledge on the topology also when thousands of unknown impulse response coefficients must be inferred from data sets of relatively small size.
Persistence of excitation for identifying switched linear systems
2022, Automatica
This paper investigates the uniqueness of parameters via persistence of excitation for switched linear systems. The main contribution is a much weaker sufficient condition on the regressors to be persistently exciting that guarantees the uniqueness of the parameter sets and also provides new insights in understanding the relation among different subsystems. It is found that for uniquely determining the parameters of switched linear systems, the needed minimum number of samples derived from our sufficient condition is much smaller than that reported in the literature.
Deep prediction networks
2022, Neurocomputing
The challenge for next generation system identification is to build new flexible models and estimators able to simulate complex systems. This task is especially difficult in the nonlinear setting. In fact, in many real applications the performance of long-term predictors may be severely affected by stability problems arising due to the output feedback. For this purpose, also the use of deep networks, which are having much success to solve classification problems, has not led so far to any significant cross-fertilization with system identification. This paper proposes a novel procedure based on a hierarchical architecture, which we call deep prediction network, whose flexibility is used to favor the identification of stable systems. In particular, its structure contains layers whose aim is to improve long-term predictions, with complexity controlled by a kernel-based strategy. The usefulness of the new approach is demonstrated through many examples, including important real benchmark problems taken from the system identification literature.
Hierarchical identification of nonlinear hybrid systems in a Bayesian framework
2022, Information and Computation
This paper presents a hierarchical framework for the identification of nonlinear hybrid systems in the form of Switched Nonlinear AutoRegressive models with eXogenous variables (SNARX). The identification is done via three levels of inference, using Bayes' rule. In the first level, model parameters are computed via a Maximum a Posteriori (MAP) estimator. The posterior distribution therein involved depends on hyper-parameters that are tuned in the second level of inference. Such terms determine model complexity, and the Bayesian framework is key in returning values that trade off complexity with accuracy by automatically embodying the Occam's razor principle. Lastly, the third level compares different model structures by means of a quality measure that encompasses data fitness, model complexity, and data classification. The proposed framework is compared with existing relevant methods and is tested on different numerical models, showing promising performance.

View all citing articles on Scopus

His research interests are in the field of system identification and machine learning.

Dr. Pillonetto is an Associate Editor of Automatica and Systems and Control Letters.

^☆: This research has been partially supported by the MIUR FIRB project RBFR12M3AC-Learning meets time: a new computational approach to learning in dynamic systems and by the Progetto di Ateneo CPDA147754/14-New statistical learning approach for multi-agents adaptive estimation and coverage control. The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Er-Wei Bai under the direction of Editor Torsten Söderström. The author would like to thank Prof. Henrik Ohlsson for providing the real data coming from a pick-and-place machine.

¹: Tel.: +390498277607.

View full text

A new kernel-based approach to hybrid system identification☆

Abstract

Introduction

Section snippets

Modeling hybrid systems using stable spline kernels

Stable spline classifier

HSS

Classification and impulse responses fits

Identification of the six benchmark hybrid systems via HSS

Conclusions

Automatica

Automatica

Automatica

Automatica

Automatica

Control Engineering Practice

Automatica

Automatica

Automatica

European Journal of Control

Automatica

Automatica

Automatica

Automatica

Automatica

Automatica

Particle Markov chain Monte Carlo methods

Journal of the Royal Statistical Society. Series B (Statistical Methodology)

Observability and controllability of piecewise affine and hybrid systems

IEEE Transactions on Automatic Control

A bounded-error approach to piecewise affine system identification

IEEE Transactions on Automatic Control

Hinging hyperplanes for regression, classification, and function approximation

IEEE Transactions on Information Theory

Regularization networks and support vector machines

Advances in Computational Mathematics