Multiple Model-Based Control Using Finite Controlled Markov Chains

Ikonen, Enso; Najim, Kaddour

doi:10.1007/s12559-009-9020-0

Multiple Model-Based Control Using Finite Controlled Markov Chains

Published: 20 June 2009

Volume 1, pages 234–243, (2009)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Enso Ikonen¹ &
Kaddour Najim²

149 Accesses
8 Citations
Explore all metrics

Abstract

Cognition and control processes share many similar characteristics, and decisionmaking and learning under the paradigm of multiple models has increasingly gained attention in both fields. The controlled finite Markov chain (CFMC) approach enables to deal with a large variety of signals and systems with multivariable, nonlinear, and stochastic nature. In this article, adaptive control based on multiple models is considered. For a set of candidate plant models, CFMC models (and controllers) are constructed off-line. The outcomes of the CFMC models are compared with frequentist information obtained from on-line data. The best model (and controller) is chosen based on the Kullback–Leibler information. This approach to adaptive control emphasizes the use of physical models as the basis of reliable plant identification. Three series of simulations are conducted: to examine the performace of the developed Matlab-tools; to illustrate the approach in the control of a nonlinear nonminimum phase van der Vusse CSTR plant; and to examine the suggested model selection method for the adaptive control.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review on model predictive control: an engineering perspective

Article Open access 11 August 2021

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

A Review on Kalman Filter Models

Article 01 October 2022

References

Berthiaux H, Mizonov V. Applications of Markov chains in particulate process engineering: a review. Can J Chem Eng. 2004;82:1143–68.
Article CAS Google Scholar
Bertsekas DP. Dynamic programming and optimal control. Belmont, Massachusetts: Athena Scientific; 2007.
Google Scholar
Chen H, Kremling A, Allgöwer F. Nonlinear predictive control of a benchmark CSTR. In: Proceedings of the 3rd european control conference, Rome, Italy; 1995. p. 3247–58.
Filev D. Model bank based intelligent control. In Proceedings of the NAFIPS, New Orleans, USA; 2002. p. 583–6.
Ghahramani Z, Jordan MI. Learning From Incomplete Data. Lab Memo No. 1509, Center for Biological and Computational learning, Department of Brain and Cognitive Sciences, Paper No. 108, MIT Artificial Intelligence Laboratory; 1994.
Gosavi, A. Reinforcement learning: a tutorial survey and recent advances. INFORMS J Comput. 2009;21(2):178–92.
Article Google Scholar
Häggström O. Finite Markov chains and algorithmic applications. Cambridge: Cambridge University Press; 2002.
Google Scholar
Hsu CS. Cell-to-cell mapping—a method of global analysis for nonlinear systems. New York: Springer-Verlag; 1987.
Google Scholar
Hussain A, Abdullah R, Chambers J, Gurney K, Redgrave P. Emergent common functional principles in control theory and the vertebrate brain: a case study with autonomous vehicle control. Artificial Neural Networks—ICANN, Prague, Czech Republic; 2008. p. 949–58.
Hyötyniemi H. Neocybernetics in biology. Helsinki University of Technology, Control Engineering Laboratory, Report 151, Finland, 2006; 267 p.
Ikonen E, Najim K. Process identification and control. New York: Marcel Dekker; 2002. p. 310.
Google Scholar
Jacobs R, Jordan M, Nowlan S, Hinton G. Adaptive mixtures of local experts. Neural Comput. 1991;3:79–87.
Article Google Scholar
Johansen TA, Murray-Smith R. The operating regime approach to nonlinear modelling and control. In: Murray-Smith R, Johansen TA (eds) Multiple model approaches to modelling and control. London: Taylor and Francis; 1997, p. 3–72.
Google Scholar
Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.
Google Scholar
Kárný M, Kracík J, Guy TV. Cooperative decision making without facilitator. St. Petersburg, Russia: IFAC ALCOSP; 2007.
Google Scholar
Kemeny JG, Snell JL. Finite Markov chains. Princeton: van Nostrand; 1960.
Google Scholar
Kulhavy R. Kullback–Leibler distance approach to system identification. In: IFAC symposium on adaptive systems in control and signal processing, Budapest, 1995; p. 55–66.
Lee JM, Lee JH. Approximate dynamic programming strategies and their applicability for process control: a review and future directions. Int J Control Autom Syst. 2004;2(3):263–78.
Google Scholar
Li XR, Zhao Z, Li X-B. General model-set design methods for multiple-model approach. IEEE Trans Autom Control. 2005;50(9):1260–76.
Article Google Scholar
Lunze J. On the Markov property of quantised state measurement sequences. Automatica. 1998;32(11):1439–44.
Article Google Scholar
Lunze J, Nixdorf B, Richter H. Process supervision by means of a hybrid model. J Proc Cont. 2001;11:89–104.
Article CAS Google Scholar
Motter M, Principe JC. Predictive multiple model switching control with the self-organizing map. Int J Robust Nonlinear Control. 2002;12(11):1029–51.
Article Google Scholar
Murray-Smith R, Johansen TA. Multiple model approaches to modelling and control. London: Taylor & Francis; 1997.
Narendra KS, Balakrishnan J, Ciliz MK. Adaptation and learning using multiple models, switching and tuning. IEEE Control Syst. 1995;15(3):37–51.
Article Google Scholar
Poznyak AS, Najim K, Gómez-Ramírez E. Self-learning control of finite Markov chains. New York: Marcel Dekker; 2000.
Google Scholar
Puterman ML. Markov decision processes—discrete stochastic dynamic programming. New York: Wiley & Sons; 1994.
Google Scholar
Riordon JS. An adaptive automation controller for discrete-time Markov processes. Automatica. 1969;5:721–730.
Article Google Scholar
Sanz, R. Cognition and control. A preparatory workshop for EU seventh research framework programme, 9 March 2006, Luxembourg; 2006.
Shah S, Cluett W. Recursive least squares based estimation schemes for self-tuning control. Can J Chem Eng. 1991;69:89–96.
Article CAS Google Scholar
Shorten R, Murray-Smith R, Bjorgan R, Gollee H. On the interpretation of local models in blended multiple model structures. Int J Control. 1999;72(7/8):620–8.
Google Scholar
White DJ. Real applications of Markov decision processes. Interfaces. 1985;15(6):73–83.
Article Google Scholar
White DJ. Further real applications of Markov decision processes. Interfaces. 1989;18(5):55–61.
Article Google Scholar
White DJ. A survey of applications of Markov decision processes. J Oper Res Soc. 1993;44(11):1073–96.
Article Google Scholar

Download references

Acknowledgements

The authors thank the anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Systems Engineering Laboratory, University of Oulu, P.O. Box 4300, Oulu, 90014, Finland
Enso Ikonen
E.N.S.I.A.C.E.T., Toulouse, France
Kaddour Najim

Authors

Enso Ikonen
View author publications
You can also search for this author in PubMed Google Scholar
Kaddour Najim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Enso Ikonen.

Appendix A: Kullback–Leibler Distance

Following Kulhavy [17], suppose that X ₁, X ₂, …, X _k+1 form a controlled Markov chain with a conditional probability mass function S(y|z,u) = Pr $\{X_{k+1}=y {\vert} X_{k}=z, U_{k}=u\}.$ The transition probability distribution is known partially, it is assumed to belong to a family S _θ. The task is to estimate θ.

For a sequence of observations ${\bf x}=\left( x_{1},x_{2},...,x_{k+1}\right) $ and control actions ${\bf u}=\left( u_{1},u_{2},...,u_{k+1}\right), $ we have

$$ \begin{aligned} S_{\theta }^{k}\left( {\bf x}|x_{1},{\bf u}\right) &=\prod_{i=1}^{k}S_{\theta }\left( x_{i+1}|x_{i},u_{i}\right)\\ &=\exp \left\{\sum_{i=1}^{k}\log S_{\theta }\left( x_{i+1}|x_{i},u_{i}\right) \right\}\\ &=\exp\left\{ k\sum_{\left( y,z,v\right) \in {\mathcal{X}}^{2}\times {\mathcal{U }}}R_{{\bf x,u}}\left( y,z,v\right) \log S_{\theta }\left( y|z,v\right) \right\}\\ &=\exp \left\{ -k\left[\overline{H}\left( R_{{\bf x}}\right) +\overline{D} \left( R_{{\bf x}}||S_{\theta }\right) \right] \right\} \end{aligned} $$

where

$R_{\bf x,u}(a,b,c)$ is the empirical distribution $R_{{\bf x,u}}( a,b,c) =\frac{N_{\bf x,u}( a,b,c)}{k},$ and $N_{{\bf x,u}}(a,b,c)$ counts the number of occurrences of the triplets (a,b,c) in the sequence formed by ${\bf x}$ and ${\bf u}.$
$\overline{H}(R_{{\bf x,u}})$ is the conditional Shannon entropy

$ \overline{H}\left(R_{{\bf x,u}}\right) =\sum_{\left( y,z,v\right) }R_{ {\bf x,u}}\left( y,z,v\right) \log R_{{\bf x,u}}\left( y,z,v\right) +\sum_{\left( z,v\right) }R_{{\bf x,u}}\left( z,v\right) \log R_{ {\bf x,u}}\left( z,v\right)$

of a random variable Y given another random variable Z and a control action V, described jointly by a probability distribution $R_{{\bf x,u}},$ and
$\overline{D}\left( R_{{\bf x,u}}||S_{\theta }\right) $ is the conditional Kullback–Leibler distance $ \overline{D}\left( R_{{\bf x,u}}||S_{\theta }\right) =\sum_{\left( y,z,v\right) }R_{{\bf x,u}}\left( y,z,v\right) \log \frac{{R_{{\bf x,u}}}(y,z,v)}{S_{\theta}( y|z,v) R_{\bf x,u}(z,v)} $
of a joint probability distribution $R_{{\bf x,u}}$ and a conditional distribution S _θ.

Further, we can regard any of the conditional distributions S _θ(y|z,v) as a set of distributions S ^z,v_θ (y), and conditional empirical distribution $R_{{\bf x,u}}\left( y|z,v\right) $ as a set of points $R_{{\bf x}}^{z,v}\left( y\right), $ $\% z\in {\mathcal{X}},$ $v\in {\mathcal{U}}.$ We can then write

$$ \overline{D}(R_{{\bf x,u}}||S_{\theta }) =\sum_{( z,v) \in {\mathcal{X}}\times {\mathcal{U}}}R_{{\bf x,u}}( z,v) D( R_{{\bf x,u}}^{z,v}||S_{\theta }^{z,v}). $$

The posterior distribution of the unknown parameter $\theta $ conditional on ${\bf x}$ and ${\bf u}$ is

$$ P_{{\bf x}}\propto P(\theta) S_{\theta }^{k}({\bf x} ,{\bf u}|x_{1}) \propto P(\theta) \exp \{ -k \overline{D}(R_{{\bf x,u}}||S_{\theta })\} $$

since the conditional entropy $\overline{H}(R_{{\bf x}})$ does not depend on θ.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ikonen, E., Najim, K. Multiple Model-Based Control Using Finite Controlled Markov Chains. Cogn Comput 1, 234–243 (2009). https://doi.org/10.1007/s12559-009-9020-0

Download citation

Received: 09 February 2009
Accepted: 04 June 2009
Published: 20 June 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s12559-009-9020-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Model-Based Control Using Finite Controlled Markov Chains

Abstract

Access this article

Similar content being viewed by others

Review on model predictive control: an engineering perspective

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

A Review on Kalman Filter Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Kullback–Leibler Distance

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multiple Model-Based Control Using Finite Controlled Markov Chains

Abstract

Access this article

Similar content being viewed by others

Review on model predictive control: an engineering perspective

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

A Review on Kalman Filter Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix A: Kullback–Leibler Distance

Appendix A: Kullback–Leibler Distance

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation