Abstract
Neural networks have been widely used to model nonlinear systems that are difficult to formulate. Thus far, because neural networks are a radically different approach to mathematical modeling, control theory has not been applied to them, even if they approximate the nonlinear state equation of a control object. In this research, we propose a new approach—i.e., neural model extraction, that enables model-based control for a feed-forward neural network trained for a nonlinear state equation. Specifically, we propose a method for extracting the linear state equations that are equivalent to the neural network corresponding to given input vectors. We conducted simple simulations of a two degrees-of-freedom planar manipulator to verify how the proposed method enables model-based control on neural network forward models. Through simulations, where different settings of the manipulator’s state observation are assumed, we successfully confirm the validity of the proposed method.
Similar content being viewed by others
References
Pfeifer R, Lungarella M, Iida F. Self-organization, embodiment, and biologically inspired robotics. Science. 2007;318(5853):1088. http://science.sciencemag.org/content/318/5853/1088.abstract
Rus D, Tolley MT. Design, fabrication and control of soft robots. Nature. 2015;521:467. https://doi.org/10.1038/nature14543.
Laschi C, Mazzolai B, Cianchetti M. Soft robotics: technologies and systems pushing the boundaries of robot abilities. Science Robotics. 2016;1(1). http://robotics.sciencemag.org/content/1/1/eaah3690.abstract
Martius G, Hostettler R, Knoll A, Der R. Compliant control for soft robots: emergent behavior of a tendon driven anthropomorphic arm. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2016. p. 767–73. https://doi.org/10.1109/IROS.2016.7759138.
Gupta A, Eppner C, Levine S, Abbeel P. Learning dexterous manipulation for a soft robotic hand from human demonstrations. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2016. p. 3786–93. https://doi.org/10.1109/IROS.2016.7759557.
Ishige M, Umedachi T, Taniguchi T, Kawahara Y. Learning oscillator-based gait controller for string-form soft robots using parameter-exploring policy gradients. In: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2018. p. 6445–52. https://doi.org/10.1109/IROS.2018.8594338.
Hunt KJ, Sbarbaro D, Żbikowski R, Gawthrop PJ. Neural networks for control systems: a survey. Automatica. 1992;28(6):1083. https://doi.org/10.1016/0005-1098(92)90053-I.
Jin L, Li S, Yu J, He J. Robot manipulator control using neural networks: a survey. Neurocomputing. 2018;285:23. https://doi.org/10.1016/j.neucom.2018.01.002.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436. https://doi.org/10.1038/nature14539.
Pierson HA, Gashler MS. Deep learning in robotics: a review of recent research. Adv Robot. 2017;31(16):821. https://doi.org/10.1080/01691864.2017.1365009.
Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML); 2010. p. 807–14. https://doi.org/10.5555/3104322.3104425.
Maas AL, Hannun AY, Ng AY. Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing; 2013.
He K, Zhang X, Ren S, Sun J. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. CoRR arXiv:1502.01852 (2015).
Nguyen-Tuong D, Peters J. Model learning for robot control: a survey. Cogn Process. 2011;12(4):319. https://doi.org/10.1007/s10339-011-0404-1.
Peters J, Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Netw. 2008;21(4):682. https://doi.org/10.1016/j.neunet.2008.02.003.
Gaeta M, Loia V, Miranda S, Tomasiello S. Fitted Q-iteration by functional networks for control problems. Appl Math Model. 2016;40(21):9183. https://doi.org/10.1016/j.apm.2016.05.049.
Bruin T, Kober J, Tuyls K, Babuška R. Improved deep reinforcement learning for robotics through distribution-based experience retention. In: 2016 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2016. p. 3947–52. https://doi.org/10.1109/IROS.2016.7759581.
Gu S, Holly E, Lillicrap T, Levine S. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE international conference on robotics and automation (ICRA); 2017. p. 3389–96. https://doi.org/10.1109/ICRA.2017.7989385.
Haarnoja T, Pong V, Zhou A, Dalal M, Abbeel P, Levine S. Composable deep reinforcement learning for robotic manipulation. In: 2018 IEEE international conference on robotics and automation (ICRA); 2018. p. 6244–51. https://doi.org/10.1109/ICRA.2018.8460756.
Stulp F, Sigaud O. Path integral policy improvement with covariance matrix adaptation. In: Proceedings of the 29th international conference on international conference on machine learning (ICML); 2012. p. 1547–54. https://doi.org/10.5555/3042573.3042771.
Stulp F, Oudeyer PY. Adaptive exploration through covariance matrix adaptation enables developmental motor learning. Paladyn. 2012;3(3):128. https://doi.org/10.2478/s13230-013-0108-6.
Nguyen-Tuong D, Peters J, Seeger M, Schölkopf B. Learning inverse dynamics: a comparison. In: Advances in computational intelligence and learning: proceedings of the European symposium on artificial neural networks (ESANN); 2008. p. 13–8.
Sigaud O, Salaün C, Padois V. On-line regression algorithms for learning mechanical models of robots: a survey. Robot Auton Syst. 2011;59(12):1115. https://doi.org/10.1016/j.robot.2011.07.006.
Schaal S, Atkeson CG, Vijayakumar S. Real-time robot learning with locally weighted statistical learning. In: IEEE international conference on robotics and automation (ICRA); 2000. p. 288–93. https://doi.org/10.1109/ROBOT.2000.844072.
Nguyen-Tuong D, Seeger M, Peters J. Model learning with local Gaussian process regression. Adv Robot. 2009;23(15):2015. https://doi.org/10.1163/016918609X12529286896877.
Miyamoto H, Kawato M, Setoyama T, Suzuki R. Feedback-error-learning neural network for trajectory control of a robotic manipulator. Neural Netw. 1988;1(3):251. https://doi.org/10.1016/0893-6080(88)90030-5.
Katayama M, Kawato M. Learning trajectory and force control of an artificial muscle arm by parallel-hierarchical neural network model. In: Advances in neural information processing systems; 1990. p. 436–42. https://proceedings.neurips.cc/paper/1990/file/3fe94a002317b5f9259f82690aeea4cd-Paper.pdf.
Waegeman T, wyffels F, Schrauwen B. Feedback control by online learning an inverse model. IEEE Trans Neural Netw Learn Syst. 2012;23(10):1637. https://doi.org/10.1109/TNNLS.2012.2208655.
Settles B. Synthesis lectures on artificial intelligence and machine learning. Act Learn. 2012;6(1):1.
Jordan MI, Rumelhart DE. Forward models: supervised learning with a distal teacher. Cogn Sci. 1992;16(3):307. https://doi.org/10.1016/0364-0213(92)90036-T.
Dearden A, Demiris Y. Learning forward models for robots. In: Proceedings of the 19th international joint conference on artificial intelligence (IJCAI); 2005. p. 1440–5. https://doi.org/10.5555/1642293.1642521.
Wolpert DM, Kawato M. Multiple paired forward and inverse models for motor control. Neural Netw. 1998;11(7):1317. https://doi.org/10.1016/S0893-6080(98)00066-5.
Haruno M, Wolpert DM, Kawato M. Multiple paired forward-inverse models for human motor learning and control. Adv Neural Inform Process Syst. 1999;11:31–7.
Lambert A, Shaban A, Raj A, Liu Z, Boots B. Deep forward and inverse perceptual models for tracking and prediction. In: 2018 IEEE international conference on robotics and automation (ICRA); 2018. p. 675–82. https://doi.org/10.1109/ICRA.2018.8461050.
Polydoros AS, Nalpantidis L. Survey of model-based reinforcement learning: applications on robotics. J Intell Robot Syst. 2017;86(2):153. https://doi.org/10.1007/s10846-017-0468-y.
Hester T, Quinlan M, Stone P. RTMBA: a real-time model-based reinforcement learning architecture for robot control. In: 2012 IEEE international conference on robotics and automation (ICRA); 2012. p. 85–90. https://doi.org/10.1109/ICRA.2012.6225072.
Martínez D, Alenyà G, Torras C. Safe robot execution in model-based reinforcement learning. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS); 2015, p. 6422–7. https://doi.org/10.1109/IROS.2015.7354295.
Watter M, Springenberg J, Boedecker J, Riedmiller M. Embed to control: a locally linear latent dynamics model for control from raw images. In: Proceedings of the 28th international conference on neural information processing systems, vol. 2; 2015. p. 2746–54. https://doi.org/10.5555/2969442.2969546.
Nagabandi A, Kahn G, Fearing RS, Levine S. Neural network dynamics for model-based deep reinforcement learning with model-free fine-tuning. In: 2018 IEEE international conference on robotics and automation (ICRA); 2018. p. 7559–66. https://doi.org/10.1109/ICRA.2018.8463189.
Soloway D, Haley PJ. Neural generalized predictive control. In: Proceedings of the 1996 IEEE international symposium on intelligent control; 1996. p. 277–82. https://doi.org/10.1109/ISIC.1996.556214.
Akesson B, Toivonen H. A neural network model predictive controller. J Process Control. 2006;16(9):937. https://doi.org/10.1016/j.jprocont.2006.06.001.
Kashima K. Nonlinear model reduction by deep autoencoder of noise response data. In: 2016 IEEE 55th conference on decision and control (CDC); 2016. p. 5750–5. https://doi.org/10.1109/CDC.2016.7799153.
Wang M, Li HX, Shen W. Deep auto-encoder in model reduction of lage-scale spatiotemporal dynamics. In: 2016 international joint conference on neural networks (IJCNN); 2016. p. 3180–6. https://doi.org/10.1109/IJCNN.2016.7727605.
Lenz I, Knepper RA, Saxena A. DeepMPC: learning deep latent features for model predictive control. In: Robotics: science and systems (RSS); 2015. https://doi.org/10.15607/rss.2015.xi.012.
Takahara K, Ikemoto S, Hosoda K. Reconstructing state-space from movie using convolutional autoencoder for robot control. In: The 15th international conference on intelligent autonomous systems (IAS), vol. 15; 2015. p. 480–9. https://doi.org/10.1007/978-3-030-01370-7_38.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR); 2016. p. 770–8. https://doi.org/10.1109/CVPR.2016.90.
Funding
This work was supported by JSPS KAKENHI Grant Number 18H01410, 19K22875, and 19H01122.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ikemoto, S., Takahara, K., Kumi, T. et al. Neural Model Extraction for Model-Based Control of a Neural Network Forward Model. SN COMPUT. SCI. 2, 54 (2021). https://doi.org/10.1007/s42979-021-00456-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-00456-4