Abstract
We propose a methodology based on Learning from Observation in order to teach a virtual robot to perform its tasks. Our technique only assumes that behaviors to be cloned can be observed and represented using a finite alphabet of symbols. A virtual agent is used to generate training material, according to a range of strategies of gradually increasing complexity. We use Machine Learning techniques to learn new strategies by observing and thereafter imitating the actions performed by the agent. We perform several experiments to test our proposal. The analysis of those experiments suggests that probabilistic finite state machines could be a suitable tool for the problem of behavioral cloning. We believe that the given methodology is easy to integrate in the learning module of any Ubiquitous Robot Architecture.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
The model predicts the next action, but the next state is given by the actual configuration of the map; in the case that it is impossible to perform a certain action because of an obstacle, the agent does not change its location.
- 3.
According to wikipedia, “iRobot Corporation is an American advanced technology company founded in 1990 and incorporated in Delaware in 2000. Roomba was introduced in 2002. As of Feb 2014, over 10 million units have been sold worldwide”.
References
Bauer, M.A.: Programming by examples. Artif. Intell. 12(1), 1–21 (1979). http://dx.doi.org/10.1016/0004-3702(79)90002-X
Berthold, M.R., Diamond, J.: Constructive training of probabilistic neural networks. Neurocomputing 19(1–3), 167–183 (1998). http://dx.doi.org/10.1016/S0925-2312(97)00063–5
Fernlund, H.K.G., Gonzalez, A.J., Georgiopoulos, M., DeMara, R.F.: Learning tactical human behavior through observation of human performance. IEEE Trans. Syst. Man Cybern. B 36(1), 128–140 (2006). http://doi.ieeecomputersociety.org/10.1109/TSMCB.2005.855568
Floyd, M.W., Esfandiari, B., Lam, K.: A case-based reasoning approach to imitating robocup players. In: Wilson, D., Lane, H.C. (eds.) Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference, Coconut Grove, Florida, USA, pp. 251–256. AAAI Press, 15–17 May 2008. http://www.aaai.org/Library/FLAIRS/2008/flairs08-064.php
Gonzalez, A.J., Georgiopoulos, M., DeMara, R.F., Henninger, A., Gerber, W.: Automating the CGF model development and refinement process by observing expert behavior in a simulation. In: Proceedings of the 7th Conference on Computer Generated Forces and Behavioral Representation, May 12–14, 1998, Orlando, Florida, USA (1998)
Könik, T., Laird, J.E.: Learning goal hierarchies from structured observations and expert annotations. Mach. Learn. 64(1–3), 263–287 (2006). http://dx.doi.org/10.1007/s10994-006-7734-8
Lozano-Pérez, T.: Robot programming. Proc. IEEE 71(7), 821–841 (1983)
Michalski, R., Stepp, R.: Learning from observation: conceptual clustering. In: Michalski, R., Carbonell, J., Mitchell, T. (eds.) Machine Learning Symbolic Computation, pp. 331–363. Springer, Heidelberg (1983). http://dx.doi.org/10.1007/978-3-662-12405-5_11
Moriarty, C.L., Gonzalez, A.J.: Learning human behavior from observation for gaming applications. In: Lane, H.C., Guesgen, H.W. (eds.) Proceedings of the 22nd International Florida Artificial Intelligence Research Society Conference, Sanibel Island, Florida, USA. AAAI Press, 19–21 May 2009. http://aaai.org/ocs/index.php/FLAIRS/2009/paper/view/100
Ng, A.Y., Russell, S.J.: Algorithms for inverse reinforcement learning. In: Langley, P. (ed.) Proceedings of the 17th International Conference on Machine Learning, Stanford University, Stanford, CA, USA, pp. 663–670. Morgan Kaufmann, June 29–July 2, 2000
Ontañón, S., Mishra, K., Sugandh, N., Ram, A.: On-line case-based planning. Comput. Intell. 26(1), 84–119 (2010). http://dx.doi.org/10.1111/j.1467-8640.2009.00344.x
Pomerleau, D.: ALVINN: an autonomous land vehicle in a neural network. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing Systems 1 (NIPS 1998), pp. 305–313. Morgan Kaufmann (1988). http://papers.nips.cc/paper/95-alvinn-an-autonomous-land-vehicle-in-a-neural-network
Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: IEEE International Conference on Neural Networks, vol. 1, pp. 586–591, IEEE (1993). http://dx.doi.org/10.1109/icnn.1993.298623
Sammut, C., Hurst, S., Kedzier, D., Michie, D.: Learning to fly. In: Sleeman, D.H., Edwards, P. (eds.) Proceedings of the 9th International Workshop on Machine Learning, Aberdeen, Scotland, UK, pp. 385–393, Morgan Kaufmann, 1–3 July 1992
Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: a scalable parallel classifier for data mining. In: Vijayaraman, T.M., Buchmann, A.P., Mohan, C., Sarda, N.L. (eds.) Proceedings of 22th International Conference on Very Large Data Bases, Mumbai (Bombay), India, pp. 544–555, Morgan Kaufmann, 3–6 Sep 1996. http://www.vldb.org/conf/1996/P544.PDF
Sidani, T.: Automated machine learning from observation of simulation. Ph.D. thesis, University of Central Florida (1994)
Acknowledgments
The authors gratefully acknowledge the financial support of project BASMATI (TIN2011-27479-C04-04) of Programa Nacional de Investigación and project PAC::LFO (MTM2014-55262-P) of Programa Estatal de Fomento de la Investigación Científica y Técnica de Excelencia, Ministerio de Ciencia e Innovación (MICINN), Spain.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Tîrnăucă, C., Montaña, J.L., Ortiz–Sobremazas, C., Ontañón, S., González, A.J. (2015). Teaching a Virtual Robot to Perform Tasks by Learning from Observation. In: García-Chamizo, J., Fortino, G., Ochoa, S. (eds) Ubiquitous Computing and Ambient Intelligence. Sensing, Processing, and Using Environmental Information. UCAmI 2015. Lecture Notes in Computer Science(), vol 9454. Springer, Cham. https://doi.org/10.1007/978-3-319-26401-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-26401-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26400-4
Online ISBN: 978-3-319-26401-1
eBook Packages: Computer ScienceComputer Science (R0)