How Resilient Are Imitation Learning Methods to Sub-optimal Experts?

Gavenski, Nathan; Monteiro, Juarez; Medronha, Adilson; Barros, Rodrigo C.

doi:10.1007/978-3-031-21689-3_32

Nathan Gavenski⁹,
Juarez Monteiro⁹,
Adilson Medronha⁹ &
…
Rodrigo C. Barros⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13654 ))

Included in the following conference series:

Brazilian Conference on Intelligent Systems

882 Accesses
1 Citations
2 Altmetric

Abstract

Imitation Learning (IL) algorithms try to mimic expert behavior in order to be capable of performing specific tasks, but it remains unclear what those strategies could achieve when learning from sub-optimal data (faulty experts). Studying how Imitation Learning approaches learn when dealing with different degrees of quality from the observations can benefit tasks such as optimizing data collection, producing interpretable models, reducing bias when using sub-optimal experts, and more. Therefore, in this work we provide extensive experiments to verify how different Imitation Learning agents perform under various degrees of expert optimality. We experiment with four IL algorithms, three of them that learn self-supervisedly and one that uses the ground-truth labels (BC) in four different environments (tasks), and we compare them using optimal and sub-optimal experts. For assessing the performance of each agent, we compute two metrics: Performance and Average Episodic Reward. Our experiments show that IL approaches that learn self-supervisedly are relatively resilient to sub-optimal experts, which is not the case of the supervised approach. We also observe that sub-optimal experts are sometimes beneficial since they seem to act as a kind of regularization method, preventing models from data overfitting. You can replicate our experiments by using the code in our GitHub (https://github.com/NathanGavenski/How-resilient-IL-methods-are).

N. Gavenski and J. Monteiro—These authors contributed equally to the work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Argall, B.D., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57(5), 469–483 (2009)
Article Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. 1(5), 834–846 (1983)
Article Google Scholar
Burke, C.J., Tobler, P.N., Baddeley, M., Schultz, W.: Neural mechanisms of observational learning. Proc. Natl. Acad. Sci. U. S. A. 107(32), 14431–14436 (2010)
Article Google Scholar
Ciosek, K.: Imitation learning by reinforcement learning. arXiv preprint arXiv:2108.04763 (2021)
Edwards, A., Sahni, H., Schroecker, Y., Isbell, C.: Imitating latent policies from observation. In: International Conference on Machine Learning, pp. 1755–1763. PMLR (2019)
Google Scholar
Gavenski, N., Monteiro, J., Granada, R., Meneguzzi, F., Barros, R.C.: Imitating unknown policies via exploration. arXiv preprint arXiv:2008.05660 (2020)
Gavenski, N.S.: Self-supervised imitation learning from observation. Master’s thesis, Pontifícia Universidade Católica do Rio Grande do Sul (2021)
Google Scholar
Ho, J., Ermon, S.: Generative adversarial imitation learning. In: Proceedings of the 30th Conference on Neural Information Processing Systems (NIPS 2016), NIPS 2016, pp. 4565–4573 (2016)
Google Scholar
Hussein, A., Gaber, M.M., Elyan, E., Jayne, C.: Imitation learning: a survey of learning methods. ACM Comput. Surv. (CSUR) 50(2), 1–35 (2017)
Article Google Scholar
Jing, L., Tian, Y.: Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 1(01), 1–24 (2019). https://doi.org/10.1109/TPAMI.2020.2992393
Kidambi, R., Chang, J.D., Sun, W.: Mobile: model-based imitation learning from observations alone (2021)
Google Scholar
Liu, Y., Gupta, A., Abbeel, P., Levine, S.: Imitation from observation: learning to imitate behaviors from raw video via context translation. In: Proceedings of ICRA 2018, pp. 1118–1125 (2018)
Google Scholar
Monteiro, J., Gavenski, N., Granada, R., Meneguzzi, F., Barros, R.C.: Augmented behavioral cloning from observation. In: Proceedings of the 2020 International Conference on Neural Networks, IJCNN 2020, pp. 1–8. IEEE, July 2020. https://arxiv.org/abs/2004.13529
Moore, A.W.: Efficient memory-based learning for robot control. Ph.D. thesis, University of Cambridge (1990)
Google Scholar
Pomerleau, D.A.: ALVINN: an autonomous land vehicle in a neural network. In: Proceedings of the 1st Conference on Neural Information Processing Systems, NIPS 1988, pp. 305–313 (1988)
Google Scholar
Schaal, S.: Learning from demonstration. In: Proceedings of NIPS 1996, pp. 1040–1046 (1996)
Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning, vol. 2. MIT Press, Cambridge (1998)
Google Scholar
Torabi, F., Warnell, G., Stone, P.: Behavioral cloning from observation. In: Proceedings of IJCAI 2018, pp. 4950–4957 (2018)
Google Scholar
Zhu, Z., Lin, K., Dai, B., Zhou, J.: Off-policy imitation learning from observations. Adv. Neural. Inf. Process. Syst. 33, 12402–12413 (2020)
Google Scholar

Download references

Acknowledgment

This paper was achieved in cooperation with Banco Cooperativo Sicredi.

Author information

Authors and Affiliations

School of Technology, Pontifícia Universidade Católica do Rio Grande do Sul, Av. Ipiranga, 6681, Porto Alegre, RS, 90619-900, Brazil
Nathan Gavenski, Juarez Monteiro, Adilson Medronha & Rodrigo C. Barros

Authors

Nathan Gavenski
View author publications
You can also search for this author in PubMed Google Scholar
Juarez Monteiro
View author publications
You can also search for this author in PubMed Google Scholar
Adilson Medronha
View author publications
You can also search for this author in PubMed Google Scholar
Rodrigo C. Barros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathan Gavenski .

Editor information

Editors and Affiliations

Federal University of Rio Grande do Norte, Natal, Brazil
João Carlos Xavier-Junior
Federal University of Bahia, Salvador, Brazil
Ricardo Araújo Rios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gavenski, N., Monteiro, J., Medronha, A., Barros, R.C. (2022). How Resilient Are Imitation Learning Methods to Sub-optimal Experts?. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13654 . Springer, Cham. https://doi.org/10.1007/978-3-031-21689-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-031-21689-3_32
Published: 19 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21688-6
Online ISBN: 978-3-031-21689-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

How Resilient Are Imitation Learning Methods to Sub-optimal Experts?