Learning human actions from complex manipulation tasks and their transfer to robots in the circular factory

Manuel Zaremski; Blanca Handwerker; Christian R. G. Dreher; Fabian Leven; David Schneider; Alina Roitberg; Rainer Stiefelhagen; Gerhard Neumann; Michael Heizmann; Tamim Asfour; Barbara Deml

doi:10.1515/auto-2024-0008

Published by De Gruyter (O) September 10, 2024

Learning human actions from complex manipulation tasks and their transfer to robots in the circular factory

Erlernen menschlicher Handlungen aus komplexen Manipulationsaufgaben und deren Übertragung auf Roboter in einer Kreislauffabrik

Manuel Zaremski
Manuel Zaremski received his M.Sc. in Human Movement Sciences at the Justus Liebig University of Gießen in 2017 and is currently a research assistant at the Institute of Human and Industrial Engineering (ifab) at KIT. His research interests lie in the field of human-machine interaction, and he is particularly interested in the analysis of human eye and gaze movements.
, Blanca Handwerker
Blanca Handwerker received her M.Sc. in Psychology at the University of Bonn in 2023 and is currently a research assistant at the Institute of Human and Industrial Engineering (ifab) at KIT. Her research interests include human-machine interaction and the analysis of human eye and gaze movements.
, Christian R. G. Dreher
Christian R. G. Dreher studied computer science at the Karlsruhe Institute of Technology (KIT) and graduated in 2019. He is currently working as a research assistant at the Chair of High-Performance Humanoid Technologies (H²T), KIT. His research interests include robot programming by demonstration.
, Fabian Leven
Fabian Leven received his M. Sc.-degree in Physics at the Karlsruhe Institute of Technology in 2019. His current research interests lie in the field of machine vision, where he is particularly interested in estimating the direction of human gaze.
, David Schneider
David Schneider is currently a research assistant at the Computer Vision for Human-Computer Interaction Lab (CV:HCI) at KIT. He is working on human activity recognition as a component of assistive technologies which facilitate daily activities in the later stages of life. His research interests focus on human action recognition, multimodal, selfsupervised and cross-domain learning.
, Alina Roitberg
Alina Roitberg is a tenure-track junior professor at the University of Stuttgart, leading the Intelligent Sensing and Perception Group at the Institute for AI, University of Stuttgart. Her research interests include computer vision, human activity recognition, domain adaptation, open set recognition, as well as resource- and data-efficient learning.
, Rainer Stiefelhagen
Rainer Stiefelhagen is a professor for Information Technology Systems for Visually Impaired Students at the Karlsruhe Institute of Technology (KIT), where he directs the Computer Vision for Human-Computer Interaction Lab at the Institute for Anthropomatics and Robotics. His research interests include computer vision methods for visual perception of humans and their activities, to facilitate perceptive multimodal interfaces, humanoid robots, smart environments, multimedia analysis and assistive technology for persons with visual impairments.
, Gerhard Neumann
Gerhard Neumann is a professor for Autonomous Learning Robots at the Institute for Anthropomatics and Robotics at the KIT. His research is focused on the intersection of machine learning, robotics, and human-robot interaction, including the creation of data-efficient machine-learning algorithms that are suitable for complex robot domains.
, Michael Heizmann
Michael Heizmann is a professor for Mechatronic Measurement Systems and head of the Institute of Industrial Information Technology (IIIT) at the Karlsruhe Institute of Technology (KIT). His research areas include automatic visual inspection, signal and image processing, image and information fusion, measurement technology, machine learning and artificial intelligence and their applications.
, Tamim Asfour
Tamim Asfour is a professor of Humanoid Robotic Systems at the Karlsruhe Institute of Technology (KIT). He heads the chair for high-performance humanoid technologies (H²T) and his research interests focus on humanoid robots that can learn from observation and experience, and can act and interact in real environments.
and Barbara Deml
Barbara Deml is a professor of Human Factors and the head of the Institute for Human and Industrial Engineering (ifab) at the Karlsruhe Institute of Technology (KIT). Her research interests include the empirical analysis of human behavior and related cognitive processes, human-machine interaction, as well as designing work systems that are humancentered and incorporate learning automated systems.

From the journal at - Automatisierungstechnik

https://doi.org/10.1515/auto-2024-0008

Showing a limited preview of this publication:

Abstract

Process automation is essential to establish an economically viable circular factory in high-wage locations. This involves using autonomous production technologies, such as robots, to disassemble, reprocess, and reassemble used products with unknown conditions into the original or a new generation of products. This is a complex and highly dynamic issue that involves a high degree of uncertainty. To adapt robots to these conditions, learning from humans is necessary. Humans are the most flexible resource in the circular factory and they can adapt their knowledge and skills to new tasks and changing conditions. This paper presents an interdisciplinary research framework for learning human action knowledge from complex manipulation tasks through human observation and demonstration. The acquired knowledge will be described in a machine-executable form and will be transferred to industrial automation execution by robots in a circular factory. There are two primary research objectives. First, we investigate the multi-modal capture of human behavior and the description of human action knowledge. Second, the reproduction and generalization of learned actions, such as disassembly and assembly actions on robots is studied.

Zusammenfassung

Die Prozessautomatisierung spielt eine wesentliche Rolle bei der wirtschaftlichen Tragfähigkeit einer Kreislauffabrik an Hochlohnstandorten. Dies impliziert den Einsatz autonomer Produktionstechnologien wie Roboter, um gebrauchte Produkte mit unbekannten Zuständen zu demontieren, aufzuarbeiten und in die ursprüngliche oder eine neue Generation von Produkten wieder zusammenzubauen. Dieser Prozess ist von hoher Komplexität und einer hohen Dynamik geprägt, wodurch ein hohes Maß an Unsicherheit entsteht. Um Roboter an diese Bedingungen anzupassen, ist es notwendig vom Menschen zu lernen. Der Mensch stellt in einer Kreislauffabrik die flexibelste Ressource dar, da er in der Lage ist, sein Wissen und seine Fähigkeiten an neue Aufgaben und sich ändernde Bedingungen anzupassen. In diesem Artikel wird ein interdisziplinärer Forschungsansatz vorgestellt, um menschliches Handlungswissen aus komplexen Manipulationsaufgaben durch Beobachtung und Demonstration zu erlernen. Das erlangte Wissen wird in einer für Maschinen ausführbaren Form beschrieben und auf Roboter übertragen, sodass es in einer industriellen Automatisierung in einer Kreislauffabrik zur Anwendung kommt. Dazu gibt es zwei primäre Forschungsziele. Erstens wird die multimodale Erfassung des menschlichen Verhaltens und die Beschreibung des menschlichen Handlungswissens untersucht. Zweitens wird die Reproduktion und Generalisierung von erlernten Handlungen, insbesondere von Demontage- und Montagehandlungen, auf Roboter evaluiert.

Keywords: multi-modal capturing of humans; action recognition; eye-tracking; programming by demonstration; machine and deep learning methods

Schlagwörter: multimodale Erfassung des Menschen; Aktionserkennung; Augen-/Blickregistrierung; Programmieren durch Vormachen; maschinelle Lernverfahren

Corresponding author: Manuel Zaremski, Karlsruhe Institute of Technology, Institute of Human and Industrial Engineering, Engler-Bunte-Ring 4, 76131 Karlsruhe, Germany, E-mail: manuel.zaremski@kit.edu

About the authors

Manuel Zaremski

Manuel Zaremski received his M.Sc. in Human Movement Sciences at the Justus Liebig University of Gießen in 2017 and is currently a research assistant at the Institute of Human and Industrial Engineering (ifab) at KIT. His research interests lie in the field of human-machine interaction, and he is particularly interested in the analysis of human eye and gaze movements.

Blanca Handwerker

Blanca Handwerker received her M.Sc. in Psychology at the University of Bonn in 2023 and is currently a research assistant at the Institute of Human and Industrial Engineering (ifab) at KIT. Her research interests include human-machine interaction and the analysis of human eye and gaze movements.

Christian R. G. Dreher

Christian R. G. Dreher studied computer science at the Karlsruhe Institute of Technology (KIT) and graduated in 2019. He is currently working as a research assistant at the Chair of High-Performance Humanoid Technologies (H²T), KIT. His research interests include robot programming by demonstration.

Fabian Leven

Fabian Leven received his M. Sc.-degree in Physics at the Karlsruhe Institute of Technology in 2019. His current research interests lie in the field of machine vision, where he is particularly interested in estimating the direction of human gaze.

David Schneider

David Schneider is currently a research assistant at the Computer Vision for Human-Computer Interaction Lab (CV:HCI) at KIT. He is working on human activity recognition as a component of assistive technologies which facilitate daily activities in the later stages of life. His research interests focus on human action recognition, multimodal, selfsupervised and cross-domain learning.

Alina Roitberg

Alina Roitberg is a tenure-track junior professor at the University of Stuttgart, leading the Intelligent Sensing and Perception Group at the Institute for AI, University of Stuttgart. Her research interests include computer vision, human activity recognition, domain adaptation, open set recognition, as well as resource- and data-efficient learning.

Rainer Stiefelhagen

Rainer Stiefelhagen is a professor for Information Technology Systems for Visually Impaired Students at the Karlsruhe Institute of Technology (KIT), where he directs the Computer Vision for Human-Computer Interaction Lab at the Institute for Anthropomatics and Robotics. His research interests include computer vision methods for visual perception of humans and their activities, to facilitate perceptive multimodal interfaces, humanoid robots, smart environments, multimedia analysis and assistive technology for persons with visual impairments.

Gerhard Neumann

Gerhard Neumann is a professor for Autonomous Learning Robots at the Institute for Anthropomatics and Robotics at the KIT. His research is focused on the intersection of machine learning, robotics, and human-robot interaction, including the creation of data-efficient machine-learning algorithms that are suitable for complex robot domains.

Michael Heizmann

Michael Heizmann is a professor for Mechatronic Measurement Systems and head of the Institute of Industrial Information Technology (IIIT) at the Karlsruhe Institute of Technology (KIT). His research areas include automatic visual inspection, signal and image processing, image and information fusion, measurement technology, machine learning and artificial intelligence and their applications.

Tamim Asfour

Tamim Asfour is a professor of Humanoid Robotic Systems at the Karlsruhe Institute of Technology (KIT). He heads the chair for high-performance humanoid technologies (H²T) and his research interests focus on humanoid robots that can learn from observation and experience, and can act and interact in real environments.

Barbara Deml

Barbara Deml is a professor of Human Factors and the head of the Institute for Human and Industrial Engineering (ifab) at the Karlsruhe Institute of Technology (KIT). Her research interests include the empirical analysis of human behavior and related cognitive processes, human-machine interaction, as well as designing work systems that are humancentered and incorporate learning automated systems.

Research ethics: The ethics committee at the Karlsruhe Institute of Technology has unanimously voted that there are no ethical concerns regarding the admissibility of the circular factory research project for the eternal product.
Author contributions: The authors have accepted responsibility for the entire content of this manuscript and approved its submission.
Competing interests: The authors state no conflict of interest.
Research funding: German Research Foundation (DFG), SFB 1574 Circular Factory for the Perpetual Product (project ID: 471687386).
Data availability: Not applicable.

References

[1] C. B. Cetin and G. Zaccour, “Remanufacturing with innovative features: a strategic analysis,” Eur. J. Oper. Res., vol. 310, no. 2, pp. 655–669, 2023. https://doi.org/10.1016/j.ejor.2023.03.027.Search in Google Scholar

[2] M. Matsumoto, S. Yang, K. Martinsen, and Y. Kainuma, “Trends and research challenges in remanufacturing,” Int. J. Precis. Eng. Manuf. Green Technol., vol. 3, no. 1, pp. 129–142, 2016. https://doi.org/10.1007/s40684-016-0016-4.Search in Google Scholar

[3] R. Slama, O. Ben-Ammar, H. Tlahig, I. Slama, and P. Slangen, “Human-centred assembly and disassembly systems: a survey on technologies, ergonomic, productivity and optimisation,” IFAC-PapersOnLine, vol. 55, no. 10, pp. 1722–1727, 2022. https://doi.org/10.1016/j.ifacol.2022.09.646.Search in Google Scholar

[4] S. Kadner, et al.., Circular Economy Roadmap für Deutschland, Circular Economy Initiative Deutschland, Hrsg., Munich, Germany, Acatech, Deutsche Akademie der Technikwissenschaften e.V, 2021.Search in Google Scholar

[5] T. Pfeifer and R. Schmitt, Autonome Produktionszellen – Komplexe Produktionsprozesse flexibel automatisieren, Berlin, Springer, 2006.10.1007/3-540-30811-3Search in Google Scholar

[6] D. Schütz, C. Budde, A. Raatz, and J. Hesselbach, Parallel Kinematic Structures of the SFB 562, Berlin, Heidelberg, Springer Berlin Heidelberg, 2011, pp. 109–124.10.1007/978-3-642-16785-0_7Search in Google Scholar

[7] R. Dillmann and T. Asfour, “Collaborative research center on humanoid robots (sfb 588),” Zeitschrift Künstl. Intell., vol. 4, pp. 26–28, 2008.Search in Google Scholar

[8] J. Gausemeier, F. Rammig, and W. Schäfer, Design Methodology for Intelligent Technical Systems – Develop Intelligent Technical Systems of the Future, Berlin Heidelberg, Springer, 2014.10.1007/978-3-642-45435-6Search in Google Scholar

[9] C. Pentzold, A. Kaun, and C. Lohmeier, “Imagining and instituting future media: introduction to the special issue,” Convergence, vol. 26, no. 4, pp. 705–715, 2020. https://doi.org/10.1177/1354856520938584.Search in Google Scholar

[10] F. H. P. Fitzek, S.-C. Li, S. Speidel, T. Strufe, M. Şimşek, and M. Reisslein, Eds., Tactile Internet with Human-in-the-Loop, Cambridge, Massachusetts, US, Academic Press, 2021.Search in Google Scholar

[11] P. Ruppel and J. Zhang, “Learning object manipulation with dexterous hand-arm systems from human demonstration,” in 2020 IEEE/RSJ International Conference, 2020, pp. 5417–5424.10.1109/IROS45743.2020.9340966Search in Google Scholar

[12] C. R. G. Dreher, et al.., “Erfassung und Interpretation menschlicher Handlungen für die Programmierung von Robotern in der Produktion,” Automatisierungstechnik, vol. 70, no. 6, pp. 517–533, 2022. https://doi.org/10.1515/auto-2022-0006.Search in Google Scholar

[13] J. Carreira and A. Zisserman, “Quo vadis, action recognition? A new model and the kinetics dataset,” arXiv, 2017.10.1109/CVPR.2017.502Search in Google Scholar

[14] T. Jiang, et al.., “Rtmpose: real-time multi-person pose estimation based on mmpose,” arXiv: 2303.07399, 2023, https://doi.org/10.48550/arXiv.2303.07399.Search in Google Scholar

[15] S. Goel, G. Pavlakos, J. Rajasegaran, A. Kanazawa, and J. Malik, “Humans in 4D: reconstructing and tracking humans with transformers,” in ICCV, 2023.10.1109/ICCV51070.2023.01358Search in Google Scholar

[16] A. Roitberg, et al.., “Is my driver observation model overconfident? Input-guided calibration networks for reliable and interpretable confidence estimates,” IEEE Trans. Intell. Transport. Syst., vol. 23, no. 12, pp. 25271–25286, 2022. https://doi.org/10.1109/tits.2022.3196410.Search in Google Scholar

[17] B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,” in Conference on Neural Information Processing Systems, vol. 31, 2017.Search in Google Scholar

[18] Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: representing model uncertainty in deep learning,” arXiv, 2015.Search in Google Scholar

[19] K. Peng, et al.., “Navigating open set scenarios for skeleton-based action recognition,” arXiv preprint arXiv:2312.06330, 2023.Search in Google Scholar

[20] F. Mannhardt, et al.., “A taxonomy for combining activity recognition and process discovery in industrial environments,” in Intelligent Data Engineering and Automated Learning – IDEAL 2018, Cham, Springer International Publishing, 2018, pp. 84–93.10.1007/978-3-030-03496-2_10Search in Google Scholar

[21] M. Fayyaz and J. Gall, “Sct: set constrained temporal transformer for set supervised action segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 501–510.10.1109/CVPR42600.2020.00058Search in Google Scholar

[22] R. Ghoddoosian, S. Sayed, and V. Athitsos, “Hierarchical modeling for task recognition and action segmentation in weakly-labeled instructional videos,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 1922–1932.10.1109/WACV51458.2022.00020Search in Google Scholar

[23] M. Ilse, J. Tomczak, and M. Welling, “Attention-based deep multiple instance learning,” in International Conference on Machine Learning, PMLR, 2018, pp. 2127–2136.Search in Google Scholar

[24] A. Villanueva and R. Cabeza, “Models for gaze tracking systems,” EURASIP J. Image Video Process., vol. 2007, no. 1, p. 023570, 2007. https://doi.org/10.1186/1687-5281-2007-023570.Search in Google Scholar

[25] J. Merchant, R. Morrissette, and J. L. Porterfield, “Remote measurement of eye direction allowing subject motion over one cubic foot of space,” IEEE Trans. Biomed. Eng., vol. BME-21, no. 4, pp. 309–317, 1974. https://doi.org/10.1109/tbme.1974.324318.Search in Google Scholar PubMed

[26] K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. van de Weijer, Eye Tracking: A Comprehensive Guide to Methods and Measures, Oxford, OUP, 2011.Search in Google Scholar

[27] S.-W. Shih and J. Liu, “A novel approach to 3-D gaze tracking using stereo cameras,” IEEE Trans. Syst. Man Cybern. B Cybern., vol. 34, no. 1, pp. 234–245, 2004. https://doi.org/10.1109/tsmcb.2003.811128.Search in Google Scholar PubMed

[28] E. D. Guestrin and M. Eizenman, “General theory of remote gaze estimation using the pupil center and corneal reflections,” IEEE Trans. Biomed. Eng., vol. 53, no. 6, pp. 1124–1133, 2006. https://doi.org/10.1109/tbme.2005.863952.Search in Google Scholar

[29] C. Hennessey and P. Lawrence, “Noncontact binocular eye-gaze tracking for point-of-gaze estimation in three dimensions,” IEEE Trans. Biomed. Eng., vol. 56, no. 3, pp. 790–799, 2009. https://doi.org/10.1109/tbme.2008.2005943.Search in Google Scholar

[30] B. Hosp, S. Eivazi, M. Maurer, W. Fuhl, D. Geisler, and E. Kasneci, “RemoteEye: an open-source high-speed remote eye tracker: implementation insights of a pupil- and glint-detection algorithm for high-speed remote eye tracking,” Behav. Res. Methods, vol. 52, no. 3, pp. 1387–1401, 2020. https://doi.org/10.3758/s13428-019-01305-2.Search in Google Scholar PubMed

[31] Y.-L. Xiao, S. Li, Q. Zhang, J. Zhong, X. Su, and Z. You, “Optical fringe-reflection deflectometry with bundle adjustment,” Opt. Lasers Eng., vol. 105, pp. 132–140, 2018. https://doi.org/10.1016/j.optlaseng.2018.01.013.Search in Google Scholar

[32] F. Böhle, “Implizites Wissen und subjektivierendes Handeln – Konzepte und empirische Befunde aus der Arbeitsforschung,” in Implizites Wissen, Wirtschaft – Beruf – Ethik, R. Hermkes, G. H. Neuweg, and T. Bonowski, Eds., Bielefeld, wbv Media, 2020, pp. 37–64.Search in Google Scholar

[33] B. M. Velichkovsky, “Heterarchy of cognition: the depths and the highs of a framework for memory research,” Memory, vol. 10, nos. 5–6, pp. 405–419, 2002. https://doi.org/10.1080/09658210244000234.Search in Google Scholar PubMed

[34] A. T. Duchowski, “Gaze-based interaction: a 30 year retrospective,” Comput. Graph., vol. 73, pp. 59–69, 2018. https://doi.org/10.1016/j.cag.2018.04.002.Search in Google Scholar

[35] J. Theeuwes, A. Belopolsky, and C. N. L. Olivers, “Interactions between working memory, attention and eye movements,” Acta Psychol., vol. 132, no. 2, pp. 106–114, 2009. https://doi.org/10.1016/j.actpsy.2009.01.005.Search in Google Scholar PubMed

[36] E. Awh, K. M. Armstrong, and T. Moore, “Visual and oculomotor selection: links, causes and implications for spatial attention,” Trends Cognit. Sci., vol. 10, no. 3, pp. 124–130, 2006. https://doi.org/10.1016/j.tics.2006.01.001.Search in Google Scholar PubMed

[37] Y.-c. Yeh, J.-L. Tsai, W.-C. Hsu, and C. F. Lin, “A model of how working memory capacity influences insight problem solving in situations with multiple visual representations: an eye tracking analysis,” Think. Skills Creativ., vol. 13, pp. 153–167, 2014. https://doi.org/10.1016/j.tsc.2014.04.003.Search in Google Scholar

[38] N. Ayala, A. Zafar, and E. Niechwiej-Szwedo, “Gaze behaviour: a window into distinct cognitive processes revealed by the tower of london test,” Vis. Res., vol. 199, p. 108072, 2022. https://doi.org/10.1016/j.visres.2022.108072.Search in Google Scholar PubMed

[39] A. Gegenfurtner and M. Seppänen, “Transfer of expertise: an eye tracking and think aloud study using dynamic medical visualizations,” Comput. Educ., vol. 63, pp. 393–403, 2013. https://doi.org/10.1016/j.compedu.2012.12.021.Search in Google Scholar

[40] A. H. Fathaliyan, X. Wang, and V. J. Santos, “Exploiting three-dimensional gaze tracking for action recognition during bimanual manipulation to enhance human–robot collaboration,” Front. Robot. AI, vol. 5, 2018, Art. no. 25. https://doi.org/10.3389/frobt.2018.00025.Search in Google Scholar PubMed PubMed Central

[41] A. Saran, E. S. Short, A. Thomaz, and S. Niekum, “Understanding teacher gaze patterns for robot learning,” in 3rd Conference on Robot Learning, 2019.Search in Google Scholar

[42] J. Pfrommer, et al.., “An ontology for remanufacturing systems,” Automatisierungstechnik, vol. 70, no. 6, pp. 534–541, 2022. https://doi.org/10.1515/auto-2021-0156.Search in Google Scholar

[43] A. Billard and D. Kragić, “Trends and challenges in robot manipulation,” Science, vol. 364, no. 6446, p. eaat8414, 2019. https://doi.org/10.1126/science.aat8414.Search in Google Scholar PubMed

[44] S. Niekum, S. Osentoski, G. Konidaris, S. Chitta, B. Marthi, and A. G. Barto, “Learning grounded finite-state representations from unstructured demonstrations,” Int. J. Robot. Res., vol. 34, no. 2, pp. 131–157, 2015. https://doi.org/10.1177/0278364914554471.Search in Google Scholar

[45] S. Calinon, F. Guenter, and A. Billard, “On learning, representing, and generalizing a task in a humanoid robot,” IEEE Trans. Syst. Man Cybern. B Cybern., vol. 37, no. 2, pp. 286–298, 2007. https://doi.org/10.1109/tsmcb.2006.886952.Search in Google Scholar PubMed

[46] X. Ye, Z. Lin, and Y. Yang, “Robot learning of manipulation activities with overall planning through precedence graph,” Robot. Autonom. Syst., vol. 116, pp. 126–135, 2019. https://doi.org/10.1016/j.robot.2019.03.011.Search in Google Scholar

[47] T. Welschehold, N. Abdo, C. Dornhege, and W. Burgard, “Combined task and action learning from human demonstrations for mobile manipulation applications,” in International Conference on Intelligent Robots and Systems (IROS), IEEE, 2019, pp. 4317–4324.10.1109/IROS40897.2019.8968091Search in Google Scholar

[48] N. Krüger, et al.., “Object–action complexes: grounded abstractions of sensory–motor processes,” Robot. Autonom. Syst., vol. 59, no. 10, pp. 740–757, 2011. https://doi.org/10.1016/j.robot.2011.05.009.Search in Google Scholar

[49] A. Paraschos, C. Daniel, J. R. Peters, and G. Neumann, “Probabilistic movement primitives,” Adv. Neural Inf. Process. Syst., vol. 26, 2013.Search in Google Scholar

[50] Y. Zhou, J. Gao, and T. Asfour, “Movement primitive learning and generalization: using mixture density networks,” Robot. Autom. Mag., vol. 27, no. 2, pp. 22–32, 2020. https://doi.org/10.1109/mra.2020.2980591.Search in Google Scholar

[51] C. R. G. Dreher, M. Wächter, and T. Asfour, “Learning object-action relations from bimanual human demonstration using graph networks,” Robot. Autom. Lett., vol. 5, no. 1, pp. 187–194, 2020. https://doi.org/10.1109/lra.2019.2949221.Search in Google Scholar

[52] F. Krebs and T. Asfour, “A bimanual manipulation taxonomy,” Robot. Autom. Lett., vol. 7, no. 4, pp. 11031–11038, 2022. https://doi.org/10.1109/lra.2022.3196158.Search in Google Scholar

[53] R. Mirsky, R. Stern, K. Gal, and M. Kalech, “Sequential plan recognition: an iterative approach to disambiguating between hypotheses,” Artif. Intell., vol. 260, pp. 51–73, 2018. https://doi.org/10.1016/j.artint.2018.03.006.Search in Google Scholar

[54] K. French, S. Wu, T. Pan, Z. Zhou, and O. C. Jenkins, “Learning behavior trees from demonstration,” Int. Conf. Robot. Autom., pp. 7791–7797, 2019.10.1109/ICRA.2019.8794104Search in Google Scholar

[55] T. Aoki, G. Venture, and D. Kulić, “Segmentation of human body movement using inertial measurement unit,” in International Conference on Systems, Man, and Cybernetics (SMC), IEEE, 2013, pp. 1181–1186.10.1109/SMC.2013.205Search in Google Scholar

[56] F. Zhou, F. de la Torre, and J. K. Hodgins, “Hierarchical aligned cluster analysis for temporal clustering of human motion,” Trans. Pattern Anal. Mach. Intell., vol. 35, no. 3, pp. 582–596, 2013. https://doi.org/10.1109/tpami.2012.137.Search in Google Scholar PubMed

[57] J. F.-S. Lin, V. Joukov, and D. Kulić, “Human motion segmentation by data point classification,” in Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, 2014, pp. 9–13.10.1109/EMBC.2014.6943516Search in Google Scholar PubMed

[58] M. Wächter and T. Asfour, “Hierarchical segmentation of manipulation actions based on object relations and motion characteristics,” in International Conference on Advanced Robotics (ICAR), IEEE, 2015, pp. 549–556.10.1109/ICAR.2015.7251510Search in Google Scholar

[59] C. Mandery, Ö. Terlemez, M. Do, N. Vahrenkamp, and T. Asfour, “Unifying representations and large-scale whole-body motion databases for studying human motion,” Trans. Robot., vol. 32, no. 4, pp. 796–809, 2016. https://doi.org/10.1109/tro.2016.2572685.Search in Google Scholar

[60] C. R. G. Dreher and T. Asfour, “Learning temporal task models from human bimanual demonstrations,” in International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 7664–7671.10.1109/IROS47612.2022.9981068Search in Google Scholar

[61] G. Li, Z. Jin, M. Volpp, F. Otto, R. Lioutikov, and G. Neumann, “Prodmps: a unified perspective on dynamic and probabilistic movement primitives,” arXiv: 2306.12729, 2023, https://doi.org/10.1109/LRA.2023.3248443.Search in Google Scholar

[62] F. Otto, O. Celik, H. Zhou, H. Ziesche, N. A. Vien, and G. Neumann, “Deep black-box reinforcement learning with movement primitives,” in 6th Conference on Robot Learning (CoRL 2022), 2022.Search in Google Scholar

[63] O. Celik, D. Zhou, G. Li, P. Becker, and G. Neumann, “Specializing versatile skill libraries using local mixture of experts,” in Conference on Robot Learning, 2021.Search in Google Scholar

[64] K. Lee, L. M. Smith, and P. Abbeel, “Pebble: feedback-efficient interactive reinforcement learning via relabeling experience and unsupervised pre-training,” in Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang, Eds., PMLR, 2021, pp. 6152–6163.Search in Google Scholar

[65] J. Hejna and D. Sadigh, “Few-shot preference learning for human-in-the-loop rl,” in 6th Annual Conference on Robot Learning, 2022.Search in Google Scholar

[66] P. Sundaresan, R. Antonova, and J. Bohg, “Diffcloud: real-to-sim from point clouds with differentiable simulation and rendering of deformable objects,” arXiv preprint arXiv:2204.03139, 2022.10.1109/IROS47612.2022.9981101Search in Google Scholar

[67] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: representing scenes as neural radiance fields for view synthesis,” Commun. ACM, vol. 65, no. 1, pp. 99–106, 2021. https://doi.org/10.1145/3503250.Search in Google Scholar

[68] D. Driess, I. Schubert, P. Florence, Y. Li, and M. Toussaint, “Reinforcement learning with neural radiance fields,” arXiv preprint arXiv:2206.01634, 2022.Search in Google Scholar

[69] G. Maeda, M. Ewerton, D. Koert, and J. Peters, “Acquiring and generalizing the embodiment mapping from human observations to robot skills,” Robot. Autom. Lett., vol. 1, no. 2, pp. 784–791, 2016. https://doi.org/10.1109/lra.2016.2525038.Search in Google Scholar

[70] P. Englert, N. A. Vien, and M. Toussaint, “Inverse kkt: learning cost functions of manipulation tasks from demonstrations,” Int. J. Robot. Res., vol. 36, nos. 13–14, pp. 1474–1488, 2017. https://doi.org/10.1177/0278364917745980.Search in Google Scholar

[71] K. Zakka, A. Zeng, P. Florence, J. Tompson, J. Bohg, and D. Dwibedi, “Xirl: cross-embodiment inverse reinforcement learning,” in Conference on Robot Learning, 2021.Search in Google Scholar

[72] W. Chen, et al.., “Learning to predict 3d objects with an interpolation-based differentiable renderer,” in Proceedings of the 33rd International Conference on Neural Information Processing Systems, Red Hook, NY, USA, Curran Associates Inc, 2019.Search in Google Scholar

[73] M. Garnelo, et al.., “Neural processes,” arXiv: 1807.01622, 2018, https://doi.org/10.48550/arXiv.1807.01622.Search in Google Scholar

Received: 2024-01-09

Accepted: 2024-07-02

Published Online: 2024-09-10

Published in Print: 2024-09-25

Learning human actions from complex manipulation tasks and their transfer to robots in the circular factory

Abstract

Zusammenfassung

About the authors

References

Journal and Issue

Articles in the same Issue