ABSTRACT
Intent prediction finds widespread applications in user interface (UI/UX) design to predict target icons, in automotive industry to anticipate driver’s intent, and in understanding human motion during human-robot interactions (HRI). Predicting human intent involves analyzing factors such as hand motion, eye gaze movement, and gestures. This paper introduces a multimodal intent prediction algorithm involving hand and eye gaze using Bayesian fusion. Inverse reinforcement learning was leveraged to learn human preferences for the human-robot handover task. Results demonstrate that the proposed approach achieves the highest prediction accuracy of 99.9% at 60% task completion as compared to state-of-the-art (SOTA) methods.
- Bashar I Ahmad, Patrick M Langdon, Simon J Godsill, Robert Hardy, Lee Skrypchuk, and Richard Donkor. 2015. Touchscreen usability and input performance in vehicles under different road conditions: an evaluative study. In Proceedings of the 7th International Conference on Automotive User Interfaces and Interactive Vehicular Applications. 47–54.Google ScholarDigital Library
- Bashar I Ahmad, James K Murphy, Patrick M Langdon, Simon J Godsill, Robert Hardy, and Lee Skrypchuk. 2015. Intent inference for hand pointing gesture-based interactions in vehicles. IEEE transactions on cybernetics 46, 4 (2015), 878–889.Google Scholar
- Pradipta Biswas and Patrick Langdon. 2014. Multimodal target prediction model. In CHI’14 Extended Abstracts on Human Factors in Computing Systems. 1543–1548.Google Scholar
- Judith Bütepage, Hedvig Kjellström, and Danica Kragic. 2018. Anticipating many futures: Online human motion prediction and generation for human-robot interaction. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 4563–4570.Google ScholarDigital Library
- Laura Cohen, Sinan Haliyo, Mohamed Chetouani, and Stéphane Régnier. 2014. Intention prediction approach to interact naturally with the microworld. In 2014 IEEE/ASME International Conference on Advanced Intelligent Mechatronics. IEEE, 396–401.Google ScholarCross Ref
- Brecht Corteville, Erwin Aertbeliën, Herman Bruyninckx, Joris De Schutter, and Hendrik Van Brussel. 2007. Human-inspired robot assistant for fast point-to-point movements. In Proceedings 2007 IEEE International Conference on Robotics and Automation. IEEE, 3639–3644.Google ScholarCross Ref
- Tor-Salve Dalsgaard, Jarrod Knibbe, and Joanna Bergström. 2021. Modeling Pointing for 3D Target Selection in VR. In Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology. 1–10.Google ScholarDigital Library
- Nachiket Deo and Mohan M Trivedi. 2020. Trajectory forecasts in unknown environments conditioned on grid-based plans. arXiv preprint arXiv:2001.00735 (2020).Google Scholar
- Jos Elfring, René Van De Molengraft, and Maarten Steinbuch. 2014. Learning intentions for improved human motion prediction. Robotics and Autonomous Systems 62, 4 (2014), 591–602.Google ScholarDigital Library
- Ashraf Elnagar. 2001. Prediction of moving objects in dynamic environments using Kalman filters. In Proceedings 2001 IEEE International Symposium on Computational Intelligence in Robotics and Automation (Cat. No. 01EX515). IEEE, 414–419.Google ScholarCross Ref
- Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. 2015. Recurrent network models for human dynamics. In Proceedings of the IEEE international conference on computer vision. 4346–4354.Google ScholarCross Ref
- Nisal Menuka Gamage, Deepana Ishtaweera, Martin Weigel, and Anusha Withana. 2021. So predictable! continuous 3d hand trajectory prediction in virtual reality. In The 34th Annual ACM Symposium on User Interface Software and Technology. 332–343.Google ScholarDigital Library
- Lu Gan, Jessy W Grizzle, Ryan M Eustice, and Maani Ghaffari. 2022. Energy-based legged robots terrain traversability modeling via deep inverse reinforcement learning. IEEE Robotics and Automation Letters 7, 4 (2022), 8807–8814.Google ScholarCross Ref
- Mithun Jacob, Yu-Ting Li, George Akingba, and Juan P Wachs. 2012. Gestonurse: a robotic surgical nurse for handling surgical instruments in the operating room. Journal of Robotic Surgery 6 (2012), 53–63.Google ScholarCross Ref
- Mrinal Kalakrishnan, Peter Pastor, Ludovic Righetti, and Stefan Schaal. 2013. Learning objective functions for manipulation. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 1331–1336.Google ScholarCross Ref
- Philipp Kratzer, Marc Toussaint, and Jim Mainprice. 2020. Prediction of human full-body movements with motion optimization and recurrent neural networks. In 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1792–1798.Google ScholarCross Ref
- Edward Lank, Yi-Chun Nikko Cheng, and Jaime Ruiz. 2007. Endpoint prediction using motion kinematics. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 637–646.Google ScholarDigital Library
- Qinghua Li, Zhao Zhang, Yue You, Yaqi Mu, and Chao Feng. 2020. Data driven models for human motion prediction in human-robot collaboration. IEEE Access 8 (2020), 227690–227702.Google ScholarCross Ref
- Ruixuan Liu and Changliu Liu. 2020. Human motion prediction using adaptable recurrent neural networks and inverse kinematics. IEEE Control Systems Letters 5, 5 (2020), 1651–1656.Google ScholarCross Ref
- Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, 2019. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172 (2019).Google Scholar
- Ruikun Luo and Dmitry Berenson. 2015. A framework for unsupervised online human reaching motion recognition and early prediction. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2426–2433.Google ScholarDigital Library
- Yusuke Maeda, Takayuki Hara, and Tamio Arai. 2001. Human-robot cooperative manipulation with motion estimation. In Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No. 01CH37180), Vol. 4. Ieee, 2240–2245.Google ScholarCross Ref
- Jim Mainprice and Dmitry Berenson. 2013. Human-robot collaborative manipulation planning using early prediction of human motion. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 299–306.Google ScholarCross Ref
- Jim Mainprice, Rafi Hayne, and Dmitry Berenson. 2016. Goal set inverse optimal control and iterative replanning for predicting human reaching motions in shared workspaces. IEEE Transactions on Robotics 32, 4 (2016), 897–908.Google ScholarDigital Library
- Omey M Manyar, Zachary McNulty, Stefanos Nikolaidis, and Satyandra K Gupta. 2023. Inverse Reinforcement Learning Framework for Transferring Task Sequencing Policies from Humans to Robots in Manufacturing Applications. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 849–856.Google ScholarCross Ref
- Julieta Martinez, Michael J Black, and Javier Romero. 2017. On human motion prediction using recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2891–2900.Google ScholarCross Ref
- Mukund Mitra, Preetam Pati, Vinay Krishna Sharma, Subin Raj, Partha Pratim Chakrabarti, and Pradipta Biswas. 2023. Comparison of Target Prediction in VR and MR using Inverse Reinforcement Learning. In Companion Proceedings of the 28th International Conference on Intelligent User Interfaces. 55–58.Google ScholarDigital Library
- Anneli Olsen. 2012. The Tobii I-VT fixation filter. Tobii Technology 21 (2012), 4–19.Google Scholar
- Sibo Tian, Xiao Liang, and Minghui Zheng. 2023. An Optimization-Based Human Behavior Modeling and Prediction for Human-Robot Collaborative Disassembly. In 2023 American Control Conference (ACC). IEEE, 3356–3361.Google Scholar
- Weitian Wang, Rui Li, Yi Chen, Z Max Diekel, and Yunyi Jia. 2018. Facilitating human–robot collaborative tasks by teaching-learning-collaboration from human demonstrations. IEEE Transactions on Automation Science and Engineering 16, 2 (2018), 640–653.Google ScholarCross Ref
- Weitian Wang, Rui Li, Yi Chen, Yi Sun, and Yunyi Jia. 2021. Predicting human intentions in human–robot hand-over tasks through multimodal learning. IEEE Transactions on Automation Science and Engineering 19, 3 (2021), 2339–2353.Google ScholarCross Ref
- Zhan Wang, Patric Jensfelt, and John Folkesson. 2015. Modeling spatial-temporal dynamics of human movements for predicting future trajectories. In Workshop at the Twenty-Ninth AAAI Conference on Artificial Intelligence," Knowledge, Skill, and Behavior Transfer in Autonomous Robots", AAAI Conference on Artificial Intelligence, Austin, USA, January 25, 2015. Association for the advancement of Artificial Intelligence.Google Scholar
- Markus Wulfmeier, Peter Ondruska, and Ingmar Posner. 2015. Maximum entropy deep inverse reinforcement learning. arXiv preprint arXiv:1507.04888 (2015).Google Scholar
- Brian Ziebart, Anind Dey, and J Andrew Bagnell. 2012. Probabilistic pointing target prediction via inverse optimal control. In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces. 1–10.Google ScholarDigital Library
- Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, Anind K Dey, 2008. Maximum entropy inverse reinforcement learning.. In Aaai, Vol. 8. Chicago, IL, USA, 1433–1438.Google Scholar
- Athanasia Zlatintsi, Isidoros Rodomagoulakis, Petros Koutras, AC Dometios, Vassilis Pitsikalis, Costas S Tzafestas, and Petros Maragos. 2018. Multimodal signal processing and learning aspects of human-robot interaction for an assistive bathing robot. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3171–3175.Google ScholarDigital Library
Index Terms
- Multimodal Target Prediction for Rapid Human-Robot Interaction
Recommendations
Lexical Entrainment in Multi-party Human–Robot Interaction
Social RoboticsAbstractThis paper reports lexical entrainment in a multi-party human–robot interaction, wherein one robot and two humans serve as participants. Humans tend to use the same terms as their interlocutors while making conversation. This phenomenon is called ...
Precision timing in human-robot interaction: coordination of head movement and utterance
CHI '08: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsAs research over the last several decades has shown that non-verbal actions such as face and head movement play a crucial role in human interaction, such resources are also likely to play an important role in human-robot interaction. In developing a ...
On the Benefit of Independent Control of Head and Eye Movements of a Social Robot for Multiparty Human-Robot Interaction
Human-Computer InteractionAbstractThe human gaze direction is the sum of the head and eye movements. The coordination of these two segments has been studied and models of the contribution of head movement to the gaze of virtual agents or robots have been proposed. However, these ...
Comments