poster

Multimodal Target Prediction for Rapid Human-Robot Interaction

Authors:
Mukund Mitra

Indian Institute of Science, India

Indian Institute of Science, India

0000-0001-8442-2224
View Profile

,
Ameya Avinash Patil

CSE Core, Vellore Institute of Technology, India

CSE Core, Vellore Institute of Technology, India

0009-0004-1889-3422
View Profile

,
Gvs Mothish

RBCCPS, Indian Institute of Science, India

RBCCPS, Indian Institute of Science, India

0009-0005-3777-627X
View Profile

,
Gyanig Kumar

I3D Labs, Indian Institute of Sciences, India

I3D Labs, Indian Institute of Sciences, India

0000-0001-9045-2630
View Profile

,
Abhishek Mukhopadhyay

Indian Institute of Science, India

Indian Institute of Science, India

0000-0002-4341-0523
View Profile

,
Murthy L R D

CPDM, Indian Institute of Science, India

CPDM, Indian Institute of Science, India

0000-0002-7039-6763
View Profile

,
Partha Pratim Chakraborty

Indian Institute of Technology, India

Indian Institute of Technology, India

0000-0002-3553-8834
View Profile

,
Pradipta Biswas

CPDM, Indian Institute of Science, India

CPDM, Indian Institute of Science, India

0000-0003-3054-6699
View Profile

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User InterfacesMarch 2024Pages 18–23https://doi.org/10.1145/3640544.3645229

Published:05 April 2024Publication History

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces

Pages 18–23

ABSTRACT

Intent prediction finds widespread applications in user interface (UI/UX) design to predict target icons, in automotive industry to anticipate driver’s intent, and in understanding human motion during human-robot interactions (HRI). Predicting human intent involves analyzing factors such as hand motion, eye gaze movement, and gestures. This paper introduces a multimodal intent prediction algorithm involving hand and eye gaze using Bayesian fusion. Inverse reinforcement learning was leveraged to learn human preferences for the human-robot handover task. Results demonstrate that the proposed approach achieves the highest prediction accuracy of 99.9% at 60% task completion as compared to state-of-the-art (SOTA) methods.

References

Bashar I Ahmad, Patrick M Langdon, Simon J Godsill, Robert Hardy, Lee Skrypchuk, and Richard Donkor. 2015. Touchscreen usability and input performance in vehicles under different road conditions: an evaluative study. In Proceedings of the 7th International Conference on Automotive User Interfaces and Interactive Vehicular Applications. 47–54.Google ScholarDigital Library
Bashar I Ahmad, James K Murphy, Patrick M Langdon, Simon J Godsill, Robert Hardy, and Lee Skrypchuk. 2015. Intent inference for hand pointing gesture-based interactions in vehicles. IEEE transactions on cybernetics 46, 4 (2015), 878–889.Google Scholar
Pradipta Biswas and Patrick Langdon. 2014. Multimodal target prediction model. In CHI’14 Extended Abstracts on Human Factors in Computing Systems. 1543–1548.Google Scholar
Judith Bütepage, Hedvig Kjellström, and Danica Kragic. 2018. Anticipating many futures: Online human motion prediction and generation for human-robot interaction. In 2018 IEEE international conference on robotics and automation (ICRA). IEEE, 4563–4570.Google ScholarDigital Library
Laura Cohen, Sinan Haliyo, Mohamed Chetouani, and Stéphane Régnier. 2014. Intention prediction approach to interact naturally with the microworld. In 2014 IEEE/ASME International Conference on Advanced Intelligent Mechatronics. IEEE, 396–401.Google ScholarCross Ref
Brecht Corteville, Erwin Aertbeliën, Herman Bruyninckx, Joris De Schutter, and Hendrik Van Brussel. 2007. Human-inspired robot assistant for fast point-to-point movements. In Proceedings 2007 IEEE International Conference on Robotics and Automation. IEEE, 3639–3644.Google ScholarCross Ref
Tor-Salve Dalsgaard, Jarrod Knibbe, and Joanna Bergström. 2021. Modeling Pointing for 3D Target Selection in VR. In Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology. 1–10.Google ScholarDigital Library
Nachiket Deo and Mohan M Trivedi. 2020. Trajectory forecasts in unknown environments conditioned on grid-based plans. arXiv preprint arXiv:2001.00735 (2020).Google Scholar
Jos Elfring, René Van De Molengraft, and Maarten Steinbuch. 2014. Learning intentions for improved human motion prediction. Robotics and Autonomous Systems 62, 4 (2014), 591–602.Google ScholarDigital Library
Ashraf Elnagar. 2001. Prediction of moving objects in dynamic environments using Kalman filters. In Proceedings 2001 IEEE International Symposium on Computational Intelligence in Robotics and Automation (Cat. No. 01EX515). IEEE, 414–419.Google ScholarCross Ref
Katerina Fragkiadaki, Sergey Levine, Panna Felsen, and Jitendra Malik. 2015. Recurrent network models for human dynamics. In Proceedings of the IEEE international conference on computer vision. 4346–4354.Google ScholarCross Ref
Nisal Menuka Gamage, Deepana Ishtaweera, Martin Weigel, and Anusha Withana. 2021. So predictable! continuous 3d hand trajectory prediction in virtual reality. In The 34th Annual ACM Symposium on User Interface Software and Technology. 332–343.Google ScholarDigital Library
Lu Gan, Jessy W Grizzle, Ryan M Eustice, and Maani Ghaffari. 2022. Energy-based legged robots terrain traversability modeling via deep inverse reinforcement learning. IEEE Robotics and Automation Letters 7, 4 (2022), 8807–8814.Google ScholarCross Ref
Mithun Jacob, Yu-Ting Li, George Akingba, and Juan P Wachs. 2012. Gestonurse: a robotic surgical nurse for handling surgical instruments in the operating room. Journal of Robotic Surgery 6 (2012), 53–63.Google ScholarCross Ref
Mrinal Kalakrishnan, Peter Pastor, Ludovic Righetti, and Stefan Schaal. 2013. Learning objective functions for manipulation. In 2013 IEEE International Conference on Robotics and Automation. IEEE, 1331–1336.Google ScholarCross Ref
Philipp Kratzer, Marc Toussaint, and Jim Mainprice. 2020. Prediction of human full-body movements with motion optimization and recurrent neural networks. In 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1792–1798.Google ScholarCross Ref
Edward Lank, Yi-Chun Nikko Cheng, and Jaime Ruiz. 2007. Endpoint prediction using motion kinematics. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. 637–646.Google ScholarDigital Library
Qinghua Li, Zhao Zhang, Yue You, Yaqi Mu, and Chao Feng. 2020. Data driven models for human motion prediction in human-robot collaboration. IEEE Access 8 (2020), 227690–227702.Google ScholarCross Ref
Ruixuan Liu and Changliu Liu. 2020. Human motion prediction using adaptable recurrent neural networks and inverse kinematics. IEEE Control Systems Letters 5, 5 (2020), 1651–1656.Google ScholarCross Ref
Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, 2019. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172 (2019).Google Scholar
Ruikun Luo and Dmitry Berenson. 2015. A framework for unsupervised online human reaching motion recognition and early prediction. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2426–2433.Google ScholarDigital Library
Yusuke Maeda, Takayuki Hara, and Tamio Arai. 2001. Human-robot cooperative manipulation with motion estimation. In Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No. 01CH37180), Vol. 4. Ieee, 2240–2245.Google ScholarCross Ref
Jim Mainprice and Dmitry Berenson. 2013. Human-robot collaborative manipulation planning using early prediction of human motion. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 299–306.Google ScholarCross Ref
Jim Mainprice, Rafi Hayne, and Dmitry Berenson. 2016. Goal set inverse optimal control and iterative replanning for predicting human reaching motions in shared workspaces. IEEE Transactions on Robotics 32, 4 (2016), 897–908.Google ScholarDigital Library
Omey M Manyar, Zachary McNulty, Stefanos Nikolaidis, and Satyandra K Gupta. 2023. Inverse Reinforcement Learning Framework for Transferring Task Sequencing Policies from Humans to Robots in Manufacturing Applications. In 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 849–856.Google ScholarCross Ref
Julieta Martinez, Michael J Black, and Javier Romero. 2017. On human motion prediction using recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2891–2900.Google ScholarCross Ref
Mukund Mitra, Preetam Pati, Vinay Krishna Sharma, Subin Raj, Partha Pratim Chakrabarti, and Pradipta Biswas. 2023. Comparison of Target Prediction in VR and MR using Inverse Reinforcement Learning. In Companion Proceedings of the 28th International Conference on Intelligent User Interfaces. 55–58.Google ScholarDigital Library
Anneli Olsen. 2012. The Tobii I-VT fixation filter. Tobii Technology 21 (2012), 4–19.Google Scholar
Sibo Tian, Xiao Liang, and Minghui Zheng. 2023. An Optimization-Based Human Behavior Modeling and Prediction for Human-Robot Collaborative Disassembly. In 2023 American Control Conference (ACC). IEEE, 3356–3361.Google Scholar
Weitian Wang, Rui Li, Yi Chen, Z Max Diekel, and Yunyi Jia. 2018. Facilitating human–robot collaborative tasks by teaching-learning-collaboration from human demonstrations. IEEE Transactions on Automation Science and Engineering 16, 2 (2018), 640–653.Google ScholarCross Ref
Weitian Wang, Rui Li, Yi Chen, Yi Sun, and Yunyi Jia. 2021. Predicting human intentions in human–robot hand-over tasks through multimodal learning. IEEE Transactions on Automation Science and Engineering 19, 3 (2021), 2339–2353.Google ScholarCross Ref
Zhan Wang, Patric Jensfelt, and John Folkesson. 2015. Modeling spatial-temporal dynamics of human movements for predicting future trajectories. In Workshop at the Twenty-Ninth AAAI Conference on Artificial Intelligence," Knowledge, Skill, and Behavior Transfer in Autonomous Robots", AAAI Conference on Artificial Intelligence, Austin, USA, January 25, 2015. Association for the advancement of Artificial Intelligence.Google Scholar
Markus Wulfmeier, Peter Ondruska, and Ingmar Posner. 2015. Maximum entropy deep inverse reinforcement learning. arXiv preprint arXiv:1507.04888 (2015).Google Scholar
Brian Ziebart, Anind Dey, and J Andrew Bagnell. 2012. Probabilistic pointing target prediction via inverse optimal control. In Proceedings of the 2012 ACM international conference on Intelligent User Interfaces. 1–10.Google ScholarDigital Library
Brian D Ziebart, Andrew L Maas, J Andrew Bagnell, Anind K Dey, 2008. Maximum entropy inverse reinforcement learning.. In Aaai, Vol. 8. Chicago, IL, USA, 1433–1438.Google Scholar
Athanasia Zlatintsi, Isidoros Rodomagoulakis, Petros Koutras, AC Dometios, Vassilis Pitsikalis, Costas S Tzafestas, and Petros Maragos. 2018. Multimodal signal processing and learning aspects of human-robot interaction for an assistive bathing robot. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 3171–3175.Google ScholarDigital Library

Index Terms

Multimodal Target Prediction for Rapid Human-Robot Interaction

Recommendations

Lexical Entrainment in Multi-party Human–Robot Interaction
Social Robotics
Abstract
This paper reports lexical entrainment in a multi-party human–robot interaction, wherein one robot and two humans serve as participants. Humans tend to use the same terms as their interlocutors while making conversation. This phenomenon is called ...
Read More
Precision timing in human-robot interaction: coordination of head movement and utterance
CHI '08: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

As research over the last several decades has shown that non-verbal actions such as face and head movement play a crucial role in human interaction, such resources are also likely to play an important role in human-robot interaction. In developing a ...
Read More
On the Benefit of Independent Control of Head and Eye Movements of a Social Robot for Multiparty Human-Robot Interaction
Human-Computer Interaction
Abstract
The human gaze direction is the sum of the head and eye movements. The coordination of these two segments has been studied and models of the contribution of head movement to the gaze of virtual agents or robots have been proposed. However, these ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces
March 2024
182 pages
ISBN:9798400705090
DOI:10.1145/3640544

Copyright © 2024 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 April 2024
Check for updates
Author Tags
Human-Robot Interaction
Intent Prediction
Inverse Reinforcement Learning
Multimodal Target Prediction
Qualifiers
- poster
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate746of2,811submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 62
  Total Downloads
- Downloads (Last 12 months)62
- Downloads (Last 6 weeks)62
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Multimodal Target Prediction for Rapid Human-Robot Interaction

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Lexical Entrainment in Multi-party Human–Robot Interaction

Precision timing in human-robot interaction: coordination of head movement and utterance

On the Benefit of Independent Control of Head and Eye Movements of a Social Robot for Multiparty Human-Robot Interaction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Multimodal Target Prediction for Rapid Human-Robot Interaction

IUI '24 Companion: Companion Proceedings of the 29th International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Lexical Entrainment in Multi-party Human–Robot Interaction

Precision timing in human-robot interaction: coordination of head movement and utterance

On the Benefit of Independent Control of Head and Eye Movements of a Social Robot for Multiparty Human-Robot Interaction

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media