skip to main content
10.1145/2818346.2820738acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Multimodal Human Activity Recognition for Industrial Manufacturing Processes in Robotic Workcells

Published:09 November 2015Publication History

ABSTRACT

We present an approach for monitoring and interpreting human activities based on a novel multimodal vision-based interface, aiming at improving the efficiency of human-robot interaction (HRI) in industrial environments. Multi-modality is an important concept in this design, where we combine inputs from several state-of-the-art sensors to provide a variety of information, e.g. skeleton and fingertip poses. Based on typical industrial workflows, we derived multiple levels of human activity labels, including large-scale activities (e.g. assembly) and simpler sub-activities (e.g. hand gestures), creating a duration- and complexity-based hierarchy. We train supervised generative classifiers for each activity level and combine the output of this stage with a trained Hierarchical Hidden Markov Model (HHMM), which models not only the temporal aspects between the activities on the same level, but also the hierarchical relationships between the levels.

References

  1. J. K. Aggarwal and M. S. Ryoo. Human activity analysis: A review. ACM Computing Surveys (CSUR), 43(3):16, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. O. Boiman, E. Shechtman, and M. Irani. In defense of nearest-neighbor based image classification. In Computer Vision and Pattern Recognition(CVPR) IEEE Conference on, pages 1--8. IEEE, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  3. S. Fine, Y. Singer, and N. Tishby. The hierarchical hidden markov model: Analysis and applications. Machine learning, 32(1):41--62, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Gan and F. Chen. Human action recognition using apj3d and random forests. Journal of Software, 8(9):2238--2245, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  5. B. Gleeson, K. MacLean, A. Haddadi, E. Croft, and J. Alcazar. Gestures for industry: Intuitive human-robot communication from human observation. In Proceedings of the 8th ACM/IEEE International Conference on Human-robot Interaction, HRI, pages 349--356. IEEE Press, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. V. Kellokumpu, M. Pietikäinen, and J. Heikkilä. Human activity recognition using sequences of postures. In IAPR Conference on Machine Vision Applications, pages 570--573, 2005.Google ScholarGoogle Scholar
  7. H. S. Koppula, R. Gupta, and A. Saxena. Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research, 32(8):951--970, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Lenz, A. Sotzek, T. Röder, H. Radrich, A. Knoll, M. Huber, and S. Glasauer. Human workflow analysis using 3D occupancy grid hand tracking in a human-robot collaboration scenario. In IROS, pages 3375--3380. IEEE, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  9. B. Liang and L. Zheng. Multi-modal gesture recognition using skeletal joints and motion trail model. In Computer Vision-ECCV Workshops, pages 623--638. Springer, 2014.Google ScholarGoogle Scholar
  10. N. T. Nguyen, D. Q. Phung, S. Venkatesh, and H. Bui. Learning and detecting activities from movement trajectories using the hierarchical hidden markov model. In Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Conference on, volume 2, pages 955--960. IEEE, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Offi, R. Chaudhry, G. Kurillo, R. Vidal, and R. Bajcsy. Sequence of the most informative joints (smij): A new representation for human skeletal action recognition. Journal of Visual Communication and Image Representation, 25(1):24--38, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. T. Papadopoulos, A. Axenopoulos, and P. Daras. Real-time skeleton-tracking-based human action recognition using kinect data. In MultiMedia Modeling, pages 473--483. Springer, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. R. Poppe. A survey on vision-based human action recognition. Image and vision computing, 28(6):976--990, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. L. R. Rabiner and B. H. Juang. An introduction to hidden markov models. ASSP Magazine, pages 4--16, Jan. 1986.Google ScholarGoogle ScholarCross RefCross Ref
  15. A. Roitberg, A. Perzylo, N. Somani, M. Giuliani, M. Rickert, and A. Knoll. Human activity recognition in the context of industrial human-robot interaction. In Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA), pages 1--10. IEEE, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  16. J. Sung, C. Ponce, B. Selman, and A. Saxena. Unstructured human activity detection from RGBD images. In IEEE International Conference on Robotics and Automation (ICRA), 2012.Google ScholarGoogle Scholar
  17. P. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea. Machine recognition of human activities: A survey. Circuits and Systems for Video Technology, IEEE Transactions on, 18(11):1473--1488, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Yamato, J. Ohya, and K. Ishii. Recognizing human action in time-sequential images using hidden markov model. In CVPR, pages 379--385, 1992.Google ScholarGoogle ScholarCross RefCross Ref
  19. Y. Zhu, W. Chen, and G. Guo. Fusing spatiotemporal features and joints for 3d action recognition. In Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE Conference on, pages 486--491. IEEE, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multimodal Human Activity Recognition for Industrial Manufacturing Processes in Robotic Workcells

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
      November 2015
      678 pages
      ISBN:9781450339124
      DOI:10.1145/2818346

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 November 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ICMI '15 Paper Acceptance Rate52of127submissions,41%Overall Acceptance Rate453of1,080submissions,42%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader