Authors:
Mo'taz Al-Hami
and
Rolf Lakaemper
Affiliation:
Temple University, United States
Keyword(s):
Human-pose, 2D Human-pose Estimation, 3D Human-pose Reconstruction, Silhouette Value, Hierarchical Clustering.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Computer Vision, Visualization and Computer Graphics
;
Features Extraction
;
Geometry and Modeling
;
Image and Video Analysis
;
Image-Based Modeling
;
Pattern Recognition
;
Robotics
;
Software Engineering
Abstract:
The work presented in this paper is part of a project to enable humanoid robots to build a semantic understanding of their environment adopting unsupervised self-learning techniques. Here, we propose an approach to learn 3-dimensional human-pose conformations, i.e. structural arrangements of a (simplified) human skeleton model, given only a minimal verbal description of a human posture (e.g. "sitting", "standing", "tree pose"). The only tools given to the robot are knowledge about the skeleton model, as well as a connection to the labeled images database "google images". Hence the main contribution of this work is to filter relevant results from an images database, given a human-pose specific query words, and to transform the information in these (2D) images into a 3D pose that is the most likely to fit the human understanding of the keywords. Steps to achieve this goal integrate available 2D human-pose estimators using still images, clustering techniques to
extract representative 2D
human skeleton poses, and the 3D-pose from 2D-pose estimation. We evaluate the approach using different query keywords representing different postures.
(More)