skip to main content
10.1145/2393347.2396381acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

Human action recognition and retrieval using sole depth information

Published: 29 October 2012 Publication History

Abstract

Observing the widespread use of Kinect-like depth cameras, in this work, we investigate into the problem of using sole depth data for human action recognition and retrieval in videos. We proposed the use of simple depth descriptors without learning optimization to achieve promising performances as compatible to those of the leading methods based on color images and videos, and can be effectively applied for real-time applications. Because of the infrared nature of depth cameras, the proposed approach will be especially useful under poor lighting conditions, e.g. the surveillance environments without sufficient lighting. Meanwhile, we proposed a large Depth-included Human Action video dataset, namely DHA, which contains 357 videos of performed human actions belonging to 17 categories. To the best of our knowledge, the DHA is one of the largest depth-included video datasets of human actions.

References

[1]
W. Brendel and S. Todorovic. Activities as time series of human postures. In Proceedings of the 11th European conference on Computer vision: Part II, ECCV'10, pages 721--734, Berlin, Heidelberg, 2010. Springer-Verlag.
[2]
L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri. Actions as space-time shapes. Transactions on Pattern Analysis and Machine Intelligence, 29(12):2247--2253, December 2007.
[3]
T. Ojala, M. Pietikainen, and T. Maenpaa. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 24(7):971--987, jul 2002.
[4]
OpenNI organization. OpenNI User Guide, November 2010. http://www.openni.org/documentation.
[5]
PrimeSense Inc. Prime Sensor™ NITE 1.3 Algorithms notes, 2010. http://www.primesense.com.
[6]
C. Schuldt, I. Laptev, and B. Caputo. Recognizing human actions: a local svm approach. In Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, volume 3, pages 32--36 Vol.3, aug. 2004.
[7]
X. Wu, D. Xu, L. Duan, and J. Luo. Action recognition using context and appearance distribution features. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 489--496, june 2011.
[8]
M.-C. Yeh and K.-T. Cheng. A string matching approach for visual retrieval and classification. In Proceedings of the 1st ACM international conference on Multimedia information retrieval, MIR'08, pages 52--58, New York, NY, USA, 2008. ACM.

Cited By

View all
  • (2025)Federated Multi-View K-Means ClusteringIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.352070847:4(2446-2459)Online publication date: Apr-2025
  • (2024)Simple contrastive multi-view clustering with data-level fusionProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/519(4697-4705)Online publication date: 3-Aug-2024
  • (2024)SCAE: Structural Contrastive Auto-Encoder for Incomplete Multi-View Representation LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367207820:9(1-24)Online publication date: 7-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. depth information
  2. human action recognition
  3. human action video retrieval

Qualifiers

  • Poster

Conference

MM '12
Sponsor:
MM '12: ACM Multimedia Conference
October 29 - November 2, 2012
Nara, Japan

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)2
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Federated Multi-View K-Means ClusteringIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.352070847:4(2446-2459)Online publication date: Apr-2025
  • (2024)Simple contrastive multi-view clustering with data-level fusionProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/519(4697-4705)Online publication date: 3-Aug-2024
  • (2024)SCAE: Structural Contrastive Auto-Encoder for Incomplete Multi-View Representation LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367207820:9(1-24)Online publication date: 7-Jun-2024
  • (2024)Learning Common Semantics via Optimal Transport for Contrastive Multi-View ClusteringIEEE Transactions on Image Processing10.1109/TIP.2024.343661533(4501-4515)Online publication date: 2024
  • (2024)Self‐supervised multi‐view clustering in computer visionIET Computer Vision10.1049/cvi2.1229918:6(709-734)Online publication date: 2-Jul-2024
  • (2024)Deep incomplete multi-view clustering via attention-based direct contrastive learningExpert Systems with Applications10.1016/j.eswa.2024.124745255(124745)Online publication date: Dec-2024
  • (2024)Multimodal vision-based human action recognition using deep learning: a reviewArtificial Intelligence Review10.1007/s10462-024-10730-557:7Online publication date: 19-Jun-2024
  • (2023)Spatio-Temporal Information Fusion and Filtration for Human Action RecognitionSymmetry10.3390/sym1512217715:12(2177)Online publication date: 8-Dec-2023
  • (2023)Human Behavior Recognition via Hierarchical Patches Descriptor and Approximate Locality-Constrained Linear CodingSensors10.3390/s2311517923:11(5179)Online publication date: 29-May-2023
  • (2023)An Overview of Facial Micro-Expression Analysis: Data, Methodology and ChallengeIEEE Transactions on Affective Computing10.1109/TAFFC.2022.314310014:3(1857-1875)Online publication date: 1-Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media