skip to main content
10.1145/2396761.2398694acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Topic based pose relevance learning in dance archives

Published: 29 October 2012 Publication History

Abstract

This paper improves spatial pyramid kernel (SPK) and proposes a relevance learning approach to compare performer's poses in a large dance archive, the NRCD collection1. Domain knowledge of Choreutics is exploited to define pose topics and a selection operator is developed for pose topic matching. The visual structure descriptor of self similarity (SSF) is extended to hierarchical self similarity (HSSF) to keep shape context. The framework of Bag-of-Visual Words (BOVW) is applied to encode as well as to speed up the matching on pose topics/topic combinations. This alleviates the complexity in limb allocation which is infeasible in our data. Extensive experiments show that the new approach outperforms the original SPK in both precision and robustness.

References

[1]
A. Agarwal and B. Triggs. Recovering 3d human pose from monocular images. IEEE Trans. PAMI, 28(1):44--58, Jan. 2006.
[2]
A. Agarwal and B. Triggs. Multilevel image coding with hyperfeatures. International Journal of Computer Vision, 78(1):15--27, 2008.
[3]
S. Battiato, G. M. Farinella, G. Gallo, and D. Ravé. Spatial hierarchy of textons distributions for scene classification. In MMM '09: Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling, pages 333--343, Berlin, Heidelberg, 2008. Springer-Verlag.
[4]
A. Bissacco, M. H. Yang, and S. Soatto. Detecting humans via their pose. In NIPS.
[5]
M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri. Actions as space-time shapes. In ICCV.
[6]
A. Bosch, A. Zisserman, and X. Munoz. Representing shape with a spatial pyramid kernel. In CIVR, pages 401--408, 2007.
[7]
N. Dalal, B. Triggs, and C. Schmid. Human detection using oriented histograms of flow and appearance. In European Conference on Computer Vision, 2006.
[8]
A. Elgammal and C. Lee. Inferring 3d body pose from silhouettes using activity manifold learning. In CVPR.
[9]
L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer Vision and Image Understanding, 106(1):59--70, 2007.
[10]
V. Ferrari, M. Marin-Jimenez, and A. Zisserman. Progressive search space reduction for human pose estimation. In CVPR.
[11]
M. Germann, T. Popa, R. Ziegler, R. Keiser, and M. Gross. Space-time body pose estimation in uncontrolled environments. In 3DIMPVT 2011, 2011.
[12]
K. Hachimura. Digital archiving of dancing. Review of the National Center for Digitization, 8:51--66, 2006.
[13]
Y.-G. Jiang and C.-W. Ngo. Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval. Computer Vision and Image Understanding, 113(3):405--414, 2009.
[14]
R. Kannan, F. Andres, and C. Guetl. Danvideo: an mpeg-7 authoring and retrieval system for dance videos. MTAP, 46:545--572, 2010. 10.1007/s11042-009-0388--3.
[15]
Y. Ke, R. Sukthankar, and M. Hebert. Efficient visual event detection using volumetric features. In ICCV.
[16]
R. Laban. Choreutics. Maconald and Evans, 9 John Street, London W.C.I, lisa ullmann edition, 1966.
[17]
S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognising natural scene categories. In CVPR 2006, pages 2169--2178, New York, June 2006. IEEE Press.
[18]
D. Liu, G. Hua, P. Viola, and T. Chen. Integrated feature selection and higher order spatial feature extraction for object categorisation. In CVPR 2008, pages 1--8, 2008.
[19]
W. L. Lu and J. J. Little. Simultaneous tracking and action recognition using the pca-hog descriptor. In CRV.
[20]
F. Lv and R. nevatia. Single view human action recognition using key pose matching and viterbi path searching. In CVPR, pages 1--8, 2007.
[21]
M. Marszalek, C. Schmid, H. Harzallah, and J. van de Weijer. Learning representations for visual object class recognition. In ICCV, 2007.
[22]
S. Savarese, J. Winn, and A. Criminisi. Discriminative object class models of appearance and shape by correlatons. In CVPR '06: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 2033--2040, Washington, DC, USA, 2006. IEEE Computer Society.
[23]
E. Shechtman and M. Irani. Matching local self-similarities across images and videos. In CVPR 2007, pages 1--8, jun. 2007.
[24]
S. Smoliar and H. Zhang. Content based video indexing and retrieval. Multimedia, IEEE Transactions on, 1(2):62--72, 1994.
[25]
C. G. M. Snoek, K. E. A. van de Sande, O. de Rooij, B. Huurnink, J. van Gemert, J. R. R. Uijlings, J. He, X. Li, I. Everts, V. Nedovic, M. van Liempt, R. van Balen, M. de Rijke, J.-M. Geusebroek, T. Gevers, M. Worring, A. W. M. Smeulders, D. Koelma, F. Yan, M. A. Tahir, K. Mikolajczyk, and J. Kittler. The mediamill trecvid 2009 semantic video search engine. In TRECVID, 2009.
[26]
J. W. H. Tangelder and R. C. Veltkamp. A survey of content based 3d shape retrieval methods. MTAP, 39(3):441--471, 2008.
[27]
C. Thurau and V. Hlavac. Pose primitive based human action recognition in videos or still images. In CVPR, pages 1--8, 2008.
[28]
J. R. R. Uijlings, A. W. M. Smeulders, and R. J. H. Scha. Real-time bag of words, approximately. In CIVR 2009, pages 1--8, New York, NY, USA, 2009. ACM.
[29]
S. Zhang, Q. Tan, G. Hua, Q. Huang, and S. Li. Descriptive visual words and visual phrases for image applications. In ACM Multimedia 2009, pages 75--84, Beijing, China, Oct 2009. ACM.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
October 2012
2840 pages
ISBN:9781450311564
DOI:10.1145/2396761
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dance retrieval
  2. pose relevance learning
  3. spatial pyramid kernel

Qualifiers

  • Poster

Conference

CIKM'12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 115
    Total Downloads
  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media