Spatio-Temporal Tube data representation and Kernel design for SVM-based video object retrieval system

Zhao, Shuji; Precioso, Frédéric; Cord, Matthieu

doi:10.1007/s11042-010-0602-3

Spatio-Temporal Tube data representation and Kernel design for SVM-based video object retrieval system

Published: 18 September 2010

Volume 55, pages 105–125, (2011)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shuji Zhao¹,
Frédéric Precioso¹ &
Matthieu Cord²

194 Accesses
7 Citations
Explore all metrics

Abstract

In this article, we propose a new video object retrieval system. Our approach is based on a Spatio-Temporal data representation, a dedicated kernel design and a statistical learning toolbox for video object recognition and retrieval. Using state-of-the-art video object detection algorithms (for faces or cars, for example) we segment video object tracks from real movies video shots. We then extract, from these tracks, sets of spatio-temporally coherent features that we call Spatio-Temporal Tubes. To compare these complex tube objects, we design a Spatio-Temporal Tube Kernel (STTK) function. Based on this kernel similarity we present both supervised and active learning strategies embedded in Support Vector Machine framework. Additionally, we propose a multi-class classification framework dealing with unbalanced data. Our approach is successfully evaluated on two real movies databases, the french movie “L’esquive” and episodes from “Buffy, the Vampire Slayer” TV series. Our method is also tested on a car database (from real movies) and shows promising results for car identification task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Video Event Detection Using Kernel Support Vector Machine with Isotropic Gaussian Sample Uncertainty (KSVM-iGSU)

Video Segmentation Framework Based on Multi-kernel Representations and Feature Relevance Analysis for Object Classification

Incremental Slow Feature Analysis with Indefinite Kernel for Online Temporal Video Segmentation

References

Apostoloff NE, Zisserman A (2007) Who are you? Real-time person identification. In: BMVC
Cour T, Sapp B, Jordan C, Taskar B (2009) Learning from ambiguously labeled images. In: CVPR
Ekenel HK, Stiefelhagen R (2009) Why is facial occlusion a challenging problem? In: Intl. conf. on biometrics (ICB’09). LNCS, vol 5558. Alghero, Italy, pp 299–308
Google Scholar
Everingham M, Sivic J, Zisserman A (2006) Hello! my name is... Buffy—automatic naming of characters in tv video. In: BMVC
Gosselin PH, Cord M (2008) Active learning methods for interactive image retrieval. IEEE Trans Image Process 17(7):1200–1211
Article MathSciNet Google Scholar
Guillaumin M, Mensink T, Verbeek J, Schmid C (2008) Automatic face naming with caption-based supervision. In: CVPR, pp 1–8
Kapoor A, Grauman K, Urtasun R, Darrell T (2007) Active learning with Gaussian processes for object categorization. In: ICCV
Kumar N, Belhumeur P, Nayar SK (2008) Face tracer: a search engine for large collections of images with faces. In: ECCV
Lienhart R, Maydt J (2002) An extended set of haar-like features for rapid object detection. In: ICIP, vol 1, pp I–900–I–903
Lowe D (2003) Distinctive image features from scale-invariant keypoints. In: IJCV, vol 20, pp 91–110
Lyu S (2004) Mercer kernels for object recognition with local features. In: Technical report TR2004-520. Dartmouth College
Morik K, Brockhausen P, Joachims T (1999) Combining statistical learning with a knowledge-based approach—a case study in intensive care monitoring. In: ICML, pp 268–277
Osuna EE, Freund R, Girosi F (1997) Support vector machines: training and applications. Tech. rep., AI Memo 1602, MIT
Platt J (1998) Fast training of support vector machines using sequential minimal optimization. MIT Press, Cambridge
Google Scholar
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: ICML
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Book Google Scholar
Sivic J, Everingham M, Zisserman A (2009) “Who are you?”—learning person specific classifiers from video. In: CVPR
Tong S, Koller D (2001) Support vector machine active learning with application to text classification. JMLR 2:45–66
Article Google Scholar
Vedaldi A. http://www.vlfeat.org/~vedaldi/code/siftpp.html
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR
Wallraven C, Caputo B, Graf A (2003) Recognition with local features: the kernel recipe. In: ICCV, vol 2, pp 257–264
Wu G, Chang E (2003) Class-boundary alignment for imbalanced dataset learning
Yan R, Yang J, Hauptmann A (2003) Automatically labeling video data using multi-class active learning. In: ICCV
Zhao S, Precioso F, Cord M, Philipp-Foliguet S (2008) Actor retrieval system based on kernels on bags of bags. In: EUSIPCO, Lausanne, Switzerland
Google Scholar
Zhao S, Precioso F, Cord M (2009) Spatio-temporal tube kernel for actor retrieval. In: ICIP, Cairo, Egypt

Download references

Acknowledgements

We want here to thank a lot, Andrew Zisserman and Josef Sivic for providing us the data to compare our results with theirs and for the very interesting exchanges we had on this work. We also want to thank Philippe-Henri Gosselin for providing the codes of kernel-based SVM with active learning within the retrieval system RETIN.

Author information

Authors and Affiliations

ETIS Lab, CNRS/ENSEA/Univ Cergy-Pontoise, 6, av. du Ponceau, 95000, Cergy-Pontoise, France
Shuji Zhao & Frédéric Precioso
UPMC-Sorbonne Universités – LIP6, 4, place Jussieu, 75005, Paris, France
Matthieu Cord

Authors

Shuji Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Precioso
View author publications
You can also search for this author in PubMed Google Scholar
Matthieu Cord
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuji Zhao.

Additional information

This work is funded by Région Île-de-France, project k-VideoScan 2007-34HD Digiteo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhao, S., Precioso, F. & Cord, M. Spatio-Temporal Tube data representation and Kernel design for SVM-based video object retrieval system. Multimed Tools Appl 55, 105–125 (2011). https://doi.org/10.1007/s11042-010-0602-3

Download citation

Published: 18 September 2010
Issue Date: October 2011
DOI: https://doi.org/10.1007/s11042-010-0602-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Spatio-Temporal Tube data representation and Kernel design for SVM-based video object retrieval system

Abstract

Access this article

Similar content being viewed by others

Video Event Detection Using Kernel Support Vector Machine with Isotropic Gaussian Sample Uncertainty (KSVM-iGSU)

Video Segmentation Framework Based on Multi-kernel Representations and Feature Relevance Analysis for Object Classification

Incremental Slow Feature Analysis with Indefinite Kernel for Online Temporal Video Segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatio-Temporal Tube data representation and Kernel design for SVM-based video object retrieval system

Abstract

Access this article

Similar content being viewed by others

Video Event Detection Using Kernel Support Vector Machine with Isotropic Gaussian Sample Uncertainty (KSVM-iGSU)

Video Segmentation Framework Based on Multi-kernel Representations and Feature Relevance Analysis for Object Classification

Incremental Slow Feature Analysis with Indefinite Kernel for Online Temporal Video Segmentation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation