skip to main content
10.1145/3123266.3123273acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Temporal Binary Coding for Large-Scale Video Search

Published: 19 October 2017 Publication History

Abstract

Recent years have witnessed the success of the emerging hash-based approximate nearest neighbor search techniques in large-scale image retrieval. However, for large-scale video search, most of the existing hashing methods mainly focus on the visual content contained in the still frames, without considering their temporal relations. Therefore, they usually suffer greatly from the insufficient capability of capturing the intrinsic video similarities, from both the visual and the temporal aspects. To address the problem, we propose a temporal binary coding solution in an unsupervised manner, which simultaneously considers the intrinsic relations among the visual content and the temporal consistency among the successive frames. To capture the inherent data similarities among videos, we adopt the sparse, nonnegative feature to characterize the common local visual content and approximate their intrinsic similarities using a low-rank matrix. Then a standard graph-based loss is adopted to guarantee that the learnt hash codes can well preserve the similarities. Furthermore, we introduce a subspace rotation to model the small variation among the successive frames, and thus essentially preserve the temporal consistency in Hamming space. Finally, we formulate the video hashing problem as a joint learning of the binary codes, the hash functions and the temporal variation, and devise an alternating optimization algorithm that enjoys fast training and discriminative hash functions. Extensive experiments on three large video datasets demonstrate the proposed method significantly outperforms a number of state-of-the-art hashing methods.

References

[1]
Liangliang Cao, Zhenguo Li, Yadong Mu, and Shih-Fu Chang. 2012. Submodular Video Hashing: A Unified Framework Towards Video Pooling and Indexing ACM MM. 299--308.
[2]
Jian Cheng, Cong Leng, Jiaxiang Wu, Hainan Cui, and Hanqing Lu. 2014. Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction IEEE CVPR. 4321--4328.
[3]
Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S. Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions SCG. 253--262.
[4]
Thomas Dean, Mark Ruzon, Mark Segal, Jon Shlens, Sudheendra Vijayanarasimhan, and Jay Yagnik. 2013. Fast, Accurate Detection of 100,000 Object Classes on a Single Machine IEEE CVPR. 1--8.
[5]
Yunchao Gong and S. Lazebnik. 2011. Iterative quantization: A procrustean approach to learning binary codes IEEE CVPR. 817--824.
[6]
Junfeng He, Jinyuan Feng, Xianglong Liu, Tao Cheng, Tai-Hsu Lin, Hyunjin Chung, and Shih-Fu Chang. 2012. Mobile Product Search with Bag of Hash Bits and Boundary Reranking IEEE CVPR. 3005--3012.
[7]
Kaiming He, Fang Wen, and Jian Sun. 2013. K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes. In IEEE CVPR. 2938--2945.
[8]
Jae-Pil Heo, Youngwoon Lee, Junfeng He, Shih-Fu Chang, and Sung-Eui Yoon. 2012. Spherical hashing IEEE CVPR. 2957--2964.
[9]
Piotr Indyk and Rajeev Motwani. 1998. Approximate nearest neighbors: towards removing the curse of dimensionality ACM STOC. 604--613.
[10]
Prateek Jain, Sudheendra Vijayanarasimhan, and Kristen Grauman. 2010. Hashing Hyperplane Queries to Near Points with Applications to Large-Scale Active Learning. Advances in Neural Information Processing Systems. 928--936.
[11]
X. Li, G. Lin, C. Shen, A. van den Hengel, and A. Dick. 2013. Learning hash functions using column generation. ICML.
[12]
Yan Li, Ruiping Wang, Zhiwu Huang, Shiguang Shan, and Xilin Chen. 2015. Face video retrieval with image query via hashing across Euclidean space and Riemannian manifold. IEEE CVPR Vol. 00 (2015), 4758--4767.

Cited By

View all
  • (2023)Contrastive Transformer Hashing for Compact Video RepresentationIEEE Transactions on Image Processing10.1109/TIP.2023.332699432(5992-6003)Online publication date: 2023
  • (2021)Boosting Temporal Binary Coding for Large-Scale Video SearchIEEE Transactions on Multimedia10.1109/TMM.2020.297859323(353-364)Online publication date: 2021
  • (2021)Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video RetrievalIEEE Transactions on Image Processing10.1109/TIP.2020.304868030(2989-3004)Online publication date: 2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '17: Proceedings of the 25th ACM international conference on Multimedia
October 2017
2028 pages
ISBN:9781450349062
DOI:10.1145/3123266
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. binary code learning
  2. large-scale video search
  3. locality sensitive hashing
  4. temporal consistency

Qualifiers

  • Research-article

Funding Sources

  • the Foundation of State Key Lab of Software Development Environment
  • Beijing Municipal Science and Technology Commission
  • the National Natural Science Foundation of China

Conference

MM '17
Sponsor:
MM '17: ACM Multimedia Conference
October 23 - 27, 2017
California, Mountain View, USA

Acceptance Rates

MM '17 Paper Acceptance Rate 189 of 684 submissions, 28%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Contrastive Transformer Hashing for Compact Video RepresentationIEEE Transactions on Image Processing10.1109/TIP.2023.332699432(5992-6003)Online publication date: 2023
  • (2021)Boosting Temporal Binary Coding for Large-Scale Video SearchIEEE Transactions on Multimedia10.1109/TMM.2020.297859323(353-364)Online publication date: 2021
  • (2021)Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video RetrievalIEEE Transactions on Image Processing10.1109/TIP.2020.304868030(2989-3004)Online publication date: 2021
  • (2021)Learning a maximized shared latent factor for cross-modal hashingKnowledge-Based Systems10.1016/j.knosys.2021.107252228(107252)Online publication date: Sep-2021
  • (2019)Estimation of gait normality index based on point clouds through deep auto-encoderEURASIP Journal on Image and Video Processing10.1186/s13640-019-0466-z2019:1Online publication date: 28-May-2019
  • (2019)Fast distributed video deduplication via locality-sensitive hashing with similarity rankingEURASIP Journal on Image and Video Processing10.1186/s13640-019-0442-72019:1Online publication date: 5-Mar-2019
  • (2017)VSCC'2017Proceedings of the 25th ACM international conference on Multimedia10.1145/3123266.3132053(1976-1977)Online publication date: 23-Oct-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media