Personalized video similarity measure

Shen, Jialie; Cheng, Zhiyong

doi:10.1007/s00530-010-0223-8

Personalized video similarity measure

Interactive Multimedia Computing
Published: 23 December 2010

Volume 17, pages 421–433, (2011)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Jialie Shen¹ &
Zhiyong Cheng¹

531 Accesses
21 Citations
Explore all metrics

Abstract

As an effective technique to manage and explore large scale of video collections, personalized video search has received great attentions in recent years. One of the key problems in the related technique development is how to design and evaluate the similarity measures. Most of the existing approaches simply adopt traditional Euclidean distance or its variants. Consequently, they generally suffer from two main disadvantages: (1) low effectiveness—retrieval accuracy is poor. One of main reasons is that very little research has been carried out on designing an effective fusion scheme for integrating multimodal information (e.g., text, audio and visual) from video sequences and (2) poor scalability—development process of the video similarity metrics is largely disconnected from that of the relevant database access methods (indexing structures). This article reports a new distance metric called personalized video distance to effectively fuse information about individual preference and multimodal properties into a compact signature. Moreover, a novel hashing-based indexing structure has been designed to facilitate fast retrieval process and better scalability. A set of comprehensive empirical studies have been carried out based on two large video test collections and carefully designed queries with different complexities. We observe significant improvements over the existing techniques on various aspects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video Similarity Measurement and Search

Perfect Match in Video Retrieval

VERGE: A Multimodal Interactive Video Search Engine

Notes

Variance must be larger than 80%.

References

Special issue on keeping, refinding, and sharing personal information. ACM Trans. Inf. Syst. (2008)
Aggarwal, C.C.: On the effects of dimensionality reduction on high dimensional similarity search. In: Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (POSD) (2001)
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional spaces. In: Proceedings of the 8th International Conference on Database Theory (ICDT) (2001)
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proc. of ACM FOCS (2006)
Berchtold, S., Keim, D.A., Kriegel, H.: The x-tree : An index structure for high-dimensional data. In: Proceedings of 22th International Conference on Very Large Data Bases (VLDB’96) pp. 28–39 (1996)
Blei, D., Jordan, M.: Modeling annotated data. In: Proc. of ACM SIGIR (2003)
Böhm, C., Berchtold, S., Keim, D.A.: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3), (2001)
Chang, H.S., Sull, S., Lee, S.U.: Efficient video indexing scheme for content-based retrieval. IEEE Trans. Circuits Syst. Video Technol. 9(8), 1269–1279 (1999)
Article Google Scholar
Chen, L., Chua, T.-S.: A match and tiling approach to content-based video retrieval. In: Proceeding of ICME (2001)
Cherubini, M., de Oliveira, R., Oliver, N.: Understanding near-duplicate videos: a user-centric approach. In: ACM Multimedia (2009)
Cheung, S., Zakhor, A.: Efficient video similarity measurement with video signature. IEEE Trans. Circuits Syst. Video Technol. 13(1), (2003)
Chiu, C.-Y., Li, C.-H., Wang, H.-A., Chen, C.-S., Chien, L.-F.: A time warping based approach for video copy detection. In: Proceeding of ICPR (2006)
O’Toole, C., Smeaton, A., Murphy, N., Marlow, S.: Evaluation of shot boundary detection on a large video test suite. In: Proc. of Challenges in Image Retrieval (1999)
Dadason, K., Lejsek, H., Ásmundsson, F., Jónsson, B., Amsaleg, L.: Videntifier: identifying pirated videos in real-time. In: Proceedings of ACM the 15th International Conference on Multimedia, pp. 471–472 (2007)
Divakaran, A., Radhakrishnan, R., Peker, K.A.: Motion activity-based extraction of key-frames from video shots. In: Proceeding of the IEEE International Conference on Image Processing (2002)
Fahlman, S.: An empirical study of learning speed for back-propagation networks. Technical report, Technical Report CMU-CS 88-162, Carnegie-Mellon University (1988)
Feng, S., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. In: Proc. of the International Conference on Computer Vision and Pattern Recognition (CVPR) (2004)
Ferman, A.M., Tekalp, A.M.: Two-stage hierarchical video summary extraction to match low-level user browsing preferences. IEEE Trans. Multimed. 5(2), 244–256 (2003)
Article Google Scholar
Gibbon D. (2005) Introduction to video search engines (tutorial). In: Proc. of WWW
Gonzalez, R.C., Woods, R.E.: Digital Image Processing. Prentice Hall (2002)
Haghani, P., Michel, S., Cudré-Mauroux, P., Aberer, K.: Lsh at large—distributed knn search in high dimensions. In: WebDB (2008)
Haykin, S.: Neural Networks: A Comprehensive Foundation. Macmillan Publishing (1994)
Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the nearest neighbor in high dimensional spaces? In: Proceedings of 26th International Conference on Very Large Data Bases (VLDB) (2000)
Hoad, T., Zobel, J.: Detection of video sequences using compact signatures. ACM Trans. Inf. Syst. 24(1) (2006)
Jagadish, H.V., Ooi, B.C., Tan, K.-L., Yu, C., Zhang, R.: idistance: An adaptive b+-tree based indexing method for nearest neighbor search. ACM Trans. Database Syst. 30(2), 364–397 (2005)
Article Google Scholar
Li, Y., Zhang, T., Tretter, D.: An overview of video abstraction techniques. Technical report, HP Laboratory, (2001)
Lin, K.-I., Jagadish, H.V., Faloutsos, C.: The tv-tree: An index structure for high-dimensional data. VLDB J. 3(4), 517–542 (1994)
Article Google Scholar
Logan, B.: Mel frequency cepstral coefficients for music modeling. In: Proc. of the ISMIR (2000)
Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Acoust., Speech, Signal (2006)
Luo, H., Fan, J.: Building concept ontology for medical video annotation. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 57–60 (2006)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
OConnor, B.C.: Selecting key frames of moving image documents: A digital environment for analysis and navigation. Microcomput. Inf. Manag. 8(2), (1991)
Puzicha, J., Buhmann, J., Rubner, Y., Tomasi, C.: Empirical evaluation of dissimilarity measures for color and texture. In: Proc. of the International Conference on Computer Vision (ICCV) (1999)
Sakurai, Y., Yoshikawa, M., Uemura, S., Kojima, H.: The a-tree: An index structure for high-dimensional spaces using relative approximation. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB ’00), pp. 516–526 (2000)
Santini, S., Jain, R.: Similarity measures. IEEE Trans. Pattern Anal. Mach. Intell. 21(9), (1999)
Shen, J., Tao, D., Li, X.: Modality mixture projections for semantic video event detection. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1587–1596 (2008)
Article Google Scholar
Tao, Y., Yi, K., Sheng, C., Kalnis, P.: Efficient and accurate nearest neighbor and closest pair search in high-dimensional space. ACM Trans. Database Syst. 35(3), (2010)
Truong, B.T., Venkatesh, S.: Video abstraction: A systematic review and classification. ACM Transactions on Multimedia Computing, Communications and Applications 3(1), (2007)
Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. on Speech and Audio Processing (2002)
Wang, M., Hua, X.-S., Hong, R., Tang, J., Qi, G.-J., Song, Y.: Unified video annotation via multi-graph learning. IEEE Trans. Circuits Syst. Video Technol. 19(5), (2009)
Wang, M., Hua, X.-S., Tang, J., Hong, R.: Beyond distance measurement: Constructing neighborhood similarity for video annotation. IEEE Trans. Multimed. 11(3), (2009)
Zhang, B., Shen, J., Xiang, Q., Wang, Y.: Compositemap: A novel framework for music similarity measure. In: Proc. of ACM SIGIR (2009)
Zhang, H., Tan, S.Y., Smoliar, S.W., Gong, Y.: Automatic parsing and indexing of news video. Multimed. Syst. 2(6), 256–266 (1995)
Article Google Scholar
Zhu, X., Fan, J., Elmagarmid, A.K., Wu, X.: Hierarchical video content description and summarization using unified semantic and visual similarity. Multimed. Syst. 9(1), (2003)
Zhu, X., Wu, X., Fan, J., Elmagarmid, A.K., Aref, W.G.: Exploring video content structure for hierarchical summarization. Multimed. Syst. 10(2), 98–115 (2004)
Article Google Scholar

Download references

Acknowledgments

Jialie Shen was supported by the Lee Foundation Fellowship for Research Excellence (SMU Research Project Fund No: C220/T050024), Singapore. We would like to thank Professor Ramayya Krishnan at School of Information Systems and Management, Heinz College, Carnegie Mellon University, associate editors and referees for their valuable comments.

Author information

Authors and Affiliations

School of Information Systems, Singapore Management University, Singapore, 178902, Singapore
Jialie Shen & Zhiyong Cheng

Authors

Jialie Shen
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jialie Shen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shen, J., Cheng, Z. Personalized video similarity measure. Multimedia Systems 17, 421–433 (2011). https://doi.org/10.1007/s00530-010-0223-8

Download citation

Published: 23 December 2010
Issue Date: October 2011
DOI: https://doi.org/10.1007/s00530-010-0223-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Personalized video similarity measure

Abstract

Access this article

Similar content being viewed by others

Video Similarity Measurement and Search

Perfect Match in Video Retrieval

VERGE: A Multimodal Interactive Video Search Engine

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Personalized video similarity measure

Abstract

Access this article

Similar content being viewed by others

Video Similarity Measurement and Search

Perfect Match in Video Retrieval

VERGE: A Multimodal Interactive Video Search Engine

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation