Skip to main content
Log in

Perceptual video hashing based on Tucker decomposition with application to indexing and retrieval of near-identical videos

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The perceptual video hash function defines a feature vector that characterizes a video depending on its perceptual contents. This function must be robust to the content preserving manipulations and sensitive to the content changing manipulations. In the literature, the subspace projection techniques such as the reduced rank PARAllel FACtor analysis (PARAFAC), have been successfully applied to extract perceptual hash for the videos. We propose a robust perceptual video hash function based on Tucker decomposition, a multi-linear subspace projection method. We also propose a method to find the optimum number of components in the factor matrices of the Tucker decomposition. The Receiver Operating Characteristics (ROC) curves are used to evaluate the performance of the proposed algorithm compared to the other state-of-the-art projection techniques. The proposed algorithm shows superior performance for most of the image processing attacks. An application for indexing and retrieval of near-identical videos is developed using the proposed algorithm and the performance is evaluated using average recall/precision curves. The experimental results show that the proposed algorithm is suitable for indexing and retrieval of near-identical videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Abdallah EE, Hamza AB, Bhattacharya P (2007) MPEG video watermarking using tensor singular value decomposition. In: Proceedings of the 4th international conference on image analysis and recognition. Springer-Verlag

  2. Achlioptas D (2001) Database-friendly random projections. ACM Press, pp 274–281

  3. Achlioptas D (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 66:671–687

    Article  MathSciNet  MATH  Google Scholar 

  4. Ailon N, Chazelle B (2006) Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In: Proceedings of the thirty-eighth annual ACM symposium on theory of computing. ACM, pp 557–563

  5. Ailon N, Chazelle B (2009) The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J Comput 39:302–322

    Article  MathSciNet  MATH  Google Scholar 

  6. Baranyi P (2004) TP model transformation as a way to LMI-based controller design. IEEE Trans Indust Electron 51:387–400

    Article  Google Scholar 

  7. Bergqvist G, Larsson EG (2010) The higher-order singular value decomposition: theory and an application [lecture notes]. IEEE Signal Process Mag 27:151–154

    Article  Google Scholar 

  8. Cichocki A, Zdunek R, Phan AH, Ichi Amari S (2009) Nonnegative matrix and tensor factorizations - applications to exploratory multi-way data analysis and blind source separation. Wiley

  9. Coskun B, Sankur B (2004) Robust video hash extraction. In: Proceedings of the 12th IEEE signal processing and communications applications conference, pp 292–295

  10. Coskun B, Sankur B, Memon N (2006) Spatio-temporal transform based video hashing. IEEE Trans Multimed 8:1190–1208

    Article  Google Scholar 

  11. De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21:1253–1278

    Article  MathSciNet  MATH  Google Scholar 

  12. DeMenthon D, Doermann D (2003) Video retrieval of near−duplicates using k−Nearest neighbor retrieval of spatio −temporal descriptors. In: ACM multimedia ’03. Berkeley, pp 508–517

  13. De Roover C, De Vleeschouwer C, Lefebvre F, Macq B (2005) Robust video hashing based on radial projections of key frames 153:4020–4037

  14. Dittmann J, Steinmetz A, Steinmetz R (1999) Content-based digital signature for motion pictures authentication and content-fragile watermarking. IEEE Int Conf Multimed Comput Syst: 209–213

  15. Fawcett T (2006) An introduction to ROC analysis. Pattern Recog Lett 27:861–874

    Article  Google Scholar 

  16. Hamon K, Schmucker M, Zhou X (2006) Histogram-based perceptual hashing for minimally changing video sequences. In: Proceedings of the second international conference on automated production of cross media content for multi-channel distribution. IEEE Computer Society, pp 236–241

  17. Henrion R (1994) N-way principal component analysis theory, algorithms and applications. J Chemometrics Intell Lab Syst 25:1–23

    Article  Google Scholar 

  18. Jiang Y-G, Jiang Y, Wang J (2014) VCDB: a large-scale database for partial copy detection in videos. In: European conference on computer vision (ECCV)

  19. Kiers HAL, Mechelen VI (2001) Three-way component analysis: principles and illustrative application. J Psychol Methods 6:84–110

    Article  Google Scholar 

  20. Kolda TG, Bader BW (2009) Tensor decompositions and applications. J SIAM Rev 51:455–500

    Article  MathSciNet  MATH  Google Scholar 

  21. Kroonenberg PM, Leeuw J (1980) Principal component analysis of three-mode data by means of alternating least squares algorithms. J Psychometrika 45:69–97

    Article  MathSciNet  MATH  Google Scholar 

  22. Lan Z, Jiang L, Yu S-I, Gao C, Rawat S, Cai Y, Shicheng X, Shen H, Li X, Wang Y, Sze W, Yan Y, Ma Z, Ballas N, Meng D, Tong W, Yi Y, Burger S, Metze F, Singh R, Raj B, Stern R, Mitamura T, Nyberg E, Hauptmann A Informedia E-Lamp @ TRECVID 2013 multimedia event detection and recounting (MED and MER), TRECVID

  23. Lee S, Yoo CD (2008) Robust video fingerprinting for content-based video identification. IEEE Trans Circuits Syst Video Technol 18:983–988

    Article  Google Scholar 

  24. Lee S, Yoo CD (2008) Robust video fingerprinting based on affine covariant regions. In: IEEE international conference on acoustics, speech and signal processing ICASSP 2008, pp 1237–1240

  25. Li M, Monga V (2011) Desynchronization resilient video fingerprinting via randomized, low-rank tensor approximations. In: IEEE 13th international workshop on multimedia signal processing (MMSP), pp 1–6

  26. Li M, Monga V (2012) Robust video hashing via multilinear subspace projections. IEEE Transactions on J Image Process 21:4397–4409

    Article  MathSciNet  Google Scholar 

  27. Li Z, Zhu M (2013) A light-weight relevance feedback solution for large scale content-based video retrieval. IJCSI Int J Comput Sci Issues 10(13):382–387

    Google Scholar 

  28. Lu H, Plataniotis K N, Venetsanopoulos AN (2006) Multilinear principal component analysis of tensor objects for recognition. In: Proceedings of the 18th international conference on pattern recognition, vol 02. IEEE Computer Society, pp 776–779

  29. Lv X, Wang ZJ (2012) Perceptual image hashing based on shape contexts and local feature points. IEEE Trans J Inf Forensics Secur 7:1081–1093

    Article  Google Scholar 

  30. Muti D, Bourennane S (2005) Multidimensional filtering based on a tensor approach. J Signal Process 85:2338–2353

    Article  MATH  Google Scholar 

  31. Omberg L, Golub GH, Alter O (2007) A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies. Proc Nat Acad Sci 104:18371–18376

    Article  Google Scholar 

  32. Roover C De, Vleeschouwer CDe, Lefebvre F, Macq BM (2005) Robust image hashing based on radial variance of pixels. Int Conf Image Process:77–80

  33. Sandeep R, Bora PK (2013) Perceptual video hashing based on the Achlioptas’s random projections. In: fourth national conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG 2013), pp 1–4

  34. Savas B, Eldén L (2007) Handwritten digit classification using higher order singular value decomposition. J Pattern Recog 40:993–1003

    Article  MATH  Google Scholar 

  35. Schneider M, Chang S-F (1996) A robust content based digital signature for image authentication, pp 227–230

  36. Singhal A (2001) Modern information retrieval: a brief overview. Bulletin IEEE Comput Soc Technical Committee Data Eng 24:35–43

    Google Scholar 

  37. Tucker LR (1963) Implications of factor analysis of three-way matrices for measurement of change, Problems in measuring change. University of Wisconsin Press, pp 122–137

  38. Tucker LR (1966) Some mathematical notes on three-mode factor analysis. J Psychometrika 31: 279–311

    Article  MathSciNet  Google Scholar 

  39. Vasilescu AMO, Terzopoulos D (2002) Multilinear analysis of image ensembles: tensor faces. In: Proceedings of the European conference on computer vision, pp 447–460

  40. Venkatesan R, Koon S-M, Jakubowski MH, Moulin P (2000) Robust image hashing. Proc Int Conf Image Process 3:664–666

    Google Scholar 

  41. Yan Y, Shen H, Liu G, Ma Z, Gao C, Sebe N (2014) GLocal tells you more: coupling GLocal structural for feature selection with sparsity for image and video classification. J Comput Vis Image Understand 124:99–109

    Article  Google Scholar 

  42. Yan Y, Yang Y, Shen H, Meng D, Liu G, Hauptmann A, Sebe N (2015) Event oriented dictionary learning for complex event detection. IEEE Trans Image Process 24(6):1867–1878

    Article  MathSciNet  Google Scholar 

  43. Yan Y, Yang Y, Shen H, Meng D, Liu G, Hauptmann A, Sebe N (2015) Complex event detection via event oriented dictionary learning. In: AAAI conference on artificial intelligence

  44. Zhou B, Yao Y (2010) Evaluating information retrieval system performance based on user preference. J Intell Inf Syst 34:227–248

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sandeep R..

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

R., S., Sharma, S., Thakur, M. et al. Perceptual video hashing based on Tucker decomposition with application to indexing and retrieval of near-identical videos. Multimed Tools Appl 75, 7779–7797 (2016). https://doi.org/10.1007/s11042-015-2695-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2695-1

Keywords

Navigation