Abstract
The perceptual video hash function defines a feature vector that characterizes a video depending on its perceptual contents. This function must be robust to the content preserving manipulations and sensitive to the content changing manipulations. In the literature, the subspace projection techniques such as the reduced rank PARAllel FACtor analysis (PARAFAC), have been successfully applied to extract perceptual hash for the videos. We propose a robust perceptual video hash function based on Tucker decomposition, a multi-linear subspace projection method. We also propose a method to find the optimum number of components in the factor matrices of the Tucker decomposition. The Receiver Operating Characteristics (ROC) curves are used to evaluate the performance of the proposed algorithm compared to the other state-of-the-art projection techniques. The proposed algorithm shows superior performance for most of the image processing attacks. An application for indexing and retrieval of near-identical videos is developed using the proposed algorithm and the performance is evaluated using average recall/precision curves. The experimental results show that the proposed algorithm is suitable for indexing and retrieval of near-identical videos.
Similar content being viewed by others
References
Abdallah EE, Hamza AB, Bhattacharya P (2007) MPEG video watermarking using tensor singular value decomposition. In: Proceedings of the 4th international conference on image analysis and recognition. Springer-Verlag
Achlioptas D (2001) Database-friendly random projections. ACM Press, pp 274–281
Achlioptas D (2003) Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J Comput Syst Sci 66:671–687
Ailon N, Chazelle B (2006) Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform. In: Proceedings of the thirty-eighth annual ACM symposium on theory of computing. ACM, pp 557–563
Ailon N, Chazelle B (2009) The fast Johnson-Lindenstrauss transform and approximate nearest neighbors. SIAM J Comput 39:302–322
Baranyi P (2004) TP model transformation as a way to LMI-based controller design. IEEE Trans Indust Electron 51:387–400
Bergqvist G, Larsson EG (2010) The higher-order singular value decomposition: theory and an application [lecture notes]. IEEE Signal Process Mag 27:151–154
Cichocki A, Zdunek R, Phan AH, Ichi Amari S (2009) Nonnegative matrix and tensor factorizations - applications to exploratory multi-way data analysis and blind source separation. Wiley
Coskun B, Sankur B (2004) Robust video hash extraction. In: Proceedings of the 12th IEEE signal processing and communications applications conference, pp 292–295
Coskun B, Sankur B, Memon N (2006) Spatio-temporal transform based video hashing. IEEE Trans Multimed 8:1190–1208
De Lathauwer L, De Moor B, Vandewalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21:1253–1278
DeMenthon D, Doermann D (2003) Video retrieval of near−duplicates using k−Nearest neighbor retrieval of spatio −temporal descriptors. In: ACM multimedia ’03. Berkeley, pp 508–517
De Roover C, De Vleeschouwer C, Lefebvre F, Macq B (2005) Robust video hashing based on radial projections of key frames 153:4020–4037
Dittmann J, Steinmetz A, Steinmetz R (1999) Content-based digital signature for motion pictures authentication and content-fragile watermarking. IEEE Int Conf Multimed Comput Syst: 209–213
Fawcett T (2006) An introduction to ROC analysis. Pattern Recog Lett 27:861–874
Hamon K, Schmucker M, Zhou X (2006) Histogram-based perceptual hashing for minimally changing video sequences. In: Proceedings of the second international conference on automated production of cross media content for multi-channel distribution. IEEE Computer Society, pp 236–241
Henrion R (1994) N-way principal component analysis theory, algorithms and applications. J Chemometrics Intell Lab Syst 25:1–23
Jiang Y-G, Jiang Y, Wang J (2014) VCDB: a large-scale database for partial copy detection in videos. In: European conference on computer vision (ECCV)
Kiers HAL, Mechelen VI (2001) Three-way component analysis: principles and illustrative application. J Psychol Methods 6:84–110
Kolda TG, Bader BW (2009) Tensor decompositions and applications. J SIAM Rev 51:455–500
Kroonenberg PM, Leeuw J (1980) Principal component analysis of three-mode data by means of alternating least squares algorithms. J Psychometrika 45:69–97
Lan Z, Jiang L, Yu S-I, Gao C, Rawat S, Cai Y, Shicheng X, Shen H, Li X, Wang Y, Sze W, Yan Y, Ma Z, Ballas N, Meng D, Tong W, Yi Y, Burger S, Metze F, Singh R, Raj B, Stern R, Mitamura T, Nyberg E, Hauptmann A Informedia E-Lamp @ TRECVID 2013 multimedia event detection and recounting (MED and MER), TRECVID
Lee S, Yoo CD (2008) Robust video fingerprinting for content-based video identification. IEEE Trans Circuits Syst Video Technol 18:983–988
Lee S, Yoo CD (2008) Robust video fingerprinting based on affine covariant regions. In: IEEE international conference on acoustics, speech and signal processing ICASSP 2008, pp 1237–1240
Li M, Monga V (2011) Desynchronization resilient video fingerprinting via randomized, low-rank tensor approximations. In: IEEE 13th international workshop on multimedia signal processing (MMSP), pp 1–6
Li M, Monga V (2012) Robust video hashing via multilinear subspace projections. IEEE Transactions on J Image Process 21:4397–4409
Li Z, Zhu M (2013) A light-weight relevance feedback solution for large scale content-based video retrieval. IJCSI Int J Comput Sci Issues 10(13):382–387
Lu H, Plataniotis K N, Venetsanopoulos AN (2006) Multilinear principal component analysis of tensor objects for recognition. In: Proceedings of the 18th international conference on pattern recognition, vol 02. IEEE Computer Society, pp 776–779
Lv X, Wang ZJ (2012) Perceptual image hashing based on shape contexts and local feature points. IEEE Trans J Inf Forensics Secur 7:1081–1093
Muti D, Bourennane S (2005) Multidimensional filtering based on a tensor approach. J Signal Process 85:2338–2353
Omberg L, Golub GH, Alter O (2007) A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies. Proc Nat Acad Sci 104:18371–18376
Roover C De, Vleeschouwer CDe, Lefebvre F, Macq BM (2005) Robust image hashing based on radial variance of pixels. Int Conf Image Process:77–80
Sandeep R, Bora PK (2013) Perceptual video hashing based on the Achlioptas’s random projections. In: fourth national conference on computer vision, pattern recognition, image processing and graphics (NCVPRIPG 2013), pp 1–4
Savas B, Eldén L (2007) Handwritten digit classification using higher order singular value decomposition. J Pattern Recog 40:993–1003
Schneider M, Chang S-F (1996) A robust content based digital signature for image authentication, pp 227–230
Singhal A (2001) Modern information retrieval: a brief overview. Bulletin IEEE Comput Soc Technical Committee Data Eng 24:35–43
Tucker LR (1963) Implications of factor analysis of three-way matrices for measurement of change, Problems in measuring change. University of Wisconsin Press, pp 122–137
Tucker LR (1966) Some mathematical notes on three-mode factor analysis. J Psychometrika 31: 279–311
Vasilescu AMO, Terzopoulos D (2002) Multilinear analysis of image ensembles: tensor faces. In: Proceedings of the European conference on computer vision, pp 447–460
Venkatesan R, Koon S-M, Jakubowski MH, Moulin P (2000) Robust image hashing. Proc Int Conf Image Process 3:664–666
Yan Y, Shen H, Liu G, Ma Z, Gao C, Sebe N (2014) GLocal tells you more: coupling GLocal structural for feature selection with sparsity for image and video classification. J Comput Vis Image Understand 124:99–109
Yan Y, Yang Y, Shen H, Meng D, Liu G, Hauptmann A, Sebe N (2015) Event oriented dictionary learning for complex event detection. IEEE Trans Image Process 24(6):1867–1878
Yan Y, Yang Y, Shen H, Meng D, Liu G, Hauptmann A, Sebe N (2015) Complex event detection via event oriented dictionary learning. In: AAAI conference on artificial intelligence
Zhou B, Yao Y (2010) Evaluating information retrieval system performance based on user preference. J Intell Inf Syst 34:227–248
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
R., S., Sharma, S., Thakur, M. et al. Perceptual video hashing based on Tucker decomposition with application to indexing and retrieval of near-identical videos. Multimed Tools Appl 75, 7779–7797 (2016). https://doi.org/10.1007/s11042-015-2695-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2695-1