Video similarity detection using fixed-length Statistical Dominant Colour Profile (SDCP) signatures

Bekhet, Saddam; Ahmed, Amr

doi:10.1007/s11554-017-0700-9

Video similarity detection using fixed-length Statistical Dominant Colour Profile (SDCP) signatures

Original Research Paper
Published: 28 June 2017

Volume 16, pages 1999–2014, (2019)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

304 Accesses
2 Citations
Explore all metrics

Abstract

This paper presents a fast and effective technique for videos’ visual similarity detection and measurement using compact fixed-length signatures. The proposed technique facilitates for building real-time and scalable video matching/retrieval systems through generating a representative signature for a given video shot. The generated signature (Statistical Dominant Colour Profile, SDCP) effectively encodes the colours’ spatio-temporal patterns in a given shot, towards a robust real-time matching. Furthermore, the SDCP signature is engineered to better address the visual similarity problem, through its relaxed representation of shot contents. The compact fixed-length aspect of the proposed signature is the key to its high matching speed (>1000 fps) compared to the current techniques that relies on exhaustive processing, such as dense trajectories. The SDCP signature encodes a given video shot with only 294 values, regardless of the shot length, which facilitates for speedy signature extraction and matching. To maximize the benefit of the proposed technique, compressed-domain videos are utilized as a case study following their wide availability. However, the proposed technique avoids full video decompression and operates on tiny frames, rather than full-size decompressed frames. This is achievable through using the tiny DC-images sequence of the MPEG compressed stream. The experiments on various standard and challenging datasets (e.g. UCF101 13k videos) shows the technique’s robust performance, in terms of both, retrieval ability and computational performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-based video sequence matching using dominant colour graph profile (DCGP)

Article 08 August 2017

Video block and FABEMD features for an effective and fast method of reporting near-duplicate and mirroring videos

Article Open access 23 October 2021

Content-Based Video Retrieval Using Dominant Color and Shape Feature

Notes

Percent change: $\mathbf{[~|Difference|/Reference Value] \times 100}$.
Average of the direct difference between both values across the first 10 ranks.
Per cent change: $\mathbf{[~|Difference|/Reference Value] \times 100}$.

References

YouTube statistics (2015). http://www.youtube.com/yt/press/statistics.html
Abbass, A., Youssif, A., Ghalwash, A.: Compressed domain video fingerprinting technique using the singular value decomposition. In: Proceedings of Applied Informatics and Computing Theory (2012)
Abbass, A.S., Youssif, A.A., Ghalwash, A.Z.: Hybrid-based compressed domain video fingerprinting technique. Comput. Inf. Sci. 5(5), 25 (2012)
Google Scholar
Aihara, K., Aoki, T.: Motion dense sampling and component clustering for action recognition. Multimed. Tools Appl. 74(16), 6303–6321 (2015)
Google Scholar
Almeida, J., Leite, N.J., Torres, R.da.S.: Comparison of video sequences with histograms of motion patterns. In: IEEE International Conference on Image Processing, pp. 3673–3676 (2011)
Altadmri, A., Ahmed, A.: A framework for automatic semantic video annotation. Multimed. Tools Appl. 72(2), 1167–1191 (2014)
Google Scholar
Arlinghaus, S.: Practical handbook of curve fitting. CRC Press, Boca Raton (1994)
MATH Google Scholar
Attneave, F.: Dimensions of similarity. Am. J. Psychol. 53, 516–556 (1950)
Google Scholar
Avula, S.K., Deshmukh, S.C.: Frame based video retrieval using video signatures. Int. J. Comput. Appl. 59(10), 35–40 (2012)
Google Scholar
Ballas, N., Delezoide, B., Prêteux, F.: Trajectory signature for action recognition in video. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 1429–1432. ACM (2012)
Basharat, A., Zhai, Y., Shah, M.: Content based video matching using spatiotemporal volumes. Comput. Vis. Image Underst. 110(3), 360–377 (2008)
Google Scholar
Bekhet, S., Ahmed, A.: Compact signature-based compressed video matching using dominant color profiles (dcp). In: International Conference on Pattern Recognition ICPR, pp. 3933–3938 (2014)
Bekhet, S., Ahmed, A., Altadmri, A., Hunter, A.: Compressed video matching: Frame-to-frame revisited. Multimed. Tools Appl. (2015). doi:10.1007/s11042-015-2887-8
Article Google Scholar
Bekhet, S., Ahmed, A., Hunter, A.: Video matching using dc-image and local features. Lect. Notes Eng. Comput. Sci. 3, 2209–2214 (2013)
Google Scholar
Chattopadhyay, C., Das, S.: Use of trajectory and spatiotemporal features for retrieval of videos with a prominent moving foreground object. Signal Image Video Process. 10(2), 319–326 (2016)
Google Scholar
Chen, L.H., Chin, K.H., Liao, H.Y.M.: Integration of color and motion features for video retrieval. Int. J. Pattern Recognit. Artif. Intell. 23(02), 313–329 (2009)
Google Scholar
Cheung, S.C.S., Zakhor, A.: Efficient video similarity measurement with video signature. IEEE Trans. Circuits Syst. Video Technol. 13(1), 59–74 (2003)
Google Scholar
DeMenthon, D., Doermann, D.: Video retrieval using spatio-temporal descriptors. In: Proceedings of the eleventh ACM international conference on Multimedia, pp. 508–517. ACM (2003)
Deng, Y., Manjunath, B., Kenney, C., Moore, M.S., Shin, H.: An efficient color representation for image retrieval. IEEE Trans. Image Process. 10(1), 140–147 (2001)
MATH Google Scholar
Dimitrova, N., Golshani, F.: Motion recovery for video content classification. ACM Trans. Inf. Syst. (TOIS) 13(4), 408–439 (1995)
Google Scholar
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Droueche, Z., Lamard, M., Cazuguel, G., Quellec, G., Roux, C., Cochener, B.: Content-based medical video retrieval based on region motion trajectories. In: 5th European Conference of the International Federation for Medical and Biological Engineering, pp. 622–625. Springer (2012)
Faloutsos, C., Barber, R., Flickner, M., Hafner, J., Niblack, W., Petkovic, D., Equitz, W.: Efficient and effective querying by image content. J. Intell. Inf. Syst. 3(3–4), 231–262 (1994)
Google Scholar
Fang, Y., Lin, W., Chen, Z., Tsai, C.M., Lin, C.W.: A video saliency detection model in compressed domain. IEEE Trans. Circuits Syst. Video Technol. 24(1), 27–38 (2014)
Google Scholar
Farag, W.E., Abdel-Wahab, H.: A human-based technique for measuring video data similarity. In: Proceedings Eighth IEEE International Symposium on Computers and Communication, 2003.(ISCC 2003), pp. 769–774. IEEE (2003)
Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., et al.: Query by image and video content: The QBIC system. Computer 28(9), 23–32 (1995)
Google Scholar
Gao, H.P., Yang, Z.Q.: Content based video retrieval using spatiotemporal salient objects. In: International Symposium on Intelligence Information Processing and Trusted Computing (IPTC), pp. 689–692 (2010)
Guest, P.G., Guest, P.G.: Numerical methods of curve fitting. Cambridge University Press, Cambridge (2012)
MATH Google Scholar
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey vision conference, vol. 15, p. 50. Citeseer (1988)
Jiang, Y.G., Bhattacharya, S., Chang, S.F., Shah, M.: High-level event recognition in unconstrained videos. Int. J. Multimed. Inf. Retr. 2(2), 73–101 (2013)
Google Scholar
Kamila, N.K.: Handbook of Research on Emerging Perspectives in Intelligent Pattern Recognition, Analysis, and Image Processing. IGI Global (2015)
Kanade, S.S., Patil, P.: Dominant color based extraction of key frames for sports video summarization. Int. J. Adv. Eng. Technol. 6(1), 504–512 (2013)
Google Scholar
Kantorov, V., Laptev, I.: Efficient feature extraction, encoding, and classification for action recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2593–2600. IEEE (2014)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732. IEEE (2014)
Kekre, H., Mishra, D., Rege, M.P.: Survey on recent techniques in content based video retrieval. Int. J. Eng. Tech. Res. (IJETR) 3(5), 69–73 (2015)
Google Scholar
Kiranyaz, S., Uhlmann, S., Gabbouj, M.: Dominant color extraction based on dynamic clustering by multi-dimensional particle swarm optimization. In: Seventh International Workshop on Content-Based Multimedia Indexing, 2009. CBMI’09, pp. 181–188. IEEE (2009)
Kliper-Gross, O., Gurovich, Y., Hassner, T., Wolf, L.: Motion interchange patterns for action recognition in unconstrained videos. In: Computer Vision–ECCV 2012, pp. 256–269. Springer (2012)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563. IEEE (2011)
Li, L., Huang, W., Gu, I.H., Luo, R., Tian, Q.: An efficient sequential approach to tracking multiple objects through crowds for real-time intelligent CCTV systems. IEEE Trans. Syst. Man Cybern. Part B Cybern. 38(5), 1254–1269 (2008)
Google Scholar
Li, N., Cheng, X., Zhang, S., Wu, Z.: Realistic human action recognition by fast hog3d and self-organization feature map. Mach. Vis. Appl. 25(7), 1793–1812 (2014)
Google Scholar
Lichtsteiner, P., Posch, C., Delbruck, T.: A 128$\times$ 128 120 db 15 $\mu$s latency asynchronous temporal contrast vision sensor. IEEE J. Solid-State Circuits 43(2), 566–576 (2008)
Google Scholar
Lienhart, R.W., Effelsberg, W., Jain, R.C.: Visualgrep: A systematic method to compare and retrieve video sequences. In: Photonics West’98 Electronic Imaging, pp. 271–282. International Society for Optics and Photonics (1997)
Lin, T., Ngo, C.W., Zhang, H.J., Shi, Q.Y.: Integrating color and spatial features for content-based video retrieval. In: Image Processing, 2001. Proceedings. 2001 International Conference on, vol. 3, pp. 592–595. IEEE (2001)
Lin, T., Zhang, H.J.: Automatic video scene extraction by shot grouping. In: Pattern Recognition, 2000. Proceedings. 15th International Conference on, vol. 4, pp. 39–42. IEEE (2000)
Liu, H., Sun, M.T., Wu, R.C., Yu, S.S.: Automatic video activity detection using compressed domain motion trajectories for H. 264 videos. J. Vis. Commun. Image Represent. 22(5), 432–439 (2011)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pp. 1996–2003. IEEE (2009)
Liu, T.J., Han, H.J., Xin, X., Li, Z., Katsaggelos, A.K.: A robust and lightweight feature system for video fingerprinting. In: 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 160–164. IEEE (2012)
Liu, X., Zhuang, Y., Pan, Y.: A new approach to retrieve video by example video clip. In: Proceedings of the seventh ACM international conference on Multimedia (Part 2), pp. 41–44. ACM (1999)
Lu, B., Cao, H., Cao, Z.: An efficient method for video similarity search with video signature. In: 2010 International Conference on Computational and Information Sciences (ICCIS), pp. 713–716. IEEE (2010)
Manjunath, B.S., Salembier, P., Sikora, T.: Introduction to MPEG-7: multimedia content description interface, vol. 1. Wiley, New York (2002)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H., et al.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Mohan, R.: Video sequence matching. In: Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, vol. 6, pp. 3697–3700. IEEE (1998)
Mojsilovic, A., Hu, J., Soljanin, E.: Extraction of perceptually important colors and similarity measurement for image matching, retrieval and analysis. IEEE Trans. Image Process. 11(11), 1238–1248 (2002)
MathSciNet Google Scholar
Naphade, M.R., Yeung, M.M., Yeo, B.L.: Novel scheme for fast and efficient video sequence matching using compact signatures. In: Electronic Imaging, pp. 564–572. International Society for Optics and Photonics (1999)
Pacharaney, U.S., Salankar, P.S., Mandalapu, S.: Dimensionality reduction for fast and accurate video search and retrieval in a large scale database. In: 2013 Nirma University International Conference on Engineering (NUiCONE), pp. 1–9. IEEE (2013)
Panchal, P., Merchant, S.: Performance evaluation of fade and dissolve transition shot boundary detection in presence of motion in video. In: Emerging Technology Trends in Electronics, Communication and Networking (ET2ECN), 2012 1st International Conference on, pp. 1–6. IEEE (2012)
Patel, B., Meshram, B.: Content based video retrieval systems. Int. J. UbiComp (IJU) 3(2) (2012)
Google Scholar
Peng, X., Qiao, Y., Peng, Q., Qi, X.: Exploring motion boundary based sampling and spatial-temporal context descriptors for action recognition. In: British Machine Vision Conference (BMVC) (2013)
Poppe, C., De Bruyne, S., Paridaens, T., Lambert, P., Van de Walle, R.: Moving object detection in the H. 264/AVC compressed domain for video surveillance applications. J. Vis. Commun. Image Represent. 20(6), 428–437 (2009)
Google Scholar
Reddy, K.K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24(5), 971–981 (2013)
Google Scholar
Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008., pp. 1–8 (2008)
Rogowitz, B.E., Frese, T., Smith, J.R., Bouman, C.A., Kalin, E.B.: Perceptual image similarity experiments. In: Photonics West’98 Electronic Imaging, pp. 576–590. International Society for Optics and Photonics (1998)
Sabitha, M., Hariharan, R.: Hybrid approach for image search reranking. Int. J. Sci. Res. (IJSR) 2, 123–128 (2013)
Google Scholar
Sadanand, S., Corso, J.J.: Action bank: A high-level representation of activity in video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1234–1241. IEEE (2012)
Serre, T., Wolf, L., Bileschi, S., Riesenhuber, M., Poggio, T.: Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 411–426 (2007)
Google Scholar
Shao, H., Wu, Y., Cui, W., Zhang, J.: Image retrieval based on mpeg-7 dominant color descriptor. In: Young Computer Scientists, 2008. ICYCS 2008. The 9th International Conference for, pp. 753–757. IEEE (2008)
Shao, J., Shen, H.T., Zhou, X.: Challenges and techniques for effective and efficient similarity search in large video databases. Proceedings of the VLDB Endowment 1(2), 1598–1603 (2008)
Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: IEEE Conference on Computer Vision and Pattern Recognition, 2007. CVPR’07. pp. 1–8. IEEE (2007)
Shi, F., Petriu, E., Laganiere, R.: Sampling strategies for real-time action recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2595–2602. IEEE (2013)
Snoek, C.G., Huurnink, B., Hollink, L., De Rijke, M., Schreiber, G., Worring, M.: Adding semantics to detectors for video retrieval. IEEE Trans. Multimed. 9(5), 975–986 (2007)
Google Scholar
Solmaz, B., Assari, S.M., Shah, M.: Classifying web videos using a global video descriptor. Mach. Vis. Appl. 24(7), 1473–1485 (2013)
Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402 (2012)
Su, C.W., Liao, H.Y., Tyan, H.R., Lin, C.W., Chen, D.Y., Fan, K.C.: Motion flow-based video retrieval. IEEE Trans. Multimed. 9(6), 1193–1201 (2007)
Google Scholar
Sun, J., Mu, Y., Yan, S., Cheong, L.F.: Activity recognition using dense long-duration trajectories. In: 2010 IEEE International Conference on Multimedia and Expo (ICME), pp. 322–327. IEEE (2010)
Thepade, S.D., Yadav, N.B.: Assessment of similarity measurement criteria in thepade’s sorted ternary block truncation coding (tstbtc) for content based video retrieval. In: 2015 International Conference on Communication, Information and Computing Technology (ICCICT), pp. 1–6. IEEE (2015)
Thorpe, S., Fize, D., Marlot, C., et al.: Speed of processing in the human visual system. Nature 381(6582), 520–522 (1996)
Google Scholar
TrecVid(2011): Trec video retrival task, bbc ruch (1-02-2011). www.nplpir.nist.gov/projects/trecvid
UCF: Ucf sports action dataset (2016). http://crcv.ucf.edu/data/UCF_Sports_Action.php. Retrieved (11-02-2016)
Uijlings, J., Duta, I., Sangineto, E., Sebe, N.: Video classification with densely extracted hog/hof/mbh features: an evaluation of the accuracy/computational efficiency trade-off. Int. J. Multimed. Inf. Retr. 4(1), 33–44 (2015)
Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J Comput. Vis. 103(1), 60–79 (2013)
MathSciNet Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Wang, L., Qiao, Y., Tang, X.: Motionlets: Mid-level 3d parts for human motion recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2674–2681. IEEE (2013)
Watson, A.B.: Image compression using the discrete cosine transform. Math. J. 4(1), 81 (1994)
MathSciNet Google Scholar
Wu, Y., Zhuang, Y., Pan, Y.: Content-based video similarity model. In: Proceedings of the eighth ACM international conference on multimedia, pp. 465–467. ACM (2000)
Xu, P., Xie, L., Chang, S.F., Divakaran, A., Vetro, A., Sun, H.: Algorithms and system for segmentation and structure analysis in soccer video. In: Proceedings of ICME, vol. 1, pp. 928–931. Citeseer (2001)
Zhang, H., Smoliar, S.W.: Developing power tools for video indexing and retrieval. In: IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology, pp. 140–149. International Society for Optics and Photonics (1994)
Zhang, Z., Yuan, F.: Compressed video copy detection based on texture analysis. In: 2010 IEEE International Conference on Wireless Communications, Networking and Information Security (WCNIS), pp. 612–615. IEEE (2010)
Zhang, Z., Zou, J.: Compressed video copy detection based on edge analysis. In: The 2010 IEEE International Conference on Information and Automation, pp. 2497–2501 (2010)
Zhao, Z., Cui, B., Cong, G., Huang, Z., Shen, H.T.: Extracting representative motion flows for effective video retrieval. Multimed. Tools Appl. 58(3), 687–711 (2012)
Google Scholar
Zhu, X., Elmagarmid, A.K., Xue, X., Wu, L., Catlin, A.C.: Insightvideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval. IEEE Trans. Multimed. 7(4), 648–666 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

South Valley University, Qena, Egypt
Saddam Bekhet
University of Lincoln, Lincoln, UK
Amr Ahmed

Authors

Saddam Bekhet
View author publications
You can also search for this author inPubMed Google Scholar
Amr Ahmed
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Saddam Bekhet.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bekhet, S., Ahmed, A. Video similarity detection using fixed-length Statistical Dominant Colour Profile (SDCP) signatures. J Real-Time Image Proc 16, 1999–2014 (2019). https://doi.org/10.1007/s11554-017-0700-9

Download citation

Received: 06 December 2016
Accepted: 16 June 2017
Published: 28 June 2017
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11554-017-0700-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video similarity detection using fixed-length Statistical Dominant Colour Profile (SDCP) signatures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Graph-based video sequence matching using dominant colour graph profile (DCGP)

Video block and FABEMD features for an effective and fast method of reporting near-duplicate and mirroring videos

Content-Based Video Retrieval Using Dominant Color and Shape Feature

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now