Abstract
This paper introduces a novel content-based video copy detection method using the deep CNN features. An efficient deep CNN feature is employed to encode the image content while retaining the discrimination capability. Taking advantage of the extremely fast Euclidean distance similarity of deep CNN features, a keyframe-based copy retrieval method that exhaustively searches the copy candidates from the large keyframe database without indexing is proposed. Moreover, a graph-based sequence matching algorithm is employed to obtain the copy clips and accurately locate the video segments. The experimental evaluation has been performed to show the efficacy of the proposed deep CNN features. The promising results demonstrate the effectiveness of our proposed approach.
Similar content being viewed by others
References
Ojala, T., Pietikäinen, M., & Harwood, D. (1996). A comparative study of texture measures with classification based on featured distributions. Pattern Recognition, 29(1), 51–59.
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE conference on computer vision and pattern recognition, pp. 886–893.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
LeCun, Y., Bottou, L., Bengio, Y., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp. 1097–1105.
Perkins, L. N. (2015). Convolutional neural networks as feature generators for near-duplicate video detection.
Wu, C., Zhu, J., & Zhang, J. (2012). A content-based video copy detection method with randomly projected binary features. In IEEE conference on computer vision and pattern recognition workshops, pp. 21–26.
Hampapur, A., Hyun, K., & Bolle, R. M. (2001). Comparison of sequence matching techniques for video copy detection. In International society for optics and photonics electronic imaging, pp. 194–201.
Ojala, T., Pietikäinen, M., & Mäenpää, T. (2001). A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification. In Conference on advances in pattern recognition, pp. 399–408.
Ojala, T., Pietikainen, M., & Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. In IEEE transactions on pattern analysis and machine intelligence, pp. 971–987.
Jun, W., Lee, Y., & Jun, B. M. (2016). Duplicate video detection for large-scale multimedia. Multimedia Tools and Applications, 75(23), 15665–15678.
Thomas, R. M., & Sumesh, M. S. (2015). A simple and robust colour based video copy detection on summarized videos. Procedia Computer Science, 46, 1668–1675.
Huang, Z., Shen, H. T., Shao, J., et al. (2010). Practical online near-duplicate subsequence detection for continuous video streams. IEEE Transactions on Multimedia, 12(5), 386–398.
Shen, H. T., Zhou, X., Huang, Z., et al. (2007). UQLIPS: A real-time near-duplicate video clip detection system. In VLDB endowment international conference on very large data bases, pp. 1374–1377.
Dong, W., Wang, Z., Charikar, M., et al. (2012). High-confidence near-duplicate image detection. In ACM international conference on multimedia retrieval.
Liu, H., Lu, H., & Xue, X. (2013). A segmentation and graph-based video sequence matching method for video copy detection. IEEE Transactions on Knowledge and Data Engineering, 25(8), 1706–1718.
Tan, H. K., Ngo, C. W., Hong, R., et al. (2009). Scalable detection of partial near-duplicate videos by visual-temporal consistency. In ACM international conference on multimedia, pp. 145–154.
Liu, D., & Yu, Z. (2015). A computationally efficient algorithm for large scale near-duplicate video detection. In International conference on multimedia modeling, pp. 81–490.
Jiang, F., Hu, H. M., Zheng, J., et al. (2016). A hierarchal BoW for image retrieval by enhancing feature salience. Neurocomputing, 175, 146–154.
Zhou, Z., Wang, Y., Wu, Q. M. J., et al. (2017). Effective and efficient global context verification for image copy detection. IEEE Transactions on Information Forensics and Security, 12(1), 48–63.
Wang, S., & Jiang, S. (2015). INSTRE: A new benchmark for instance-level object retrieval and recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 11(3), 37.
Wu, X., Hauptmann, A. G., & Ngo, C. W. (2007). Practical elimination of near-duplicates from web video search. In ACM international conference on multimedia, pp. 218–227.
Zhang, C., Liu, D., Wu, X., et al. (2016). Near-duplicate segments based news web video event mining. Signal Processing, 120, 26–35.
Zhu, Y., Huang, X., Huang, Q., et al. (2016). Large-scale video copy retrieval with temporal-concentration sift. Neurocomputing, 187, 83–91.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In European conference on computer vision, pp. 818–833.
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.
Lin, M., Chen, Q., & Yan, S. (2013). Network in network. arXiv preprint arXiv:1312.4400.
Szegedy, C., Liu, W., Jia, Y., et al. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition, pp. 1–9.
He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition, pp. 770–778.
Wang L, Bao Y, Li H, et al. (2017). Compact CNN based video representation for efficient video copy detection. In International conference on multimedia modeling. Springer, Cham, pp. 576-587.
Jiang, Y. G., & Wang, J. (2016). Partial copy detection in videos: A benchmark and an evaluation of popular methods. IEEE Transactions on Big Data, 2(1), 32–42.
Jia, Y., Shelhamer, E., Donahue, J., et al. (2014). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM international conference on multimedia, pp. 675–678.
Russakovsky, O., Deng, J., Su, H., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Hougardy, S. (2010). The Floyd–Warshall algorithm on graphs with negative cycles. Information Processing Letters, 110(8–9), 279–281.
Zhang, D. Q., & Chang, S. F. (2004). Detecting image near-duplicate by stochastic attributed relational graph matching with learning. In Proceedings of the 12th annual ACM international conference on multimedia, pp. 877–884.
Acknowledgements
I would like to thank Jun Lei for helpful discussions and encouragement. This work has been supported by the National Natural Science Foundation of China under Contract Nos. 61571453, 61202336 and by the Natural Science Foundation of Hunan province under Contract No. 14JJ3010.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, X., Xie, Y., Luan, X. et al. Video Copy Detection Based on Deep CNN Features and Graph-Based Sequence Matching. Wireless Pers Commun 103, 401–416 (2018). https://doi.org/10.1007/s11277-018-5450-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-018-5450-x