Abstract
Popular content in video sharing websites (e.g., YouTube) is usually replicated via identical copies or near-duplicates. These duplicates are usually studied because they pose a threat to site owners in terms of wasted disk space, or privacy infringements. Furthermore, this content might potentially hinder the users' experience in these websites. The research presented in this article focuses around the central argument that there is no agreement on the technical definition of what these near-duplicates are, and, more importantly, there is no strong evidence that users of video sharing websites would like this content to be removed. Most scholars define near-duplicate video clips (NDVC) by means of non-semantic features (e.g., different image/audio quality), while a few also include semantic features (i.e., different videos of similar content). However, it is unclear what features contribute to the human perception of near-duplicate videos. The findings of four large scale online surveys that were carried out in the context of our research confirm the relevance of both types of features. Some of our findings confirm the adopted definitions of NDVC whereas other findings are surprising: Near-duplicate videos with different image quality, audio quality, or with/without overlays were perceived as NDVC. However, the same could not be verified when videos differed by more than one of these features at the same time. With respect to semantics, it is yet unclear the exact role that it plays in relation to the features that make videos alike. From a user's perspective, participants preferred in most cases to see only one of the NDVC in the search results of a video search query and they were more tolerant to changes in the audio than in the video tracks. Based on all these findings, we propose a new user-centric NDVC definition and present implications for how duplicate content should be dealt with by video sharing Web sites.
Supplemental Material
Available for Download
Online appendix to looking at near-duplicate videos from a human-centric perspective on article 15.
- Basharat, A., Zhai, Y., and Shan, M. 2008. Content based video matching using spatiotemporal volumes,. J. Comput. Vis. Image Under. 110, 3, 360--377. Google ScholarDigital Library
- Benevenuto, F., Duarte, F., Rodrigues, T., Almeida, V. A., Almeida, J. M., and Ross, K. W. 2008. Understanding video interactions in youtube. In Proceeding of the 16th ACM International Conference on Multimedia (MM'08). ACM, New York, 761--764. Google ScholarDigital Library
- Bruce, B., Green, P. R., and Georgeson, M. A. 1996. Visual Perception. 3rd Ed. Psychology Press.Google Scholar
- Celebi, M. E. and Aslandogan, Y. A. 2005. Human perception-driven, similarity-based access to image databases. In Proceedings of the 18th International Florida Artificial Intelligence Research Society Conference. I. Russell and Z. Markov, Eds. 245--251.Google Scholar
- Cha, M., Kwak, H., Rodriguez, P., Ahn, Y.-Y., and Moon, S. 2007. I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system. In Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement (IMC'07). ACM, New York, 1--14. Google ScholarDigital Library
- Cheng, R., Huang, Z., Shen, H. T., and Zhou, X. 2009. Interactive near-duplicate video retrieval and detection. In Proceedings of the 17th ACM International Conference on Multimedia (MM'09). ACM, New York, 1001--1002. Google ScholarDigital Library
- Cheng, X. and Chia, L.-T. 2010. Stratification-based keyframe cliques for removal of near-duplicates in video search results. In Proceedings of the International Conference on Multimedia Information Retrieval (MIR'10). ACM, New York, 313--322. Google ScholarDigital Library
- Gill, P., Li, Z., Arlitt, M., and Mahanti, A. 2008. Characterizing users sessions on youtube. In Proceedings of the SPIE/ACM Conference on Multimedia Computing and Networking (MMCN).Google Scholar
- Guyader, N., Borgne, H. L., Hérault, J., and Guérin-Dugué, A. 2002. Towards the introduction of human perception in a natural scene classification system. In Proceedings of Neural Networks for Signal Processing. 385--394.Google Scholar
- Halvey, M. J. and Keane, M. T. 2007. Exploring social dynamics in online media sharing. In Proceedings of the 16th International Conference on World Wide Web (WWW'07). ACM, New York, 1273--1274. Google ScholarDigital Library
- Hsu, W. H., Kennedy, L. S., and Chang, S.-F. 2006. Video search reranking via information bottleneck principle. In Proceedings of the 14th Annual ACM International Conference on Multimedia (MULTIMEDIA'06). ACM, New York, 35--44. Google ScholarDigital Library
- Kruitbosch, G. and Nack, F. 2008. Broadcast yourself on youtube: really? In Proceeding of the 3rd ACM International Workshop on Human-Centered Computing (HCC'08). ACM, New York, 7--10. Google ScholarDigital Library
- Maia, M., Almeida, J., and Almeida, V. 2008. Identifying user behavior in online social networks. In Proceedings of the 1st Workshop on Social Network Systems (SocialNets'08). ACM, New York, 1--6. Google ScholarDigital Library
- Payne, J. S. and Stonham, T. J. 2001. Can texture and image content retrieval methods match humanperception? In Proceedings of Intelligent Multimedia, Video and Speech Processing. 154--157.Google Scholar
- Rui, Y., Huang, T., and Chang, S. 1999. Image retrieval: current techniques, promising directions and open issues. J. Vis. Comm. Image Repres. 10, 4, 39--62.Google ScholarDigital Library
- seok Min, H., Choi, J., Neve, W. D., and Ro, Y. M. 2009. Near-duplicate video detection using temporal patterns of semantic concepts. In Proceedings of the International Symposium on Multimedia, 65--71. Google ScholarDigital Library
- Shao, J., Shen, H. T., and Zhou, X. 2008. Challenges and techniques for effective and efficient similarity search in large video databases. Proc. VLDB Endow. 1, 2, 1598--1603. Google ScholarDigital Library
- Shen, H. T., Zhou, X., Huang, Z., Shao, J., and Zhou, X. 2007. Uqlips: a real-time near-duplicate video clip detection system. In Proceedings of the 33rd International Conference on Very Large Data Bases (VLDB'07). VLDB Endowment, 1374--1377. Google ScholarDigital Library
- Kim, H.-S., Chang, H.-W., Lee, J., and Lee, D. 2010. Effective near-duplicate image detection using gene sequence alignment. In Advances in Information Retrieval, Lecture Notes in Computer Science, vol. 5993. Springer, Berlin, 229--240. Google ScholarDigital Library
- Tversky, A. 1977. Features of similarity. Psych. Rev. 84, 4, 327--352.Google ScholarCross Ref
- Wu, X., Hauptmann, A. G., and Ngo, C.-W. 2007. Practical elimination of near-duplicates from web video search. In Proceedings of the 15th International Conference on Multimedia (MULTIMEDIA'07). ACM, New York, 218--227. Google ScholarDigital Library
- Yang, X., Zhu, Q., and Cheng, K.-T. 2009. Near-duplicate detection for images and videos. In Proceedings of the 1st ACM Workshop on Large-Scale Multimedia Retrieval and Mining (LS-MMRM'09). ACM, New York, 73--80. Google ScholarDigital Library
- Zhou, X., Zhou, X., Chen, L., Bouguettaya, A., Xiao, N., and Taylor, J. A. 2009. An efficient near-duplicate video shot detection method using shot-based interest points. Trans. Multimedia. 11, 5, 879--891. Google ScholarDigital Library
Index Terms
- Looking at near-duplicate videos from a human-centric perspective
Recommendations
Understanding near-duplicate videos: a user-centric approach
MM '09: Proceedings of the 17th ACM international conference on MultimediaPopular content in video sharing web sites (e.g., YouTube) is usually duplicated. Most scholars define near-duplicate video clips (NDVC) based on non-semantic features (e.g., different image/audio quality), while a few also include semantic features (...
Human Perception of Near-Duplicate Videos
INTERACT '09: Proceedings of the 12th IFIP TC 13 International Conference on Human-Computer Interaction: Part IIPopular content in video sharing websites (<em>e.g.,</em> YouTube) contains many duplicates. Most scholars define near-duplicate video clips (NDVC) as identical videos with variations on non-semantic features (<em>e.g.,</em> image/audio quality), while ...
Correlation-based retrieval for heavily changed near-duplicate videos
The unprecedented and ever-growing number of Web videos nowadays leads to the massive existence of near-duplicate videos. Very often, some near-duplicate videos exhibit great content changes, while the user perceives little information change, for ...
Comments