skip to main content
10.1145/3172944.3172966acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article

TSCSet: A Crowdsourced Time-Sync Comment Dataset for Exploration of User Experience Improvement

Published:05 March 2018Publication History

ABSTRACT

Time-Sync Comment (TSC) is a type of crowdsourced user review embedded in online video websites, which provides better real-time user interaction than traditional user comment type. Various TSC-related problems and approaches have been studied to improve user experience by taking advantage of special characteristics of TSCs such as strong time reliance. However, there are three major drawbacks to these TSC researches. First, they did not explicitly show advantage of TSC features over the traditional features in terms of users' experience. Second, the experiments were conducted on some inconsistent TSC datasets crawled from different source, which makes the effectiveness of their methods less convincing. Third, the methods were manually evaluated by a limited number of so-called "experts" in these experiments, so it is hard for other researchers to obtain the data labels and reproduce the results. In order to overcome these drawbacks, this paper aims to explore the usefulness of TSC data for for the improvement of user experience online by exploiting the TSC pattern inside a new dataset. Specifically, we present a larger-scale TSC dataset with four-level structures and rich self-labeled attributes and formally define a group of TSC-related research problems based on this dataset. The problems are solved by adapted state-of-the-art methods and evaluated through crowdsourced labels in the dataset. The result can be regarded as a baseline for further research.

References

  1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, and others. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google ScholarGoogle Scholar
  2. Daniel Archambault, Helen Purchase, and Tobias Hoßfeld. 2015. Evaluation in the Crowd. Springer.Google ScholarGoogle Scholar
  3. Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Computer Society. 1798--1828 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Youmna Borghol, Sebastien Ardon, Niklas Carlsson, Derek Eager, and Anirban Mahanti. 2012. The untold story of the clones: content-agnostic factors that impact YouTube video popularity. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1186--1194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon. 2009. Analyzing the video popularity characteristics of large-scale user generated content systems. IEEE/ACM Transactions on Networking (TON) 17, 5 (2009), 1357--1370. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Xu Chen, Yongfeng Zhang, Qingyao Ai, Hongteng Xu, Junchi Yan, and Zheng Qin. 2017. Personalized key frame recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 315--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Abhimanyu Das, Sreenivas Gollapudi, Rina Panigrahy, and Mahyar Salek. 2013. Debiasing social wisdom. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 500--508. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. IEEE, 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  9. Yihong Gong and Xin Liu. 2000. Video summarization using singular value decomposition. In Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, Vol. 2. IEEE, 174--180.Google ScholarGoogle Scholar
  10. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  11. Ming He, Yong Ge, Enhong Chen, Qi Liu, and Xuesong Wang. 2017. Exploring the Emerging Type of Comment for Online Videos: DanMu. ACM Transactions on the Web (TWEB) 12, 1 (2017), 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ming He, Yong Ge, Le Wu, Enhong Chen, and Chang Tan. 2016a. Predicting the Popularity of DanMu-enabled Videos: A Multi-factor View. In International Conference on Database Systems for Advanced Applications. Springer, 351--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C Hugh Holman and William Harmon. 1992. A handbook to literature. Macmillan Publishing Company.Google ScholarGoogle Scholar
  15. Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, and Hal Daumé III. 2015. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Vol. 1. 1681--1691.Google ScholarGoogle ScholarCross RefCross Ref
  16. Rie Johnson and Tong Zhang. 2014. Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014).Google ScholarGoogle Scholar
  17. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). 1188--1196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google ScholarGoogle Scholar
  20. Jiangfeng Li, Zhenyu Liao, Chenxi Zhang, and Jing Wang. 2016. Event Detection on Online Videos Using Crowdsourced Time-Sync Comment. In 7th International Conference on Cloud Computing and Big Data (CCBD), 2016. IEEE, 52--57.Google ScholarGoogle Scholar
  21. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.Google ScholarGoogle ScholarCross RefCross Ref
  22. Tianming Liu, Hong-Jiang Zhang, and Feihu Qi. 2003. A novel video key-frame-extraction algorithm based on perceived motion energy model. IEEE transactions on circuits and systems for video technology 13, 10 (2003), 1006--1013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Guangyi Lv, Tong Xu, Enhong Chen, Qi Liu, and Yi Zheng. 2016. Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding.. In AAAI. 3000--3006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google ScholarGoogle Scholar
  25. Siddharth Mitra, Mayank Agrawal, Amit Yadav, Niklas Carlsson, Derek Eager, and Anirban Mahanti. 2011. Characterizing web-based video sharing workloads. ACM Transactions on the Web (TWEB) 5, 2 (2011), 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  27. Stefan Siersdorfer, Jose San Pedro, and Mark Sanderson. 2009. Automatic video tagging using content redundancy. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. ACM, 395--402. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y Ng. 2008. Cheap and fast but is it good?: evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 254--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. C Sujatha and Uma Mudenagudi. 2011. A study on keyframe extraction methods for video summary. In International Conference on Computational Intelligence and Communication Networks (CICN), 2011. IEEE, 73--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Ba Tu Truong and Svetha Venkatesh. 2007. Video abstraction: A systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM) 3, 1 (2007), 3. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Adrian Ulges, Christian Schulze, Markus Koch, and Thomas M Breuel. 2010. Learning automatic concept detectors from online video. Computer vision and Image understanding 114, 4 (2010), 429--438. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Subhashini Venugopalan, Marcus Rohrbach, Jeffrey Donahue, Raymond Mooney, Trevor Darrell, and Kate Saenko. 2015. Sequence to sequence-video to text. In Proceedings of the IEEE international conference on computer vision. 4534--4542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Meng Wang, Richang Hong, Guangda Li, Zheng-Jun Zha, Shuicheng Yan, and Tat-Seng Chua. 2012. Event driven web video summarization by tag localization and key-shot identification. IEEE Transactions on Multimedia 14, 4 (2012), 975--985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Peter Welinder, Steve Branson, Pietro Perona, and Serge J Belongie. 2010. The multidimensional wisdom of crowds. In Advances in neural information processing systems. 2424--2432. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Bin Wu, Erheng Zhong, Ben Tan, Andrew Horner, and Qiang Yang. 2014. Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 721--730. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xian Wu, Wei Fan, and Yong Yu. 2012. Sembler: Ensembling Crowd Sequential Labeling for Improved Quality.. In AAAI. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zechen Wu and Eisuke Ito. 2014. Correlation analysis between user's emotional comments and popularity measures. In IIAI 3rd International Conference on Advanced Applied Informatics (IIAIAAI), 2014. IEEE, 280--283.Google ScholarGoogle ScholarCross RefCross Ref
  38. Yikun Xian, Jiangfeng Li, Chenxi Zhang, and Zhenyu Liao. 2015. Video highlight shot extraction with time-sync comment. In Proceedings of the 7th International Workshop on Hot Topics in Planet-scale mObile computing and online Social neTworking. ACM, 31--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Linli Xu and Chao Zhang. 2017. Bridging Video Content and Comments: Synchronized Video Description with Temporal Summarization of Crowdsourced Time-Sync Comments.. In AAAI. 1611--1617.Google ScholarGoogle Scholar
  40. Wenmian Yang, Na Ruan, Wenyuan Gao, Kun Wang, Wensheng Ran, and Weijia Jia. 2017. Crowdsourced time-sync video tagging using semantic association graph. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 547--552.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. TSCSet: A Crowdsourced Time-Sync Comment Dataset for Exploration of User Experience Improvement

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces
          March 2018
          698 pages
          ISBN:9781450349451
          DOI:10.1145/3172944

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 5 March 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          IUI '18 Paper Acceptance Rate43of299submissions,14%Overall Acceptance Rate746of2,811submissions,27%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader