research-article

TSCSet: A Crowdsourced Time-Sync Comment Dataset for Exploration of User Experience Improvement

Authors:
Zhenyu Liao

Tongji University, Shanghai, China

Tongji University, Shanghai, China
View Profile

,
Yikun Xian

Rutgers University, New Brunswick, NJ, USA

Rutgers University, New Brunswick, NJ, USA
View Profile

,
Xiao Yang

Tongji University, Shanghai, China

Tongji University, Shanghai, China
View Profile

,
Qinpei Zhao

Tongji University, Shanghai, China

Tongji University, Shanghai, China
View Profile

,
Chenxi Zhang

Tongji University, Shanghai, China

Tongji University, Shanghai, China
View Profile

,
Jiangfeng Li

Tongji University, Shanghai, China

Tongji University, Shanghai, China
View Profile

IUI '18: Proceedings of the 23rd International Conference on Intelligent User InterfacesMarch 2018Pages 641–652https://doi.org/10.1145/3172944.3172966

Published:05 March 2018Publication History

IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces

Pages 641–652

ABSTRACT

Time-Sync Comment (TSC) is a type of crowdsourced user review embedded in online video websites, which provides better real-time user interaction than traditional user comment type. Various TSC-related problems and approaches have been studied to improve user experience by taking advantage of special characteristics of TSCs such as strong time reliance. However, there are three major drawbacks to these TSC researches. First, they did not explicitly show advantage of TSC features over the traditional features in terms of users' experience. Second, the experiments were conducted on some inconsistent TSC datasets crawled from different source, which makes the effectiveness of their methods less convincing. Third, the methods were manually evaluated by a limited number of so-called "experts" in these experiments, so it is hard for other researchers to obtain the data labels and reproduce the results. In order to overcome these drawbacks, this paper aims to explore the usefulness of TSC data for for the improvement of user experience online by exploiting the TSC pattern inside a new dataset. Specifically, we present a larger-scale TSC dataset with four-level structures and rich self-labeled attributes and formally define a group of TSC-related research problems based on this dataset. The problems are solved by adapted state-of-the-art methods and evaluated through crowdsourced labels in the dataset. The result can be regarded as a baseline for further research.

References

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, and others. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google Scholar
Daniel Archambault, Helen Purchase, and Tobias Hoßfeld. 2015. Evaluation in the Crowd. Springer.Google Scholar
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation Learning: A Review and New Perspectives. IEEE Computer Society. 1798--1828 pages. Google ScholarDigital Library
Youmna Borghol, Sebastien Ardon, Niklas Carlsson, Derek Eager, and Anirban Mahanti. 2012. The untold story of the clones: content-agnostic factors that impact YouTube video popularity. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 1186--1194. Google ScholarDigital Library
Meeyoung Cha, Haewoon Kwak, Pablo Rodriguez, Yong-Yeol Ahn, and Sue Moon. 2009. Analyzing the video popularity characteristics of large-scale user generated content systems. IEEE/ACM Transactions on Networking (TON) 17, 5 (2009), 1357--1370. Google ScholarDigital Library
Xu Chen, Yongfeng Zhang, Qingyao Ai, Hongteng Xu, Junchi Yan, and Zheng Qin. 2017. Personalized key frame recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 315--324. Google ScholarDigital Library
Abhimanyu Das, Sreenivas Gollapudi, Rina Panigrahy, and Mahyar Salek. 2013. Debiasing social wisdom. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 500--508. Google ScholarDigital Library
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, 2009. IEEE, 248--255.Google ScholarCross Ref
Yihong Gong and Xin Liu. 2000. Video summarization using singular value decomposition. In Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, Vol. 2. IEEE, 174--180.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016b. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Ming He, Yong Ge, Enhong Chen, Qi Liu, and Xuesong Wang. 2017. Exploring the Emerging Type of Comment for Online Videos: DanMu. ACM Transactions on the Web (TWEB) 12, 1 (2017), 1. Google ScholarDigital Library
Ming He, Yong Ge, Le Wu, Enhong Chen, and Chang Tan. 2016a. Predicting the Popularity of DanMu-enabled Videos: A Multi-factor View. In International Conference on Database Systems for Advanced Applications. Springer, 351--366. Google ScholarDigital Library
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780. Google ScholarDigital Library
C Hugh Holman and William Harmon. 1992. A handbook to literature. Macmillan Publishing Company.Google Scholar
Mohit Iyyer, Varun Manjunatha, Jordan Boyd-Graber, and Hal Daumé III. 2015. Deep unordered composition rivals syntactic methods for text classification. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Vol. 1. 1681--1691.Google ScholarCross Ref
Rie Johnson and Tong Zhang. 2014. Effective use of word order for text categorization with convolutional neural networks. arXiv preprint arXiv:1412.1058 (2014).Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105. Google ScholarDigital Library
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the 31st International Conference on Machine Learning (ICML-14). 1188--1196. Google ScholarDigital Library
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436--444.Google Scholar
Jiangfeng Li, Zhenyu Liao, Chenxi Zhang, and Jing Wang. 2016. Event Detection on Online Videos Using Crowdsourced Time-Sync Comment. In 7th International Conference on Cloud Computing and Big Data (CCBD), 2016. IEEE, 52--57.Google Scholar
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.Google ScholarCross Ref
Tianming Liu, Hong-Jiang Zhang, and Feihu Qi. 2003. A novel video key-frame-extraction algorithm based on perceived motion energy model. IEEE transactions on circuits and systems for video technology 13, 10 (2003), 1006--1013. Google ScholarDigital Library
Guangyi Lv, Tong Xu, Enhong Chen, Qi Liu, and Yi Zheng. 2016. Reading the Videos: Temporal Labeling for Crowdsourced Time-Sync Videos Based on Semantic Embedding.. In AAAI. 3000--3006. Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
Siddharth Mitra, Mayank Agrawal, Amit Yadav, Niklas Carlsson, Derek Eager, and Anirban Mahanti. 2011. Characterizing web-based video sharing workloads. ACM Transactions on the Web (TWEB) 5, 2 (2011), 8. Google ScholarDigital Library
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532--1543.Google ScholarCross Ref
Stefan Siersdorfer, Jose San Pedro, and Mark Sanderson. 2009. Automatic video tagging using content redundancy. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. ACM, 395--402. Google ScholarDigital Library
Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y Ng. 2008. Cheap and fast but is it good?: evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 254--263. Google ScholarDigital Library
C Sujatha and Uma Mudenagudi. 2011. A study on keyframe extraction methods for video summary. In International Conference on Computational Intelligence and Communication Networks (CICN), 2011. IEEE, 73--77. Google ScholarDigital Library
Ba Tu Truong and Svetha Venkatesh. 2007. Video abstraction: A systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM) 3, 1 (2007), 3. Google ScholarDigital Library
Adrian Ulges, Christian Schulze, Markus Koch, and Thomas M Breuel. 2010. Learning automatic concept detectors from online video. Computer vision and Image understanding 114, 4 (2010), 429--438. Google ScholarDigital Library
Subhashini Venugopalan, Marcus Rohrbach, Jeffrey Donahue, Raymond Mooney, Trevor Darrell, and Kate Saenko. 2015. Sequence to sequence-video to text. In Proceedings of the IEEE international conference on computer vision. 4534--4542. Google ScholarDigital Library
Meng Wang, Richang Hong, Guangda Li, Zheng-Jun Zha, Shuicheng Yan, and Tat-Seng Chua. 2012. Event driven web video summarization by tag localization and key-shot identification. IEEE Transactions on Multimedia 14, 4 (2012), 975--985. Google ScholarDigital Library
Peter Welinder, Steve Branson, Pietro Perona, and Serge J Belongie. 2010. The multidimensional wisdom of crowds. In Advances in neural information processing systems. 2424--2432. Google ScholarDigital Library
Bin Wu, Erheng Zhong, Ben Tan, Andrew Horner, and Qiang Yang. 2014. Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 721--730. Google ScholarDigital Library
Xian Wu, Wei Fan, and Yong Yu. 2012. Sembler: Ensembling Crowd Sequential Labeling for Improved Quality.. In AAAI. Google ScholarDigital Library
Zechen Wu and Eisuke Ito. 2014. Correlation analysis between user's emotional comments and popularity measures. In IIAI 3rd International Conference on Advanced Applied Informatics (IIAIAAI), 2014. IEEE, 280--283.Google ScholarCross Ref
Yikun Xian, Jiangfeng Li, Chenxi Zhang, and Zhenyu Liao. 2015. Video highlight shot extraction with time-sync comment. In Proceedings of the 7th International Workshop on Hot Topics in Planet-scale mObile computing and online Social neTworking. ACM, 31--36. Google ScholarDigital Library
Linli Xu and Chao Zhang. 2017. Bridging Video Content and Comments: Synchronized Video Description with Temporal Summarization of Crowdsourced Time-Sync Comments.. In AAAI. 1611--1617.Google Scholar
Wenmian Yang, Na Ruan, Wenyuan Gao, Kun Wang, Wensheng Ran, and Weijia Jia. 2017. Crowdsourced time-sync video tagging using semantic association graph. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 547--552.Google ScholarCross Ref

Index Terms

TSCSet: A Crowdsourced Time-Sync Comment Dataset for Exploration of User Experience Improvement
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Learning latent representations
2. Information systems
  1. World Wide Web
    1. Web applications
      1. Crowdsourcing

Recommendations

Topic-driven reader comments summarization
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Readers of a news article often read its comments contributed by other readers. By reading comments, readers obtain not only complementary information about this news article but also the opinions from other readers. However, the existing ranking ...
Read More
Does sentiment help requirement engineering: exploring sentiments in user comments to discover informative comments
Abstract
User comments are valuable resources for software improvement; however, owing to excessive volume, informative comments need to be selectively analyzed. We attempt to address this problem by sentiment analysis and expect sentiment can be a useful ...
Read More
Joint sentiment/topic model for sentiment analysis
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces
March 2018
698 pages
ISBN:9781450349451
DOI:10.1145/3172944
General Chairs:
Shlomo Berkovsky
CSIRO, Australia
,
Yoshinori Hijikata
Kwansei Gakuin University, Japan
,
Jun Rekimoto
University of Tokyo, Japan
,
Program Chairs:
Margaret Burnett
Oregon State University, USA
,
Mark Billinghurst
University of South Australia, Australia
,
Aaron Quigley
University of St Andrews, UK
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 March 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourced time-sync comment
episode representation learning
hierarchical structured dataset
storyline prediction
Qualifiers
- research-article
Conference

Acceptance Rates
IUI '18 Paper Acceptance Rate43of299submissions,14%Overall Acceptance Rate746of2,811submissions,27%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 15
  Total Citations
  View Citations
- 347
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

TSCSet: A Crowdsourced Time-Sync Comment Dataset for Exploration of User Experience Improvement

IUI '18: Proceedings of the 23rd International Conference on Intelligent User Interfaces

ABSTRACT

References

Cited By

Index Terms

Recommendations

Topic-driven reader comments summarization

Does sentiment help requirement engineering: exploring sentiments in user comments to discover informative comments

Joint sentiment/topic model for sentiment analysis