ABSTRACT
Live streaming has become a ubiquitous channel for people to learn new happenings. Although live streaming videos generally attract a large audience of watchers, their contents are long and contain relatively unexciting stretches of knowledge transmission. This observation has prompted artificial intelligence researchers to establish advanced models that automatically extract highlights from live streaming videos. Most streaming highlight extraction research has been based on visual analysis of video frames, and seldom have studies considered the messages posted by the audiences. In this paper, we propose a deep learning model that examines the messages posted by streaming audiences. The video segments whose messages reveal audience excitement are extracted to compose the highlights of a streaming video. We evaluate our model in terms of multiple Twitch streaming channels. The precision of our highlight extraction model is 51.3% and is superior to several baseline methods.
- BARBIERI, F., ANKE, L.E., BALLESTEROS, M., SOLER, J., and SAGGION, H., 2017. Towards the understanding of gaming audiences by modeling twitch emotes. In Proceedings of the 3rd Workshop on Noisy User-generated Text, 11--20.Google ScholarCross Ref
- CHUNG, J., GULCEHRE, C., CHO, K., and BENGIO, Y., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In NIPS 2014 Deep Learning and Representation Learning Workshop.Google Scholar
- DUPREZ, C., CHRISTOPHE, V.R., RIMé, B., CONGARD, A., and ANTOINE, P., 2015. Motives for the social sharing of an emotional experience. Journal of Social and Personal Relationships 32, 6, 757--787.Google ScholarCross Ref
- ESCORCIA, V., HEILBRON, F.C., NIEBLES, J.C., and GHANEM, B., 2016. Daps: Deep action proposals for action understanding. In European Conference on Computer Vision Springer, 768--784.Google ScholarCross Ref
- FU, C.-Y., LEE, J., BANSAL, M., and BERG, A.C., 2017. Video highlight prediction using audience chat reactions. arXiv preprint arXiv: 1707.08559.Google Scholar
- GOYAL, S., 2016. Sentimental Analysis of Twitter Data using Text Mining and Hybrid Classification Approach. International Journal of Advance Research, Ideas and Innovations in Technology 2, z5, 2454-2132X.Google Scholar
- HUA, X.-S., LU, L., ZHANG, H.-J., and DISTRICT, H., 2005. A generic framework of user attention model and its application in video summarization. IEEE Transaction on multimedia 7, 5, 907--919.Google ScholarDigital Library
- JIAO, Y., LI, Z., HUANG, S., YANG, X., LIU, B., and ZHANG, T., 2018. Three-Dimensional Attention-Based Deep Ranking Model for Video Highlight Detection. IEEE Transactions on Multimedia 20, 10, 2693--2705.Google ScholarCross Ref
- KRIZHEVSKY, A., SUTSKEVER, I., and HINTON, G.E., 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097--1105.Google Scholar
- LIPTON, Z.C., BERKOWITZ, J., and ELKAN, C., 2015. A critical review of recurrent neural networks for sequence learning. In arXiv preprint arXiv: 1506.00019.Google Scholar
- MIKOLOV, T., SUTSKEVER, I., CHEN, K., CORRADO, G.S., and DEAN, J., 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems, 3111--3119.Google Scholar
- MOHER, M., 1993. Decoding via cross-entropy minimization. In Proceedings of GLOBECOM'93. IEEE Global Telecommunications Conference IEEE, 809--813.Google ScholarCross Ref
- NEPAL, S., SRINIVASAN, U., and REYNOLDS, G., 2001. Automatic detection of Goal'segments in basketball videos. In Proceedings of the ninth ACM international conference on Multimedia ACM, 261--269.Google ScholarDigital Library
- NGO, C.-W., MA, Y.-F., and ZHANG, H.-J., 2005. Video summarization and scene detection by graph modeling. IEEE Transactions on Circuits and Systems for Video Technology 15, 2, 296--305.Google ScholarDigital Library
- OTSUKA, I., NAKANE, K., DIVAKARAN, A., HATANAKA, K., and OGAWA, M., 2005. A highlight scene detection and video summarization system using audio feature for a personal video recorder. IEEE Transactions on Consumer Electronics 51, 1, 112--116.Google ScholarDigital Library
- RINGER, C. and NICOLAOU, M.A., 2018. Deep unsupervised multi-view detection of video game stream highlights. In Proceedings of the 13th International Conference on the Foundations of Digital Games ACM, 15.Google ScholarDigital Library
- ROCHAN, M., YE, L., and WANG, Y., 2018. Video summarization using fully convolutional sequence networks. In Proceedings of the European Conference on Computer Vision (ECCV), 347--363.Google ScholarDigital Library
- RUI, Y., GUPTA, A., and ACERO, A., 2000. Automatically extracting highlights for TV baseball programs. In Proceedings of the eighth ACM international conference on Multimedia ACM, 105--115.Google ScholarDigital Library
- SCHUSTER, M. and PALIWAL, K.K., 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11, 2673--2681.Google ScholarDigital Library
- SIGARI, M.-H., SOLTANIAN-ZADEH, H., and POURREZA, H.-R., 2015. Fast highlight detection and scoring for broadcast soccer video summarization using on-demand feature extraction and fuzzy inference. International Journal of Computer Graphics 6, 1, 13--36.Google ScholarCross Ref
- SONG, Y., 2016. Real-time video highlights for yahoo esports. In NIPS Workshop, LSCVS 2016.Google Scholar
- SUN, M., FARHADI, A., and SEITZ, S., 2014. Ranking domain-specific highlights by analyzing edited videos. In European conference on computer vision Springer, 787--802.Google ScholarCross Ref
- TANG, A. and BORING, S., 2012. #EpicPlay: crowd-sourcing sports video highlights. In Proceedings of the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Austin, Texas, USA2012), ACM, 2208622, 1569--1572. DOI= http://dx.doi.org/10.1145/2207676.2208622.Google ScholarDigital Library
- TJONDRONEGORO, D., CHEN, Y.-P.P., and PHAM, B., 2004. Highlights for more complete sports video summarization. IEEE multimedia 11, 4, 22--37.Google Scholar
- TRUONG, B.T. and VENKATESH, S., 2007. Video abstraction: A systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM) 3, 1, 3.Google Scholar
- WEBB, G.I. and CONILIONE, P., 2005. Estimating bias and variance from data. In Pre-publication manuscript (http./www.csse.monash.edu/webb/-Files/WebbConilione06.pdf).Google Scholar
- XU, C., WANG, J., WAN, K., LI, Y., and DUAN, L., 2006. Live sports event detection based on broadcast video and web-casting text. In Proceedings of the 14th ACM international conference on Multimedia ACM, 221--230.Google ScholarDigital Library
- XU, H., DAS, A., and SAENKO, K., 2017. R-c3d: Region convolutional 3d network for temporal activity detection. In Proceedings of the IEEE international conference on computer vision, 5783--5792.Google ScholarCross Ref
- YANG, H., WANG, B., LIN, S., WIPF, D., GUO, M., and GUO, B., 2015. Unsupervised extraction of video highlights via robust recurrent auto-encoders. In Proceedings of the IEEE international conference on computer vision, 4633--4641.Google ScholarDigital Library
- YAO, T., MEI, T., and RUI, Y., 2016. Highlight detection with pairwise deep ranking for first-person video summarization. In Proceedings of the IEEE conference on computer vision and pattern recognition, 982--990.Google ScholarCross Ref
- ZHANG, B., DOU, W., and CHEN, L., 2006. Combining short and long term audio features for TV sports highlight detection. In European Conference on Information Retrieval Springer, 472--475.Google ScholarDigital Library
Index Terms
- A Deep Learning Model for Extracting Live Streaming Video Highlights using Audience Messages
Recommendations
Multi-camera Live Video Streaming over Wireless Network
Advances in Mobile Computing and Multimedia IntelligenceAbstractDue to the development of wireless communication technology, more and more streamers are using cameras mounted on mobile devices for live streaming in a wireless LAN environment. Conventional live streaming systems, which employ multiple images ...
Live Streaming as Co-Performance: Dynamics between Center and Periphery in Theatrical Engagement
Live streaming is a highly participatory form of performance, involving various types of audience participation such as liking, commenting, and gifting. But how do streamers and audiences collaborate to deliver live streaming performances? We approach ...
A measurement study of YouTube 360° live video streaming
NOSSDAV '19: Proceedings of the 29th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video360° live video streaming is becoming increasingly popular. While providing viewers with enriched experience, 360° live video streaming is challenging to achieve since it requires a significantly higher bandwidth and a powerful computation ...
Comments