Skip to main content

Learning from Audience Intelligence: Dynamic Labeled LDA Model for Time-Sync Commented Video Tagging

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11303))

Abstract

With the boom of online video uploading, video tagging becomes an important way for video indexing. However, text-based video tagging methods ignore either genre labels or temporal differences of videos, which makes results defective. Fortunately, a new type of videos called time-sync commented videos which contains large amounts of information commented by the users helps videos tagging. In this paper, we propose a supervised dynamic Latent Dirichlet Allocation model utilizing the variational topics of time-sync comments to extract both genre labels and keywords as tags. We also implement experiments on large scale real-world datasets and the effectiveness of our model are proved both in genre label classification and keyword extraction compared with baseline models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.imdb.com/title/tt5311514/.

  2. 2.

    https://www.imdb.com/.

  3. 3.

    https://www.viki.com/.

  4. 4.

    http://www.bilibili.com/.

  5. 5.

    These words and topic names are manually translated to English by the authors.

  6. 6.

    A Chinese internet slang means laughing.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. Arch. 3, 993–1022 (2003)

    MATH  Google Scholar 

  2. Chakrabarti, D., Punera, K.: Event summarization using tweets. ICWSM 11, 66–73 (2011)

    Google Scholar 

  3. Chen, X., Zhang, Y., Ai, Q., Xu, H., Yan, J., Qin, Z.: Personalized key frame recommendation. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 315–324. ACM (2017)

    Google Scholar 

  4. Chiu, C.Y., Lin, P.C., Li, S.Y., Tsai, T.H., Tsai, Y.L.: Tagging webcast text in baseball videos by video segmentation and text alignment. IEEE Trans. Circuits Syst. Video Technol. 22(7), 999–1013 (2012)

    Article  Google Scholar 

  5. Lv, G., Xu, T., Chen, E., Liu, Q., Zheng, Y.: Reading the videos: temporal labeling for crowdsourced time-sync videos based on semantic embedding. In: AAAI, pp. 3000–3006 (2016)

    Google Scholar 

  6. Mcauliffe, J.D., Blei, D.M.: Supervised topic models. In: Advances in Neural Information Processing Systems, pp. 121–128 (2008)

    Google Scholar 

  7. Ramage, D., Hall, D., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 1, pp. 248–256. Association for Computational Linguistics (2009)

    Google Scholar 

  8. Rubin, T.N., Chambers, A., Smyth, P., Steyvers, M.: Statistical topic models for multi-label document classification. Mach. Learn. 88(1–2), 157–208 (2012)

    Article  MathSciNet  Google Scholar 

  9. Siersdorfer, S., San Pedro, J., Sanderson, M.: Automatic video tagging using content redundancy. In: Proceedings of the 32nd international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 395–402. ACM (2009)

    Google Scholar 

  10. Ulges, A., Schulze, C., Koch, M., Breuel, T.M.: Learning automatic concept detectors from online video. Comput. Vis. Image Underst. 114(4), 429–438 (2010)

    Article  Google Scholar 

  11. Wang, Y., Sabzmeydani, P., Mori, G.: Semi-latent Dirichlet allocation: a hierarchical model for human action recognition. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) HuMo 2007. LNCS, vol. 4814, pp. 240–254. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75703-0_17

    Chapter  Google Scholar 

  12. Wang, Z., Yu, J., He, Y., Guan, T.: Affection arousal based highlight extraction for soccer video. Multimed. Tools Appl. 73(1), 519–546 (2014)

    Article  Google Scholar 

  13. Wu, B., Zhong, E., Tan, B., Horner, A., Yang, Q.: Crowdsourced time-sync video tagging using temporal and personalized topic modeling. In: 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 721–730. ACM (2014)

    Google Scholar 

  14. Xu, C., Wang, J., Wan, K., Li, Y., Duan, L.: Live sports event detection based on broadcast video and web-casting text. In: Proceedings of the 14th ACM International Conference on Multimedia, pp. 221–230. ACM (2006)

    Google Scholar 

  15. Xu, L., Zhang, C.: Bridging video content and comments: Synchronized video description with temporal summarization of crowdsourced time-sync comments. In: AAAI, pp. 1611–1617 (2017)

    Google Scholar 

  16. Yang, W., Ruan, N., Gao, W., Wang, K., Ran, W., Jia, W.: Crowdsourced time-sync video tagging using semantic association graph. In: 2017 IEEE International Conference on Multimedia and Expo (ICME), pp. 547–552. IEEE (2017)

    Google Scholar 

  17. Yoshii, K., Goto, M.: Musiccommentator: Generating comments synchronized with musical audio signals by a joint probabilistic model of acoustic and textual features. In: ICEC (2009)

    Google Scholar 

  18. Zhu, J., Ahmed, A., Xing, E.P.: Medlda: maximum margin supervised topic models for regression and classification. In: Proceedings of the 26th annual international conference on machine learning. pp. 1257–1264. ACM (2009)

    Google Scholar 

Download references

Acknowledgments

This work is partially supported by National Key Research and Development Program of China.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cong Xue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zeng, Z., Xue, C., Gao, N., Wang, L., Liu, Z. (2018). Learning from Audience Intelligence: Dynamic Labeled LDA Model for Time-Sync Commented Video Tagging. In: Cheng, L., Leung, A., Ozawa, S. (eds) Neural Information Processing. ICONIP 2018. Lecture Notes in Computer Science(), vol 11303. Springer, Cham. https://doi.org/10.1007/978-3-030-04182-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04182-3_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04181-6

  • Online ISBN: 978-3-030-04182-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics