Skip to main content
Log in

Online web video topic detection and tracking with semi-supervised learning

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

With the pervasiveness of online social media and rapid growth of web data, a large amount of multi-media data is available online. However, how to organize them for facilitating users’ experience and government supervision remains a problem yet to be seriously investigated. Topic detection and tracking, which has been a hot research topic for decades, could cluster web videos into different topics according to their semantic content. However, how to online discover topic and track them from web videos and images has not been fully discussed. In this paper, we formulate topic detection and tracking as an online tracking, detection and learning problem. First, by learning from historical data including labeled data and plenty of unlabeled data using semi-supervised multi-class multi-feature method, we obtain a topic tracker which could also discover novel topics from the new stream data. Second, when new data arrives, an online updating method is developed to make topic tracker adaptable to the evolution of the stream data. We conduct experiments on public dataset to evaluate the performance of the proposed method and the results demonstrate its effectiveness for topic detection and tracking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Xie, L., Natsev, A., Kender, J.R., Hill, M., Smith, J.R.: Visual memes in social media: tracking real-world news in youtube videos. In: Proceedings of the 19th ACM International Conference on Multimedia, MM ’11, pp. 53–62 (2011)

  2. Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study final report (1998)

  3. Chen, K., Luesukprasert, L., Chou, S.: Hot topic extraction based on timeline analysis and multi-dimensional sentence modeling. IEEE Trans. Knowl. Data Eng. 19(8), 1016–1025 (2007)

    Article  Google Scholar 

  4. Sun, A.X., Hu, M.: Query-guided event detection from news and blog streams. IEEE Trans. Syst. Man Cybern. 41(5), 834–839 (2011)

    Article  Google Scholar 

  5. Zhai, Y., Shah, M.: Tracking news stories across different sources. In: Proceedings of the 20th ACM International Conference on Multimedia, MM ’05, pp. 2–10. ACM (2005)

  6. Wu, Z.L., Li, C.h.: Topic detection in online discussion using non-negative matrix factorization. In: Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology-Workshops, WI-IATW ’07, pp. 272–275 (2007)

  7. Kasiviswanathan, S.P., Melville, P., Banerjee, A., Sindhwani, V.: Emerging topic detection using dictionary learning. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM ’11, pp. 745–754. ACM (2011)

  8. Aiello, L.M., Petkos, G., Corney, D., Papadopoulos, S., Skraba, R., Goker, A., Kompatsiaris, Y., Jaimes, A.: Sensing trending topics in twitter. IEEE Trans. Multimed. 33(4), 410–419 (2013)

    Google Scholar 

  9. Kim, D., Kim, D., Hwang, E., Rho, S.: Twittertrends: a spatio-temporal trend detection and related keywords recommendation scheme. Multimed. Syst. 1–14 (2013)

  10. Yeh, Y.R., Chung, Y.Y., Wang, Y.F.: A novel multiple kernel learning framework for heterogeneous feature fusion and variable selection. IEEE Trans. Multimed. 14(3), 563–574 (2012)

    Article  Google Scholar 

  11. Li, H.J., Wang, X.H., Tang, J.H., Zhao, C.X.: Combining global and local matching of multiple features fro precise item image retrieval. Multimed. Syst. 19, 37–49 (2013)

    Article  Google Scholar 

  12. Ma, Z.G., Yang, Y., Xu, Z.W., Yan, S.C., Sebe, N., Hauptmann, A.G.: Complex event detection via multi-source video attributes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2013)

  13. Xu, Z.W., Yang, Y., Tsang, I., Sebe, N., Hauptmann, A.G.: Feature weighting via optimal thresholding for video analysis. In: Proceedings of Intenational Conference on Computer Vision, ICCV (2013)

  14. Bao, B.K., Min, W., Sang, J., Xu, C.: Multimedia news digger on emerging topics from social streams. In: Proceedings of the 20th ACM International Conference on Multimedia, MM ’12, pp. 1357–1358 (2012)

  15. Liu, K., Xu, J., Zhang, L., Ding, Z., Li, M.: Discovering hot topics from geo-tagged video. Neurocomputing 105, 90–99 (2013)

    Article  Google Scholar 

  16. Shao, J., Ma, S., Lu, W., Zhuang, Y.: A unified framework for web video topic discovery and visualization. Pattern Recognit. Lett. 33(4), 410–419 (2012)

    Article  Google Scholar 

  17. Hong, R., Tang, J., Tan, H., Ngo, C., Yan, S., Chua, T.: Beyond search: event driven summarization for web videos. ACM Trans. Multimed. Comput. Commun. Appl. 33(4), 410–419 (2011)

    Google Scholar 

  18. Cao, J., Ngo, C.W., Zhang, Y.D., Li, J.T.: Tracking web video topics: discovery, visualization, and monitoring. IEEE Trans. Circuits Syst. Video Technol. 21(12), 1835–1846 (2011)

    Article  Google Scholar 

  19. Chen, T., Liu, C., Huang, Q.: An effective multi-clue fusion approach for web video topic detection. In: Proceedings of the 20th ACM International Conference on Multimedia, MM ’12, pp. 781–784 (2012)

  20. Yang, Y., Song, J., Huang, Z., Ma, Z., Sebe, N., Hauptmann, A.: Multi-feature fusion via hierarchical regression for multimedia analysis. IEEE Trans. Multimed. 15(3), 572–581 (2013)

    Article  Google Scholar 

  21. McDonald, K., Smeaton, A.F.: A comparison of score, rank and probability-based fusion methods for video shot retrieval. In: Proceedings of the 4th International Conference on Image and Video Retrieval, CIVR’05, pp. 61–70 (2005)

  22. Fu, Z., Ip, H.H.S., Lu, H., Lu, Z.: Multi-modal constraint propagation for heterogeneous image clustering. In: Candan, K.S., Panchanathan, S., Prabhakaran, B., Sundaram, H., Chi Feng, W., Sebe, N. (eds.) ACM Multimedia, pp. 143–152. ACM (2011)

  23. Zhang, Y., Li, G., Chu, L., Wang, S., Zhang, W., Huang, Q.: Cross-media topic detection: a multi-modality fusion framework. In: Proceedings of the International Conference on Multimedia (2013)

  24. Adams, W.H., Iyengar, G., Naphade, M.R., Neti, C., Nock, H.J., Smith, J.R.: Semantic indexing of multimedia content using visual, audio and text cues. EURASIP J. Appl. Signal Process. 2, 170–185 (2003)

    Article  Google Scholar 

  25. Papandreou, G., Katsamanis, A., Pitsikalis, V., Maragos, P.: Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition. IEEE Trans. Audio Speech Lang. Process. 17(3), 423–435 (2009)

    Article  Google Scholar 

  26. Poh, N., Bengio, S.: How do correlation and variance of base-experts affect fusion in biometric authentication tasks? IEEE Trans. Signal Process. 53(11), 4384–4396 (2005)

    Article  MathSciNet  Google Scholar 

  27. Ding, C., He, X.: K-means clustering via principal component analysis. In: Proceedings of the Twenty-first International Conference on Machine learning, ICML ’04, pp. 29–36 (2004)

  28. Xue, Z., Jiang, S., Li, G., Huang, Q., Zhang, W.: Cross-media topic detection associated with hot search queries. In: Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service, ICIMCS ’13, pp. 403–406 (2013)

  29. Saha, A., Sindhwani, V.: Dynamic nmfs with temporal regularization for online analysis of streaming text. In: Proceedings of NIPS Workshop on Machine Learning for Social Computing, pp. 1C8 (2010)

  30. AlSumait, L., Barbara, D., Domeniconi, C.: On-line lda: adaptive topic models for mining text streams with applications to topic detection and tracking. In: Eighth IEEE International Conference on Data Mining, ICDM ’08, pp. 3–12 (2008)

  31. Hoffman, M., Blei, D.M., Bach, F.: Online learning for latent Dirichlet allocation. In: NIPS (2010)

  32. Dai, X.Y., Chen, Q.C., Wang, X.L., Xu, J.: Online topic detection and tracking of financial news based on hierarchical clustering. In: International Conference on Machine Learning and Cybernetics, ICMLC, vol. 6, pp. 3341–3346 (2010)

  33. Freund, Y., Schapire, R.: A decision theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  34. Hastie, T., Simard, P.: Models and metrics for handwritten character recognition. Stat. Sci. 13(1), 54–65 (1998)

    Article  MATH  Google Scholar 

  35. Yang, Y., Xu, D., Nie, F.P.: Ranking with local regression and global aignment for cross media retrieval. In: Proceedings of the 17th ACM International Conference on Multimedia, MM ’09, pp. 175–184 (2009)

  36. Cao, J., Zhang, Y., Song, Y., Chen, Z., Zhang, X., Li, J.: Mcg-webv: a benchmark dataset for web video analysis. Technical Report, MCG-ICT-CAS-09-001 (2009)

  37. Picard, R.R., Cook, R.D.: Cross-validation of regression models. J. Am. Stat. Assoc. 79(387), 575–583 (1984)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This work was supported by China Postdoctoral Science Foundation: 2012M520436, in part by National Basic Research Program of China (973 Program): 2012CB316400, National Natural Science Foundation of China: 61303153, 61025011, 61332016, 61322212, 61202234 and 61202322, Present Foundation of UCAS.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Guorong Li, Weigang Zhang or Qingming Huang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, G., Jiang, S., Zhang, W. et al. Online web video topic detection and tracking with semi-supervised learning. Multimedia Systems 22, 115–125 (2016). https://doi.org/10.1007/s00530-014-0402-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-014-0402-0

Keywords

Navigation