Abstract
In this paper, we proposed a novel multi-modal max-margin supervised topic model (MMSTM) for social event analysis by jointly learning the representation together with the classifier in a unified framework. Compared with existing methods, the proposed MMSTM model has several advantages. (1) The proposed model can utilize the classifier as the regularization term of our model to jointly learn the parameters in the generative model and max-margin classifier, and use the Gibbs sampling to learn parameters of the representation model and max-margin classifier by minimizing the expected loss function. (2) The proposed model is able to not only effectively mine the multi-modal property by jointly learning the latent topic relevance among multiple modalities for social event representation, but also exploit the supervised information by considering a discriminative max-margin classifier for event classification to boost the classification performance. (3) In order to validate the effectiveness of the proposed model, we collect a large-scale real-world dataset for social event analysis, and both qualitative and quantitative evaluation results have demonstrated the effectiveness of the proposed MMSTM.
Similar content being viewed by others
References
Bao Y, Collier N, Datta A (2013) A partially supervised cross-collection topic model for cross-domain text classification. In: ACM International conference on information & knowledge management, pp 239–248
Blei DM, Jordan MI (2003) Modeling annotated data. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 127–134
Blei DM, Mcauliffe JD (2010) Supervised topic models. Adv Neural Inf Process Syst 3:327–332
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. JMLR 3:993–1022
Firan CS, Georgescu M, Nejdl W, Paiu R (2010) Bringing order to your photos: event-driven classification of flickr images based on social knowledge. In: ACM International conference on information and knowledge management, pp 189–198
Gao H, Tang S, Zhang Y, Jiang D, Wu F, Zhuang Y (2012) Supervised cross-collection topic modeling. In: ACM Multimedia, pp 957–960
Griffiths TL, Steyvers M (2004) Find Sci Topics 101:5228–5235
Hoffman MD, Blei DM, Bach FR (2010) Online learning for latent Dirichlet allocation. Adv Neural Inf Process Syst 23:856–864
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Krestel R, Fankhauser P, Nejdl W (2009) Latent Dirichlet allocation for tag recommendation. In: ACM Conference on recommender systems, Recsys 2009. New York, pp 61–68
Kumaran G, Allan J (2004) Text classification and named entities for new event detection. In: International ACM SIGIR conference on research and development in information retrieval, pp 297–304
Lacoste-Julien S, Sha F, Jordan MI (2008) Disclda: discriminative learning for dimensionality reduction and classification. In: Proceedings of NIPS neural information processing systems, pp 897–904
Lin D, Xiao J (2013) Characterizing layouts of outdoor scenes using spatial topic processes, pp 841–848
Liu X, Huet B (2013) Heterogeneous features and model selection for event-based media classification. In: ACM International conference on multimedia retrieval, pp 151–158
Makkonen J, Ahonen-Myka H, Salmenkivi M (2004) Simple semantics in topic detection and tracking. Inf Retr 7(3–4):347–368
Min W, Bao BK, Xu C (2014) Multimodal spatio-temporal theme modeling for landmark analysis. IEEE Multimed 21(3):20–29
Niu Z, Hua G, Gao X, Tian Q (2011) Spatial-disclda for visual recognition. In: Computer vision and pattern recognition, pp 1769–1776
Perotte A, Bartlett N, Elhadad N, Wood F (2011) Hierarchically supervised latent Dirichlet allocation. Adv Neural Inf Process Syst 24:2609–2617
Qian S, Zhang T, Xu C (2016) Multi-modal multi-view topic-opinion mining for social event analysis. In: ACM on multimedia conference, pp 2–11
Qian S, Zhang T, Xu C, Shao J (2016) Multi-modal event topic model for social event analysis. IEEE Trans Multimed 18(2):233–246
Radinsky K, Horvitz E (2013) Mining the web to predict future events. In: ACM International conference on web search and data mining, pp 255–264
Ramage D, Hall D, Nallapati R, Manning CD (2009) Labeled lda: a supervised topic model for credit attribution in multi-labeled corpora. In: Conference on empirical methods in natural language processing: volume, 248–256
Ramage D, Heymann P, Manning CD, Garcia-Molina H (2009) Clustering the tagged web. In: International conference on web search and web data mining, WSDM 2009. Barcelona, pp 54–63
Wang Y, Mori G (2011) Max-margin latent Dirichlet allocation for image classification and annotation. Lect Notes Comput Sci 1674(1):39–48
Min W, Bao BK, Mei S, Zhu Y, Rui Y, Jiang S (2017) “You are what you eat: Exploring rich recipe information for cross-region food analysis”. IEEE Trans Multi 99:1–1
Yang S, Yuan C, Wu B, Hu W, Wang F (2015) Multi-feature max-margin hierarchical Bayesian model for action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1610–1618
Yang W, Boyd-Graber J, Resnik P (2016) A discriminative topic model using document network structure. In: Meeting of the association for computational linguistics, pp 686–696
Yue G, Hanwang Z, Xibin SY (2017) Event classification in microblogs via social tracking. Acm Trans Intell Syst Technol 8(3):35
Zhang T, Xu C (2014) Cross-domain multi-event tracking via co-pmht. Acm Trans Multimed Comput Commun Appl 10(4):1–19
Zhang T, Xu C, Zhu G, Liu S (2012) A generic framework for video annotation via semi-supervised learning. IEEE Trans Multimed 14(4):1206–1219
Zhu J, Chen N, Perkins H, Zhang B (2014) Gibbs max-margin topic models with data augmentation. J Mach Learn Res 15(1):1073–1110
Acknowledgments
The work is supported by the National Key Research and Development Program of China (No. 2017YFB080 3301). This work is also supported by the National Natural Science Foundation of China (No.61772170, 614 72115, 61572498, 61532009, 61472379, 61572296).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xue, F., Wang, J., Qian, S. et al. Multi-modal max-margin supervised topic model for social event analysis. Multimed Tools Appl 78, 141–160 (2019). https://doi.org/10.1007/s11042-017-5605-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5605-x