Abstract
In this paper, we propose a concept-based multimodal learning model (CMLM) for generating document topic through modeling textual and visual data. Our model considers cross-modal concept similarity and unlabeled image concept, it is capable of processing document which has modality missing. The model can extract semantic concepts from unlabeled image and combine with text modality to generate document topics. Our comparison experiments on news document topic generation shows, in multimodal scenario, CMLM can generate more representative topics than latent dirichet allocation (LDA) based topic for representing given document.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blei, D.M., Lafferty, J.D.: A correlated topic model of science. The Annals of Applied Statistics, 17–35 (2007)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic combination of textual and visual information in multimedia retrieval. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, p. 44. ACM (2011)
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)
Fan, J., Elmagarmid, A.K., Zhu, X., Aref, W.G., Wu, L.: Classview: hierarchical video shot classification, indexing, and accessing. IEEE Transactions on Multimedia 6(1), 70–86 (2004)
Feng, Y., Lapata, M.: Topic models for image annotation and text illustration. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 831–839. Association for Computational Linguistics (2010)
He, X., Ma, W.-Y., Zhang, H.-J.: Learning an image manifold for retrieval. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, pp. 17–23. ACM (2004)
Huang, A.: Similarity measures for text document clustering. In: Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC 2008), Christchurch, New Zealand, pp. 49–56 (2008)
Jia, Y., Salzmann, M., Darrell, T.: Learning cross-modality similarity for multinomial data. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2407–2414. IEEE (2011)
Putthividhy, D., Attias, H.T., Nagarajan, S.S.: Topic regression multi-modal latent dirichlet allocation for image annotation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3408–3415. IEEE (2010)
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the International Conference on Multimedia, pp. 251–260. ACM (2010)
Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the International Conference on Multimedia, pp. 1469–1472. ACM (2010)
Yu, J., Cong, Y., Qin, Z., Wan, T.: Cross-modal topic correlations for multimedia retrieval. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 246–249. IEEE (2012)
Zhai, X., Peng, Y., Xiao, J.: Cross-media retrieval by intra-media and inter-media correlation mining. Multimedia Systems 19(5), 395–406 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, C., Yang, H., Che, X., Meinel, C. (2015). Concept-Based Multimodal Learning for Topic Generation. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds) MultiMedia Modeling. MMM 2015. Lecture Notes in Computer Science, vol 8935. Springer, Cham. https://doi.org/10.1007/978-3-319-14445-0_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-14445-0_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14444-3
Online ISBN: 978-3-319-14445-0
eBook Packages: Computer ScienceComputer Science (R0)