Concept-Based Multimodal Learning for Topic Generation

Wang, Cheng; Yang, Haojin; Che, Xiaoyin; Meinel, Christoph

doi:10.1007/978-3-319-14445-0_33

Cheng Wang¹⁹,
Haojin Yang¹⁹,
Xiaoyin Che¹⁹ &
…
Christoph Meinel¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8935))

Included in the following conference series:

International Conference on Multimedia Modeling

3725 Accesses
1 Citations

Abstract

In this paper, we propose a concept-based multimodal learning model (CMLM) for generating document topic through modeling textual and visual data. Our model considers cross-modal concept similarity and unlabeled image concept, it is capable of processing document which has modality missing. The model can extract semantic concepts from unlabeled image and combine with text modality to generate document topics. Our comparison experiments on news document topic generation shows, in multimodal scenario, CMLM can generate more representative topics than latent dirichet allocation (LDA) based topic for representing given document.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blei, D.M., Lafferty, J.D.: A correlated topic model of science. The Annals of Applied Statistics, 17–35 (2007)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. The Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Clinchant, S., Ah-Pine, J., Csurka, G.: Semantic combination of textual and visual information in multimedia retrieval. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, p. 44. ACM (2011)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)
Google Scholar
Fan, J., Elmagarmid, A.K., Zhu, X., Aref, W.G., Wu, L.: Classview: hierarchical video shot classification, indexing, and accessing. IEEE Transactions on Multimedia 6(1), 70–86 (2004)
Article Google Scholar
Feng, Y., Lapata, M.: Topic models for image annotation and text illustration. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 831–839. Association for Computational Linguistics (2010)
Google Scholar
He, X., Ma, W.-Y., Zhang, H.-J.: Learning an image manifold for retrieval. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, pp. 17–23. ACM (2004)
Google Scholar
Huang, A.: Similarity measures for text document clustering. In: Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC 2008), Christchurch, New Zealand, pp. 49–56 (2008)
Google Scholar
Jia, Y., Salzmann, M., Darrell, T.: Learning cross-modality similarity for multinomial data. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2407–2414. IEEE (2011)
Google Scholar
Putthividhy, D., Attias, H.T., Nagarajan, S.S.: Topic regression multi-modal latent dirichlet allocation for image annotation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3408–3415. IEEE (2010)
Google Scholar
Rasiwasia, N., Pereira, J.C., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the International Conference on Multimedia, pp. 251–260. ACM (2010)
Google Scholar
Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms. In: Proceedings of the International Conference on Multimedia, pp. 1469–1472. ACM (2010)
Google Scholar
Yu, J., Cong, Y., Qin, Z., Wan, T.: Cross-modal topic correlations for multimedia retrieval. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 246–249. IEEE (2012)
Google Scholar
Zhai, X., Peng, Y., Xiao, J.: Cross-media retrieval by intra-media and inter-media correlation mining. Multimedia Systems 19(5), 395–406 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Hasso Plattner Institute, University of Potsdam, Prof.-Dr.-Helmert-Str. 2-3, 14482, Potsdam, Germany
Cheng Wang, Haojin Yang, Xiaoyin Che & Christoph Meinel

Authors

Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haojin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyin Che
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Meinel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Technology, P.O. Box 123, 2007, Sydney, NSW, Australia
Xiangjian He , Dacheng Tao & Muhammad Abul Hasan , &
University of Newcastle, University Dr, Callaghan, 2308, NSW, Australia
Suhuai Luo
National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 95, Zhongguancun East Road, 100190, Beijing, P.R. China
Changsheng Xu
Shanghai Jitotong University, 800 Dong Chuan Rd, 200240, Shanghai, China
Jie Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, C., Yang, H., Che, X., Meinel, C. (2015). Concept-Based Multimodal Learning for Topic Generation. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds) MultiMedia Modeling. MMM 2015. Lecture Notes in Computer Science, vol 8935. Springer, Cham. https://doi.org/10.1007/978-3-319-14445-0_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-14445-0_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14444-3
Online ISBN: 978-3-319-14445-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics