Tripartite-Replicated Softmax Model for Document Representations

Xu, Bo; Lin, Hongfei; Wang, Lin; Lin, Yuan; Xu, Kan; Wei, Xiaocong; Huang, Dong

doi:10.1007/978-3-319-68699-8_9

Bo Xu¹⁸,
Hongfei Lin¹⁸,
Lin Wang¹⁸,
Yuan Lin¹⁹,
Kan Xu¹⁸,
Xiaocong Wei¹⁸ &
…
Dong Huang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10390))

Included in the following conference series:

China Conference on Information Retrieval

545 Accesses

Abstract

Text mining tasks based on machine learning require inputs to be represented as fixed-length vectors, and effective vectors of words, phrases, sentences and even documents may greatly improve the performance of these tasks. Recently, distributed word representations based on neural networks have been demonstrated powerful in many tasks by encoding abundant semantic and linguistic information. However, it remains a great challenge for document representations because of the complex semantic structures in different documents. To meet the challenge, we propose two novel tripartite graphical models for document representations by incorporating word representations into the Replicated Softmax model, and we name the models as Tripartite-Replicated Softmax model (TRPS) and directed Tripartite-Replicated Softmax model (d-TRPS), respectively. We also introduce some optimization strategies for training the proposed models to learn better document representations. The proposed models can capture linear relationships among words and latent semantic information within documents simultaneously, thus learning both linear and nonlinear document representations. We examine the learned document representations in a document classification task and a document retrieval task. Experimental results show that the learned representations by our models outperform the state-of-the-art models in improving the performance of these two tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Grefenstette, E., Dinu, G., Zhang, Y.Z., et al.: Multi-step regression learning for compositional distributional semantics. arXiv preprint arXiv:1301.6939 (2013)
Mikolov, T., Le, Q.V., Sutskever, I.: Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168 (2013)
Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)
Article Google Scholar
Nam, J., Mencía, E.L., Fürnkranz, J.: All-in text: learning document, label, and word representations jointly. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Google Scholar
Yessenalina, A., Cardie, C.: Compositional matrix-space models for sentiment analysis. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 172–182. Association for Computational Linguistics (2011)
Google Scholar
Zanzotto, F.M., Korkontzelos, I., Fallucchi, F., et al.: Estimating linear models for compositional distributional semantics. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1263–1271. Association for Computational Linguistics (2010)
Google Scholar
Gehler, P.V., Holub, A.D., Welling, M.: The rate adapting Poisson model for information retrieval and object recognition. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 337–344. ACM (2006)
Google Scholar
Xing, E.P., Yan, R., Hauptmann, A.G.: Mining associated text and images with dual-wing harmoniums. arXiv preprint arXiv:1207.1423 (2012)
Hinton, G.E., Salakhutdinov, R.R.: Replicated softmax: an undirected topic model. In: Advances in Neural Information Processing Systems, pp. 1607–1614 (2009)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Srivastava, N., Salakhutdinov, R.R., Hinton, G.E.: Modeling documents with deep Boltzmann machines. arXiv preprint arXiv:1309.6865 (2013)
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Niu, L.Q., Dai, X.Y.: Topic2Vec: learning distributed representations of topics. arXiv preprint arXiv:1506.08422 (2015)
Nguyen, D.Q., Billingsley, R., Du, L., et al.: Improving topic models with latent feature word representations. Trans. Assoc. Comput. Linguist. 3, 299–313 (2015)
Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article MATH Google Scholar
Tieleman, T.: Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of the 25th International Conference on Machine Learning, pp. 1064–1071. ACM (2008)
Google Scholar
Salakhutdinov, R., Hinton, G.E.: Deep Boltzmann machines. In: International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009)
Google Scholar
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)

Download references

Acknowledgements

This work is partially supported by grant from the Natural Science Foundation of China (No. 61632011, 61572102, 61402075, 61602078, 61562080), State Education Ministry and The Research Fund for the Doctoral Program of Higher Education (No. 20090041110002), the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

School of Computer Science and Technology, Dalian University of Technology, Dalian, China
Bo Xu, Hongfei Lin, Lin Wang, Kan Xu, Xiaocong Wei & Dong Huang
WISE lab, Dalian University of Technology, Dalian, China
Yuan Lin

Authors

Bo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Lin
View author publications
You can also search for this author in PubMed Google Scholar
Lin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Kan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaocong Wei
View author publications
You can also search for this author in PubMed Google Scholar
Dong Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongfei Lin .

Editor information

Editors and Affiliations

Renmin University, Beijing, China
Jirong Wen
Université de Montréal, Montreal, Canada
Jianyun Nie
East China University of Science and Technology, Shanghai, China
Tong Ruan
Tsinghua University, Beijing, China
Yiqun Liu
Wuhan University, Wuhan, Hubei, China
Tieyun Qian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, B. et al. (2017). Tripartite-Replicated Softmax Model for Document Representations. In: Wen, J., Nie, J., Ruan, T., Liu, Y., Qian, T. (eds) Information Retrieval. CCIR 2017. Lecture Notes in Computer Science(), vol 10390. Springer, Cham. https://doi.org/10.1007/978-3-319-68699-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-68699-8_9
Published: 21 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68698-1
Online ISBN: 978-3-319-68699-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics