Mixture Variational Autoencoder of Boltzmann Machines for Text Processing

Guilherme Gomes, Bruno; Murai, Fabricio; Goussevskaia, Olga; Couto da Silva, Ana Paula

doi:10.1007/978-3-030-80599-9_5

Bruno Guilherme Gomes¹²,
Fabricio Murai¹²,
Olga Goussevskaia¹² &
…
Ana Paula Couto da Silva¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12801))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1585 Accesses

Abstract

Variational autoencoders (VAEs) have been successfully used to learn good representations in unsupervised settings, especially for image data. More recently, mixture variational autoencoders (MVAEs) have been proposed to enhance the representation capabilities of VAEs by assuming that data can come from a mixture distribution. In this work, we adapt MVAEs for text processing by modeling each component’s joint distribution of latent variables and document’s bag-of-words as a graphical model known as the Boltzmann Machine, popular in natural language processing for performing well in a number of tasks. The proposed model, MVAE-BM, can learn text representations from unlabeled data without requiring pre-trained word embeddings. We evaluate the representations obtained by MVAE-BM on six corpora w.r.t. the perplexity metric and accuracy on binary and multi-class text classification. Despite its simplicity, our results show that MVAE-BM’s performance is on par with or superior to that of modern deep learning techniques such as BERT and RoBERTa. Last, we show that the mapping to mixture components learned by the model lends itself naturally to document clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Conneau, A., et al.: Unsupervised cross-lingual representation learning at scale. In: ACL (2020)
Google Scholar
Dahl, G.E., Adams, R.P., Larochelle, H.: Training restricted Boltzmann machines on word observations. In: ICML (2012)
Google Scholar
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. PAMI–1(2), 224–227 (1979). https://doi.org/10.1109/TPAMI.1979.4766909
Article Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)
Google Scholar
Ding, R., Nallapati, R., Xiang, B.: Coherence-aware neural topic modeling. In: EMNLP (2018)
Google Scholar
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. In: ICLR (2017)
Google Scholar
Jiang, S., Chen, Y., Yang, J., Zhang, C., Zhao, T.: Mixture variational autoencoders. Pattern Recognit. Lett. 128 (2019)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2014)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. CoRR (2019). http://arxiv.org/abs/1907.11692
Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. In: ICLR (2017)
Google Scholar
Miao, Y., Grefenstette, E., Blunsom, P.: Discovering discrete latent topics with neural variational inference. In: ICML (2017)
Google Scholar
Miao, Y., Yu, L., Blunsom, P.: Neural variational inference for text processing. In: ICML (2015)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NeurIPS (2013)
Google Scholar
Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. In: ICML (2014)
Google Scholar
Ning, X., et al.: Nonparametric topic modeling with neural inference. Neurocomputing 399, 296–306 (2020)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: EMNLP (2014)
Google Scholar
Reimers, N., Gurevych, I.: Making monolingual sentence embeddings multilingual using knowledge distillation. arXiv preprint arXiv:2004.09813 (2020)
Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)
Article Google Scholar
Srivastava, A., Sutton, C.: Neural variational inference for topic models. In: NeurIPS (2016)
Google Scholar
Srivastava, N., Salakhutdinov, R., Hinton, G.: Modeling documents with a deep Boltzmann machine. In: Conference on Uncertainty in Artificial Intelligence (2013)
Google Scholar
Sugar, C.A., James, G.M.: Finding the number of clusters in a dataset. J. Am. Stat. Assoc. 98(463), 750–763 (2003)
Article Google Scholar
Wu, J., et al.: Neural mixed counting models for dispersed topic discovery. In: Annual Meeting of the Association for Computational Linguistics (2020)
Google Scholar
Xiao, Y., Zhao, T., Wang, W.Y.: Dirichlet variational autoencoder for text modeling. CoRR (2018)
Google Scholar
Xu, J., Durrett, G.: Spherical latent spaces for stable variational autoencoders. In: EMNLP (2018)
Google Scholar
Yang, Z., Hu, Z., Salakhutdinov, R., Berg-Kirkpatrick, T.: Improved variational autoencoders for text modeling using dilated convolutions. In: ICML (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Bruno Guilherme Gomes, Fabricio Murai, Olga Goussevskaia & Ana Paula Couto da Silva

Authors

Bruno Guilherme Gomes
View author publications
You can also search for this author in PubMed Google Scholar
Fabricio Murai
View author publications
You can also search for this author in PubMed Google Scholar
Olga Goussevskaia
View author publications
You can also search for this author in PubMed Google Scholar
Ana Paula Couto da Silva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bruno Guilherme Gomes .

Editor information

Editors and Affiliations

Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais
University of Derby, Derby, UK
Farid Meziane
German Research Center for Artificial Intelligence, Saarbrücken, Germany
Helmut Horacek
University of Hertfordshire, Hatfield, UK
Epaminondas Kapetanios

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guilherme Gomes, B., Murai, F., Goussevskaia, O., Couto da Silva, A.P. (2021). Mixture Variational Autoencoder of Boltzmann Machines for Text Processing. In: Métais, E., Meziane, F., Horacek, H., Kapetanios, E. (eds) Natural Language Processing and Information Systems. NLDB 2021. Lecture Notes in Computer Science(), vol 12801. Springer, Cham. https://doi.org/10.1007/978-3-030-80599-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-80599-9_5
Published: 20 June 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-80598-2
Online ISBN: 978-3-030-80599-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mixture Variational Autoencoder of Boltzmann Machines for Text Processing