Learning Word and Sentence Embeddings Using a Generative Convolutional Network

Vargas-Ocampo, Edgar; Roman-Rangel, Edgar; Hermosillo-Valadez, Jorge

doi:10.1007/978-3-319-92198-3_14

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10880))

Included in the following conference series:

Mexican Conference on Pattern Recognition

767 Accesses

Abstract

In recent years, sentence modeling using dense vector representations has been a central concern in Natural Language Processing research. While many efforts are essentially focused on the quality of the embeddings in downstream classification tasks, our contribution focuses on the understanding of new forms of computing word representations using generative architectures based on 2D Convolutional Neural Networks. We treat a sentence as a \(n \times m\) input image, such that it can be processed using 2D convolutional operations. In contrast to similar current approaches, where the input image remains untouched along the whole learning process, our contribution proposes the use of the learned 2D convolutional filters for modifying the input arrays in order to compute the corresponding word and sentence vector representations at once. We also propose to compute word dictionaries for local contexts and a global dictionary to fuse every word local meaning in a single representation. We call this proposed model a Word Embedding Generative Convolutional Network (WEGCN). Our experiments show that our method is capable of jointly estimating consistent word and sentence embeddings, thus opening pathways for future research in this vein.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The number 100 is arbitrary. The reader must bear in mind that this number would correspond to the size of a sentence (number of words) or of the portion of the text to be represented.
2.
www.cl.cam.ac.uk/~sht25/AZ_corpus.html.

References

Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the International Conference on Neural Information Processing Systems (NIPS) (2013)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)
Google Scholar
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Google Scholar
Teufel, S., Moens, M.: Summarizing scientific articles: experiments with relevance and rhetorical status. Comput. Linguist. 28(4), 409–445 (2002)
Article Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the International Conference on Machine Learning (ICML) (2014)
Google Scholar
Hill, F., Cho, K., Korhonen, A.: Learning distributed representations of sentences from unlabelled data. In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT) (2016)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (2016)
Google Scholar
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.: Convolutional sequence to sequence learning. In: Proceedings of the International Conference on Machine Learning (2017)
Google Scholar
Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2012)
Google Scholar
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the Association of Computational Linguistics (ACL) (2015)
Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Assawinjaipetch, P., Shirai, K., Sornlertlamvanich, V., Marukata, S.: Recurrent neural network with word embedding for complaint classification. In: Proceedings of the International Workshop on Worldwide Language Service Infrastructure (2016)
Google Scholar
Goldberg, Y., Hirst, G.: Neural Network Methods in Natural Language Processing. Morgan & Claypool Publishers, San Rafael (2017)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (2014)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the 2nd International Conference on Learning Representations (ICLR), arXiv:1312.6114 (2013)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R.S., Urtasun, R., Torralba, A., Fidler, S.: Skip-thought vectors. In: Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Centro de Investigación en Ciencias-(IICBA), UAEM, Cuernavaca, Morelos, México
Edgar Vargas-Ocampo & Jorge Hermosillo-Valadez
Viper Group - CVM Lab, University of Geneva, Geneva, Switzerland
Edgar Roman-Rangel

Authors

Edgar Vargas-Ocampo
View author publications
You can also search for this author in PubMed Google Scholar
Edgar Roman-Rangel
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Hermosillo-Valadez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge Hermosillo-Valadez .

Editor information

Editors and Affiliations

National Institute of Astrophysics, Optics and Electronics, Sta. Maria Tonantzintla, Puebla, Mexico
José Francisco Martínez-Trinidad
National Institute of Astrophysics, Optics and Electronics, Sta. Maria Tonantzintla, Puebla, Mexico
Jesús Ariel Carrasco-Ochoa
Autonomous University of Puebla, Puebla, Puebla, Mexico
José Arturo Olvera-López
University of South Florida, Tampa, Florida, USA
Sudeep Sarkar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vargas-Ocampo, E., Roman-Rangel, E., Hermosillo-Valadez, J. (2018). Learning Word and Sentence Embeddings Using a Generative Convolutional Network. In: Martínez-Trinidad, J., Carrasco-Ochoa, J., Olvera-López, J., Sarkar, S. (eds) Pattern Recognition. MCPR 2018. Lecture Notes in Computer Science(), vol 10880. Springer, Cham. https://doi.org/10.1007/978-3-319-92198-3_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-92198-3_14
Published: 25 May 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92197-6
Online ISBN: 978-3-319-92198-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)