research-article

Learning word embeddings via context grouping

Authors:

Antoni B. ChanAuthors Info & Claims

ACM TURC '17: Proceedings of the ACM Turing 50th Celebration Conference - China

Article No.: 24, Pages 1 - 10

https://doi.org/10.1145/3063955.3063979

Published: 12 May 2017 Publication History

Abstract

Recently, neural-network based word embedding models have been shown to produce high-quality distributional representations capturing both semantic and syntactic information. In this paper, we propose a grouping-based context predictive model by considering the interactions of context words, which generalizes the widely used CBOW model and Skip-Gram model. In particular, the words within a context window are split into several groups with a grouping function, where words in the same group are combined while different groups are treated as independent. To determine the grouping function, we propose a relatedness hypothesis stating the relationship among context words and propose several context grouping methods. Experimental results demonstrate better representations can be learned with suitable context groups.

References

[1]

Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Paşca, and Aitor Soroa. 2009. A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 19--27.

Digital Library

[2]

Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. The Journal of Machine Learning Research 3 (2003), 1137--1155.

Digital Library

[3]

Elia Bruni, Nam-Khanh Tran, and Marco Baroni. 2014. Multimodal Distributional Semantics. J. Artif. Intell. Res.(JAIR) 49, 1--47 (2014).

Digital Library

[4]

Danqi Chen and Christopher D Manning. 2014. A Fast and Accurate Dependency Parser using Neural Networks. In EMNLP. 740--750.

[5]

Xinxiong Chen, Lei Xu, Zhiyuan Liu, Maosong Sun, and Huan-Bo Luan. 2015. Joint Learning of Character and Word Embeddings. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25--31, 2015. 1236--1242.

Digital Library

[6]

Jianpeng Cheng, Zhongyuan Wang, Ji-Rong Wen, Jun Yan, and Zheng Chen. 2015. Contextual Text Understanding in Distributional Semantic Space. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 133--142.

Digital Library

[7]

Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. ACM, 160--167.

Digital Library

[8]

Scott C. Deerwester, Susan T Dumais, Thomas K. Landauer, George W. Furnas, and Richard A. Harshman. 1990. Indexing by latent semantic analysis. JAsIs 41, 6 (1990), 391--407.

[9]

John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research 12 (2011), 2121--2159.

Digital Library

[10]

Zellig S Harris. 1954. Distributional structure. Word (1954).

[11]

Felix Hill, Roi Reichart, and Anna Korhonen. 2016. Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Computational Linguistics (2016).

Digital Library

[12]

Eric H Huang, Richard Socher, Christopher D Manning, and Andrew Y Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 873--882.

Digital Library

[13]

Douwe Kiela, Felix Hill, and Stephen Clark. 2015. Specializing word embeddings for similarity or relatedness. In Proceedings of EMNLP.

[14]

Yoon Kim. 2014. Convolutional Neural Networks for Sentence Classification. In EMNLP. ACL, 1746--1751.

[15]

Omer Levy and Yoav Goldberg. 2014. Dependency-Based Word Embeddings. In ACL (2). 302--308.

[16]

Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in Neural Information Processing Systems. 2177--2185.

Digital Library

[17]

Wang Ling, Chris Dyer, Alan Black, and Isabel Trancoso. 2015. Two/too simple adaptations of word2vec for syntax problems. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1299--1304.

[18]

Wang Ling, Yulia Tsvetkov, Silvio Amir, Ramon Fermandez, Chris Dyer, Alan W. Black, Isabel Trancoso, and Chu-Cheng Lin. 2015. Not All Contexts Are Created Equal: Better Word Representations with Variable Attention. In EMNLP. The Association for Computational Linguistics, 1367--1372.

[19]

Quan Liu, Hui Jiang, Si Wei, Zhen-Hua Ling, and Yu Hu. 2015. Learning semantic word embeddings based on ordinal knowledge constraints. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP). 1501--1511.

[20]

Minh-Thang Luong, Richard Socher, and Christopher D Manning. 2013. Better word representations with recursive neural networks for morphology. CoNLL-2013 104 (2013).

[21]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).

[22]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.

Digital Library

[23]

Andriy Mnih and Geoffrey Hinton. 2007. Three new graphical models for statistical language modelling. In Proceedings of the 24th international conference on Machine learning. ACM, 641--648.

Digital Library

[24]

Frederic Morin and Yoshua Bengio. 2005. Hierarchical probabilistic neural network language model. In Proceedings of the international workshop on artificial intelligence and statistics. Citeseer, 246--252.

[25]

Ramesh Nallapati, Bowen Zhou, Cícero Nogueira dos Santos, Çaglar Gülçehre, and Bing Xiang. 2016. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond. In CoNLL. ACL, 280--290.

[26]

Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, and Andrew McCallum. 2014. Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space. In EMNLP. ACL, 1059--1069.

[27]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) 12 (2014), 1532--1543.

[28]

Chang Xu, Yalong Bai, Jiang Bian, Bin Gao, Gang Wang, Xiaoguang Liu, and Tie-Yan Liu. 2014. Rc-net: A general framework for incorporating knowledge into word representations. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, 1219--1228.

Digital Library

[29]

Geoffrey Zweig and Christopher JC Burges. 2011. The Microsoft Research sentence completion challenge. Technical Report. Technical Report MSR-TR-2011-129, Microsoft.

Index Terms

Learning word embeddings via context grouping
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
    1. Learning settings
      1. Online learning settings
    2. Machine learning approaches
      1. Learning in probabilistic graphical models
        Maximum likelihood modeling
      2. Learning latent representations

Recommendations

Learning class-specific word embeddings
Abstract
Recent years have seen the success of applying word embedding algorithms to natural language processing (NLP) tasks. Most word embedding algorithms only produce a single embedding per word. This makes the learned embeddings indiscriminative since ...
Morphological Segmentation to Improve Crosslingual Word Embeddings for Low Resource Languages

Crosslingual word embeddings developed from multiple parallel corpora help in understanding the relationships between languages and improving the prediction quality of machine translation. However, in low resource languages with complex and ...
Lexical Function Identification Using Word Embeddings and Deep Learning
Advances in Soft Computing
Abstract
In this work, we report the results of our experiments on the task of distinguishing the semantics of verb-noun collocations in a Spanish corpus. This semantics was represented by four lexical functions of the Meaning-Text Theory. Each lexical ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACM TURC '17: Proceedings of the ACM Turing 50th Celebration Conference - China

May 2017

371 pages

ISBN:9781450348737

DOI:10.1145/3063955

Conference Chairs:
John Lui
The Chinese University of Hong Kong
,
Xinbing Wang
Shanghai Jiao Tong University
,
General Chairs:
Alexander Wolf
University of California, Santa Cruz
,
Yunhao Liu
Tsinghua University, China
,
Chuanping Hu
The Third Research Institute of MPS, China

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 May 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ACM TUR-C '17

ACM TUR-C '17: ACM Turing 50th Celebration Conference - China

May 12 - 14, 2017

Shanghai, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
187
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents