research-article

Enhancing Semantic Word Representations by Embedding Deep Word Relationships

Authors:
Anupiya Nugaliyadde

School of Engineering and Information Technology, Murdoch University, Western Australia

School of Engineering and Information Technology, Murdoch University, Western Australia
View Profile

,
Kok Wai Wong

School of Engineering and Information Technology, Murdoch University, Western Australia

School of Engineering and Information Technology, Murdoch University, Western Australia
View Profile

,
Ferdous Sohel

School of Engineering and Information Technology, Murdoch University, Western Australia

School of Engineering and Information Technology, Murdoch University, Western Australia
View Profile

,
Hong Xie

School of Engineering and Information Technology, Murdoch University, Western Australia

School of Engineering and Information Technology, Murdoch University, Western Australia
View Profile

ICCAE 2019: Proceedings of the 2019 11th International Conference on Computer and Automation EngineeringFebruary 2019Pages 82–87https://doi.org/10.1145/3313991.3314019

Published:23 February 2019Publication History

ICCAE 2019: Proceedings of the 2019 11th International Conference on Computer and Automation Engineering

Pages 82–87

ABSTRACT

Word representations are created using analogy context-based statistics and lexical relations on words. Word representations are inputs for the learning models in Natural Language Understanding (NLU) tasks. However, to understand language, knowing only the context is not sufficient. Reading between the lines is a key component of NLU. Embedding deeper word relationships which are not represented in the context enhances the word representation. This paper presents a word embedding which combines an analogy, context-based statistics using Word2Vec, and deeper word relationships using Conceptnet, to create an expanded word representation. In order to fine-tune the word representation, Self-Organizing Map is used to optimize it. The proposed word representation is compared with semantic word representations using Simlex 999. Furthermore, the use of 3D visual representations has shown to be capable of representing the similarity and association between words. The proposed word representation shows a Spearman correlation score of 0.886 and provided the best results when compared to the current state-of-the-art methods, and exceed the human performance of 0.78.

References

T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," in Advances in neural information processing systems, 2013, pp. 3111--3119. Google ScholarDigital Library
G. Recski, E. Iklódi, K. Pajkossy, and A. Kornai, "Measuring semantic similarity of words using concept networks," in Proceedings of the 1st Workshop on Representation Learning for NLP, 2016, pp. 193--200.Google Scholar
B. Dhingra, H. Liu, R. Salakhutdinov, and W. W. Cohen, "A comparative study of word embeddings for reading comprehension," arXiv preprint arXiv:1703.00993, 2017.Google Scholar
R. Collobert and J. Weston, "A unified architecture for natural language processing: Deep neural networks with multitask learning," in Proceedings of the 25th international conference on Machine learning, 2008, pp. 160--167: ACM. Google ScholarDigital Library
A. Kumar et al., "Ask me anything: Dynamic memory networks for natural language processing," in International Conference on Machine Learning, 2016, pp. 1378--1387. Google ScholarDigital Library
A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Reinforced memory network for question answering," in International Conference on Neural Information Processing, 2017, pp. 482--490: Springer.Google Scholar
D. Bahdanau, K. Cho, and Y. Bengio, "Neural machine translation by jointly learning to align and translate," arXiv preprint arXiv:1409.0473, 2014.Google Scholar
T.-H. Wen, M. Gasic, N. Mrksic, P.-H. Su, D. Vandyke, and S. Young, "Semantically conditioned lstm-based natural language generation for spoken dialogue systems," arXiv preprint arXiv:1508.01745, 2015.Google Scholar
X. Zhang and Y. LeCun, "Text understanding from scratch," arXiv preprint arXiv:1502.01710, 2015.Google Scholar
J. P. Chiu and E. Nichols, "Named entity recognition with bidirectional LSTM-CNNs," arXiv preprint arXiv:1511.08308, 2015.Google Scholar
T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient estimation of word representations in vector space," arXiv preprint arXiv:1301.3781, 2013.Google Scholar
J. Pennington, R. Socher, and C. Manning, "Glove: Global vectors for word representation," in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532--1543.Google Scholar
C. Potts, D. Lassiter, R. Levy, and M. C. Frank, "Embedded implicatures as pragmatic inferences under compositional lexical uncertainty," Journal of Semantics, vol. 33, no. 4, pp. 755--802, 2016.Google Scholar
A. Nugaliyadde, K. W. Wong, F. Sohel, and H. Xie, "Multi-level Search of a Knowledgebase for Semantic Parsing," in International Workshop on Multi-disciplinary Trends in Artificial Intelligence, 2017, pp. 44--53: Springer.Google Scholar
H. Liu and P. Singh, "ConceptNet---a practical commonsense reasoning tool-kit," BT technology journal, vol. 22, no. 4, pp. 211--226, 2004. Google ScholarDigital Library
F. Hill, R. Reichart, and A. Korhonen, "Simlex-999: Evaluating semantic models with (genuine) similarity estimation," Computational Linguistics, vol. 41, no. 4, pp. 665--695, 2015. Google ScholarDigital Library
J. A. Bullinaria and J. P. Levy, "Extracting semantic representations from word co-occurrence statistics: A computational study," Behavior research methods, vol. 39, no. 3, pp. 510--526, 2007.Google Scholar
R. Lebret and R. Collobert, "Word emdeddings through hellinger PCA," arXiv preprint arXiv:1312.5542, 2013.Google Scholar
T. Mikolov, W.-t. Yih, and G. Zweig, "Linguistic regularities in continuous space word representations," in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2013, pp. 746--751.Google Scholar
Y. Goldberg and O. Levy, "word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method," arXiv preprint arXiv:1402.3722, 2014.Google Scholar
R. Speer, J. Chin, and C. Havasi, "ConceptNet 5.5: An Open Multilingual Graph of General Knowledge," in AAAI, 2017, pp. 4444--4451. Google ScholarDigital Library
R. Speer and J. Chin, "An ensemble method to produce high-quality word embeddings," arXiv preprint arXiv:1604.01692, 2016.Google Scholar
G. N. Leech, Principles of pragmatics. Routledge, 2016.Google ScholarCross Ref
T. Kohonen, "The self-organizing map," Neurocomputing, vol. 21, no. 1-3, pp. 1--6, 1998.Google Scholar
L. Finkelstein et al., "Placing search in context: The concept revisited," in Proceedings of the 10th international conference on World Wide Web, 2001, pp. 406--414: ACM. Google ScholarDigital Library
J. Weston et al., "Towards ai-complete question answering: A set of prerequisite toy tasks," arXiv preprint arXiv:1502.05698, 2015.Google Scholar

Index Terms

Enhancing Semantic Word Representations by Embedding Deep Word Relationships
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Discourse, dialogue and pragmatics

Recommendations

Adaptive Probabilistic Word Embedding
WWW '20: Proceedings of The Web Conference 2020

Word embeddings have been widely used and proven to be effective in many natural language processing and text modeling tasks. It is obvious that one ambiguous word could have very different semantics in various contexts, which is called polysemy. Most ...
Read More
Improving Vietnamese WordNet using word embedding
NLPIR '19: Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval

This paper presents a simple but effective method to improve the quality of WordNet synsets and extract glosses for synsets. We translate the Princeton WordNet and other intermediate WordNets to a target language using a machine translator, then the ...
Read More
Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity Measurement

Word embedding-based methods have received increasing attention for their flexibility and effectiveness in many natural language-processing (NLP) tasks, including Word Similarity (WS). However, these approaches rely on high-quality corpus and neglect ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCAE 2019: Proceedings of the 2019 11th International Conference on Computer and Automation Engineering
February 2019
160 pages
ISBN:9781450362870
DOI:10.1145/3313991

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 February 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Conceptnet
Semantic representation
Word Embedding
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 141
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Enhancing Semantic Word Representations by Embedding Deep Word Relationships

ICCAE 2019: Proceedings of the 2019 11th International Conference on Computer and Automation Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Adaptive Probabilistic Word Embedding

Improving Vietnamese WordNet using word embedding

Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity Measurement

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Enhancing Semantic Word Representations by Embedding Deep Word Relationships

ICCAE 2019: Proceedings of the 2019 11th International Conference on Computer and Automation Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Adaptive Probabilistic Word Embedding

Improving Vietnamese WordNet using word embedding

Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity Measurement

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media