research-article

Knowledge-Enhanced Ensemble Learning for Word Embeddings

Authors:
Lanting Fang

Southeast University, China

Southeast University, China
View Profile

,
Yong Luo

Nanyang Technological University, Singapore

Nanyang Technological University, Singapore
View Profile

,
Kaiyu Feng

Nanyang Technological University, Singapore

Nanyang Technological University, Singapore
View Profile

,
Kaiqi Zhao

Nanyang Technological University, Singapore

Nanyang Technological University, Singapore
View Profile

,
Aiqun Hu

Southeast University, China

Southeast University, China
View Profile

Authors Info & Claims

WWW '19: The World Wide Web ConferenceMay 2019Pages 427–437https://doi.org/10.1145/3308558.3313425

Published:13 May 2019Publication History

WWW '19: The World Wide Web Conference

Pages 427–437

ABSTRACT

Representing words as embeddings in a continuous vector space has been proven to be successful in improving the performance in many natural language processing (NLP) tasks. Beyond the traditional methods that learn the embeddings from large text corpora, ensemble methods have been proposed to leverage the merits from pre-trained word embeddings as well as external semantic sources. In this paper, we propose a knowledge-enhanced ensemble method to combine both knowledge graphs and pre-trained word embedding models. Specifically, we interpret relations in knowledge graphs as linear translation from one word to another. We also propose a novel weighting scheme to further distinguish edges in the knowledge graph with same type of relation. Extensive experiments demonstrate that our proposed method is up to 20% times better than state-of-the-art in word analogy task and up to 16% times better than state-of-the-art in word similarity task.

References

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in neural information processing systems. 2787-2795. Google ScholarDigital Library
Elia Bruni, Nam-Khanh Tran, and Marco Baroni. 2014. Multimodal distributional semantics. Journal of Artificial Intelligence Research 49 (2014), 1-47. Google ScholarDigital Library
Kai-Wei Chang, Wen-tau Yih, and Christopher Meek. 2013. Multi-relational latent semantic analysis. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 1602-1612.Google Scholar
Michael Cochez, Petar Ristoski, Simone Paolo Ponzetto, and Heiko Paulheim. 2017. Global rdf vector space embeddings. In International Semantic Web Conference. 190-207.Google ScholarDigital Library
Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. 160-167. Google ScholarDigital Library
Boyang Ding, Quan Wang, Bin Wang, and Li Guo. 2018. Improving Knowledge Graph Embedding Using Simple Constraints. arXiv preprint arXiv:1805.02408(2018).Google Scholar
Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy, and Noah A Smith. 2015. Retrofitting Word Vectors to Semantic Lexicons. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1606-1615.Google ScholarCross Ref
Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: The concept revisited. In Proceedings of the 10th international conference on World Wide Web. 406-414. Google ScholarDigital Library
Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 758-764.Google Scholar
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855-864. Google ScholarDigital Library
Felix Hill, Kyunghyun Cho, Sebastien Jean, Coline Devin, and Yoshua Bengio. 2014. Embedding word similarity with neural machine translation. arXiv preprint arXiv:1412.6448(2014).Google Scholar
Eric H Huang, Richard Socher, Christopher D Manning, and Andrew Y Ng. 2012. Improving word representations via global context and multiple word prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. 873-882. Google ScholarDigital Library
Guoliang Ji, Kang Liu, Shizhu He, and Jun Zhao. 2016. Knowledge Graph Completion with Adaptive Sparse Transfer Matrix.. In AAAI. 985-991. Google ScholarDigital Library
Omer Levy and Yoav Goldberg. 2014. Linguistic regularities in sparse and explicit word representations. In Proceedings of the eighteenth conference on computational natural language learning. 171-180.Google ScholarCross Ref
Yong Luo, Jian Tang, Jun Yan, Chao Xu, and Zheng Chen. 2014. Pre-Trained Multi-View Word Embedding Using Two-Side Neural Network.. In AAAI. 1982-1988. Google ScholarDigital Library
Thang Luong, Richard Socher, and Christopher Manning. 2013. Better word representations with recursive neural networks for morphology. In Proceedings of the 17th Conference on Computational Natural Language Learning. 104-113.Google Scholar
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008), 2579-2605.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781(2013).Google Scholar
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39-41. Google ScholarDigital Library
George A Miller and Walter G Charles. 1991. Contextual correlates of semantic similarity. Language and cognitive processes 6, 1 (1991), 1-28.Google Scholar
Andriy Mnih and Geoffrey E Hinton. 2009. A scalable hierarchical distributed language model. In Advances in neural information processing systems. 1081-1088. Google ScholarDigital Library
Avo Muromägi, Kairit Sirts, and Sven Laur. 2017. Linear Ensembles of Word Embedding Models. arXiv preprint arXiv:1704.01419(2017).Google Scholar
Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. 2011. A Three-Way Model for Collective Learning on Multi-Relational Data.. In ICML, Vol. 11. 809-816. Google ScholarDigital Library
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing. 1532-1543.Google ScholarCross Ref
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701-710. Google ScholarDigital Library
Richard Socher, John Bauer, Christopher D Manning, 2013. Parsing with compositional vector grammars. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 1. 455-465.Google Scholar
Richard Socher, Danqi Chen, Christopher D Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion. In Advances in neural information processing systems. 926-934. Google ScholarDigital Library
Robyn Speer, Joshua Chin, and Catherine Havasi. 2017. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. 4444-4451. Google ScholarDigital Library
Robyn Speer and Catherine Havasi. 2013. ConceptNet 5: A large semantic network for relational knowledge. In The People's Web Meets NLP. Springer, 161-176.Google Scholar
Stefanie Tellex, Boris Katz, Jimmy Lin, Aaron Fernandes, and Gregory Marton. 2003. Quantitative evaluation of passage retrieval algorithms for question answering. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. 41-47. Google ScholarDigital Library
Yuta Tsuboi. 2014. Neural networks leverage corpus-wide information for part-of-speech tagging. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. 938-950.Google ScholarCross Ref
Joseph Turian, Lev Ratinov, and Yoshua Bengio. 2010. Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th annual meeting of the association for computational linguistics. 384-394. Google ScholarDigital Library
Arnold D Well and Jerome L Myers. 2003. Research design & statistical analysis. Psychology Press.Google Scholar
Chang Xu, Yalong Bai, Jiang Bian, Bin Gao, Gang Wang, Xiaoguang Liu, and Tie-Yan Liu. 2014. Rc-net: A general framework for incorporating knowledge into word representations. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. 1219-1228. Google ScholarDigital Library
Wenpeng Yin and Hinrich Schütze. 2016. Learning Word Meta-Embeddings. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Vol. 1. 1351-1360.Google ScholarCross Ref
Mo Yu and Mark Dredze. 2014. Improving lexical embeddings with semantic knowledge. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 2. 545-550.Google ScholarCross Ref

Recommendations

Composing Word Embeddings for Compound Words Using Linguistic Knowledge
In recent years, the use of distributed representations has been a fundamental technology for natural language processing. However, Japanese has multiple compound words, and often we must compare the meanings of a word and a compound word. Moreover, word ...
Read More
Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity Measurement

Word embedding-based methods have received increasing attention for their flexibility and effectiveness in many natural language-processing (NLP) tasks, including Word Similarity (WS). However, these approaches rely on high-quality corpus and neglect ...
Read More
Contextual Compositionality Detection with External Knowledge Bases and Word Embeddings
WWW '19: Companion Proceedings of The 2019 World Wide Web Conference

When the meaning of a phrase cannot be inferred from the individual meanings of its words (e.g., hot dog), that phrase is said to be non-compositional. Automatic compositionality detection in multi-word phrases is critical in any application of semantic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Ensemble model
Knowledge graph
Word embedding
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 910
  Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Knowledge-Enhanced Ensemble Learning for Word Embeddings

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Composing Word Embeddings for Compound Words Using Linguistic Knowledge

Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity Measurement

Contextual Compositionality Detection with External Knowledge Bases and Word Embeddings

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Knowledge-Enhanced Ensemble Learning for Word Embeddings

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Composing Word Embeddings for Compound Words Using Linguistic Knowledge

Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity Measurement

Contextual Compositionality Detection with External Knowledge Bases and Word Embeddings

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media