Comparison of Deep-Neural-Network-Based Models for Estimating Distributed Representations of Compound Words

https://doi.org/10.1016/j.procs.2021.08.133Get rights and content
Under a Creative Commons license
open access

Abstract

Word embeddings or word vectors have become fundamental in language processing techniques, especially deep learning approaches. Although many languages have compound words (e.g., “robot arm” and “maple leaf”), such words have not received much attention from researchers. Most research on compound word embeddings considered only two-word compounds; there has been little detailed analysis on the learning representations of arbitrary-length compound words. This paper discusses the necessity for learning-based approaches for estimating the distributed representations of compound words instead of a simple average of the representations of constituents. An evaluation of two downstream tasks confirms the effectiveness of compositional models in encoding useful information into vector spaces. The experimental results suggest that complex architectures such as long short-term memory, gated recurrent units, and transformers learn better representations for long entities, whereas simpler models such as recurrent neural networks are more applicable for downstream tasks where there are only short compounds (two or three words in length), as in the noun compound interpretation task.

Keywords

compound word
distributional models of semantics
multi-word expressions

Cited by (0)