A Metaphor Detection Approach Using Cosine Similarity

Pramanick, Malay; Mitra, Pabitra

doi:10.1007/978-3-319-69900-4_45

A Metaphor Detection Approach Using Cosine Similarity

Conference paper
First Online: 01 November 2017

2842 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10597))

Abstract

Metaphor is a prominent figure of speech. For their prevalence in text and speech, detection and analysis of metaphors are required for complete natural language understanding. This paper describes a novel method for identification of metaphors with word vectors. Our method relies on the semantic distance between the word and the corresponding object or action it is applied to. Our method does not target any particular kind of metaphor but tries to identify metaphors in general. Experimental results on the VU Amsterdam Metaphor Corpus show that our method gives state of the art results as compared to previous reported works.

You have full access to this open access chapter, Download conference paper PDF

1 Introduction

Metaphor is a figure of speech, that shows similarities between one thing and another or between actions. Metaphors are so abundant in any language that their identification and interpretation would benefit the Natural Language Processing (NLP) methods like paraphrasing, summarization, machine translation, language generation etc.

For any metaphor, to be analysed and interpreted, it has to be identified first. Some of the existing computational methods for metaphor detection use heirarchical organisation of conventional metaphors or conventional mappings of subject-verb, verb-object, subject-object or selectional restrictions as provided in lexical resources available, or domain mapping of the word and its context [17].

We tackle the problem of detection of metaphor in a given sentence irrespective of its type, and without using lexical resources like WordNet. For this we come up with a novel method using word embeddings, even when embeddings were not meant for this purpose. The method for detection of metaphor thus described in this paper uses similarity metrics for vectors, with the vector representation of words. With the similarity, thus obtained, as features, Decision Tree Classifier is used to classify whether any metaphor is present in the given sentence. Experiments with the VU Amsterdam Metaphor Corpus show better results compared to strong baselines.

2 Related Works

There have been many works on detection of metaphors computationally, supervised as well as unsupervised. Shutova [17] has a comprehensive review of computational metaphor detection. Following are some of the works that are related to our approach.

To determine whether a sentence contains a metaphor Wilks et al. [21] extracted all verbs along with the subject and direct object arguments for each verb using the Stanford Parser. Extracting the verbs from the sentence, they checked for preference violations with the help of WordNet [6, 14] and VerbNet [16]. If there is a violation, they mark it as ‘Preference Violation metaphor’. They also take into consideration the ‘conventional metaphors’ and determine them by the senses in WordNet. Klebanov et al. [9] used the logistic regression classifier to detect metaphor using unigrams, part of speech, concreteness and topic models as features. Klebanov et al. [8], to improve their previous work, tuned the weight parameter to represent concreteness of information, including the difference of concreteness. Su et al. [20], based on the theory of meaning, presented a metaphor detection technique, considering the difference between the source and target domains in the semantic level rather than the categories of the domains. They extract subject-object pair by a dependency parser, which they refer to as ‘concepts-pair’. Then they compare the cosine similarity of the concepts-pair and from the WordNet find out whether the subject is hypernym or hyponym of the object. When the cosine similarity is below a particular threshold and ‘concept-pair’ does not have a hypernym-hyponym relation, it is categorized as metaphorical, otherwise literal. But they target only nominal metaphors (‘IS-A’ metaphors) aka Type I metaphors [11], whereas our method is general and does not look for any particular type of metaphor.

3 Motivation

Many real-world NLP systems treat words as atomic units because of simplicity, robustness and the observation that simple models trained on huge amounts of data outperform complex systems trained on less data [12]. Motivation behind the proposed approach is that other methods treat words as atomic units but words can have multiple degrees of similarity [13] and many word embeddings acknowledge that fact.

4 Proposed Metaphor Detection Approach

Flow diagram for our approach is shown in Fig. 1.

4.1 Vector Representation of Words

The method proposed in this paper uses vector representation of words already made available to it. We have used the open-source Google Word2Vec^{Footnote 1} system, and for training it, we have used text corpus from the latest English Wikipedia dump^{Footnote 2}, preprocessed with the Perl script of Matt Mahoney^{Footnote 3}.

For training as well as testing purposes, one might come across words for which embeddings are not provided. For such scenarios, we map those to a constant vector of the same dimension as of the word vectors provided.

4.2 Feature Extraction

Replacing Named Entities: First we normalize the sentences with Normalization Form KD (NFKD) [3]. This is required because in presence of non-ascii characters, the Stanford NLP Softwares^{Footnote 4} sometime produce characters which are originally not there in the input.

We replace the Named Entities because we cannot get the vector representation for many proper nouns, and the chances increase for the unpopular ones. Also the replacement is required for unification of similar proper nouns under the same category. For example, different companies have different names, but unification is required for them to be treated similarly. So we use Stanford Named Entity Recognizer (NER) [7] for that purpose. Once the entities are recognized, the names are replaced by the entities. So “Montenegro’s sudden rehabilitation of Nicholas ’s memory is a popular move” (VU Amsterdam Metaphor Corpus) becomes “LOCATION’s sudden rehabilitation of PERSON ’s memory is a popular move”.

Getting Type Dependencies: We parse the sentences with NER replaced with the Stanford PFCG Lexical Parser [10] to get the parse trees and also the typed dependencies (Stanford Typed Dependencies) [4]. Of all the dependencies identified, we keep a subset of the dependencies along with their types. We decide, the types that we choose to include in our subset, in such a way that they may contain a metaphor. Wilks et al., [21], consider agent, nsubj, xsubj, dobj and nsubjpass as they look for metaphor surrounding a verb. We choose a larger subset. For example, we also consider acomp (adjectival complement) [5], as it may result in metaphors as in ‘he looks green’.

4.3 Training

The system has to be provided with an annotated metaphor corpus, a corpus with sentences having metaphors and some without, marked positive and negative respectively, for training purpose. It gets the cosine similarity of the dependent word pairs, and then distributes the cosine similarities according to the class of the sentence they come from, i.e., the cosine similarities of dependent words of a metaphor containing sentences are put in the positive class and those coming from sentences not containing any metaphor are put in the negative class.

4.4 Classification

The default class of a sentence is negative. Then one by one the cosine similarities are classified with CART [1], which is a Decision Tree Classifier. If atleast one of them is classified to be positive, the sentence is marked positive.

5 Experiments

5.1 Dataset

The VU Amsterdam Metaphor Corpus (VUAMC)^{Footnote 5} [19] is one of the “largest available corpus hand-annotated for all metaphorical language use, regardless of lexical field or source domain”. It is based on “a systematic and explicit metaphor identification protocol” [18] with inter-annotator reliability of \(\kappa >0.8\).

5.2 Baselines

We compare our method with two baselines, one that does not use word embedding and one that does, which are explained as follows.

Baseline 1 (UPT+CUpDown+DCUpDown model)

As one of our baselines, we use the results from Klebanov et al. [8]. They also report on the VUAMC other than the ‘Essay Data’. We consider the average of VUAMC (VUA in [8]) for comparison. We choose their best reported results achieved by using the method known as UPT+CUpDown+DCUpDown model.

Baseline 2 (CRF (with SF+CF+AF+XF))

As our next baseline, we use the results from Rai et al. [15]. They also report on the VUAMC. For comparison, we choose their best reported results achieved by using CRF with feature set of SF+CF+AF+XF on overall VUAMC dataset across every genre (Dataset2 in [15]).

Baseline 3 (SVM (with word embeddings))

For each of the sentences, after replacing NERs, we get the typed dependencies. For each of the pairs of words, we append word vector of the second word in the (ordered) pair to the vector representation of the first and thus obtain the feature vectors. For a sentence containing metaphor, the feature vectors derived from the (typed) dependent pairs are placed in positive class and those sentences which do not contain metaphor, their feature vectors are placed in negative class. By default, for classification, a sentence is put in a negative class and if atleast one of its feature vectors is classified by Support Vector Classifier [2] to be positive, the sentence is marked positive for metaphor.

5.3 Evaluation

For training and testing purpose, we consider the VU Amsterdam Metaphor Corpus and performed a 10-fold cross validation on it.

We compare our method against the baselines on the basis of precision, recall and F\(_{1}\)-score. For their calculation, sentences containing metaphors are considered to constitute the positive class, irrespective of the number of metaphors in the sentence and sentences not having metaphors constitute the negative class.

Table 1. VU Amsterdam Metaphor Corpus.

Full size table

6 Results and Discussions

As shown in Table 1, the proposed method outperforms the baselines. Our method surpasses each of the criteria considered for comparison of the methods. For the VU Amsterdam Metaphor Database, Klebanov et al. [8] report an average F\(_{1}\)-score of 0.511 and Rai et al. [15] report F-measure of 0.609. The proposed approach gives an F\(_{1}\)-score of 0.758.

Some of the typed dependencies are ignored so as to speed up the process and decrease the volume of the data to be examined for detection procedure. Considering all of them does not improve the results significantly, but increase the overheads.

Analysing the false positives, we found out that over-fitting of the positive class is due to the presence of common pairs in the typed dependencies of dobj(direct object), nsubj(nominal subject) and the alikes. We observed in our experiments that if we do not consider those dependencies, the F\(_{1}\)-score falls drastically.

Our system gives a larger number of false positives compared to false negatives, which we believe to be the better option. Metaphor interpretation, comes after metaphor recognition. For false negatives, the metaphors will be treated literally and interpreted in ways they were not intended. But for the cases of false positives, we search for the analogies, if any analogy is not found, we can always return to the literal meaning.

7 Conclusion

In this paper we proposed a novel approach for metaphor detection which uses cosine similarity as its main component. We compared our results on a standard dataset and showed superior performance. In future, we intend to use the proposed method in downstream applications like paraphrasing and summarization.

Notes

1.
Available at https://code.google.com/archive/p/word2vec/.
2.
Available at https://dumps.wikimedia.org/enwiki/latest/.
3.
Available at http://www.mattmahoney.net/dc/textdata.html.
4.
http://nlp.stanford.edu/software/.
5.
Available at http://ota.ahds.ac.uk/headers/2541.xml.

References

Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press (1984)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Davis, M., Dürst, M.: Unicode normalization forms (2001)
Google Scholar
De Marneffe, M.-C., MacCartney, B., Manning, C.D., et al.: Generating typed dependency parses from phrase structure parses. In: Proceedings of LREC, vol. 6, pp. 449–454 (2006)
Google Scholar
De Marneffe, M.-C., Manning, C.D.: Stanford typed dependencies manual. Technical report, Technical report, Stanford University (2008)
Google Scholar
Fellbaum, C.: WordNet. Wiley Online Library (1998)
Google Scholar
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 363–370. Association for Computational Linguistics (2005)
Google Scholar
Klebanov, B.B., Leong, C.W., Flor, M.: Supervised word-level metaphor detection: experiments with concreteness and reweighting of examples. In: NAACL HLT 2015, p. 11 (2015)
Google Scholar
Klebanov, B.B., Leong, C.W., Heilman, M., Flor, M.: Different texts, same metaphors: unigrams and beyond. In: Proceedings of the Second Workshop on Metaphor in NLP, pp. 11–17 (2014)
Google Scholar
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 423–430. Association for Computational Linguistics (2003)
Google Scholar
Krishnakumaran, S., Zhu, X.: Hunting elusive metaphors using lexical resources. In: Proceedings of the Workshop on Computational Approaches to Figurative Language, pp. 13–20. Association for Computational Linguistics (2007)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Yih, W.-T., Zweig, G.: Linguistic regularities in continuous space word representations. In: HLT-NAACL, vol. 13, pp. 746–751 (2013)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Rai, S., Chakraverty, S., Tayal, D.K.: Supervised metaphor detection using conditional random fields
Google Scholar
Schuler, K.K.: Verbnet: a broad-coverage, comprehensive verb lexicon (2005)
Google Scholar
Shutova, E.: Models of metaphor in NLP. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 688–697. Association for Computational Linguistics (2010)
Google Scholar
Steen, G.J., Dorst, A.G., Berenike Herrmann, J., Kaal, A., Krennmayr, T., Pasma, T.: A Method for Linguistic Metaphor Identification: From MIP to MIPVU, vol. 14. John Benjamins Publishing (2010)
Google Scholar
Steen, G.J., Dorst, A.G., Berenike Herrmann, J., Kaal, A.A., Krennmayr, T.: Vu amsterdam metaphor corpus (2010)
Google Scholar
Su, C., Huang, S., Chen, Y.: Automatic detection and interpretation of nominal metaphor based on the theory of meaning. Neurocomputing (2016)
Google Scholar
Wilks, Y., Galescu, L., Allen, J., Dalton, A.: Automatic metaphor detection using large-scale lexical resources and conventional metaphor extraction. In: Proceedings of the First Workshop on Metaphor in NLP, pp. 36–44 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur, 721302, WB, India
Malay Pramanick & Pabitra Mitra

Authors

Malay Pramanick
View author publications
You can also search for this author in PubMed Google Scholar
Pabitra Mitra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Malay Pramanick .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
B. Uma Shankar
Indian Statistical Institute, Kolkata, India
Kuntal Ghosh
Indian Statistical Institute, Kolkata, India
Deba Prasad Mandal
Indian Statistical Institute, Kolkata, India
Shubhra Sankar Ray
The Hong Kong Polytechnic University, Hong Kong, China
David Zhang
Indian Statistical Institute, Kolkata, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pramanick, M., Mitra, P. (2017). A Metaphor Detection Approach Using Cosine Similarity. In: Shankar, B., Ghosh, K., Mandal, D., Ray, S., Zhang, D., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2017. Lecture Notes in Computer Science(), vol 10597. Springer, Cham. https://doi.org/10.1007/978-3-319-69900-4_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-69900-4_45
Published: 01 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69899-1
Online ISBN: 978-3-319-69900-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)