research-article

Metric Sentiment Learning for Label Representation

Authors:

Jianming Zheng,

Zhiqiang PanAuthors Info & Claims

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

Pages 1703 - 1712

https://doi.org/10.1145/3459637.3482369

Published: 30 October 2021 Publication History

Abstract

Label representation aims to generate a so-called verbalizer to an input text, which has a broad application in the field of text classification, event detection, question answering, etc. Previous works on label representation, especially in a few-shot setting, mainly define the verbalizers manually, which is accurate but time-consuming. Other models fail to correctly produce antonymous verbalizers for two semantically opposite classes. Thus, in this paper, we propose a metric sentiment learning framework (MSeLF) to generate the verbalizers automatically, which can capture the sentiment differences between the verbalizers accurately. In detail, MSeLF consists of two major components, i.e., the contrastive mapping learning (CML) module and the equal-gradient verbalizer acquisition (EVA) module. CML learns a transformation matrix to project the initial word embeddings to the antonym-aware embeddings by enlarging the distance between the antonyms. After that, in the antonym-aware embedding space, EVA first takes a pair of antonymous words as verbalizers for two opposite classes and then applies a sentiment transition vector to generate verbalizers for intermediate classes. We use the generated verbalizers for the downstream text classification task in a few-shot setting on two publicly available fine-grained datasets. The results indicate that our proposal outperforms the state-of-the-art baselines in terms of accuracy. In addition, we find CML can be used as a flexible plug-in component in other verbalizer acquisition approaches.

Supplementary Material

MP4 File (CIKM-rgfp0947.mp4)

Presentation video

Download
124.94 MB

References

[1]

Heike Adel and Hinrich Schütze. 2014. Using Mined Coreference Chains as a Resource for a Semantic Task. In EMNLP. 1447--1452.

[2]

Muhammad Asif Ali, Yifang Sun, et al. 2019. Antonym-Synonym Classification Based on New Sub-Space Embeddings. In AAAI. 6204--6211.

[3]

Marina Angelovska, Sina Sheikholeslami, et al. 2021. Siamese Neural Networks for Detecting Complementary Products. In EACL. 65--70.

[4]

José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. AI 240 (2016), 36--64.

[5]

Lawrence Cayton. 2005. Algorithms for Manifold Learning. UCSDT 12, 1--17 (2005), 1.

[6]

Jia-Ren Chang and Yong-Sheng Chen. 2018. Pyramid stereo matching network. In CVPR. 5410--5418.

[7]

Xinyi Chen, Jingxian Xu, and Alex Wang. 2020. Label Representations in Modeling Classification as Text Generation. In IJCNLP. 160--164.

[8]

Kyunghyun Cho, Bart van Merrienboer, et al. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In EMNLP. 1724--1734.

[9]

Zhen Dong, Su Jia, et al. 2017. Deep Manifold Learning of Symmetric Positive Definite Matrices with Application to Face Recognition. In AAAI. 4009--4015.

Digital Library

[10]

Xinya Du and Claire Cardie. 2020. Event Extraction by Answering (Almost) Natural Questions. In EMNLP. 671--683.

[11]

Elad Hoffer and Nir Ailon. 2015. Deep Metric Learning Using Triplet Network. In SIMBAD, Vol. 9370. 84--92.

[12]

Hexiang Hu, Guang-Tong Zhou, et al. 2016. Learning Structured Inference Neural Networks with Label Relations. In CVPR. 2960--2968.

[13]

Zhengbao Jiang, Frank F. Xu, et al. 2020. How Can We Know What Language Models Know. TACL 8 (2020), 423--438.

[14]

Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese Neural Networks for One-shot Image Recognition. In ICML, Vol. 2.

[15]

Buon Kiong Lau, Jrgen Bach Andersen, et al. 2006. Impact of matching network on bandwidth of compact antenna arrays. TAP 54, 11 (2006), 3225--3238.

[16]

Yoonkyung Lee, Yi Lin, and Grace Wahba. 2004. Multicategory Support Vector Machines: Theory and Application to The Classification of Microarray Data and Satellite Radiance data. JASA 99, 465 (2004), 67--81.

[17]

Bohan Li, Hao Zhou, et al. 2020. On the Sentence Embeddings from Pretrained Language Models. In EMNLP. 9119--9130.

[18]

Dekang Lin, Shaojun Zhao, et al. 2003. Identifying Synonyms among Distributionally Similar Words. In IJCAI. 1492--1493.

Digital Library

[19]

Jian Liu, Yubo Chen, et al. 2020. Event Extraction as Machine Reading Comprehension. In EMNLP. 1641--1651.

[20]

Xiaoqian Liu, Fengyu Zhou, et al. 2020. Meta-Learning Based Prototype-relation Network for Few-shot Classification. Neurocomputing 383 (2020), 224--234.

Digital Library

[21]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, et al. 2019. Roberta: A Robustly Optimized Bert Pretraining Approach. CoRR abs/1907.11692 (2019).

[22]

Michael Loster, Ioannis K. Koumarelas, and Felix Naumann. 2021. Knowledge Transfer for Entity Resolution with Siamese Neural Networks. ACM 13, 1 (2021), 2:1--2:25.

[23]

Tomás Mikolov, Kai Chen, et al. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR.

[24]

Tomás Mikolov, Ilya Sutskever, et al. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111--3119.

Digital Library

[25]

Andriy Mnih and Yee Whye Teh. 2012. A Fast and Simple Algorithm for Training Neural Probabilistic Language Models. CoRR abs/1206.6426 (2012).

[26]

Saif Mohammad and Peter Turney. 2010. Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. In NAACL. 26--34.

Digital Library

[27]

Kim Anh Nguyen, Sabine Schulte im Walde, and Ngoc Thang Vu. 2016. Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction. In ACL. 454--459.

[28]

Kim Anh Nguyen, Sabine Schulte im Walde, and Ngoc Thang Vu. 2017. Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network. In EACL. 76--85.

[29]

Rodrigo Nogueira, Zhiying Jiang, et al. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In EMNLP. 708--718.

[30]

Haibo Pang, Qi Xuan, et al. 2021. Research on Target Tracking Algorithm Based on Siamese Neural Network. MIS 2021 (2021), 6645629:1--6645629:11.

[31]

Fabio Petroni, Tim Rocktäschel, et al. 2019. Language Models as Knowledge Bases?. In EMNLP. 2463--2473.

[32]

Nghia The Pham, Angeliki Lazaridou, and Marco Baroni. 2015. A Multitask Objective to Inject Lexical Contrast into Distributional Semantics. In ACL. 21--26.

[33]

Timo Pukkala and Jari Miina. 1997. A Method for Stochastic Multiobjective Optimization of Stand Management. FEM 98, 2 (1997), 189--203.

[34]

Yuanyuan Qiao, Yuewei Wu, et al. 2020. Siamese Neural Networks for User Identity Linkage Through Web Browsing. TNNLS 31, 8 (2020), 2741--2751.

[35]

Alec Radford, Jeffrey Wu, et al. 2019. Language Models are Unsupervised Multi-task Learners. OpenAI 1, 8 (2019), 9.

[36]

Colin Raffel, Noam Shazeer, et al. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. JMLR 21 (2020), 140:1--140:67.

[37]

Paulo E. Rauber, Alexandre X. Falcão, and Alexandru C. Telea. 2016. Visualizing Time-Dependent Data Using Dynamic t-SNE. In ECV. 73--77.

Digital Library

[38]

Michael Roth and Sabine Schulte im Walde. 2014. Combining Word Patterns and Discourse Markers for Paradigmatic Relation Classification. In ACL. 524--530.

[39]

Igor Samenko, Alexey Tikhonov, et al. 2020. Synonyms and Antonyms: Embedded Conflict. CoRR abs/2004.12835 (2020).

[40]

Lawrence K. Saul and Sam T. Roweis. 2003. Think Globally, Fit Locally: Unsuper-vised Learning of Low Dimensional Manifold. JMLR 4 (2003), 119--155.

Digital Library

[41]

Silke Scheible, Sabine Schulte im Walde, and Sylvia Springorum. 2013. Uncovering Distributional Differences between Synonyms and Antonyms in a Word Space Model. In IJCNLP. 489--497.

[42]

Timo Schick, Helmut Schmid, and Hinrich Schütze. 2020. Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification. In COLING. 5569--5578.

[43]

Timo Schick and Hinrich Schütze. 2021. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. In EACL. 255--269.

[44]

Roy Schwartz, Roi Reichart, and Ari Rappoport. 2015. Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction. In CoNLL. 258--267.

[45]

Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical Networks for Few-shot Learning. In NIPS. 4077--4087.

Digital Library

[46]

Flood Sung, Yongxin Yang, et al. 2018. Learning to Compare: Relation Network for Few-shot Learning. In CVPR. 1199--1208.

[47]

Peter D. Turney and Patrick Pantel. 2010. From Frequency to Meaning: Vector Space Models of Semantics. JAIR 37 (2010), 141--188.

Digital Library

[48]

Oriol Vinyals, Charles Blundell, et al. 2016. Matching Networks for One Shot Learning. In NIPS. 3630--3638.

Digital Library

[49]

Wenpeng Yin. 2020. Meta-learning for Few-shot Natural Language Processing: A Survey. CoRR abs/2007.09604 (2020).

[50]

Sung Whan Yoon, Jun Seo, and Jaekyun Moon. 2019. TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning. In ICML, Vol. 97. 7115--7123.

[51]

Yaru Zhang, Yaqian Li, et al. 2021. Attention-guided Aggregation Stereo Matching Network. IVC 106 (2021), 104088.

[52]

Jianming Zheng, Fei Cai, et al. 2019. Hierarchical Neural Representation for Document Classification. Cogn. Comput. 11, 2 (2019), 317--327.

[53]

Jianming Zheng, Fei Cai, et al. 2020. Pre-train, Interact, Fine-tune: a novel interaction representation for text classification. IPM (2020), 102215.

[54]

Jianming Zheng, Fei Cai, and Honghui Chen. 2020. Incorporating Scenario Knowledge into A Unified Fine-tuning Architecture for Event Representation. In SIGIR. ACM, 249--258.

Digital Library

[55]

Jianming Zheng, Fei Cai, and Wanyu Chen others. 2021. Taxonomy-aware Learning for Few-Shot Event Detection. In WWW. ACM, 3546--3557.

Digital Library

Index Terms

Metric Sentiment Learning for Label Representation
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Sentiment analysis

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Multi-label feature selection based on fuzzy rough sets with metric learning and label enhancement
Abstract
Multi-label feature selection based on fuzzy rough sets, as a key step of multi-label data preprocessing, has been widely concerned by scholars in recent years. Most of the existing multi-label feature selection algorithms directly treat labels ...
Cross-lingual sentiment lexicon learning with bilingual word graph label propagation

In this article we address the task of cross-lingual sentiment lexicon learning, which aims to automatically generate sentiment lexicons for the target languages with available English sentiment lexicons. We formalize the task as a learning problem on a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

October 2021

4966 pages

ISBN:9781450384469

DOI:10.1145/3459637

General Chairs:
Gianluca Demartini
The University of Queensland, Australia
,
Guido Zuccon
The University of Queensland, Australia
,
Program Chairs:
J. Shane Culpepper
RMIT University, Australia
,
Zi Huang
The University of Queensland, Australia
,
Hanghang Tong
University of Illinois at Urbana-Champaign, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

CIKM '21

Sponsor:

CIKM '21: The 30th ACM International Conference on Information and Knowledge Management

November 1 - 5, 2021

Queensland, Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
122
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents