skip to main content
10.1145/3459637.3482369acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Metric Sentiment Learning for Label Representation

Published: 30 October 2021 Publication History

Abstract

Label representation aims to generate a so-called verbalizer to an input text, which has a broad application in the field of text classification, event detection, question answering, etc. Previous works on label representation, especially in a few-shot setting, mainly define the verbalizers manually, which is accurate but time-consuming. Other models fail to correctly produce antonymous verbalizers for two semantically opposite classes. Thus, in this paper, we propose a metric sentiment learning framework (MSeLF) to generate the verbalizers automatically, which can capture the sentiment differences between the verbalizers accurately. In detail, MSeLF consists of two major components, i.e., the contrastive mapping learning (CML) module and the equal-gradient verbalizer acquisition (EVA) module. CML learns a transformation matrix to project the initial word embeddings to the antonym-aware embeddings by enlarging the distance between the antonyms. After that, in the antonym-aware embedding space, EVA first takes a pair of antonymous words as verbalizers for two opposite classes and then applies a sentiment transition vector to generate verbalizers for intermediate classes. We use the generated verbalizers for the downstream text classification task in a few-shot setting on two publicly available fine-grained datasets. The results indicate that our proposal outperforms the state-of-the-art baselines in terms of accuracy. In addition, we find CML can be used as a flexible plug-in component in other verbalizer acquisition approaches.

Supplementary Material

MP4 File (CIKM-rgfp0947.mp4)
Presentation video

References

[1]
Heike Adel and Hinrich Schütze. 2014. Using Mined Coreference Chains as a Resource for a Semantic Task. In EMNLP. 1447--1452.
[2]
Muhammad Asif Ali, Yifang Sun, et al. 2019. Antonym-Synonym Classification Based on New Sub-Space Embeddings. In AAAI. 6204--6211.
[3]
Marina Angelovska, Sina Sheikholeslami, et al. 2021. Siamese Neural Networks for Detecting Complementary Products. In EACL. 65--70.
[4]
José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. AI 240 (2016), 36--64.
[5]
Lawrence Cayton. 2005. Algorithms for Manifold Learning. UCSDT 12, 1--17 (2005), 1.
[6]
Jia-Ren Chang and Yong-Sheng Chen. 2018. Pyramid stereo matching network. In CVPR. 5410--5418.
[7]
Xinyi Chen, Jingxian Xu, and Alex Wang. 2020. Label Representations in Modeling Classification as Text Generation. In IJCNLP. 160--164.
[8]
Kyunghyun Cho, Bart van Merrienboer, et al. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In EMNLP. 1724--1734.
[9]
Zhen Dong, Su Jia, et al. 2017. Deep Manifold Learning of Symmetric Positive Definite Matrices with Application to Face Recognition. In AAAI. 4009--4015.
[10]
Xinya Du and Claire Cardie. 2020. Event Extraction by Answering (Almost) Natural Questions. In EMNLP. 671--683.
[11]
Elad Hoffer and Nir Ailon. 2015. Deep Metric Learning Using Triplet Network. In SIMBAD, Vol. 9370. 84--92.
[12]
Hexiang Hu, Guang-Tong Zhou, et al. 2016. Learning Structured Inference Neural Networks with Label Relations. In CVPR. 2960--2968.
[13]
Zhengbao Jiang, Frank F. Xu, et al. 2020. How Can We Know What Language Models Know. TACL 8 (2020), 423--438.
[14]
Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. 2015. Siamese Neural Networks for One-shot Image Recognition. In ICML, Vol. 2.
[15]
Buon Kiong Lau, Jrgen Bach Andersen, et al. 2006. Impact of matching network on bandwidth of compact antenna arrays. TAP 54, 11 (2006), 3225--3238.
[16]
Yoonkyung Lee, Yi Lin, and Grace Wahba. 2004. Multicategory Support Vector Machines: Theory and Application to The Classification of Microarray Data and Satellite Radiance data. JASA 99, 465 (2004), 67--81.
[17]
Bohan Li, Hao Zhou, et al. 2020. On the Sentence Embeddings from Pretrained Language Models. In EMNLP. 9119--9130.
[18]
Dekang Lin, Shaojun Zhao, et al. 2003. Identifying Synonyms among Distributionally Similar Words. In IJCAI. 1492--1493.
[19]
Jian Liu, Yubo Chen, et al. 2020. Event Extraction as Machine Reading Comprehension. In EMNLP. 1641--1651.
[20]
Xiaoqian Liu, Fengyu Zhou, et al. 2020. Meta-Learning Based Prototype-relation Network for Few-shot Classification. Neurocomputing 383 (2020), 224--234.
[21]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, et al. 2019. Roberta: A Robustly Optimized Bert Pretraining Approach. CoRR abs/1907.11692 (2019).
[22]
Michael Loster, Ioannis K. Koumarelas, and Felix Naumann. 2021. Knowledge Transfer for Entity Resolution with Siamese Neural Networks. ACM 13, 1 (2021), 2:1--2:25.
[23]
Tomás Mikolov, Kai Chen, et al. 2013. Efficient Estimation of Word Representations in Vector Space. In ICLR.
[24]
Tomás Mikolov, Ilya Sutskever, et al. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111--3119.
[25]
Andriy Mnih and Yee Whye Teh. 2012. A Fast and Simple Algorithm for Training Neural Probabilistic Language Models. CoRR abs/1206.6426 (2012).
[26]
Saif Mohammad and Peter Turney. 2010. Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. In NAACL. 26--34.
[27]
Kim Anh Nguyen, Sabine Schulte im Walde, and Ngoc Thang Vu. 2016. Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction. In ACL. 454--459.
[28]
Kim Anh Nguyen, Sabine Schulte im Walde, and Ngoc Thang Vu. 2017. Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network. In EACL. 76--85.
[29]
Rodrigo Nogueira, Zhiying Jiang, et al. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In EMNLP. 708--718.
[30]
Haibo Pang, Qi Xuan, et al. 2021. Research on Target Tracking Algorithm Based on Siamese Neural Network. MIS 2021 (2021), 6645629:1--6645629:11.
[31]
Fabio Petroni, Tim Rocktäschel, et al. 2019. Language Models as Knowledge Bases?. In EMNLP. 2463--2473.
[32]
Nghia The Pham, Angeliki Lazaridou, and Marco Baroni. 2015. A Multitask Objective to Inject Lexical Contrast into Distributional Semantics. In ACL. 21--26.
[33]
Timo Pukkala and Jari Miina. 1997. A Method for Stochastic Multiobjective Optimization of Stand Management. FEM 98, 2 (1997), 189--203.
[34]
Yuanyuan Qiao, Yuewei Wu, et al. 2020. Siamese Neural Networks for User Identity Linkage Through Web Browsing. TNNLS 31, 8 (2020), 2741--2751.
[35]
Alec Radford, Jeffrey Wu, et al. 2019. Language Models are Unsupervised Multi-task Learners. OpenAI 1, 8 (2019), 9.
[36]
Colin Raffel, Noam Shazeer, et al. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. JMLR 21 (2020), 140:1--140:67.
[37]
Paulo E. Rauber, Alexandre X. Falcão, and Alexandru C. Telea. 2016. Visualizing Time-Dependent Data Using Dynamic t-SNE. In ECV. 73--77.
[38]
Michael Roth and Sabine Schulte im Walde. 2014. Combining Word Patterns and Discourse Markers for Paradigmatic Relation Classification. In ACL. 524--530.
[39]
Igor Samenko, Alexey Tikhonov, et al. 2020. Synonyms and Antonyms: Embedded Conflict. CoRR abs/2004.12835 (2020).
[40]
Lawrence K. Saul and Sam T. Roweis. 2003. Think Globally, Fit Locally: Unsuper-vised Learning of Low Dimensional Manifold. JMLR 4 (2003), 119--155.
[41]
Silke Scheible, Sabine Schulte im Walde, and Sylvia Springorum. 2013. Uncovering Distributional Differences between Synonyms and Antonyms in a Word Space Model. In IJCNLP. 489--497.
[42]
Timo Schick, Helmut Schmid, and Hinrich Schütze. 2020. Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification. In COLING. 5569--5578.
[43]
Timo Schick and Hinrich Schütze. 2021. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. In EACL. 255--269.
[44]
Roy Schwartz, Roi Reichart, and Ari Rappoport. 2015. Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction. In CoNLL. 258--267.
[45]
Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical Networks for Few-shot Learning. In NIPS. 4077--4087.
[46]
Flood Sung, Yongxin Yang, et al. 2018. Learning to Compare: Relation Network for Few-shot Learning. In CVPR. 1199--1208.
[47]
Peter D. Turney and Patrick Pantel. 2010. From Frequency to Meaning: Vector Space Models of Semantics. JAIR 37 (2010), 141--188.
[48]
Oriol Vinyals, Charles Blundell, et al. 2016. Matching Networks for One Shot Learning. In NIPS. 3630--3638.
[49]
Wenpeng Yin. 2020. Meta-learning for Few-shot Natural Language Processing: A Survey. CoRR abs/2007.09604 (2020).
[50]
Sung Whan Yoon, Jun Seo, and Jaekyun Moon. 2019. TapNet: Neural Network Augmented with Task-Adaptive Projection for Few-Shot Learning. In ICML, Vol. 97. 7115--7123.
[51]
Yaru Zhang, Yaqian Li, et al. 2021. Attention-guided Aggregation Stereo Matching Network. IVC 106 (2021), 104088.
[52]
Jianming Zheng, Fei Cai, et al. 2019. Hierarchical Neural Representation for Document Classification. Cogn. Comput. 11, 2 (2019), 317--327.
[53]
Jianming Zheng, Fei Cai, et al. 2020. Pre-train, Interact, Fine-tune: a novel interaction representation for text classification. IPM (2020), 102215.
[54]
Jianming Zheng, Fei Cai, and Honghui Chen. 2020. Incorporating Scenario Knowledge into A Unified Fine-tuning Architecture for Event Representation. In SIGIR. ACM, 249--258.
[55]
Jianming Zheng, Fei Cai, and Wanyu Chen others. 2021. Taxonomy-aware Learning for Few-Shot Event Detection. In WWW. ACM, 3546--3557.

Index Terms

  1. Metric Sentiment Learning for Label Representation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
    October 2021
    4966 pages
    ISBN:9781450384469
    DOI:10.1145/3459637
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. label representation
    2. metric learning
    3. pre-training

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 122
      Total Downloads
    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 20 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media