skip to main content
10.1145/3511808.3557632acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Long-tail Mixup for Extreme Multi-label Classification

Authors Info & Claims
Published:17 October 2022Publication History

ABSTRACT

Extreme multi-label classification (XMC) aims at finding multiple relevant labels for a given sample from a huge label set at the industrial scale. The XMC problem inherently poses two challenges: scalability and label sparsity - the number of labels is too large, and labels follow the long-tail distribution. To resolve these problems, we propose a novel Mixup-based augmentation method for long-tail labels, called TailMix. Building upon the partition-based model, TailMix utilizes the context vectors generated from the label attention layer. It first selectively chooses two context vectors using the inverse propensity score of labels and the label proximity graph representing the co-occurrence of labels. Using two context vectors, it augments new samples with the long-tail label to improve the accuracy of long-tail labels. Despite its simplicity, experimental results show that TailMix consistently outperforms other augmentation methods on three benchmark datasets, especially for long-tail labels in terms of two metrics, PSP@k and PSN@k.

References

  1. Rohit Babbar and Bernhard Schö lkopf. 2017. DiSMEC: Distributed Sparse Machines for Extreme Multi-label Classification. In WSDM.Google ScholarGoogle Scholar
  2. Rohit Babbar and Bernhard Schö lkopf. 2019. Data scarcity, robustness and extreme multi-label classification. Mach. Learn., Vol. 108, 8--9 (2019), 1329--1351.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Bhatia, K. Dahiya, H. Jain, P. Kar, A. Mittal, Y. Prabhu, and M. Varma. 2016. The extreme classification repository: Multi-label datasets and code.Google ScholarGoogle Scholar
  4. Kush Bhatia, Himanshu Jain, Purushottam Kar, Manik Varma, and Prateek Jain. 2015. Sparse Local Embeddings for Extreme Multi-label Classification. In NeurIPS.Google ScholarGoogle Scholar
  5. Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, and Inderjit S. Dhillon. 2020. Taming Pretrained Transformers for Extreme Multi-label Text Classification. In SIGKDD.Google ScholarGoogle Scholar
  6. Jiaao Chen, Zichao Yang, and Diyi Yang. 2020. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. In ACL.Google ScholarGoogle Scholar
  7. Hongyu Guo, Yongyi Mao, and Richong Zhang. 2019. Augmenting Data with Mixup for Sentence Classification: An Empirical Study. arXiv preprint arXiv:1905.08941 (2019). arxiv: 1905.08941Google ScholarGoogle Scholar
  8. Himanshu Jain, Venkatesh Balasubramanian, Bhanu Chunduri, and Manik Varma. 2019. Slice: Scalable Linear Extreme Classifiers Trained on 100 Million Labels for Related Searches. In WSDM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Himanshu Jain, Yashoteja Prabhu, and Manik Varma. 2016. Extreme Multi-Label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications. In SIGKDD.Google ScholarGoogle Scholar
  10. Kalina Jasinska, Krzysztof Dembczynski, Robert Busa-Fekete, Karlson Pfannschmidt, Timo Klerx, and Eyke Hullermeier. 2016. Extreme F-measure Maximization using Sparse Probability Estimates. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 48), Maria Florina Balcan and Kilian Q. Weinberger (Eds.). New York, New York, USA, 1435--1444.Google ScholarGoogle Scholar
  11. Sujay Khandagale, Han Xiao, and Rohit Babbar. 2020. Bonsai: diverse and shallow trees for extreme multi-label classification. Mach. Learn., Vol. 109, 11 (2020), 2099--2119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Julian J. McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: understanding rating dimensions with review text. In RecSys.Google ScholarGoogle Scholar
  13. Eneldo Loza Menc'i a and Johannes Fü rnkranz. 2008. Efficient Pairwise Multilabel Classification for Large-Scale Problems in the Legal Domain. In ECML/PKDD.Google ScholarGoogle Scholar
  14. Yashoteja Prabhu, Anil Kag, Shrutendra Harsola, Rahul Agrawal, and Manik Varma. 2018. Parabel: Partitioned Label Trees for Extreme Classification with Application to Dynamic Search Advertising. In WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yashoteja Prabhu and Manik Varma. 2014. FastXML: A Fast, Accurate and Stable Tree-Classifier for Extreme Multi-Label Learning. In SIGKDD.Google ScholarGoogle Scholar
  16. Lichao Sun, Congying Xia, Wenpeng Yin, Tingting Liang, Philip S. Yu, and Lifang He. 2020. Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks. In COLING.Google ScholarGoogle Scholar
  17. Yukihiro Tagami. 2017. AnnexML: Approximate Nearest Neighbor Search for Extreme Multi-Label Classification. In SIGKDD.Google ScholarGoogle Scholar
  18. Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, and Yoshua Bengio. 2019. Manifold Mixup: Better Representations by Interpolating Hidden States. In ICML.Google ScholarGoogle Scholar
  19. Ian En-Hsu Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon, and Eric P. Xing. 2017. PPDsparse: A Parallel Primal-Dual Sparse Method for Extreme Classification. In SIGKDD.Google ScholarGoogle Scholar
  20. Ian En-Hsu Yen, Xiangru Huang, Pradeep Ravikumar, Kai Zhong, and Inderjit S. Dhillon. 2016. PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification. In ICML.Google ScholarGoogle Scholar
  21. Soyoung Yoon, Gyuwan Kim, and Kyumin Park. 2021. SSMix: Saliency-Based Span Mixup for Text Classification. In ACL/IJCNLP.Google ScholarGoogle Scholar
  22. Ronghui You, Zihan Zhang, Ziye Wang, Suyang Dai, Hiroshi Mamitsuka, and Shanfeng Zhu. 2019. AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification. In NeurIPS.Google ScholarGoogle Scholar
  23. Danqing Zhang, Tao Li, Haiyang Zhang, and Bing Yin. 2020. On Data Augmentation for Extreme Multi-label Classification. arXiv preprint arXiv:2009.10778 (2020).Google ScholarGoogle Scholar
  24. Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz. 2018. mixup: Beyond Empirical Risk Minimization. In ICLR.Google ScholarGoogle Scholar
  25. Arkaitz Zubiaga. 2012. Enhancing Navigation on Wikipedia with Social Tags. arXiv preprint arXiv:1202.5469 (2012).Google ScholarGoogle Scholar

Index Terms

  1. Long-tail Mixup for Extreme Multi-label Classification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
        October 2022
        5274 pages
        ISBN:9781450392365
        DOI:10.1145/3511808
        • General Chairs:
        • Mohammad Al Hasan,
        • Li Xiong

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 October 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • short-paper

        Acceptance Rates

        CIKM '22 Paper Acceptance Rate621of2,257submissions,28%Overall Acceptance Rate1,861of8,427submissions,22%

        Upcoming Conference

      • Article Metrics

        • Downloads (Last 12 months)158
        • Downloads (Last 6 weeks)25

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader