research-article

Self-Attention-based Data Augmentation Method for Text Classification

Authors:

Mehari Yohannes Hailemariam,

Akiyoshi Matono,

Toshiyuki AmagasaAuthors Info & Claims

ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing

Pages 239 - 244

https://doi.org/10.1145/3587716.3587779

Published: 07 September 2023 Publication History

Abstract

Text classification, where textual data is analyzed to gain meaningful information, has many applications in information extraction and data management. Recently, deep-learning models have been applied with success to this problem; however, they require sufficient labeled training data to produce a robust model, and performance suffers in low-resource domains where sufficient training data is unavailable and collecting or creating labeled training data is challenging in terms of cost, energy, and time. To address this problem, we propose an effective data augmentation approach for text classification. Our method employs a self-attention mechanism to augment the text, where we alter and substitute, in some scenarios, words with the highest attention score and, in some cases, words with low scores. Experimental results show that our method performs at least as well as current approaches in most scenarios and outperforms current approaches in some cases by as much as seven percent.

References

[1]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Transactions of the association for computational linguistics 5 (2017), 135–146.

[2]

Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5–32.

Digital Library

[3]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423

[4]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.

Digital Library

[5]

Minqing Hu and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM New York, NY, USA, New York, NY, USA, 168–177.

Digital Library

[6]

Chang Jin, Shigui Qiu, Nini Xiao, and Hao Jia. 2022. AdMix: A Mixed Sample Data Augmentation Method for Neural Machine Translation. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Lud De Raedt (Ed.). International Joint Conferences on Artificial Intelligence Organization, New York, NY, USA, 4171–4177. https://doi.org/10.24963/ijcai.2022/579 Main Track.

[7]

Akbar Karimi, Leonardo Rossi, and Andrea Prati. 2021. AEDA: An Easier Data Augmentation Technique for Text Classification. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 2748–2754. https://doi.org/10.18653/v1/2021.findings-emnlp.234

[8]

Varun Kumar, Ashutosh Choudhary, and Eunah Cho. 2020. Data Augmentation using Pre-trained Transformer Models. In Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems. Association for Computing Machinery, New York, NY, USA, 18–26.

[9]

Xin Li and Dan Roth. 2002. Learning question classifiers. In COLING 2002: The 19th International Conference on Computational Linguistics. ACM New York, NY, USA, New York, NY, USA, 1–7.

Digital Library

[10]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 abs/1907.11692 (2019).

[11]

Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013), 1–12.

[12]

Anna Mosolova, Vadim Fomin, and Ivan Bondarenko. 2018. Text Augmentation for Neural Networks. In AIST (Supplement). ceur-ws.org, ceur-ws.org, 104–109.

[13]

Ch A. S. Murty and Parag H. Rughani. 2022. Dark Web Text Classification by Learning through SVM Optimization. Journal of Advances in Information Technology (JAIT) 13 (2022), 624–631.

[14]

Trung T Nguyen, Amartya Hatua, and Andrew H Sung. 2017. Building a learning machine classifier with inadequate data for crime prediction. Journal of Advances in Information Technology Vol 8, 2 (2017), 141–147.

[15]

Eda Okur, Saurav Sahay, and Lama Nachman. 2022. Data Augmentation with Paraphrase Generation and Entity Extraction for Multimodal Dialogue System. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. ACM, New York, NY, USA, New York, NY, USA, 4114–4125.

[16]

B PANG. 2004. A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL), 2004. ACM New York, NY, USA, New York, NY, USA, 271–es.

[17]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). ACM, New York, NY, USA, New York, NY, USA, 1532–1543.

[18]

Georgios Rizos, Konstantin Hemker, and Björn Schuller. 2019. Augment to prevent: short-text data augmentation in deep learning for hate-speech classification. In Proceedings of the 28th ACM international conference on information and knowledge management. ACM, New York, NY, USA, New York, NY, USA, 991–1000.

Digital Library

[19]

Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of big data 6, 1 (2019), 1–48.

[20]

Richard Socher, John Bauer, Christopher D Manning, and Andrew Y Ng. 2013. Parsing with compositional vector grammars. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). ACM New York, NY, USA, New York, NY, USA, 455–465.

[21]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017), 1–11.

[22]

To Nguyen Phuoc Vinh and Ha Hoang Kha. 2021. Vietnamese News Articles Classification Using Neural Networks. Journal of Advances in Information Technology (JAIT) 12 (2021), 363–369.

[23]

Jason Wei and Kai Zou. 2019. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 6382–6388. https://doi.org/10.18653/v1/D19-1670

[24]

Hailemariam Mehari Yohannes and Toshiyuki Amagasa. 2022. A method of named entity recognition for Tigrinya. ACM SIGAPP Applied Computing Review 22, 3 (2022), 56–68.

Digital Library

[25]

Hailemariam Mehari Yohannes and Toshiyuki Amagasa. 2022. Named-entity recognition for a low-resource language using pre-trained language model. In Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing. Association for Computing Machinery, New York, NY, USA, 837–844.

Digital Library

[26]

Hailemariam Mehari Yohannes and Toshiyuki Amagasa. 2022. A Scheme for News Article Classification in a Low-Resource Language. In International Conference on Information Integration and Web. Springer, Cham, 519–530.

Digital Library

Cited By

V PS UB SV RThangaraju MP U(2024)Advanced Explainable AI: Self Attention Deep Neural Network of Text ClassificationJournal of Machine and Computing10.53759/7669/jmc202404056(586-593)Online publication date: 5-Jul-2024
https://doi.org/10.53759/7669/jmc202404056
Mehari Yohannes HLynden SAmagasa TMatono A(2024)Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMsNatural Language Processing and Information Systems10.1007/978-3-031-70239-6_12(166-180)Online publication date: 20-Sep-2024
https://doi.org/10.1007/978-3-031-70239-6_12

Index Terms

Self-Attention-based Data Augmentation Method for Text Classification
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Semi-supervised text classification with deep convolutional neural network using feature fusion approach
WI '19: IEEE/WIC/ACM International Conference on Web Intelligence

Supervised learning algorithms employ labeled training data for classification purposes while obtaining labeled data for large datasets is costly and time consuming. Semi-supervised learning algorithms, on the contrary, use a small set of labeled data ...
Data augmentation and semi-supervised learning for deep neural networks-based text classifier
SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing

User feedback is essential for understanding user needs. In this paper, we use free-text obtained from a survey on sleep-related issues to build a deep neural networks-based text classifier. However, to train the deep neural networks model, a lot of ...
Improving Text Classification Accuracy by Training Label Cleaning

In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain. Semisupervised learning and active learning are two strategies whose aim is maximizing the effectiveness of the resulting ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing

February 2023

619 pages

ISBN:9781450398411

DOI:10.1145/3587716

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 September 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMLC 2023

ICMLC 2023: 2023 15th International Conference on Machine Learning and Computing

February 17 - 20, 2023

Zhuhai, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
90
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)5

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

V PS UB SV RThangaraju MP U(2024)Advanced Explainable AI: Self Attention Deep Neural Network of Text ClassificationJournal of Machine and Computing10.53759/7669/jmc202404056(586-593)Online publication date: 5-Jul-2024
https://doi.org/10.53759/7669/jmc202404056
Mehari Yohannes HLynden SAmagasa TMatono A(2024)Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMsNatural Language Processing and Information Systems10.1007/978-3-031-70239-6_12(166-180)Online publication date: 20-Sep-2024
https://doi.org/10.1007/978-3-031-70239-6_12

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten