Data Augmentation with Transformers for Text Classification

Tapia-Téllez, José Medardo; Escalante, Hugo Jair

doi:10.1007/978-3-030-60887-3_22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12469))

Included in the following conference series:

Mexican International Conference on Artificial Intelligence

1110 Accesses
1 Citations

Abstract

The current deep learning revolution has established to transformer based architectures as the state of the art in several natural language processing tasks. However, it is not clear whether such models can be also used for enhancing other aspects of the learning pipeline in the NLP context. This paper presents a study in such a direction, in particular, we explore the suitability of transformer models as a data augmentation mechanism for text classification. We introduce four ways of using transformer models for augmenting data in text classification. Each of these variants take the outputs of a transformer model, feed with training documents, and use such outputs as additional training data. The proposed strategies are evaluated in benchmark data using CNN and LSTM based classifiers. Experimental results are promising: improvements over a model training on the plain documents are consistent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agirre, E., Arregi, X., Otegi, A.: Document expansion based on WordNet for robust IR. In: Coling 2010: Posters, pp. 9–17 (2010)
Google Scholar
Anaby-Tavor, A., et al.: Not enough data? Deep learning to the rescue! arXiv preprint 1911.03118 (2019)
Bowyer, K.W., Chawla, N.V., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)
MATH Google Scholar
Cabrera, J.M., Escalante, H.J., Montes-y-Gómez, M.: Distributional term representations for short-text categorization. In: Gelbukh, A. (ed.) CICLing 2013, Part II. LNCS, vol. 7817, pp. 335–346. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_28
Chapter Google Scholar
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: learning augmentation policies from data. ArXiv preprint 1805.09501 (2018)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the NAACL, pp. 4171–4186, June 2019
Google Scholar
Gong, Z., Cheang, C.W., Hou U, L.: Multi-term web query expansion using WordNet. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 379–388. Springer, Heidelberg (2006). https://doi.org/10.1007/11827405_37
Chapter Google Scholar
Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. ArXiv preprint 1801.06146 (2018)
Kobayashi, S.: Contextual augmentation: data augmentation by words with paradigmatic relations. In: Proceedings of the 2018 Conference of NAACL, pp. 452–457 (2018)
Google Scholar
Kumar, V., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models. ArXiv preprint 2003.02245 (2020)
Lavelli, A., Sebastiani, F., Zanoli, R.: Distributional term representations: an experimental comparison. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pp. 615–624. ACM (2004)
Google Scholar
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. Technical report (2019)
Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019). https://doi.org/10.1186/s40537-019-0197-0
Article Google Scholar
Vaswani, A.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)
Google Scholar
Wang, W.Y., Yang, D.: That’s so annoying!!!: A lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2557–2563 (2015)
Google Scholar
Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing, pp. 6382–6388. Association for Computational Linguistics (2019)
Google Scholar

Download references

Acknowledgements

This work was partially supported by CONACyT under project grant A1-S-26314, Integración de Visión y Lenguaje mediante Representaciones Multimodales Aprendidas para Clasificación y Recuperación de Imágenes y Videos.

Author information

Authors and Affiliations

Instituto Nacional de Astrofisica Optica y Electronica, Puebla, Mexico
José Medardo Tapia-Téllez & Hugo Jair Escalante

Authors

José Medardo Tapia-Téllez
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Jair Escalante
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hugo Jair Escalante .

Editor information

Editors and Affiliations

Facultad de Ingeniería, Universidad Panamericana, Mexico City, Mexico
Lourdes Martínez-Villaseñor
Universidad Autónoma Metropolitana, Mexico City, Mexico
Oscar Herrera-Alcántara
Facultad de Ingeniería, Universidad Panamericana, Mexico City, Mexico
Hiram Ponce
Universidad Autónoma del Estado de Hidalgo, Hidalgo, Mexico
Félix A. Castro-Espinoza

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tapia-Téllez, J.M., Escalante, H.J. (2020). Data Augmentation with Transformers for Text Classification. In: Martínez-Villaseñor, L., Herrera-Alcántara, O., Ponce, H., Castro-Espinoza, F.A. (eds) Advances in Computational Intelligence. MICAI 2020. Lecture Notes in Computer Science(), vol 12469. Springer, Cham. https://doi.org/10.1007/978-3-030-60887-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-60887-3_22
Published: 07 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60886-6
Online ISBN: 978-3-030-60887-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics