Abstract
The current deep learning revolution has established to transformer based architectures as the state of the art in several natural language processing tasks. However, it is not clear whether such models can be also used for enhancing other aspects of the learning pipeline in the NLP context. This paper presents a study in such a direction, in particular, we explore the suitability of transformer models as a data augmentation mechanism for text classification. We introduce four ways of using transformer models for augmenting data in text classification. Each of these variants take the outputs of a transformer model, feed with training documents, and use such outputs as additional training data. The proposed strategies are evaluated in benchmark data using CNN and LSTM based classifiers. Experimental results are promising: improvements over a model training on the plain documents are consistent.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agirre, E., Arregi, X., Otegi, A.: Document expansion based on WordNet for robust IR. In: Coling 2010: Posters, pp. 9–17 (2010)
Anaby-Tavor, A., et al.: Not enough data? Deep learning to the rescue! arXiv preprint 1911.03118 (2019)
Bowyer, K.W., Chawla, N.V., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Int. Res. 16(1), 321–357 (2002)
Cabrera, J.M., Escalante, H.J., Montes-y-Gómez, M.: Distributional term representations for short-text categorization. In: Gelbukh, A. (ed.) CICLing 2013, Part II. LNCS, vol. 7817, pp. 335–346. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37256-8_28
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: learning augmentation policies from data. ArXiv preprint 1805.09501 (2018)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the NAACL, pp. 4171–4186, June 2019
Gong, Z., Cheang, C.W., Hou U, L.: Multi-term web query expansion using WordNet. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 379–388. Springer, Heidelberg (2006). https://doi.org/10.1007/11827405_37
Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. ArXiv preprint 1801.06146 (2018)
Kobayashi, S.: Contextual augmentation: data augmentation by words with paradigmatic relations. In: Proceedings of the 2018 Conference of NAACL, pp. 452–457 (2018)
Kumar, V., Choudhary, A., Cho, E.: Data augmentation using pre-trained transformer models. ArXiv preprint 2003.02245 (2020)
Lavelli, A., Sebastiani, F., Zanoli, R.: Distributional term representations: an experimental comparison. In: Proceedings of the 13th ACM International Conference on Information and Knowledge Management, pp. 615–624. ACM (2004)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. Technical report (2019)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6, 60 (2019). https://doi.org/10.1186/s40537-019-0197-0
Vaswani, A.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017)
Wang, W.Y., Yang, D.: That’s so annoying!!!: A lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using #petpeeve tweets. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2557–2563 (2015)
Wei, J., Zou, K.: EDA: easy data augmentation techniques for boosting performance on text classification tasks. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing, pp. 6382–6388. Association for Computational Linguistics (2019)
Acknowledgements
This work was partially supported by CONACyT under project grant A1-S-26314, Integración de Visión y Lenguaje mediante Representaciones Multimodales Aprendidas para Clasificación y Recuperación de Imágenes y Videos.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Tapia-Téllez, J.M., Escalante, H.J. (2020). Data Augmentation with Transformers for Text Classification. In: Martínez-Villaseñor, L., Herrera-Alcántara, O., Ponce, H., Castro-Espinoza, F.A. (eds) Advances in Computational Intelligence. MICAI 2020. Lecture Notes in Computer Science(), vol 12469. Springer, Cham. https://doi.org/10.1007/978-3-030-60887-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-60887-3_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60886-6
Online ISBN: 978-3-030-60887-3
eBook Packages: Computer ScienceComputer Science (R0)