Abstract
Neural networks have become a preferred tool for text classification tasks, demonstrating state of the art performances when trained on a large set of labeled data. However, in an early active learning setup, the scarcity of the ground-truth labels available severely penalizes the generalization capability of the neural network. In order to overcome such limitations, in this paper, we introduce a new learning strategy, which consist of inserting in the early stages of the learning process some additional, local and salient knowledge, presented under the form of simulated, human like rationales. We show how such knowledge can be automatically extracted from documents by analyzing the class activation maps of a convolutional neural network. The experimental results obtained demonstrate that the exploitation of such rationales permits to significantly speed-up the learning process, with a spectacular increase of the accuracy rates, starting from a very reduced number of documents (10–20).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)
Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1988). https://doi.org/10.1023/A:1022821128753
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 [cs, stat], September 2014
Brinker, K.: Incorporating diversity in active learning with support vector machines. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 2003, Washington, DC, USA, pp. 59–66. AAAI Press (2003). http://dl.acm.org/citation.cfm?id=3041838.3041846
Castro, D.W., Souza, E., Vitório, D., Santos, D., Oliveira, A.L.: Smoothed n-gram based models for tweet language identification: a case study of the Brazilian and European Portuguese national varieties. Appl. Soft Comput. 61, 1160–1172 (2017)
Catal, C., Nangir, M.: A sentiment classification model based on multiple classifiers. Appl. Soft Comput. 50, 135–141 (2017)
Charalampakis, B., Spathis, D., Kouslis, E., Kermanidis, K.: A comparison between semi-supervised and supervised text mining techniques on detecting irony in Greek political tweets. Eng. Appl. Artif. Intell. 51, 50–57 (2016)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation, June 2014. https://arxiv.org/abs/1406.1078
Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994). https://doi.org/10.1007/BF00993277
Dagan, I., Engelson, S.P.: Committee-based sampling for training probabilistic classifiers. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 150–157. Morgan Kaufmann (1995)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 [cs], October 2018
Elhamifar, E., Sapiro, G., Yang, A., Sasrty, S.S.: A convex optimization framework for active learning. In: 2013 IEEE International Conference on Computer Vision, pp. 209–216, December 2013. https://doi.org/10.1109/ICCV.2013.33
Enzweiler, M., Gavrila, D.M.: A mixed generative-discriminative framework for pedestrian classification. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008. https://doi.org/10.1109/CVPR.2008.4587592. iSSN: 1063-6919
Giatsoglou, M., Vozalis, M.G., Diamantaras, K., Vakali, A., Sarigiannidis, G., Chatzisavvas, K.C.: Sentiment analysis leveraging emotions and word embeddings. Expert Syst. Appl. 69, 214–224 (2017)
Gorriz, M., Carlier, A., Faure, E., Nieto, X.G.I.: Cost-effective active learning for melanoma segmentation. CoRR abs/1711.09168 (2017). http://arxiv.org/abs/1711.09168
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv:1607.01759 [cs], August 2016
Kim, Y.: Convolutional neural networks for sentence classification. arXiv:1408.5882 [cs], August 2014. http://arxiv.org/abs/1408.5882
Lang, K.: Newsweeder: learning to filter Netnews. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 331–339 (1995)
Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann (1994)
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. arXiv:cmp-lg/9407020, July 1994
Li, X., Guo, Y.: Adaptive active learning for image classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 859–866 (June 2013). https://doi.org/10.1109/CVPR.2013.116
Lin, Z., et al.: A structured self-attentive sentence embedding. arXiv:1703.03130 [cs], March 2017. http://arxiv.org/abs/1703.03130
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. arXiv:1701.04128 [cs], January 2017
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics, June 2011. https://www.aclweb.org/anthology/P11-1015
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 [cs], September 2013
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Peters, M.E., et al.: Deep contextualized word representations. arXiv:1802.05365 [cs], February 2018. http://arxiv.org/abs/1802.05365
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at TREC-3, vol. 109, p. 109. NIST Special Publication Sp (1995)
Settles, B.: Active Learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2009). https://minds.wisconsin.edu/handle/1793/60660
Sharma, M., Zhuang, D., Bilgic, M.: Active learning with rationales for text classification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, pp. 441–451. Association for Computational Linguistics, June 2015. http://www.aclweb.org/anthology/N15-1047
Valiant, L.G.: A theory of the learnable. In: Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing. STOC 1984, pp. 436–445. ACM, New York (1984). https://doi.org/10.1145/800057.808710
Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z.: Suggestive annotation: a deep active learning framework for biomedical image segmentation. arXiv:1706.04737 [cs], June 2017
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928 [cs], February 2017
Zaidan, O.F., Eisner, J., Piatko, C.D.: Machine learning with annotator rationales to reduce annotation cost (2008)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2921–2929. IEEE, June 2016. https://doi.org/10.1109/CVPR.2016.319. http://ieeexplore.ieee.org/document/7780688/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Guélorget, P., Grilheres, B., Zaharia, T. (2020). Deep Active Learning with Simulated Rationales for Text Classification. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-59830-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59829-7
Online ISBN: 978-3-030-59830-3
eBook Packages: Computer ScienceComputer Science (R0)