Deep Active Learning with Simulated Rationales for Text Classification

Guélorget, Paul; Grilheres, Bruno; Zaharia, Titus

doi:10.1007/978-3-030-59830-3_32

Deep Active Learning with Simulated Rationales for Text Classification

Conference paper
First Online: 09 October 2020

1452 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12068))

Abstract

Neural networks have become a preferred tool for text classification tasks, demonstrating state of the art performances when trained on a large set of labeled data. However, in an early active learning setup, the scarcity of the ground-truth labels available severely penalizes the generalization capability of the neural network. In order to overcome such limitations, in this paper, we introduce a new learning strategy, which consist of inserting in the early stages of the learning process some additional, local and salient knowledge, presented under the form of simulated, human like rationales. We show how such knowledge can be automatically extracted from documents by analyzing the class activation maps of a convolutional neural network. The experimental results obtained demonstrate that the exploitation of such rationales permits to significantly speed-up the learning process, with a spectacular increase of the accuracy rates, starting from a very reduced number of documents (10–20).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)
Article Google Scholar
Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1988). https://doi.org/10.1023/A:1022821128753
Article MathSciNet Google Scholar
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 [cs, stat], September 2014
Brinker, K.: Incorporating diversity in active learning with support vector machines. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 2003, Washington, DC, USA, pp. 59–66. AAAI Press (2003). http://dl.acm.org/citation.cfm?id=3041838.3041846
Castro, D.W., Souza, E., Vitório, D., Santos, D., Oliveira, A.L.: Smoothed n-gram based models for tweet language identification: a case study of the Brazilian and European Portuguese national varieties. Appl. Soft Comput. 61, 1160–1172 (2017)
Article Google Scholar
Catal, C., Nangir, M.: A sentiment classification model based on multiple classifiers. Appl. Soft Comput. 50, 135–141 (2017)
Article Google Scholar
Charalampakis, B., Spathis, D., Kouslis, E., Kermanidis, K.: A comparison between semi-supervised and supervised text mining techniques on detecting irony in Greek political tweets. Eng. Appl. Artif. Intell. 51, 50–57 (2016)
Article Google Scholar
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation, June 2014. https://arxiv.org/abs/1406.1078
Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994). https://doi.org/10.1007/BF00993277
Article Google Scholar
Dagan, I., Engelson, S.P.: Committee-based sampling for training probabilistic classifiers. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 150–157. Morgan Kaufmann (1995)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 [cs], October 2018
Elhamifar, E., Sapiro, G., Yang, A., Sasrty, S.S.: A convex optimization framework for active learning. In: 2013 IEEE International Conference on Computer Vision, pp. 209–216, December 2013. https://doi.org/10.1109/ICCV.2013.33
Enzweiler, M., Gavrila, D.M.: A mixed generative-discriminative framework for pedestrian classification. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008. https://doi.org/10.1109/CVPR.2008.4587592. iSSN: 1063-6919
Giatsoglou, M., Vozalis, M.G., Diamantaras, K., Vakali, A., Sarigiannidis, G., Chatzisavvas, K.C.: Sentiment analysis leveraging emotions and word embeddings. Expert Syst. Appl. 69, 214–224 (2017)
Article Google Scholar
Gorriz, M., Carlier, A., Faure, E., Nieto, X.G.I.: Cost-effective active learning for melanoma segmentation. CoRR abs/1711.09168 (2017). http://arxiv.org/abs/1711.09168
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv:1607.01759 [cs], August 2016
Kim, Y.: Convolutional neural networks for sentence classification. arXiv:1408.5882 [cs], August 2014. http://arxiv.org/abs/1408.5882
Lang, K.: Newsweeder: learning to filter Netnews. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 331–339 (1995)
Google Scholar
Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann (1994)
Google Scholar
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. arXiv:cmp-lg/9407020, July 1994
Li, X., Guo, Y.: Adaptive active learning for image classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 859–866 (June 2013). https://doi.org/10.1109/CVPR.2013.116
Lin, Z., et al.: A structured self-attentive sentence embedding. arXiv:1703.03130 [cs], March 2017. http://arxiv.org/abs/1703.03130
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. arXiv:1701.04128 [cs], January 2017
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics, June 2011. https://www.aclweb.org/anthology/P11-1015
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 [cs], September 2013
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Peters, M.E., et al.: Deep contextualized word representations. arXiv:1802.05365 [cs], February 2018. http://arxiv.org/abs/1802.05365
Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at TREC-3, vol. 109, p. 109. NIST Special Publication Sp (1995)
Google Scholar
Settles, B.: Active Learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2009). https://minds.wisconsin.edu/handle/1793/60660
Sharma, M., Zhuang, D., Bilgic, M.: Active learning with rationales for text classification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, pp. 441–451. Association for Computational Linguistics, June 2015. http://www.aclweb.org/anthology/N15-1047
Valiant, L.G.: A theory of the learnable. In: Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing. STOC 1984, pp. 436–445. ACM, New York (1984). https://doi.org/10.1145/800057.808710
Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z.: Suggestive annotation: a deep active learning framework for biomedical image segmentation. arXiv:1706.04737 [cs], June 2017
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928 [cs], February 2017
Zaidan, O.F., Eisner, J., Piatko, C.D.: Machine learning with annotator rationales to reduce annotation cost (2008)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2921–2929. IEEE, June 2016. https://doi.org/10.1109/CVPR.2016.319. http://ieeexplore.ieee.org/document/7780688/

Download references

Author information

Authors and Affiliations

Institut Polytechnique de Paris, Institut Mines-Télécom, Télécom SudParis, Palaiseau, France
Paul Guélorget & Titus Zaharia
Airbus Defence and Space, Élancourt, France
Paul Guélorget & Bruno Grilheres

Authors

Paul Guélorget
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Grilheres
View author publications
You can also search for this author in PubMed Google Scholar
Titus Zaharia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Guélorget .

Editor information

Editors and Affiliations

East China Normal University, Shanghai, China
Yue Lu
Paris Descartes University, Paris, France
Nicole Vincent
Hong Kong Baptist University, Kowloon, Hong Kong
Pong Chi Yuen
Sun Yat-sen University, Guangzhou, China
Wei-Shi Zheng
Polytechnique Montréal, Montreal, QC, Canada
Farida Cheriet
Concordia University, Montreal, QC, Canada
Ching Y. Suen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guélorget, P., Grilheres, B., Zaharia, T. (2020). Deep Active Learning with Simulated Rationales for Text Classification. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-030-59830-3_32
Published: 09 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59829-7
Online ISBN: 978-3-030-59830-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics