Skip to main content

Deep Active Learning with Simulated Rationales for Text Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12068))

Abstract

Neural networks have become a preferred tool for text classification tasks, demonstrating state of the art performances when trained on a large set of labeled data. However, in an early active learning setup, the scarcity of the ground-truth labels available severely penalizes the generalization capability of the neural network. In order to overcome such limitations, in this paper, we introduce a new learning strategy, which consist of inserting in the early stages of the learning process some additional, local and salient knowledge, presented under the form of simulated, human like rationales. We show how such knowledge can be automatically extracted from documents by analyzing the class activation maps of a convolutional neural network. The experimental results obtained demonstrate that the exploitation of such rationales permits to significantly speed-up the learning process, with a spectacular increase of the accuracy rates, starting from a very reduced number of documents (10–20).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abbasi, A., Chen, H., Salem, A.: Sentiment analysis in multiple languages: feature selection for opinion classification in web forums. ACM Trans. Inf. Syst. (TOIS) 26(3), 12 (2008)

    Article  Google Scholar 

  2. Angluin, D.: Queries and concept learning. Mach. Learn. 2(4), 319–342 (1988). https://doi.org/10.1023/A:1022821128753

    Article  MathSciNet  Google Scholar 

  3. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 [cs, stat], September 2014

  4. Brinker, K.: Incorporating diversity in active learning with support vector machines. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 2003, Washington, DC, USA, pp. 59–66. AAAI Press (2003). http://dl.acm.org/citation.cfm?id=3041838.3041846

  5. Castro, D.W., Souza, E., Vitório, D., Santos, D., Oliveira, A.L.: Smoothed n-gram based models for tweet language identification: a case study of the Brazilian and European Portuguese national varieties. Appl. Soft Comput. 61, 1160–1172 (2017)

    Article  Google Scholar 

  6. Catal, C., Nangir, M.: A sentiment classification model based on multiple classifiers. Appl. Soft Comput. 50, 135–141 (2017)

    Article  Google Scholar 

  7. Charalampakis, B., Spathis, D., Kouslis, E., Kermanidis, K.: A comparison between semi-supervised and supervised text mining techniques on detecting irony in Greek political tweets. Eng. Appl. Artif. Intell. 51, 50–57 (2016)

    Article  Google Scholar 

  8. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation, June 2014. https://arxiv.org/abs/1406.1078

  9. Cohn, D., Atlas, L., Ladner, R.: Improving generalization with active learning. Mach. Learn. 15(2), 201–221 (1994). https://doi.org/10.1007/BF00993277

    Article  Google Scholar 

  10. Dagan, I., Engelson, S.P.: Committee-based sampling for training probabilistic classifiers. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 150–157. Morgan Kaufmann (1995)

    Google Scholar 

  11. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805 [cs], October 2018

  12. Elhamifar, E., Sapiro, G., Yang, A., Sasrty, S.S.: A convex optimization framework for active learning. In: 2013 IEEE International Conference on Computer Vision, pp. 209–216, December 2013. https://doi.org/10.1109/ICCV.2013.33

  13. Enzweiler, M., Gavrila, D.M.: A mixed generative-discriminative framework for pedestrian classification. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, June 2008. https://doi.org/10.1109/CVPR.2008.4587592. iSSN: 1063-6919

  14. Giatsoglou, M., Vozalis, M.G., Diamantaras, K., Vakali, A., Sarigiannidis, G., Chatzisavvas, K.C.: Sentiment analysis leveraging emotions and word embeddings. Expert Syst. Appl. 69, 214–224 (2017)

    Article  Google Scholar 

  15. Gorriz, M., Carlier, A., Faure, E., Nieto, X.G.I.: Cost-effective active learning for melanoma segmentation. CoRR abs/1711.09168 (2017). http://arxiv.org/abs/1711.09168

  16. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv:1607.01759 [cs], August 2016

  17. Kim, Y.: Convolutional neural networks for sentence classification. arXiv:1408.5882 [cs], August 2014. http://arxiv.org/abs/1408.5882

  18. Lang, K.: Newsweeder: learning to filter Netnews. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 331–339 (1995)

    Google Scholar 

  19. Lewis, D.D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann (1994)

    Google Scholar 

  20. Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. arXiv:cmp-lg/9407020, July 1994

  21. Li, X., Guo, Y.: Adaptive active learning for image classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 859–866 (June 2013). https://doi.org/10.1109/CVPR.2013.116

  22. Lin, Z., et al.: A structured self-attentive sentence embedding. arXiv:1703.03130 [cs], March 2017. http://arxiv.org/abs/1703.03130

  23. Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. arXiv:1701.04128 [cs], January 2017

  24. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 142–150. Association for Computational Linguistics, June 2011. https://www.aclweb.org/anthology/P11-1015

  25. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 [cs], September 2013

  26. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  27. Peters, M.E., et al.: Deep contextualized word representations. arXiv:1802.05365 [cs], February 2018. http://arxiv.org/abs/1802.05365

  28. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M., et al.: Okapi at TREC-3, vol. 109, p. 109. NIST Special Publication Sp (1995)

    Google Scholar 

  29. Settles, B.: Active Learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences (2009). https://minds.wisconsin.edu/handle/1793/60660

  30. Sharma, M., Zhuang, D., Bilgic, M.: Active learning with rationales for text classification. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, pp. 441–451. Association for Computational Linguistics, June 2015. http://www.aclweb.org/anthology/N15-1047

  31. Valiant, L.G.: A theory of the learnable. In: Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing. STOC 1984, pp. 436–445. ACM, New York (1984). https://doi.org/10.1145/800057.808710

  32. Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z.: Suggestive annotation: a deep active learning framework for biomedical image segmentation. arXiv:1706.04737 [cs], June 2017

  33. Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. arXiv:1612.03928 [cs], February 2017

  34. Zaidan, O.F., Eisner, J., Piatko, C.D.: Machine learning with annotator rationales to reduce annotation cost (2008)

    Google Scholar 

  35. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2921–2929. IEEE, June 2016. https://doi.org/10.1109/CVPR.2016.319. http://ieeexplore.ieee.org/document/7780688/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Guélorget .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guélorget, P., Grilheres, B., Zaharia, T. (2020). Deep Active Learning with Simulated Rationales for Text Classification. In: Lu, Y., Vincent, N., Yuen, P.C., Zheng, WS., Cheriet, F., Suen, C.Y. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2020. Lecture Notes in Computer Science(), vol 12068. Springer, Cham. https://doi.org/10.1007/978-3-030-59830-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59830-3_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59829-7

  • Online ISBN: 978-3-030-59830-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics