Skip to main content

Toward an Incremental Classification Process of Document Stream Using a Cascade of Systems

  • Conference paper
  • First Online:
Document Analysis and Recognition – ICDAR 2021 Workshops (ICDAR 2021)

Abstract

In the context of imbalanced classification, deep neural networks suffer from the lack of samples provided by low represented classes. They can’t train enough their weights with a statistically reliable set. All solutions in the state of the art that could offer better performance for those classes, sacrifice in return a huge part of their precision on bigger classes. In this paper, we propose a solution to this problem by introducing a system cascade concept that could integrate deep neural network. This system is designed to keep as mush as possible the original network performance while it reinforces the classification of the minor classes by the addition of stages with more specialised systems. This cascade offers the possibility to integrate few-shot learning or incremental architecture following the deep neural network without major restrictions on system internal architecture. Our method keeps intact or slightly improves the performances of a deep neural network (used as first stage) in conventional cases and improves performances in strongly imbalanced cases by around +8% accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Aaron, B., Tamir, D.E., Rishe, N.D., Kandel, A.: Dynamic incremental k-means clustering. In: 2014 International Conference on Computational Science and Computational Intelligence, vol. 1, pp. 308–313. IEEE (2014)

    Google Scholar 

  2. Bakkali, S., Ming, Z., Coustaty, M., Rusinol, M.: Visual and textual deep feature fusion for document image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 562–563 (2020)

    Google Scholar 

  3. Bouguelia, M.R., Belaïd, Y., Belaïd, A.: A stream-based semi-supervised active learning approach for document classification. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 611–615. IEEE (2013)

    Google Scholar 

  4. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  5. d’Andecy, V.P., Joseph, A., Ogier, J.M.: IndUS: incremental document understanding system focus on document classification. In: 2018 13th IAPR International Workshop on Document Analysis Systems (DAS), pp. 239–244. IEEE (2018)

    Google Scholar 

  6. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  7. Fei-Fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28(4), 594–611 (2006)

    Article  Google Scholar 

  8. Górriz, M., Antony, J., McGuinness, K., Giró-i Nieto, X., O’Connor, N.E.: Assessing knee OA severity with CNN attention-based end-to-end architectures. arXiv preprint arXiv:1908.08856 (2019)

  9. Harley, A.W., Ufkes, A., Derpanis, K.G.: Evaluation of deep convolutional nets for document image classification and retrieval. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 991–995. IEEE (2015)

    Google Scholar 

  10. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  11. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  12. Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)

  13. Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)

  14. Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, Lille, vol. 2 (2015)

    Google Scholar 

  15. Kochurov, M., Garipov, T., Podoprikhin, D., Molchanov, D., Ashukha, A., Vetrov, D.: Bayesian incremental learning for deep neural networks. arXiv preprint arXiv:1802.07329 (2018)

  16. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  17. Lake, B.M., Salakhutdinov, R., Tenenbaum, J.B.: Human-level concept learning through probabilistic program induction. Science 350(6266), 1332–1338 (2015)

    Article  MathSciNet  Google Scholar 

  18. Laskov, P., Gehl, C., Krüger, S., Müller, K.R.: Incremental support vector learning: analysis, implementation and applications. J. Mach. Learn. Res. 7, 1909–1936 (2006)

    Google Scholar 

  19. Martin, L., et al.: CamemBERT: a tasty French language model. arXiv preprint arXiv:1911.03894 (2019)

  20. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  21. Palm, R.B., Winther, O., Laws, F.: CloudScan-a configuration-free invoice analysis system using recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 406–413. IEEE (2017)

    Google Scholar 

  22. Rosenfeld, A., Tsotsos, J.K.: Incremental learning through deep adaptation. IEEE Trans. Pattern Anal. Machine Intell. 42, 651–663 (2018)

    Google Scholar 

  23. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: One-shot learning with memory-augmented neural networks. arXiv preprint arXiv:1605.06065 (2016)

  24. Sarwar, S.S., Ankit, A., Roy, K.: Incremental learning in deep convolutional neural networks using partial network sharing. IEEE Access 8, 4615–4628 (2019)

    Article  Google Scholar 

  25. Schuster, D., et al.: Intellix-end-user trained information extraction for document archiving. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 101–105. IEEE (2013)

    Google Scholar 

  26. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  27. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  28. Snell, J., Swersky, K., Zemel, R.S.: Prototypical networks for few-shot learning. arXiv preprint arXiv:1703.05175 (2017)

  29. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

    Google Scholar 

  30. Voerman, J., Joseph, A., Coustaty, M., Poulain d’Andecy, V., Ogier, J.-M.: Evaluation of neural network classification systems on document stream. In: Bai, X., Karatzas, D., Lopresti, D. (eds.) DAS 2020. LNCS, vol. 12116, pp. 262–276. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57058-3_19

    Chapter  Google Scholar 

  31. Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. (CSUR) 53(3), 1–34 (2020)

    Article  Google Scholar 

  32. Xian, Y., Schiele, B., Akata, Z.: Zero-shot learning-the good, the bad and the ugly. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4582–4591 (2017)

    Google Scholar 

  33. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)

    Google Scholar 

  34. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joris Voerman .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Voerman, J., Mahamoud, I.S., Joseph, A., Coustaty, M., d’Andecy, V.P., Ogier, JM. (2021). Toward an Incremental Classification Process of Document Stream Using a Cascade of Systems. In: Barney Smith, E.H., Pal, U. (eds) Document Analysis and Recognition – ICDAR 2021 Workshops. ICDAR 2021. Lecture Notes in Computer Science(), vol 12917. Springer, Cham. https://doi.org/10.1007/978-3-030-86159-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86159-9_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86158-2

  • Online ISBN: 978-3-030-86159-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics