Knowledge Integration Inside Multitask Network for Analysis of Unseen ID Types

Neitthoffer, Timothée; Lemaitre, Aurélie; Coüasnon, Bertrand; Soullard, Yann; Awal, Ahmad Montaser

doi:10.1007/978-3-031-41501-2_21

Timothée Neitthoffer^9,11,
Aurélie Lemaitre⁹,
Bertrand Coüasnon⁹,
Yann Soullard^9,10 &
…
Ahmad Montaser Awal¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14194))

Included in the following conference series:

International Conference on Document Analysis and Recognition

553 Accesses
1 Citations

Abstract

Identity Document recognition is a key step in Know Your Customer applications where identity documents (IDs) are verified. IDs belonging to the same type share the same field structure called template. Traditional ID pipelines leverage this template to guide the localisation of the fields and then the text recognition. However, they have to be tuned to the different templates to correctly perform on those. Thus, such pipelines can not be directly used on new types of IDs. In this work, we address the task of text localisation and recognition in the context of new document types, where only the template is available with no labeled samples from the new ID type. To that end, we propose the use of Context Blocks (CB) performing template self-attention to guide the features of the network by the template. We propose three ways to leverage CB in a multitask architecture. To evaluate our approach, we design a new public task for the MIDV2020 database from rectified in-the-wild photos. Our method achieves the best results for two datasets including an industrial one composed of real examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fast End-to-End Deep Learning Identity Document Detection, Classification and Cropping

Textmatcher: cross-attentional neural network to compare image and text

Article 06 November 2023

A Text Recognition Augmented Deep Learning Approach for Logo Identification

Notes

1.
https://gitlab.inria.fr/tneittho/midv2020-rectified-photo.

References

Attivissimo, F., Giaquinto, N., Scarpetta, M., Spadavecchia, M.: An automatic reader of identity documents. In: IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3525–3530 (2019)
Google Scholar
Bluche, T.: Joint line segmentation and transcription for end-to-end handwritten paragraph recognition. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Bulatov, K.B., Bezmaternykh, P.V., Nikolaev, D.P., Arlazarov, V.V.: Towards a unified framework for identity documents analysis and recognition. Comput. Opt. 46(3), 436–454 (2022)
Article Google Scholar
Bulatov, K., Arlazarov, V.V., Chernov, T., Slavin, O., Nikolaev, D.: Smart IDReader: document recognition in video stream. In: ICDAR, vol. 6, pp. 39–44. IEEE (2017)
Google Scholar
Bulatov, K.B., Emelianova, E., Tropin, D.V., et al.: MIDV-2020: a comprehensive benchmark dataset for identity document analysis. CoRR, abs/2107.00396 (2021)
Google Scholar
Carbonell, M., Fornés, A., Villegas, M., Lladós, J.: A neural model for text localization, transcription and named entity recognition in full pages. Pattern Recogn. Lett. 136, 219–227 (2020)
Article Google Scholar
Coquenet, D., Chatelain, C., Paquet, T.: SPAN: a simple predict & align network for handwritten paragraph recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 70–84. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_5
Chapter Google Scholar
Coquenet, D., Chatelain, C., Paquet, T.: End-to-end handwritten paragraph text recognition using a vertical attention network. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 508–524 (2022)
Article Google Scholar
Coquenet, D., Chatelain, C., Paquet, T.: DAN: a segmentation-free document attention network for handwritten document recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Google Scholar
Coüasnon, B.: DMOS, a generic document recognition method: application to table structure analysis in a general and in a specific way. IJDAR 8, 111–122 (2006)
Article Google Scholar
d’Andecy, V.P., Hartmann, E., Rusinol, M.: Field extraction by hybrid incremental and a-priori structural templates. In: DAS Workshop, pp. 251–256. IEEE (2018)
Google Scholar
Guerry, C., Couasnon, B., Lemaitre, A.: Combination of deep learning and syntactical approaches for the interpretation of interactions between text-lines and tabular structures in handwritten documents. In: ICDAR (2019)
Google Scholar
Kushibar, K., Valverde, S., Gonzalez-Villa, S., et al.: Automated sub-cortical brain structure segmentation combining spatial and deep convolutional features. Med. Image Anal. 48, 177–186 (2018)
Article Google Scholar
Mustafina, V., Ivanov, S.: Identity document recognition: neural network approach. In: International Russian Automation Conference, pp. 806–811 (2021)
Google Scholar
Sarshogh, M.R., Hines, K.: A multi-task network for localization and recognition of text in images. In: ICDAR, pp. 494–501 (2019)
Google Scholar
Van Hoai, D.P., Duong, H.T., Hoang, V.T.: Text recognition for Vietnamese identity card based on deep features network. IJDAR 24, 123–131 (2021)
Article Google Scholar
Yousef, M., Bishop, T.E.: OrigamiNet: weakly-supervised, segmentation free, one-step, full page text recognition by learning to unfold. In: CVPR (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Univ Rennes, CNRS, IRISA, Rennes, France
Timothée Neitthoffer, Aurélie Lemaitre, Bertrand Coüasnon & Yann Soullard
Université Rennes 2, CNRS, LETG, IRISA, Rennes, France
Yann Soullard
IDNow, AI & ML Center of Excellence, Cesson-Sévigné, France
Timothée Neitthoffer & Ahmad Montaser Awal

Authors

Timothée Neitthoffer
View author publications
You can also search for this author in PubMed Google Scholar
Aurélie Lemaitre
View author publications
You can also search for this author in PubMed Google Scholar
Bertrand Coüasnon
View author publications
You can also search for this author in PubMed Google Scholar
Yann Soullard
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Montaser Awal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timothée Neitthoffer .

Editor information

Editors and Affiliations

University of La Rochelle, La Rochelle, France
Mickael Coustaty
Autonomous University of Barcelona, Bellaterra, Spain
Alicia Fornés

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neitthoffer, T., Lemaitre, A., Coüasnon, B., Soullard, Y., Awal, A.M. (2023). Knowledge Integration Inside Multitask Network for Analysis of Unseen ID Types. In: Coustaty, M., Fornés, A. (eds) Document Analysis and Recognition – ICDAR 2023 Workshops. ICDAR 2023. Lecture Notes in Computer Science, vol 14194. Springer, Cham. https://doi.org/10.1007/978-3-031-41501-2_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-41501-2_21
Published: 15 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41500-5
Online ISBN: 978-3-031-41501-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Knowledge Integration Inside Multitask Network for Analysis of Unseen ID Types