Information Extraction System for Invoices and Receipts

Tan, QiuXing Michelle; Cao, Qi; Seow, Chee Kiat; Yau, Peter Chunyu

doi:10.1007/978-981-99-4752-2_7

QiuXing Michelle Tan¹³,
Qi Cao¹⁴,
Chee Kiat Seow¹⁴ &
…
Peter Chunyu Yau¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14089))

Included in the following conference series:

International Conference on Intelligent Computing

983 Accesses

Abstract

Rapid growth in the digitization of documents, such as paper-based invoices or receipts, has alleviated the demand for methods to process information accurately and efficiently. However, it has become impractical for humans to extract the data manually, as it is labor-intensive and time-consuming. Digital documents contain various components such as tables, key-value pairs and figures. Existing optical character recognition (OCR) methods can recognize texts, but it is challenging to extract the key-value pairs in unformatted digital invoices or receipts. Hence, developing an information extraction system with intelligent algorithms would be beneficial, as it can increase the workflow efficiency for knowledge discovery and data recognition. In this paper, a pipeline of the information extraction system is proposed with intelligent computing and deep learning approaches for classifying key-value pairs first, followed by linking the key-value pairs. Two key-value pairing rules are developed in the proposed pipeline. Various experiments with intelligent algorithms are conducted to evaluate the performance of the pipeline of information extraction system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Turner, R.: The myth of the paperless office. New Libr. World 104(3), 120–121 (2003)
Article Google Scholar
Klein, B., Agne, S., Dengel, A.: Results of a study on invoice-reading systems in Germany. In: International Workshop on Document Analysis Systems, pp. 451–462 (2004)
Google Scholar
Document Recognizer to modernize information processing. https://crossmasters.com/en/blog/document-recognizer-to-modernize-information-processing/. Accessed 16 July 2022
Kay, A.: Tesseract: an open-source optical character recognition engine. Linux Journal (2007)
Google Scholar
Amazon Web Services: Form Data (Key-Value Pairs). https://docs.aws.amazon.com/textract/latest/dg/how-it-works-kvp.html. Accessed 15 June 2022
Qing, Y., Zeng, Y., Cao, Q., Huang, G.-B.: End-to-end novel visual categories learning via auxiliary self-supervision. Neural Netw. 139, 24–32 (2021)
Article Google Scholar
Xu, D., Li, Z., Cao, Q.: Object-based illumination transferring and rendering for applications of mixed reality. Vis. Comput. 1–15 (2021). https://doi.org/10.1007/s00371-021-02292-2
Kumar, V., Kaware, P., Singh, P., et al.: Extraction of information from bill receipts using optical character recognition. In: International Conference on Smart Electronics and Communication, pp. 72–77 (2020)
Google Scholar
Kamisetty, V.S.R., Chidvilas, B.S., Revathy, S., et al.: Digitization of data from invoice using OCR. In: 6^th International Conference on Computing Methodologies and Communication, pp. 1–10 (2022)
Google Scholar
Kaló, Á.Z., Sipos, M.L.: Key-value pair searching system via tesseract OCR and post processing. In: 19^th World Symposium on Applied Machine Intelligence and Informatics, pp. 000461–000464 (2021)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. North American Chapter of the Association for Computational Linguistics (2019)
Google Scholar
Liu, Y., Ott, M., Goyal, N., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 (2019)
Wang, W., Bi, B., Yan, M., et al.: StructBERT: incorporating language structures into pre-training for deep language understanding. In: International Conference on Learning Representations (2020)
Google Scholar
Xu, Y., Li, M., Cui, L., et al.: LayoutLM: pre-training of text and layout for document image understanding. In: 26^th ACM International Conference on Knowledge Discovery & Data Mining, pp. 1192–1200 (2020)
Google Scholar
Xu, Y., Xu, Y., Lv, T., et al.: LayoutLMv2: multi-modal pre-training for visually-rich document understanding. In: 59^th Annual Meeting of the Association for Computational Linguistics and 11^th International Joint Conference on Natural Language Processing (2021)
Google Scholar
Jaume, G. Ekenel H. K., Thiran, J.: FUNSD: a dataset for form understanding in noisy scanned documents. In: International Conference on Document Analysis and Recognition Workshops, pp. 1–6 (2019)
Google Scholar
Garncarek, Ł., et al.: LAMBERT: layout-aware language modeling for information extraction. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12821, pp. 532–547. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86549-8_34
Li, C., Bi, B., Yan, M., et al.: StructuralLM: structural pre-training for form understanding. In: 59^th Annual Meeting of the Association for Computational Linguistics and 11^th International Joint Conference on Natural Language Processing (2021)
Google Scholar
Banksy Annotation Tool. https://github.com/AboutGoods/Banksy-annotation-tool. Accessed 18 June 2022
Muller, B.: BERT 101 state of the art NLP model explained. https://huggingface.co/blog/bert-101. Accessed 21 June 2022
Howard, J., Ruder, S.: Universal language model fine-tuning for text classification. In: 56^th Annual Meeting of the Association for Computational Linguistics (2018)
Google Scholar
ICDAR 2019 robust reading challenge on scanned receipts OCR and information extraction. https://rrc.cvc.uab.es/?ch=13&com=tasks. Accessed 13 June 2022
Agrawal, S.: Metrics to evaluate your classification model to take the right decisions. https://www.analyticsvidhya.com/blog/2021/07/metrics-to-evaluate-your-classification-model-to-take-the-right-decisions/. Accessed 21 June 2022
Johansen, B.: Named-entity recognition for Norwegian. In: 22^nd Nordic Conference on Computational Linguistics, pp. 222-231 (2019)
Google Scholar
Huang, Y., Lv, T., Cui, L., Lu, Y., Wei, F.: LayoutLMv3: Pre-training for document AI with unified text and image masking. arXiv:2204.08387 (2022)

Download references

Acknowledgment

The first author would like to thank her intern supervisor Mr. Eric Tan of Infocomm Media Development Authority (IMDA) Singapore, for his guidance and dedicated supports in the project.

Author information

Authors and Affiliations

Computing Science, Singapore Institute of Technology - University of Glasgow, Singapore, Singapore
QiuXing Michelle Tan
School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
Qi Cao, Chee Kiat Seow & Peter Chunyu Yau

Authors

QiuXing Michelle Tan
View author publications
You can also search for this author in PubMed Google Scholar
Qi Cao
View author publications
You can also search for this author in PubMed Google Scholar
Chee Kiat Seow
View author publications
You can also search for this author in PubMed Google Scholar
Peter Chunyu Yau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Cao .

Editor information

Editors and Affiliations

Department of Computer Science, Eastern Institute of Technology, Zhejiang, China
De-Shuang Huang
University of Wollongong, North Wollongong, NSW, Australia
Prashan Premaratne
Zhengzhou University of Light Industry, Zhengzhou, China
Baohua Jin
Zhong Yuan University of Technology, Zhengzhou, China
Boyang Qu
University of Ulsan, Ulsan, Korea (Republic of)
Kang-Hyun Jo
Department of Computer Science, Liverpool John Moores University, Liverpool, UK
Abir Hussain

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tan, Q.M., Cao, Q., Seow, C.K., Yau, P.C. (2023). Information Extraction System for Invoices and Receipts. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science(), vol 14089. Springer, Singapore. https://doi.org/10.1007/978-981-99-4752-2_7

Download citation

DOI: https://doi.org/10.1007/978-981-99-4752-2_7
Published: 31 July 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4751-5
Online ISBN: 978-981-99-4752-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics