An adaptive document recognition system for lettrines

Nguyen, Nhu-Van; Coustaty, Mickael; Ogier, Jean-Marc

doi:10.1007/s10032-019-00346-9

An adaptive document recognition system for lettrines

Original Paper
Published: 10 October 2019

Volume 23, pages 115–128, (2020)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

282 Accesses
Explore all metrics

Abstract

In this paper, we propose an approach to interactively propagate annotations representing the historians’ knowledge on a database of lettrine images manually populated by historians (with annotations). Based on a novel document indexing processing scheme which combines the use of the Zipf law and the use of bag of patterns, our approach extends the bag-of-words model to represent the knowledge by visual features through relevance feedback. Then, annotation propagation is automatically performed to propagate knowledge to the lettrine database. Our approach is presented together with preliminary experimental results and an illustrative example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion

ConceptRank for search-based image annotation

Article 16 May 2017

Petra Budikova, Michal Batko & Pavel Zezula

Evaluating Term-Expansion for Unsupervised Image Annotation

Notes

http://gallica.bnf.fr/
The VHL is a team of the French historical center (CESR laboratory) working on documents from the Renaissance.

References

Bibliothèque nationale suisse. http://www.nb.admin.ch/slb/index.html. http://www.nb.admin.ch/slb/index.html?lang=fr
Bannour, H., Hudelot, C.: Towards ontologies for image interpretation and annotation. In: Content-Based Multimedia Indexing (CBMI), 2011 9th International Workshop on, pp. 211 –216 (2011)
Baudrier, E., Girard, N., Ogier, J.M.: A Non-symmetrical method of image local-difference comparison for ancient impressions dating. In: Seventh IAPR International Workshop on Graphics Recognition (GREC’07). Curitiba Brésil (2007)
Bigun, J., Bhattacharjee, S.K., Michel, S.: Orientation radiograms for image retrieval: an alternative to segmentation. In: Proceedings of 13th International Conference on Pattern Recognition, Vienna, Austria, vol. 3, pp. 346–350 (1996)
Bloehdorn, S., Petridis, K., Saathoff, C., Simou, N., Tzouvaras, V., Avrithis, Y., Handschuh, S., Kompatsiaris, Y., Staab, S., Strintzis, M.G.: Semantic annotation of images and videos for multimedia analysis. In: Proceedings of the Second European Conference on The Semantic Web: Research and Applications, ESWC’05, pp. 592–607. Springer, Berlin (2005)
Google Scholar
Chazalon, J., Coüasnon, B.: Iterative analysis of document collections enables efficient human-initiated interaction. In: DRR (2012)
Chouaib, H., Cloppet, F., Vincent, N.: Graphical drop caps indexing. In: Ogier, J.M., Liu, W., Lladós, J. (eds.) Graphics Recognition. Achievements, Challenges, and Evolution. LNCS, vol. 6020, pp. 212–219. Springer, Berlin (2010)
Chapter Google Scholar
Corrêa, G.N., Marcacini, R.M., Hruschka, E.R., Rezende, S.O.: Interactive textual feature selection for consensus clustering. Pattern Recogn. Lett. 52(C), 25–31 (2015). https://doi.org/10.1016/j.patrec.2014.09.008
Article Google Scholar
Coustaty, M., Ogier, J.M.: Discrimination of old document images using their style. In: International Conference on Document Analysis and Recognition, pp. 389–393 (2011)
Coustaty, M., Pareti, R., Vincent, N., Ogier, J.M.: Towards historical document indexing: extraction of drop cap letters. Int. J. Doc. Anal. Recognit. 14(3), 1–12 (2011)
Article Google Scholar
Filali, J., Zghal, H.B., Martinet, J.: Towards visual vocabulary and ontology-based image retrieval system. In: International Conference on Agents and Artificial Intelligence , International Conference on Agents and Artificial Intelligence, vol. 2, pp. 560 – 565. Rome, Italy (2016). https://doi.org/10.5220/0005832805600565. https://hal.archives-ouvertes.fr/hal-01557742
Frigui, H., Krishnapuram, R.: Clustering by competitive agglomeration. Pattern Recognit. 30(7), 1109–1119 (1997)
Article Google Scholar
Gao, H., Rusiñol, M., Karatzas, D., Antonacopoulos, A., Lladós, J.: An interactive appearance-based document retrieval system for historical newspapers. VISAPP 2, 84–87 (2013)
Google Scholar
Ghorbel, A., Almaksour, A., Lemaitre, A., Anquetil, É.: Incremental learning for interactive sketch recognition. In: GREC, pp. 108–118 (2011)
Hechenbichler, W., Schliep, K.: Weighted k-nearest-neighbor techniques and ordinal classification. In: SFB Discussion paper 399 (2004)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc., Upper Saddle River (1988)
MATH Google Scholar
Jiang, W., Chan, K.L., Li, M., Zhang, H.: Mapping low-level features to high-level semantic concepts in region-based image retrieval. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 244–249 (2005). https://doi.org/10.1109/CVPR.2005.220
Jones, K.S.: Experiments in relevance weighting of search terms. Inf. Process. Manag. 15(3), 133–144 (1979)
Article Google Scholar
Journet, N., Ramel, J.Y., Mullot, R., Eglin, V.: Document image characterization using a multiresolution analysis of the texture: application to old documents. IJDAR 11(1), 9–18 (2008)
Article Google Scholar
Karatzas, D., d’Andecy, V.P., Rusiñol, M., Chica, A., Vázquez, P.P.: Human-document interaction systems: a new frontier for document image analysis. In: Proceedings of the International Workshop on Document Analysis Systems (DAS) (2016)
Kothari, R., Pitts, D.: On finding the number of clusters. Pattern Recognit. Lett. 20(4), 405–416 (1999)
Article Google Scholar
Lin, W.C., Chang, Y.C., Chen, H.H.: Integrating textual and visual information for cross-language image retrieval: a trans-media dictionary approach. Inf. Process. Manag. 43(2), 488–502 (2007)
Article Google Scholar
Lladós, J., Rusiñol, M., Fornés, A., Fernández, D., Dutta, A.: On the influence of word representations for handwritten word spotting in historical documents. Int. J. Pattern Recognit. Artif. Intell. 26(05), 1263002 (2012). https://doi.org/10.1142/S0218001412630025
Article MathSciNet Google Scholar
Nguyen, N., Boucher, A., Ogier, J.: Keyword visual representation for image retrieval and image annotation. IJPRAI 29(6), 1555010 (2015). https://doi.org/10.1142/S0218001415550101
Article Google Scholar
Nguyen, N.V., Boucher, A., Ogier, J.M., Tabbone, S.: Cluster-based relevance feedback for CBIR: a combination of query point movement and query expansion. J. Ambient Intell. Humaniz. Comput. 3(4), 281–292 (2012)
Article Google Scholar
Nourashrafeddin, S., Sherkat, E., Minghim, R., Milios, E.E.: A visual approach for interactive keyterm-based clustering. ACM Trans. Interact. Intell. Syst. 8(1), 6:1–6:35 (2018). https://doi.org/10.1145/3181669
Article Google Scholar
Pal, K., Schüller, C., Panozzo, D., Sorkine-Hornung, O., Weyrich, T.: Content-aware surface parameterization for interactive restoration of historical documents. Comput. Graph. Forum 33(2), 401–409 (2014). https://doi.org/10.1111/cgf.12299
Article Google Scholar
Pareti, R., Vincent, N.: Global discrimination of graphic styles. In: GREC, pp. 120–130 (2005)
Google Scholar
Pareti, R., Vincent, N.: Ancient initial letters indexing. In: 18th International Conference on Pattern Recognition, pp. 756–759. IEEE Computer Society, Hong Kong, China (2006)
Purday, J.: Think culture: Europeana.eu from concept to construction. Electron. Libr. 27(6), 919–937 (2009)
Article Google Scholar
Rocchio, J.: Relevance Feedback in Information Retrieval, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)
Google Scholar
Rusiñol, M., Lladós, J.: Boosting the handwritten word spotting experience by including the user in the loop. Pattern Recognit. 47(3), 1063–1072 (2014). https://doi.org/10.1016/j.patcog.2013.07.008
Article Google Scholar
Rusiñol, M., Lladós, J.: The role of the users in handwritten word spotting applications: query fusion and relevance feedback. In: ICFHR, pp. 55–60 (2012)
Sherkat, E., Nourashrafeddin, S., Minghim, R., Milios, E.: A visual approach for interactive expertise finding and exploration. In: CIKM 2016 Workshop on Data-Driven Talent Acquisition (2016)
Sivic, J., Zisserman, A.: Efficient visual search for objects in videos. Proc. IEEE 96, 548–566 (2008)
Article Google Scholar
Valveny, E., Ramos, O., Mas, J., Rossinyol, M.: Interactive document retrieval and classification. In: Multimodal Interaction in Image and Video Applications. Springer, Berlin (2016)
Vats, E., Hast, A.: On-the-fly historical handwritten text annotation. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, pp. 10–14 (2017)
Ward, J.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58, 236–244 (1963)
Article MathSciNet Google Scholar
Xie, H., Zhang, Y., Tan, J., Guo, L., Li, J.: Contextual query expansion for image retrieval. IEEE Trans. Multimedia 16(4), 1104–1114 (2014)
Article Google Scholar
Xie, R., Liu, Z., Luan, H., Sun, M.: Image-embodied knowledge representation learning. In: IJCAI (2017)
Zaghden, N., Mullot, R., Alimi, M.A.: A proposition of a robust system for historical document images indexation. arXiv preprint arXiv:1308.6319 (2013)
Zhao, R., Grosky, W.I.: Narrowing the semantic gap-improved text-based web document retrieval using visual features. IEEE Trans. Multimedia 4(2), 189–200 (2002)
Article Google Scholar
Zipf, G.: Human Behavior and the Principle of Least Effort. Hafner Pub. Co, New York (1949)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire L3i, Université de La Rochelle, 17042, La Rochelle Cedex 1, France
Nhu-Van Nguyen, Mickael Coustaty & Jean-Marc Ogier

Authors

Nhu-Van Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Mickael Coustaty
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Ogier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nhu-Van Nguyen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen, NV., Coustaty, M. & Ogier, JM. An adaptive document recognition system for lettrines. IJDAR 23, 115–128 (2020). https://doi.org/10.1007/s10032-019-00346-9

Download citation

Received: 13 November 2018
Revised: 16 August 2019
Accepted: 27 September 2019
Published: 10 October 2019
Issue Date: June 2020
DOI: https://doi.org/10.1007/s10032-019-00346-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An adaptive document recognition system for lettrines

Abstract

Access this article

Similar content being viewed by others

A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion

ConceptRank for search-based image annotation

Evaluating Term-Expansion for Unsupervised Image Annotation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An adaptive document recognition system for lettrines

Abstract

Access this article

Similar content being viewed by others

A Picture Is Worth a Thousand Tags: Automatic Web Based Image Tag Expansion

ConceptRank for search-based image annotation

Evaluating Term-Expansion for Unsupervised Image Annotation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation