Skip to main content

Grammatical inference in document recognition

  • Conference paper
  • First Online:
Grammatical Inference (ICGI 1998)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1433))

Included in the following conference series:

Abstract

In this paper, we consider the Pattern Recognition applied to paper documents based on the grammatical inference (GI) for classes of structured documents like summaries, dictionaries, bibliographic data basis, encyclopaedias and so on. In this task, the inference engine takes as input a set of individual examples of these documents and outputs a set of rules that recognise similar documents. We place GI in an algebraic framework in which rewrite rules will define the process of generalisation. The implementation algorithm discussed here is used in a current document handling project in which paper documents are typographically tagged and then recognised. One of the current applications in this project is to extract the physical and the logical structures of a given set of paper documents and then reorganise them in a machine readable form like HTML code.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Tayeb-Bey, A. S. Saidi “Grammatical Formalism for Document Understanding System: From Document towards HTML Text”. BSDIA'97, November 1997, Brasilia.

    Google Scholar 

  2. E.M. Gold. “Language identification in the limit”. Inf. and Control, 10(5)-1967.

    Google Scholar 

  3. H.S. Fu and T. Booth: “Grammatical Inference: Introduction and Survey”, parts 1 & 2. IEEE Trans. Sys. man and Cyber. SMC-5: 95–11.

    Google Scholar 

  4. R. C. Gonzalez and M. G. Thomason. “Syntactic Pattern Recognition, an Introduction”. Addison Wesley. Reading Mass. 1978.

    Google Scholar 

  5. H.S. Fu. “Syntactic Pattern Recognition and Applications”. Prentice Hall, N.Y. 1982.

    Google Scholar 

  6. L. Miclet. “Grammatical Inference”. Syntactic and Structural Pattern Recognition. H. Bunk and SanFeliu eds. World Scientific.

    Google Scholar 

  7. J. Onica, P. Garcia. “Inferring regular Languages in Polynomial Update time”. Pattern Recognition and Image Analysis. 1992.

    Google Scholar 

  8. P. Dupont, L. Miclet & E. Vidal. “What is the search space of Regular Inference?”. ICGI'94, Grammatical Inference and Applications. Springer-Verlag-94.

    Google Scholar 

  9. L. Fribourg, M. V. Peixoto. “Automates concurrents à Contraintes”. TSI.13 (6). 1994.

    Google Scholar 

  10. J. A. Goguen, J.W. Tatcher, E.G. Wagner, J.B. Wright. “Initial Algebra Semantics and Continuous Algebra”. JACM 24(1). 1977.

    Google Scholar 

  11. A. S. Saidi: “Extensions Grammaticales de la Programmation Logique”. PhD. 1992.

    Google Scholar 

  12. A. S. Saidi. “On the unification of phrases”. IFIP-94.

    Google Scholar 

  13. H. Ehrig, B. Mahr. “ Fundamentals of Algebraic Specification”. Vol-1 & 2. Springer-Verlag1985.

    Google Scholar 

  14. E.M. Gold. “Complexity of automaton identification from given data””. Information and Control, 37-1978.

    Google Scholar 

  15. J. E. Hopcroft, J.D. Ullmann. “Formal Languages and their Relation to Automata”. Addison-Wesley 1969.

    Google Scholar 

  16. F. Bancilhon & all. “Magic Sets and Other Strange Ways to Implement Logic Programs”. Proc. ACM Symp. on principles of Databases Systems. Boston 1986.

    Google Scholar 

  17. F. Coste, J. Nicols: “Regular Inference as a graph coloring Problem”. ICML'97. 1997.

    Google Scholar 

  18. K.R. Apt, M.H. Van Emden: “Contribution to the Theory of Logic Programming”. JACM. 29(3ℴ. 1982.

    Google Scholar 

  19. R.S. Michalski & all. “Machine Learning: An Artificial Intelligence Approach”, vol. 1 & 2. Springer-Verlag 1984 and Morgan Kaufmann 1986.

    Google Scholar 

  20. H. Ahohen, H. Mannila. “Forming Grammars for structured documents”. Research report. University of Helsinki. 1994.

    Google Scholar 

  21. P. Frankhauser, Y. Xu. “MarkitUp! an incremental approach to document structure recognition”. Elect. Publishing-Organisation, Dissemination and Design, 6(4). 1994.

    Google Scholar 

  22. G. Lindén “Structured Document Transformation”. PhD Thesis. University of Helsinki.Finland June 1997.

    Google Scholar 

  23. Y. Yan Tang, C. De Yan, C. Y. Suen “Document processing for Automatic Knowledge Acquisition”. IEEE transactions on Knowledge and Data Engineering. 6(1). 1994.

    Google Scholar 

  24. B. Poirier, M. Dagenais. “Outils d'extraction et de reconnaissance de la structure de documents”. CNED'96. pp. 179–184. Nantes-France 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Vasant Honavar Giora Slutzki

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saidi, A.S., Tayeb-bey, S. (1998). Grammatical inference in document recognition. In: Honavar, V., Slutzki, G. (eds) Grammatical Inference. ICGI 1998. Lecture Notes in Computer Science, vol 1433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054074

Download citation

  • DOI: https://doi.org/10.1007/BFb0054074

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-64776-8

  • Online ISBN: 978-3-540-68707-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics