Skip to main content

A Framework for Integrating Natural Language Tools

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3960))

Abstract

Natural Language processing (NLP) systems are typically characterized by a pipeline architecture in which several independently developed NLP tools, connected as a chain of filters, apply successive transformations to the data that flows through the system. Hence when integrating such tools, one may face problems that lead to information losses, such as: (i) tools discard information from their input which will be required by other tools further along the pipeline; (ii) each tool has its own input/output format.

This work proposes a solution that solves these problems. We offer a framework for NLP systems. The systems built using this framework use a client server architecture, in which the server acts as a blackboard where all tools add/consult data. Data is kept in the server under a conceptual model independent of the client tools, thus allowing the representation of a broad range of linguistic information.

The tools interact with the server through a generic API which allows the creation of new data and the navigation through all the existing data. Moreover, we provide libraries implemented in several programming language that abstract the connection and communication protocol details between the tools and the server, and provide several levels of functionality that simplify server use.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bird, S., Day, D., Garofolo, J., Henderson, J., Laprun, C., Liberman, M.: Atlas: A flexible and extensible architecture for linguistic annotation (2000)

    Google Scholar 

  2. Bird, S., Liberman, M.: A formal framework for linguistic annotation. Technical Report MS-CIS-99-01, Philadelphia, Pennsylvania (1999)

    Google Scholar 

  3. Bontcheva, K., Tablan, V., Maynard, D., Cunningham, H.: Evolving GATE to Meet New Challenges in Language Engineering. Natural Language Eng. 10(3/4), 349–373 (2004)

    Article  Google Scholar 

  4. de Matos, D.M.M.: Construção de Sistemas de Geração Automática de Língua Natural. PhD thesis, IST - UTL (July 2005)

    Google Scholar 

  5. Fowler, M.: Patterns of Enterprise Application Architecture, November 2002. Addison-Wesley Professional, Reading (2002)

    Google Scholar 

  6. Ide, N., Romary, L., de la., E.: International standard for a linguistic annotation framework (2003)

    Google Scholar 

  7. Lee, H., Maeda, K., Ma, X., Bird, S.: The Annotation Graphs Toolkit (Version 1.0): Application Developer’s Manual. Linguistic Data Consortium, University of Pennsylvania (January 2002)

    Google Scholar 

  8. Loper, E., Bird, S.: Nltk: The natural language toolkit. In: CoRR, cs.CL/0205028 (2002)

    Google Scholar 

  9. Petersen, U.: Emdros - a text database engine for analyzed or annotated text. In: Colling (2004)

    Google Scholar 

  10. Taylor, P., Black, A., Caley, R.: The architecture of the the festival speech synthesis system (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Graça, J., Mamede, N.J., Pereira, J.D. (2006). A Framework for Integrating Natural Language Tools. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_12

Download citation

  • DOI: https://doi.org/10.1007/11751984_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34045-4

  • Online ISBN: 978-3-540-34046-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics