Skip to main content

Incremental and Adaptive Software Systems Development of Natural Language Applications

  • Conference paper
  • First Online:
Information System Development
  • 1774 Accesses

Abstract

Natural Language (NL) processing tools, such as tokenizers, part-of speech taggers or syntactic processors obtain knowledge from a set of documents (e.g., tokens, syntactic patterns, etc.) and produce the different elements that will take part on the discourse universe in a NL text (e.g., noun phrases, verbs, sentences, etc.). In this paper, we present how NL software systems development can be performed incrementally by using a high-performance specification language like Maude. A generic algebraic specification for NL is defined, including sorts and sub-sorts apart from equational properties, such as associativity and commutativity for built-in lists and sets. Then, the full discourse universe, available for NL processing, is described in terms of the algebraic specification by providing a non-deterministic but terminating set of transformation rules. Finally, and as a proof of concept, a set of documents for NL processing is given to Maude as an input term and successfully transformed into a proper document, exploring all the non-deterministic possibilities, as well as resolving the ambiguity in language. The main advantages of implementing NL in this manner are: generality, transparency, extensibility, reusability, and maintainability. To the best of our knowledge, this is the first attempt to represent and develop complex NL software systems with this formal notation, and based on the analysis conducted, this implementation constitute the basis for the design and development of more specific NL processing applications, such as text summarization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://gate.ac.uk/

  2. 2.

    uima.apache.org/

  3. 3.

    http://www-nlpir.nist.gov/projects/duc/

  4. 4.

    http://www.sketchengine.co.uk/tagsets/penn.html

References

  1. Ghezzi C, Jazayeri M, Mandrioli D (2002) Fundamentals of software engineering, 2nd edn. Prentice Hall PTR, Upper Saddle River, NJ

    Google Scholar 

  2. Pressman RS (2001) Software engineering: a practitioner’s approach, 5th edn. McGraw-Hill Higher Education, Columbus, OH

    Google Scholar 

  3. Hinchey M, Jackson M, Cousot P, Cook B, Bowen JP, Margaria T (2008) Software engineering and formal methods. Commun ACM 51(9):54–59

    Article  Google Scholar 

  4. Dale R, Somers HL, Moisl H (eds) (2000) Handbook of natural language processing. Marcel Dekker, Inc., New York, NY

    Google Scholar 

  5. Leidner J (2003) Current issues in software engineering for natural language processing. In: Proceedings of the workshop on software engineering and architecture of language technology systems, pp 45–50

    Google Scholar 

  6. Frankel D (2002) Model driven architecture: applying MDA to enterprise computing. Wiley, New York, NY

    Google Scholar 

  7. Czarnecki K, Eisenecker UW (2000) Generative programming: methods, tools, and applications. ACM Press/Addison-Wesley Publishing Co., New York, NY

    Google Scholar 

  8. Clavel M, Durán F, Eker S, Lincoln P, Martí-Oliet N, Meseguer J, Talcott CL (eds) (2007) All about Maude—a high-performance logical framework, vol 4350, How to specify, program and verify systems in rewriting logic, Lecture Notes in Computer Science. Springer, Heidelberg

    MATH  Google Scholar 

  9. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12:2493–2537

    MATH  Google Scholar 

  10. Klein D, Manning CD (2003) Accurate unlexicalized parsing. In: Proceedings of the 41st annual meeting on association for computational linguistics—vol 1. Association for Computational Linguistics, Stroudsburg, PA, pp 423–430

    Google Scholar 

  11. Pereira F, Warren D (1980) Definite clause grammars for language analysis—a survey of the formalism and a comparison with augmented transition networks. Artif Intell 13:231–278

    Article  MATH  MathSciNet  Google Scholar 

  12. Steedman M, Baldridge J (2011) Combinatory categorial grammar. Wiley-Blackwell, Oxford, pp 181–224

    Google Scholar 

  13. Steedman M (2010) Some important problems in natural language processing. Technical report, University of Edinburgh

    Google Scholar 

  14. Huang F, Yates A, Ahuja A, Downey D (2011) Language models as representations for weakly-supervised nlp tasks. In: Proceedings of the fifteenth conference on computational natural language learning, pp 125–134

    Google Scholar 

  15. Bateman JA, Hois J, Ross R, Tenbrink T (2010) A linguistic ontology of space for natural language processing. Artif Intell 174(14):1027–1071

    Article  Google Scholar 

  16. Chiarcos C (2012) A generic formalism to represent linguistic corpora in rdf and owl/dl. In: Proceedings of the eight international conference on language resources and evaluation (LREC’12)

    Google Scholar 

  17. Clavel M, Durán F, Eker S, Lincoln P, Martí-Oliet N, Meseguer J, Talcott C (2003) The Maude 2.0 system. In: Rewriting techniques and applications (RTA 2003), 2706, pp 76–87

    Google Scholar 

  18. Lloret E, Escobar S, Palomar M, Ramos I (2013) Natural language modelling using maude. Technical report, University of Alicante

    Google Scholar 

  19. Martínez-Barco P, Ferrández-Rodríguez A, Tomás D, Lloret E, Saquete E, Llopis F, Peral J, Palomar M, Gmez-Soriano JM, Romá MT (2013) Legolang: Técnicas de deconstrucción en la tecnolog´ıas del lenguaje humano. Procesamiento de Lenguaje natural (51)

    Google Scholar 

Download references

Acknowledgements

E. Lloret and M. Palomar have been partially funded by the Spanish Government through the project TEXT-MESS 2.0 (TIN2009-13391-C04) and Técnicas de Deconstrucción en la tecnologías del Lenguaje Humano (TIN2012-3 1224) and by the Generalitat Valenciana through project PROMETEO (PROMETEO/2009/199). Moreover, S. Escobar has been partially supported by the EU (FEDER) and the Spanish MEC/MICINN under grant TIN 2010-21062-C02-02, and by Generalitat Valenciana PROMETEO201 1/052.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elena Lloret .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Lloret, E., Escobar, S., Palomar, M., Ramos, I. (2014). Incremental and Adaptive Software Systems Development of Natural Language Applications. In: José Escalona, M., Aragón, G., Linger, H., Lang, M., Barry, C., Schneider, C. (eds) Information System Development. Springer, Cham. https://doi.org/10.1007/978-3-319-07215-9_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07215-9_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07214-2

  • Online ISBN: 978-3-319-07215-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics