Skip to main content

Re-expressing Business Processes Information from Corporate Documents into Controlled Language

  • Conference paper
  • First Online:
Natural Language Processing and Information Systems (NLDB 2016)

Abstract

In this paper, we propose a top-down approach for converting business processes information from corporate documents into controlled language. This proposal is achieved with a multi-level methodology. We first characterize document structure by using rhetorical analysis to determine relevant sections for information extraction. Then, a verb-centered event analysis is performed to start defining the typical patterns featured by business processes information. Lastly, morpho-syntactic and dependency parsing is carried out for extracting this information. This multi-level knowledge is used to define rules for converting the extracted sentences into a controlled language, which is intended to be used in software requirements elicitation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This work has been partly funded by the University of Medellin’s Research Vice-provost’s Office, Wake Forest University, and National University of Colombia, under the project: “Defining a Specific-Domain Controlled Language: Linguistic and Transformational Bases from Corporate Documents in Natural Language”.

  2. 2.

    A controlled natural language is a sub-language of the corresponding natural language [2].

  3. 3.

    UN-Lencep is the Spanish acronym for ‘National University of ColombiaControlled language for the specification of pre-conceptual models.

  4. 4.

    We used Freeling (http://nlp.lsi.upc.edu/freeling/) for dependency parsing.

References

  1. Manrique-Losada, B., Burgos, D.A., Zapata-Jaramillo, C.M.: Exploring MWEs for knowledge acquisition from corporate technical documents. In: 9th Workshop on Multiword Expressions -MWE 2013, NAACL 2013, Atlanta, July 2013

    Google Scholar 

  2. Fuchs, N.E., Schwitter, R.: Specifying logic programs in controlled natural language. Technical report IFI 95.17, University of Zurich (1995)

    Google Scholar 

  3. Cybulski, J.L., Reed, K.: Requirements classification and reuse: crossing domain boundaries. In: Frakes, W.B. (ed.) ICSR 2000. LNCS, vol. 1844, pp. 190–210. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Cleland-Huang, J., Marrero, W., Berenbach, B.: Goal-centric traceability: using virtual plumblines to maintain critical systemic qualities. Trans. Soft. Eng. 34, 685–699 (2008)

    Article  Google Scholar 

  5. Bajwa, I.S., Lee, M., Bordbar, B.: SBVR business rules generation from natural language specification. In: AAAI Spring Symposium, pp. 2–8. AAAI, San Francisco (2011)

    Google Scholar 

  6. Meth, H., Li, Y., Maedche, A. Mueller, B.: Advancing task elicitation systems–an experimental evaluation of design principles. In: Proceedings of 33rd International Conference on Information Systems, pp. 54–68. AISEL, Florida (2012)

    Google Scholar 

  7. Young, J.D., Antón, A.I.: A method for identifying software requirements based on policy commitments. In: 18th International Requirements engineering Conference, pp. 47–56. IEEE, Sydney (2010)

    Google Scholar 

  8. Wang, F.H.: On acquiring classification knowledge from noisy data based on rough set. Expert Syst. Appl. 29(1), 49–64 (2005)

    Article  Google Scholar 

  9. Dinesh, N., Joshi, A., Lee, I. Webber, B.: Extracting formal specifications from natural language regulatory documents. In: ICoS-5, Buxton (2006)

    Google Scholar 

  10. Vegega, C., Amatriain, H., Pytel, P., Pollo, F., Britos, P., García, R.: Formalización de Dominios de Negocio basada en Técnicas de Ingeniería del Conocimiento para Proyectos de Explotación de Información. In: Proceedings of IX JIISIC, pp. 79–86. PUCP, Lima (2012)

    Google Scholar 

  11. Aysolmaz, B., Demirors, O.: Modeling business processes to generate artifacts for software development: a methodology. In: Proceedings of the 6th International Workshop on Modeling in Software Engineering, pp. 7–12. ACM, New York (2014)

    Google Scholar 

  12. Hao, J., Yan, Y., Gong, L., Wang, G., Lin, J.: Knowledge map-based method for domain knowledge browsing. Decis. Support Syst. 61, 106–114 (2014)

    Article  Google Scholar 

  13. Tavares, V., Santoro, F.M., Borges, M.R.S.: A context-based model for knowledge management embodied in work processes. Inf. Sci. 179, 2538–2554 (2009)

    Article  Google Scholar 

  14. Azaustre, A., Casas, J.: Manual de retórica española. Ariel, Barcelona (1997)

    Google Scholar 

  15. Burdiles, G.A.: Descripción de la organización retórica del género caso clínico de la medicina a partir del corpus CCM-2009. Ph.D. thesis in Applied Linguistics. Pontificia Universidad Católica de Valparaíso, Chile (2011)

    Google Scholar 

  16. Swales, J.M.: Research Genres: Explorations and Applications. Univ. Press, Cambridge (2004)

    Book  Google Scholar 

  17. Parodi, G.: Lingüística de corpus: una introducción al ámbito. Revista de Lingüística Teórica y Aplicada 46(1), 93–119 (2008)

    Google Scholar 

  18. Manrique-Losada, B.: A formalization for mapping discourses from business-based technical documents into controlled language texts for requirements elicitation. Ph.D. thesis, Universidad Nacional de Colombia (2014)

    Google Scholar 

  19. Pivovarova, L., Huttunen, S., Yangarber, R.: Event representation across genre. In: Proceedings of 1st Workshop on EVENTS: Definition, Detection, Coreference, and Representation, pp. 29–37 (2013)

    Google Scholar 

  20. Do, Q.X., Chan, Y.S., Roth, D.: Minimally supervised event causality identification. In: EMNLP 2011 (2011)

    Google Scholar 

  21. Bejan, C.A., Harabagiu, S.: Unsupervised Event Coreference Resolution. Computational Linguistics 40(2) (2013)

    Google Scholar 

  22. Vossen, P. (ed.): EuroWordNet General Document. Version 3. University of Amsterdam, Amsterdam (2002)

    Google Scholar 

  23. Chaowicharat, E., Naruedomkul, K.: Co-ocurrence-based error correction approach to word segmentation. In: Boonthum-Denecke, C., McCarthy, P.M., Lamkin, T. (eds.) Cross-Disciplinary Advances in Applied Natural Language Processing. Issues and Approaches, pp. 354–364. Information Sciences Reference Publishers (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego A. Burgos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Manrique-Losada, B., Zapata-Jaramillo, C.M., Burgos, D.A. (2016). Re-expressing Business Processes Information from Corporate Documents into Controlled Language. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds) Natural Language Processing and Information Systems. NLDB 2016. Lecture Notes in Computer Science(), vol 9612. Springer, Cham. https://doi.org/10.1007/978-3-319-41754-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41754-7_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41753-0

  • Online ISBN: 978-3-319-41754-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics