Developing Document Analysis and Data Extraction Tools for Entity Modelling

Fulford, Heather

doi:10.1007/3-540-45399-7_22

Heather Fulford⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1959))

Included in the following conference series:

International Conference on Application of Natural Language to Information Systems

4479 Accesses

Abstract

The entity-relationship approach to conceptual modelling for database design conventionally begins with the analysis of natural language system specifications to identify entities, attributes, and relationships in preparation for the creation of entity models represented in entity-relationship diagrams. This task of document scanning can be both time-consuming and complex, often requiring linguistic knowledge, subject domain knowledge, judgement and intuition. To help alleviate the burden of this aspect of database design, we present some of our research into the development of tools for analysing natural language specifications and extracting candidate entities, attributes, and relationships. Drawing on research in corpus linguistics and terminology science, our research relies on an examination of patterns of word co-occurrence and the use of ‘linguistic cues’. We indicate how we intend integrating our tools into a CASE environment to support database designers during each stage of their work, from the analysis of system specifications through to code generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automated Assessment of ER Model Using the Domain Knowledge

Extracting Declarative Process Models from Natural Language

A Natural Language Approach for Requirements Engineering

References

Abbott, R. J. Program design by informal English descriptions. In Communications of the ACM, 26(11) (1983) 882–894
Article MATH Google Scholar
Aijmer, K. and Altenberg, B. (Eds.) English corpus linguistics: studies in honour of Jan Svartvik. Longman, London and New York (1991)
Google Scholar
Bowers, D. From data to database. 2^nd edition, Chapman and Hall, London (1993)
Google Scholar
Cluver, A. D. de V. A manual of terminography. Human Sciences Research Council, Pretoria (1989)
Google Scholar
Connolly, T. and Begg, C. Database systems: a practical approach to design, implementation and management. 2nd edition. Addison Wesley Longman, Harlow (1999)
Google Scholar
Cruse, D. A. Lexical semantics. Cambridge University Press, Cambridge (1986)
Google Scholar
Fliedl, G., Kop, C., Mayerthaler, W. and Mayr, H. C. Das Projekt NIBA zur automatischen Generierung von Vorentwurfsschemata für die Datenbankentwicklung. In Papiere zur Linguistik, Nr. 55 (Heft 2, 1996) 154–174
Google Scholar
Fulford, H. Knowledge processing 6: collocation patterns and term discovery. Computing Sciences Report. CS-92-21. University of Surrey, Guildford (1992)
Google Scholar
Fulford, H. Term acquisition: a text-probing approach. Doctoral thesis. University of Surrey, Guildford (1997)
Google Scholar
Fulford, H. Griffin, S. and Ahmad, K. Resources for knowledge transfer and training: the exploitation of domain documentation and database technology. In Proceedings of the 6th international conference on urban storm drainage, Volume 2. Eds. J. Marsalek and H. C. Torno. Victoria, Canada: Seapoint Publishing. (1993) 1332–1338
Google Scholar
Fulford, H., Work, L. B., and Bowers, D. S. Tools for information systems teaching: making a case for metaCASE. In Proceedings of the 7th Annual Conference on the teaching of computing. Eds. S. Alexander and U. O. Reilly. CTI Computing, University of Ulster. (1999) 64–68
Google Scholar
Gomez, F., Segami, C., and Delaune, C. A system for the semiautomatic generation of E-R models from natural language specifications. In Data and knowledge engineering 29 (1999) 57–81
Article Google Scholar
Lejk, M. and Deeks, D. An introduction to systems analysis techniques. Prentice Hall Europe, Hemel Hempstead (1998)
Google Scholar
Lyons, J. Semantics. Cambridge University Press, Cambridge (1977)
Google Scholar
Rock-Evans, R. A simple introduction to data and activity analysis. Computer Weekly, Sutton (1989)
Google Scholar
Saeki, M., Horai, H., and Enomoto, H. Software development process from natural language specification. In Communications of the ACM (1989) 64–73
Google Scholar
Sager, J. C., Dungworth, D., and McDonald, P. F. English special languages, principles and practice in science and technology. Oscar Brandstetter Verlag KG (1980)
Google Scholar
Sager, J. C. A practical course in terminology processing. John Benjamins Publishing Co., Amsterdam/Philadelphia (1990)
Google Scholar
Silberschatz, A., Korth, H. and Sudershan, S. Database system concepts. 3rd edition. McGraw-Hill, Singapore (1997)
Google Scholar
Sinclair, J. M. Corpus, concordance, collocation. Oxford University Press, Oxford (1991)
Google Scholar
Sinclair, J. M. The automatic analysis of corpora. In Svartvik, J. (Ed.) Directions in corpus linguistics: proceedings of Nobel Symposium 82. Stockholm 1991. Series: Trends in linguistics studies and monographs 65. Mouton de Gruyter, Berlin and New York (1992) 379–397
Google Scholar
Tjoa, A. M. and Berger, L. Transformation of requirement specifications expressed in natural language into an EER model. In Proceedings of the 12th Entity-Relationship Approach-ER.93 Conference. Lecture notes in Computer Science, Vol. 823 (1994) 206–217
Google Scholar
Yang, H. F. (1986) A new technique for identifying scientific/technical terms and describing science texts. In Literary and Linguistic Computing 1. No. 2. (1986) 93–103
Article Google Scholar

Download references

Author information

Authors and Affiliations

Business School Loughborough University, LE11 3TU, Loughborough, Leicestershire, UK
Heather Fulford

Authors

Heather Fulford
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

PRiSM Laboratory, University of Versailles, 45 av.des Etats-Unis, 78035, Paris, France
Mokrane Bouzeghoub & Zoubida Kedad &
CNAM, CEDRIC Laboratory, 292 rue Saint-Martin, 75003, Paris
Elisabeth Métais

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fulford, H. (2001). Developing Document Analysis and Data Extraction Tools for Entity Modelling. In: Bouzeghoub, M., Kedad, Z., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2000. Lecture Notes in Computer Science, vol 1959. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45399-7_22

Download citation

DOI: https://doi.org/10.1007/3-540-45399-7_22
Published: 11 May 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41943-3
Online ISBN: 978-3-540-45399-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Developing Document Analysis and Data Extraction Tools for Entity Modelling

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Automated Assessment of ER Model Using the Domain Knowledge

Extracting Declarative Process Models from Natural Language

A Natural Language Approach for Requirements Engineering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Developing Document Analysis and Data Extraction Tools for Entity Modelling

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Automated Assessment of ER Model Using the Domain Knowledge

Extracting Declarative Process Models from Natural Language

A Natural Language Approach for Requirements Engineering

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation