Skip to main content

Extracting Semistructured Data - Lessons Learnt

  • Conference paper
  • First Online:
Natural Language Processing — NLP 2000 (NLP 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1835))

Included in the following conference series:

Abstract

The Yellow Pages Assistant (Ypa) is a natural language dialogue system which guides a user through a dialogue in order to retrieve addresses from the Yellow Pages 1. Part of the work in this project is concerned with the construction of a Backend, i.e. the database extracted from the raw input text that is needed for the online access of the addresses. Here we discuss some aspects involved in this task as well as report on experiences which might be interesting for other projects as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abiteboul, S. Querying Semi-Structured Data (invited talk). In Proceedings of the 6thInternational Conference on Database Theory (ICDT) (Delphi, Greece, 1997), pp. 1–18.

    Google Scholar 

  2. Ambroziak, J., and Woods, W. A. Natural Language Technology in Precision Content Retrieval. In Proceedings of the 2ndConference on Natural Language Processing and Industrial Applications (NLP-IA) (Moncton, Canada, 1998), pp. 117–124.

    Google Scholar 

  3. Aust, H., Oerder, M., Seide, F., and Steinbiss, V. The Philips automatic train timetable information system. Speech Communication 17 (1995), 249–262.

    Article  Google Scholar 

  4. De Roeck, A., Kruschwitz, U., Neal, P., Scott, P., Steel, S., Turner, R., and Webb, N. YPA-an intelligent directory enquiry assistant. BT Technology Journal 6, 3 (1998), 145–155.

    Article  Google Scholar 

  5. Flank, S. A layered approach to NLP-based Information Retrieval. In Proceedings of the 36thACL and the 17thCOLING Conferences (Montreal, 1998), pp. 397–403.

    Google Scholar 

  6. Glass, J., Flammia, G., Goodine, D., Phillips, M., Polifroni, J., Sakai, S., Seneff, S., and Zue, V. Multilingual Spoken-Language Understanding in the MIT VOYAGER System. Speech Communication 17 (1995), 1–18.

    Article  Google Scholar 

  7. Heisterkamp, P., Mcglashan, S., and Youd, N. Dialogue Semantics for an Oral Dialogue System. In Proceedings of the International Conference of Spoken Language Processing (Banff, Canada, 1992).

    Google Scholar 

  8. Levy, A. Y., Rajaraman, A., and Ordille, J. J. Querying Heterogeneous Information Sources Using Source Descriptions. In Proceedings of the 22ndVLDB Conference (Mumbai (Bombay), India, 1996).

    Google Scholar 

  9. Lewis, D. D., and Sparck Jones, K. Natural language processing for information retrieval. Communications of the ACM 39, 1 1996, 92–101.

    Article  Google Scholar 

  10. Mcglashan, S., Fraser, N., Gilbert, N., Bilange, E., Heisterkamp, P., and Youd, N. Dialogue Management for Telephone Information Systems. In Proceedings of the International Conference on Applied Language Processing (Trento, Italy, 1992).

    Google Scholar 

  11. Mchugh, J., Abiteboul, S., Goldman, R., Quass, D., and Widom, J. Lore: A Database Management System for Semistructured Data. SIGMOD Record 26(3) (1997), 50–66.

    Article  Google Scholar 

  12. Miller, G. Wordnet: An on-line lexical database. International Journal of Lexicography 3, 4 (1990) (Special Issue).

    Google Scholar 

  13. Porter, M. F. An Algorithm for Suffix Stripping. Program 14, 3 (1980), 130–137.

    Google Scholar 

  14. Sikorski, T., and Allen, J. F. A task-based evaluation of the TRAINS-95 dialogue system. In Proceedings of the Workshop on Dialog Processing in Spoken Language Systems, ECAI-96 (Budapest, 1996).

    Google Scholar 

  15. Smeaton, A. F. Using NLP or NLP Resources for Information Retrieval Tasks. In Natural Language Information retrieval, T. Strzalkowski, Ed. Kluwer Academic Publishers, 1997.

    Google Scholar 

  16. Strzalkowski, T., Guthrie, L., Karlgren, J., Leistensnider, J., Lin, F., Perez-Carballo, J., Straszheim, T., Wang, J., and Wilding, J. Natural Language Information Retrieval: TREC-5 Report. In Proceedings of the Fifth Text Retrieval Conference (TREC-5) (NIST Special Publication 500-238), 1997).

    Google Scholar 

  17. Voorhees, E. M., and Harman, D., Eds. Proceedings of the Sixth Text Retrieval Conference (TREC-6) (1998), NIST special publication 500-240. TREC web site: http://trec.nist.gov

  18. Wahlster, W. Verbmobil: Translation of Face-to-Face Dialogues. In Proceedings of the 3rdEuropean Conference on Speech Communication and Technology (Berlin, Germany, 1993), pp. 29–38.

    Google Scholar 

  19. Webb, N., De Roeck, A., Kruschwitz, U., Scott, P., Steel, S., and Turner, R. Natural Language Engineering: Slot-Filling in the YPA. In Proceedings of the Workshop on Natural Language Interfaces, Dialogue and Partner Modelling (at the Fachtagung für Künstliche Intelligenz’ 99) (Bonn, Germany, 1999). http://www.ikp.uni-bonn.de/NDS99/Finals/3_1.ps

  20. Woods, W. A. Conceptual Indexing: A Better Way to Organize Knowledge. Technical Report SMLI TR-97-61, Sun Microsystems Laboratories Mountain View, CA, 1997.

    Google Scholar 

  21. Zhai, C. Fast Statistical Parsing of Noun Phrases for Document Indexing. In Proceedings of the 5th Conference on Applied Natural Language Processing (Washington DC, 1997).

    Google Scholar 

  22. Zue, V. Toward Systems that Understand Spoken Language. IEEE Expert Magazine February (1994), 51–59.

    Google Scholar 

  23. Zue, V. Navigating the Information Superhighway Using Spoken Language Interfaces. IEEE Expert Magazine October (1995), 39–43.

    Google Scholar 

  24. Zue, V., Glass, J., Goodine, D., Leung, H., Phillips, M., Polifroni, J., and Seneff, S. The VOYAGER Speech Understanding System: Preliminary Development and Evaluation. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kruschwitz, U., De Roeck, A., Scott, P., Steel, S., Turner, R., Webb, N. (2000). Extracting Semistructured Data - Lessons Learnt. In: Christodoulakis, D.N. (eds) Natural Language Processing — NLP 2000. NLP 2000. Lecture Notes in Computer Science(), vol 1835. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45154-4_37

Download citation

  • DOI: https://doi.org/10.1007/3-540-45154-4_37

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67605-8

  • Online ISBN: 978-3-540-45154-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics