Skip to main content

The PHASAR Search Engine

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3999))

Abstract

This article describes the rationale behind the PHASAR system (Phrase-based Accurate Search And Retrieval), a professional Information Retrieval and Text Mining system under development for the collection of information about metabolites from the biological literature. The system is generic in nature and applicable (given suitable linguistic resources and thesauri) to many other forms of professional search. Instead of keywords, the PHASAR search engine uses Dependency Triples as terms. Both the documents and the queries are parsed, transduced to Dependency Triples and lemmatized. Queries consist of a set of Dependency Triples, whose elements may be generalized or specialized in order to achieve the desired precision and recall. In order to help in interactive exploration, the search process is supported by document frequency information from the index, both for terms from the query and for terms from the thesaurus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arampatzis, A., van der Weide, T.P., Koster, C.H.A., van Bommel, P.: An Evaluation of Linguistically-motivated Indexing Schemes. In: Arampatzis, A. (ed.) Proceedings of BCS-IRSG, 22nd Annual Colloquium on IR Research, pp. 34–45 (2000)

    Google Scholar 

  2. Bouma, G., Mur, J., van Noord, G., van der Plas, L., Tiedemann, J.: Question Answering for Dutch using Dependency Relations. In: Peters, C., Gey, F.C., Gonzalo, J., Müller, H., Jones, G.J.F., Kluck, M., Magnini, B., de Rijke, M., Giampiccolo, D. (eds.) CLEF 2005. LNCS, vol. 4022, pp. 370–379. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  3. Bruza, P., Huibers, T.W.C.: A Study of Aboutness in Information Retrieval. Artificial Intelligence Review 10, 1–27 (1996)

    Article  MATH  Google Scholar 

  4. Cui, H., Sun, R., Li, K., Kan, M.-Y., Chua, T.-S.: Question Answering Passage Retrieval Using Dependency Relations. In: Proceedings SIGIR (2005)

    Google Scholar 

  5. Fagan, J.L.: Experiments in automatic phrase indexing for document retrieval: a comparison of syntactic and non-syntactic methods, PhD Thesis, Cornell University (1988)

    Google Scholar 

  6. Furnkranz, J., Mitchell, T., Riloff, E.: Case Study in Using Linguistic Phrases for Text Categorization on the WWW, AAAI/ICML Workshop on Learning for Text Categorization (1998)

    Google Scholar 

  7. Grootjen, F.A., van der Weide, T.P.: Effectiveness of Index Expressions. In: Meziane, F., Métais, E. (eds.) NLDB 2004. LNCS, vol. 3136, pp. 171–181. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Hekkelman, M.L., Vriend, G.: MRS: A fast and compact retrieval system for biological data. Nucleic Acids Res. (July 1, 2005), 33(Web Server issue), W766W769, Also: http://mrs.cmbi.ru.nl/

  9. Koster, C.H.A., Verbruggen, E.: The AGFL Grammar Work Lab. In: Proceedings FREENIX/Usenix 2002, pp. 13–18 (2002)

    Google Scholar 

  10. Melc̆uk, I.A.: Dependency Syntax: Theory and Practice. State University of New York Press, Albany (1988)

    Google Scholar 

  11. Riloff, E., Lorenzen, J.: Extraction-based Text Categorization: Generating Domain-specific Role Relationships Automatically. In: [Strzalkowski 1999] (1999)

    Google Scholar 

  12. Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)

    Article  Google Scholar 

  13. Sparck Jones, K.: The role of NLP in Text Retrieval (1999). In: [Strzalkowski, 1999], pp. 1-24 (1999)

    Google Scholar 

  14. Strzalkowski, T.: Natural Language Information Retrieval. Information Processing and Management 31(3), 397–417 (1995)

    Article  Google Scholar 

  15. Strzalkowski, T.: Natural Language Information Retrieval. Kluwer Academic Publishers, Dordrecht (1999)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Koster, C.H.A., Seibert, O., Seutter, M. (2006). The PHASAR Search Engine. In: Kop, C., Fliedl, G., Mayr, H.C., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2006. Lecture Notes in Computer Science, vol 3999. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11765448_13

Download citation

  • DOI: https://doi.org/10.1007/11765448_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34616-6

  • Online ISBN: 978-3-540-34617-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics