Skip to main content

PIRE: An Extensible IR Engine Based on Probabilistic Datalog

  • Conference paper
Advances in Information Retrieval (ECIR 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3408))

Included in the following conference series:

Abstract

This paper introduces PIRE, a probabilistic IR engine. For both document indexing and retrieval, PIRE makes heavy use of probabilistic Datalog, a probabilistic extension of predicate Horn logics. Using such a logical framework together with probability theory allows for defining and using data types (e.g. text, names, numbers), different weighting schemes (e.g. normalised tf, tf.idf or BM25) and retrieval functions (e.g. uncertain inference, language models). Extending the system thus is reduced to adding new rules. Furthermore, this logical framework provide a powerful tool for including additional background knowledge into the retrieval process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cooper, W.S., Gey, F.C., Dabney, D.P.: Probabilistic retrieval based on staged logistic regression. In: Belkin, N.J., Ingwersen, P., Pejtersen, A.M. (eds.) Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Copenhagen, Denmark, June 21-24, pp. 198–210. ACM, New York (1992)

    Chapter  Google Scholar 

  2. Fienberg, S.: The Analysis of Cross-Classified Categorical Data, 2nd edn. MIT Press, Cambridge (1980)

    MATH  Google Scholar 

  3. Freeman, D.H.: Applied Categorical Data Analysis. Dekker, New York (1987)

    MATH  Google Scholar 

  4. Fuhr, N.: A probabilistic framework for vague queries and imprecise information in databases. In: Proceedings of the 16th International Conference on Very Large Databases, Los Altos, California, pp. 696–707. Morgan Kaufman, San Francisco (1990)

    Google Scholar 

  5. Fuhr, N.: Towards data abstraction in networked information retrieval systems. Information Processing and Management 35(2), 101–119 (1999)

    Article  Google Scholar 

  6. Fuhr, N.: Probabilistic Datalog: Implementing logical information retrieval for advanced applications. Journal of the American Society for Information Science 51(2), 95–110 (2000)

    Article  MathSciNet  Google Scholar 

  7. Fuhr, N., Pfeifer, U.: Combining model-oriented and description-oriented approaches for probabilistic indexing. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–56. ACM, New York (1991)

    Google Scholar 

  8. Fuhr, N., Rölleke, T.: HySpirit – a probabilistic inference engine for hypermedia retrieval in large databases. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 24–38. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  9. Gey, F.C.: Inferring probability of relevance using the method of logistic regression. In: Croft, B.W., van Rijsbergen, C.J. (eds.) Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 222–231. Springer, London (1994)

    Google Scholar 

  10. Nottelmann, H., Fuhr, N.: Decision-theoretic resource selection for different data types in MIND. In: Callan, J., Crestani, F., Sanderson, M. (eds.) SIGIR 2003 Ws Distributed IR 2003. LNCS, vol. 2924, pp. 43–57. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  11. Nottelmann, H., Fuhr, N.: Evaluating different methods of estimating retrieval quality for resource selection. In: Callan, J., Cormack, G., Clarke, C., Hawking, D., Smeaton, A. (eds.) Proceedings of the 26st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York (2003)

    Google Scholar 

  12. Nottelmann, H., Fuhr, N.: From retrieval status values to probabilities of relevance for advanced IR applications. Information Retrieval 6(4) (2003)

    Google Scholar 

  13. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Croft, W.B., Moffat, A., van Rijsbergen, C.J., Wilkinson, R., Zobel, J. (eds.) Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281. ACM, New York (1998)

    Chapter  Google Scholar 

  14. Robertson, S.E.: The probability ranking principle in IR. Journal of Documentation 33, 294–304 (1977)

    Article  Google Scholar 

  15. Ross, K.: Modular stratification and magic sets for Datalog programs with negation. Journal of the ACM 41(6), 1216–1266 (1994)

    Article  MATH  Google Scholar 

  16. Ullman, J.D.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)

    Google Scholar 

  17. van Gelder, A., Ross, K., Schlipf, J.: The well-founded semantics for general logic programs. Journal of the ACM 38(3), 620–650 (1991)

    MATH  Google Scholar 

  18. van Rijsbergen, C.J.: A non-classical logic for information retrieval. The Computer Journal 29(6), 481–485 (1986)

    Article  MATH  Google Scholar 

  19. van Rijsbergen, C.J.: Probabilistic retrieval revisited. The Computer Journal 35(3), 291–298 (1992)

    Article  MATH  Google Scholar 

  20. Wong, S.K.M., Yao, Y.Y.: On modeling information retrieval with probabilistic inference. ACM Transactions on Information Systems 13(1), 38–68 (1995)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nottelmann, H. (2005). PIRE: An Extensible IR Engine Based on Probabilistic Datalog. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31865-1_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25295-5

  • Online ISBN: 978-3-540-31865-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics