PIRE: An Extensible IR Engine Based on Probabilistic Datalog

Nottelmann, Henrik

doi:10.1007/978-3-540-31865-1_19

Henrik Nottelmann¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3408))

Included in the following conference series:

European Conference on Information Retrieval

4754 Accesses
4 Citations

Abstract

This paper introduces PIRE, a probabilistic IR engine. For both document indexing and retrieval, PIRE makes heavy use of probabilistic Datalog, a probabilistic extension of predicate Horn logics. Using such a logical framework together with probability theory allows for defining and using data types (e.g. text, names, numbers), different weighting schemes (e.g. normalised tf, tf.idf or BM25) and retrieval functions (e.g. uncertain inference, language models). Extending the system thus is reduced to adding new rules. Furthermore, this logical framework provide a powerful tool for including additional background knowledge into the retrieval process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Pyndri: A Python Interface to the Indri Search Engine

A systematic approach to normalization in probabilistic models

Article Open access 30 June 2018

Comparison and Analysis of Information Retrieval DFR Models

References

Cooper, W.S., Gey, F.C., Dabney, D.P.: Probabilistic retrieval based on staged logistic regression. In: Belkin, N.J., Ingwersen, P., Pejtersen, A.M. (eds.) Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Copenhagen, Denmark, June 21-24, pp. 198–210. ACM, New York (1992)
Chapter Google Scholar
Fienberg, S.: The Analysis of Cross-Classified Categorical Data, 2nd edn. MIT Press, Cambridge (1980)
MATH Google Scholar
Freeman, D.H.: Applied Categorical Data Analysis. Dekker, New York (1987)
MATH Google Scholar
Fuhr, N.: A probabilistic framework for vague queries and imprecise information in databases. In: Proceedings of the 16th International Conference on Very Large Databases, Los Altos, California, pp. 696–707. Morgan Kaufman, San Francisco (1990)
Google Scholar
Fuhr, N.: Towards data abstraction in networked information retrieval systems. Information Processing and Management 35(2), 101–119 (1999)
Article Google Scholar
Fuhr, N.: Probabilistic Datalog: Implementing logical information retrieval for advanced applications. Journal of the American Society for Information Science 51(2), 95–110 (2000)
Article MathSciNet Google Scholar
Fuhr, N., Pfeifer, U.: Combining model-oriented and description-oriented approaches for probabilistic indexing. In: Proceedings of the Fourteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–56. ACM, New York (1991)
Google Scholar
Fuhr, N., Rölleke, T.: HySpirit – a probabilistic inference engine for hypermedia retrieval in large databases. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 24–38. Springer, Heidelberg (1998)
Chapter Google Scholar
Gey, F.C.: Inferring probability of relevance using the method of logistic regression. In: Croft, B.W., van Rijsbergen, C.J. (eds.) Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 222–231. Springer, London (1994)
Google Scholar
Nottelmann, H., Fuhr, N.: Decision-theoretic resource selection for different data types in MIND. In: Callan, J., Crestani, F., Sanderson, M. (eds.) SIGIR 2003 Ws Distributed IR 2003. LNCS, vol. 2924, pp. 43–57. Springer, Heidelberg (2003)
Chapter Google Scholar
Nottelmann, H., Fuhr, N.: Evaluating different methods of estimating retrieval quality for resource selection. In: Callan, J., Cormack, G., Clarke, C., Hawking, D., Smeaton, A. (eds.) Proceedings of the 26st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, New York (2003)
Google Scholar
Nottelmann, H., Fuhr, N.: From retrieval status values to probabilities of relevance for advanced IR applications. Information Retrieval 6(4) (2003)
Google Scholar
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Croft, W.B., Moffat, A., van Rijsbergen, C.J., Wilkinson, R., Zobel, J. (eds.) Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281. ACM, New York (1998)
Chapter Google Scholar
Robertson, S.E.: The probability ranking principle in IR. Journal of Documentation 33, 294–304 (1977)
Article Google Scholar
Ross, K.: Modular stratification and magic sets for Datalog programs with negation. Journal of the ACM 41(6), 1216–1266 (1994)
Article MATH Google Scholar
Ullman, J.D.: Principles of Database and Knowledge-Base Systems, vol. I. Computer Science Press, Rockville (1988)
Google Scholar
van Gelder, A., Ross, K., Schlipf, J.: The well-founded semantics for general logic programs. Journal of the ACM 38(3), 620–650 (1991)
MATH Google Scholar
van Rijsbergen, C.J.: A non-classical logic for information retrieval. The Computer Journal 29(6), 481–485 (1986)
Article MATH Google Scholar
van Rijsbergen, C.J.: Probabilistic retrieval revisited. The Computer Journal 35(3), 291–298 (1992)
Article MATH Google Scholar
Wong, S.K.M., Yao, Y.Y.: On modeling information retrieval with probabilistic inference. ACM Transactions on Information Systems 13(1), 38–68 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics and Interactive Systems, University of Duisburg-Essen, 47048, Duisburg, Germany
Henrik Nottelmann

Authors

Henrik Nottelmann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Electrónica y Computación, Universidad de Santiago de Compostela, Spain
David E. Losada
Departamento de Ciencias de la Computación e Inteligencia Artificial E.T.S.I. Informática y de Telecomunicación, Universidad de Granada, 18071, Granada, Spain
Juan M. Fernández-Luna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nottelmann, H. (2005). PIRE: An Extensible IR Engine Based on Probabilistic Datalog. In: Losada, D.E., Fernández-Luna, J.M. (eds) Advances in Information Retrieval. ECIR 2005. Lecture Notes in Computer Science, vol 3408. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31865-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-540-31865-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25295-5
Online ISBN: 978-3-540-31865-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics