Abstract
We are presenting a coherent framework for XQuery processing that incorporates IR-style approximate matching and allows the ordering of results by their relevance score. Our relevance ranking algorithm is based on both stem matching and term proximity. Our XQuery processor is stream-based, consisting of iterators connected into pipelines. In our framework, all values produced by XQuery expressions are assigned scores,and these scores propagate and are combined when piped through the iterators. The most important feature of our evaluation engine is the use of structural and content-based inverse indexes that deliver data in document order and facilitate the use of efficient merge joins to evaluate path expressions and search predicates. We present the rules for the translation from a large part of XQuery to iterator pipeline. Our modular approach of building pipelines to evaluate XQuery scales up to any query complexity because the pipes can be connected in the same way complex queries are formed from simpler ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural Joins, A Primitive for Efficient XML Query Pattern Matching. In: ICDE (2002)
Al-Khalifa, S., Yu, C., Jagadish, H.V.: Querying Structured Text in an XML Database. In: SIGMOD, pp. 4–15 (2003)
Amer-Yahia, S., Botev, C., Shanmugasundaram, S.: TeXQuery: A Full-Text Search Extension to XQuery. In: WWW (2004)
Amer-Yahia, S., Lakshmanan, L.V.S., Pandit, S.: FleXPath: Flexible Structure and Full-Text Querying for XML. In: SIGMOD (2004)
Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J.: XQuery 1.0: An XML Query Language. W3C Working Draft (November 2003), At http://www.w3.org/TR/xquery/
Botev, C., Amer-Yahia, S., Shanmugasundaram, J.: A TexQuery-Based XML Full-Text Search Engine (Demo Paper). In: SIGMOD (2004)
Bremer, J., Gertz, M.: XQuery/IR: Integrating XML Document and Data Retrieval. In: WebDB 2002, pp. 1–6 (2002)
Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: SIGMOD, pp. 310–321 (2002)
Chinenyanga, T., Kushmerick, N.: An Expressive and Efficient Language for XML Information Retrieval. JASIST 53(6), 438–453 (2002)
Fuhr, N., Grojohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: SIGIR (2001)
Grabs, T., Schek, H.-J.: PowerDB-XML: a Platform for Data-Centric and Document-Centric XML Processing. In: XSym (2003)
Graefe, G.: Query Evaluation Techniques for Large Databases. ACM Computing Surveys 25(2), 73–170 (1993)
Guo, L., Shao, F., Botev, C., Shanmugasundaram, J.: XRANK: Ranked Keyword Search over XML Documents. In: SIGMOD, pp. 16–27 (2003)
Kamps, J., Marx, M., de Rijke, M., Sigurbjornsson, B.: Best-Match Querying from Document-Centric XML. In: WebDB (2004)
Meyer, H., Bruder, I., Weber, G., Heuer, A.: The Xircus Search Engine (2003), At http://www.xircus.de
Porter, M.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Theobald, A., Weikum, G.: Adding Relevance to XML. In: WebDB (2000)
Theobald, A., Weikum, G.: The Index-based XXL Search Engine for Querying. In: EDBT, pp. 477–495 (2002)
Theobald, A., Weikum, G.: The XXL Search Engine: Ranked Retrieval on XML Data using Indexes and Ontologies (Demo Paper). In: SIGMOD (2002)
Weigel, F., Meuss, H., Schulz, K.U., Bry, F.: Content and Structure in Indexing and Ranking XML. In: WebDB (2004)
Zhang, C., Naughton, J., DeWitt, D., Luo, Q., Lohman, G.: On Supporting Containment Queries in Relational Database Management Systems. In: SIGMOD (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fegaras, L. (2004). XQuery Processing with Relevance Ranking. In: Bellahsène, Z., Milo, T., Rys, M., Suciu, D., Unland, R. (eds) Database and XML Technologies. XSym 2004. Lecture Notes in Computer Science, vol 3186. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30081-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-30081-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22969-8
Online ISBN: 978-3-540-30081-6
eBook Packages: Springer Book Archive