Abstract
XML query languages proposed so far are limited to Boolean retrieval in the sense that query results are sets of qualifying XML elements or subgraphs. This search paradigm is intriguing for “closed” collections of XML documents such as e-commerce catalogs, but we argue that it is inadequate for searching the Web where we would prefer ranked lists of results based on relevance estimation. IR-style Web search engines, on the other hand, are incapable of exploiting the additional information made explicit in the structure, element names, and attributes of XML documents. In this paper we present a compact query language, coined XXL for “flexible XML search language”, that reconciles both search paradigms by combining XML graph pattern matching with relevance estimations and producing ranked lists of XML subgraphs as search results. The paper describes the language design, sketches implementation issues, and presents preliminary experimental results.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. Abiteboul, P. Buneman, D. Siciu: Data on the Web–From Relations to Semistructured Data and XML. San Francisco: Morgan Kaufmann Publishers, 2000.
S. Abiteboul, D. Quass, J. McHugh, J. Widom, J. L. Wiener: The Lorel Query Language for Semistructured Data. International Journal of Digital Libraries 1(1): 68–88 (1997).
R. Baeza-Yates, B. Ribeiro-Neto: Modern Information Retrieval, Addison Wesley, 1999.
K. Böhm, K. Aberer, E.J. Neuhold, X. Yang: Structured Document Storage and Refined Declarative and Navigational Access Mechanisms in HyperStorM, VLDB Journal 6(4), 1997.
S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi, L. Tanca: XML-GL: A Graphical Language for Querying and Restructuring XML Documents. WWW8/Computer Networks 31(11–16): 1171–1187 (1999).
M. Cutler, Y. Shih, W. Meng: Using the Structure of HTML Documents to Improve Retrieval, USENIX Symposium on Internet Technologies and Systems, Monterey, California, 1997.
W.W. Cohen: Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity, ACM SIGMOD Conference, 1998.
W. W. Cohen: Recognizing Structure in Web Pages using Similarity Queries. 16. Nat. Conf. on Artif. Intelligence (AAAI) / 11. Conf. on Innovative Appl. on Artif. Intelligence (IAAI), pp. 59–66, 1999.
A. Deutsch, M. F. Fernandez, D. Florescu, A. Y. Levy, D. Suciu: A Query Language for XML. WWW8/Computer Networks 31(11–16): 1155–1169 (1999).
N. Fuhr, T. Rölleke: HySpirit — a Probabilistic Inference Engine for Hypermedia Retrieval in Large Databases, 6th International Conference on Extending Database Technology (EDBT), Valencia, Spain, 1998.
D. Kossmann (Editor), Special Issue on XML, IEEE Data Engineering Bulletin Vol.22 No.3, 1999.
S.-H. Myaeng, D.-H. Jang, M.-S. Kim, Z.-C. Zhoo: A Flexible Model for Retrieval of SGML Documents, ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 1998.
J. Naughton, D. DeWitt, D. Maier, et al.: The Niagara Internet Query System. http://www.cs.wisc.edu/niagara/Publications.html
S. Russell, P. Norvig: Artificial Intelligence-A Modern Approach, Prentice-Hall, 1995.
J. Shanmugasundaram, G. He, K. Tufte, C. Zhang, D. DeWitt, J. Naughton: Relational Databases for Querying XML Documents: Limitations and Opportunities. Proc. of the Very Large Databases (VLDB) Conference, September 1999.
XML-QL: A Query Language for XML, User’s Guide, Version 0.6, http://www.research.att.com/~mff/xmlql/doc
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Theobald, A., Weikum, G. (2001). Adding Relevance to XML. In: Goos, G., Hartmanis, J., van Leeuwen, J., Suciu, D., Vossen, G. (eds) The World Wide Web and Databases. WebDB 2000. Lecture Notes in Computer Science, vol 1997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45271-0_7
Download citation
DOI: https://doi.org/10.1007/3-540-45271-0_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41826-9
Online ISBN: 978-3-540-45271-3
eBook Packages: Springer Book Archive