Abstract
Despite the great advances in XML data management and querying, the currently prevalent XPath- or XQuery-centric approaches face severe limitations when applied to XML documents in large intranets, digital libraries, federations of scientific data repositories, and ultimately the Web. In such environments, data has much more diverse structure and annotations than in a business-data setting and there is virtually no hope for a common schema or DTD that all the data complies with. Without a schema, however, databasestyle querying would often produce either empty result sets, namely, when queries are overly specific, or way too many results, namely, when search predicates are overly broad, the latter being the result of the user not knowing enough about the structure and annotations of the data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Theobald, M., Schenkel, R., Weikum, G. (2003). Classification and Focused Crawling for Semistructured Data. In: Blanken, H., Grabs, T., Schek, HJ., Schenkel, R., Weikum, G. (eds) Intelligent Search on XML Data. Lecture Notes in Computer Science, vol 2818. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45194-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-45194-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40768-3
Online ISBN: 978-3-540-45194-5
eBook Packages: Springer Book Archive