Vector Retrieval Model for XML Document Based on Dynamic Partition of Information Units

Cui, Li-zhen; Wang, Hai-yang

doi:10.1007/11495772_19

Vector Retrieval Model for XML Document Based on Dynamic Partition of Information Units

Li-zhen Cui²¹ &
Hai-yang Wang²¹

Conference paper

944 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3528))

Abstract

XML document is applied in WEB application more and more. Because users can find what they need in numerous XML documents, technology of information retrieval based on XML document becomes a hot topic in information retrieval field now. Traditional technology of information retrieval based on XML document need define retrieval unit and retrieval result unit of the retrieval beforehand, and the dividing granularity is either too big or too small. In this paper we propose a retrieval method, which can dynamically partition information units in terms of the structure and semantic information of XML in vector space model. Therefore it reduces calculating workload efficiently and improves running efficiency of the entire retrieval system. The retrieval efficiency of this method is proved than the traditional one when they have the same accuracy. Finally, the results have been testified by experiment.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Salton, G., Wong, A.: A vector space model for automatic indexing. Communications of the ACM 18, 613–620 (1975)
Article MATH Google Scholar
Lee, J.: Analyzig the Effectiveness of Extended Boolean Models in Information Retrieval. In: Proc. of SIGIR 1994, Dublin, Ireland, pp. 182–190 (1994)
Google Scholar
Theobald, A., Weikem, G.: Adding relevance to XML. In: Suciu, D., Vossen, G. (eds.) WebDB 2000. LNCS, vol. 1997, pp. 105–124. Springer, Heidelberg (2001)
Chapter Google Scholar
Fuhr, N., Grobjohann, K.: XIRQL: A query language for information retrieval in XML documents. In: Proceedings of the 24th Annual International Conference on Research and development in Information Retrieval, New Orleans, USA, pp. 172–180 (2001)
Google Scholar
Hayashi, Y., Tomita, J.: Searching text-rich XML documents with relevance ranking. In: ACM SIGIR 2000 Workshop on XML and Information Retrieval, Athens, Greece (2000)
Google Scholar
Hatano, K., Kinutani, H., Yoshikawa, M., Uemura, S.: Information Retrieval System for XML Documents. In: Hameurlain, A., Cicchetti, R., Traunmüller, R. (eds.) DEXA 2002. LNCS, vol. 2453, pp. 758–767. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Shandong University, Jinan, P.R. China
Li-zhen Cui & Hai-yang Wang

Authors

Li-zhen Cui
View author publications
You can also search for this author in PubMed Google Scholar
Hai-yang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01-447, Warsaw, Poland
Piotr S. Szczepaniak
Systems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01–447, Warsaw, Poland
Janusz Kacprzyk
Institute of Computer Science, Technical University of Łódź, Poland
Adam Niewiadomski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cui, Lz., Wang, Hy. (2005). Vector Retrieval Model for XML Document Based on Dynamic Partition of Information Units. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds) Advances in Web Intelligence. AWIC 2005. Lecture Notes in Computer Science(), vol 3528. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11495772_19

Download citation

DOI: https://doi.org/10.1007/11495772_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26219-0
Online ISBN: 978-3-540-31900-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics