Abstract
The path-based indexing methods such as the three-dimensional bitmap indexing have been used for collecting and retrieving the similar XML documents. To do this, the paths become the fundamental unit for constructing index. In case the document structure changes, the path extracted before the change and the one after the change are regarded as totally different ones regardless of the degree of the change. Due to this, the performance of the path-based indexing methods is usually bad in retrieving and clustering the documents which are similar. A novel method which can detect the similar paths is needed for the effective collecting and retrieval of XML documents. In this paper, a new path construction similarity which calculates the similarity between the paths is defined and a path bitmap indexing method is proposed to effectively load and extract the similar paths. The proposed method extracts the representative path from the paths which are similar. The paths are clustered using this, and the XML documents are also clustered using the clustered paths. This solves the problem of existing three-dimensional bitmap indexing. Through the performance evaluation, the proposed method showed better clustering accuracy over existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fuhr, N., Grossjohann, K.: XIRQL: An Extension of XQL for Information Retrieval. In: ACM SIGIR Workshop, Athens, Greece (2000)
Cooper, B., Sample, N., Franklin, M., Shadmon, M.: A Fast Index for Semistructured Data. In: Proc. of the 27th VLDB Conference, Roma, Italy (2001)
Garafalalos, M., Rastogi, R., Seshadri, S., Shim, K.: XTRACT: A System for Extracting Document Type Descriptors from XML Documents. In: Proc. of the ACM SIGMOD (2000)
Banerjee, S.: Oracle XML DB. An Oracle Technical White Paper (2003)
Ennser, L., Delporte, C., Oba, M., Sunil, K.: Integrating XML with DB2 XML Extender and DB2 Text Extender. IBM Redbook (2000)
Howlett, S., Jennings, D.: SQL Server 2000 and XML: Developing XML-Enabled Data Solutions for the Web. MSDN Magazine 17(1) (2002)
Egnor, D., Lord, R.: XYZFind: Structured Searching in Context with XML. In: ACM SIGIR Workshop, Athens, Greece (2000)
Delobel, C., Reynaud, C., Rousset, M.: Semantic Integration in Xyleme: A Uniform Tree-based Approach. Data & Knowledge Engineering 44, 267–298 (2003)
Yoon, J., Raghavan, V., Chakilam, V.: BitCube: Clustering and Statistical Analysis for XML Documents. In: 13th International Conference on Scientific and Statistical Database Management, Virginia (2001)
Yoon, J., Raghavan, V., Chakilam, V., Kerschberg, L.: BitCube: A Three-Dimensional Bitmap Indexing for XML Documents. J. of Intelligent Information System 17, 241–254 (2001)
Lee, J., Hwang, B.: xPlaneb: 3-Dimensioal Bitmap Index for XML Document Retrieval. J. of Korea Information Science Society 31(3), 331–339 (2004)
Lee, J., Hwang, B., Lee, B.: X-Square: Hybrid 3-Dimensioal Bitmap Indexing for XML Document Retrieval. In: Bussler, C.J., Hong, S.-k., Jun, W., Kaschek, R., Kinshuk, Krishnaswamy, S., Loke, S.W., Oberle, D., Richards, D., Sharma, A., Sure, Y., Thalheim, B. (eds.) WISE-WS 2004. LNCS, vol. 3307, pp. 221–232. Springer, Heidelberg (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, JM., Hwang, BY. (2006). Path Bitmap Indexing for Retrieval of XML Documents. In: Torra, V., Narukawa, Y., Valls, A., Domingo-Ferrer, J. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2006. Lecture Notes in Computer Science(), vol 3885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11681960_32
Download citation
DOI: https://doi.org/10.1007/11681960_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32780-6
Online ISBN: 978-3-540-32781-3
eBook Packages: Computer ScienceComputer Science (R0)