Skip to main content

Path Bitmap Indexing for Retrieval of XML Documents

  • Conference paper
Modeling Decisions for Artificial Intelligence (MDAI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3885))

Abstract

The path-based indexing methods such as the three-dimensional bitmap indexing have been used for collecting and retrieving the similar XML documents. To do this, the paths become the fundamental unit for constructing index. In case the document structure changes, the path extracted before the change and the one after the change are regarded as totally different ones regardless of the degree of the change. Due to this, the performance of the path-based indexing methods is usually bad in retrieving and clustering the documents which are similar. A novel method which can detect the similar paths is needed for the effective collecting and retrieval of XML documents. In this paper, a new path construction similarity which calculates the similarity between the paths is defined and a path bitmap indexing method is proposed to effectively load and extract the similar paths. The proposed method extracts the representative path from the paths which are similar. The paths are clustered using this, and the XML documents are also clustered using the clustered paths. This solves the problem of existing three-dimensional bitmap indexing. Through the performance evaluation, the proposed method showed better clustering accuracy over existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fuhr, N., Grossjohann, K.: XIRQL: An Extension of XQL for Information Retrieval. In: ACM SIGIR Workshop, Athens, Greece (2000)

    Google Scholar 

  2. Cooper, B., Sample, N., Franklin, M., Shadmon, M.: A Fast Index for Semistructured Data. In: Proc. of the 27th VLDB Conference, Roma, Italy (2001)

    Google Scholar 

  3. Garafalalos, M., Rastogi, R., Seshadri, S., Shim, K.: XTRACT: A System for Extracting Document Type Descriptors from XML Documents. In: Proc. of the ACM SIGMOD (2000)

    Google Scholar 

  4. Banerjee, S.: Oracle XML DB. An Oracle Technical White Paper (2003)

    Google Scholar 

  5. Ennser, L., Delporte, C., Oba, M., Sunil, K.: Integrating XML with DB2 XML Extender and DB2 Text Extender. IBM Redbook (2000)

    Google Scholar 

  6. Howlett, S., Jennings, D.: SQL Server 2000 and XML: Developing XML-Enabled Data Solutions for the Web. MSDN Magazine 17(1) (2002)

    Google Scholar 

  7. Egnor, D., Lord, R.: XYZFind: Structured Searching in Context with XML. In: ACM SIGIR Workshop, Athens, Greece (2000)

    Google Scholar 

  8. http://www.fatdog.com

  9. Delobel, C., Reynaud, C., Rousset, M.: Semantic Integration in Xyleme: A Uniform Tree-based Approach. Data & Knowledge Engineering 44, 267–298 (2003)

    Article  Google Scholar 

  10. Yoon, J., Raghavan, V., Chakilam, V.: BitCube: Clustering and Statistical Analysis for XML Documents. In: 13th International Conference on Scientific and Statistical Database Management, Virginia (2001)

    Google Scholar 

  11. Yoon, J., Raghavan, V., Chakilam, V., Kerschberg, L.: BitCube: A Three-Dimensional Bitmap Indexing for XML Documents. J. of Intelligent Information System 17, 241–254 (2001)

    Article  MATH  Google Scholar 

  12. Lee, J., Hwang, B.: xPlaneb: 3-Dimensioal Bitmap Index for XML Document Retrieval. J. of Korea Information Science Society 31(3), 331–339 (2004)

    Google Scholar 

  13. Lee, J., Hwang, B., Lee, B.: X-Square: Hybrid 3-Dimensioal Bitmap Indexing for XML Document Retrieval. In: Bussler, C.J., Hong, S.-k., Jun, W., Kaschek, R., Kinshuk, Krishnaswamy, S., Loke, S.W., Oberle, D., Richards, D., Sharma, A., Sure, Y., Thalheim, B. (eds.) WISE-WS 2004. LNCS, vol. 3307, pp. 221–232. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, JM., Hwang, BY. (2006). Path Bitmap Indexing for Retrieval of XML Documents. In: Torra, V., Narukawa, Y., Valls, A., Domingo-Ferrer, J. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2006. Lecture Notes in Computer Science(), vol 3885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11681960_32

Download citation

  • DOI: https://doi.org/10.1007/11681960_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32780-6

  • Online ISBN: 978-3-540-32781-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics