Skip to main content

Efficiently Maintaining Structural Associations of Semistructured Data

  • Conference paper
  • First Online:
Advances in Informatics (PCI 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2563))

Included in the following conference series:

Abstract

Semistructured data arise frequently in the Web or in data integration systems. Semistructured objects describing the same type of information have similar but not identical structure. Finding the common schema of a collection of semistructured objects is a very important task and due to the huge volume of such data encountered, data mining techniques have been employed. Maintenance of the discovered schema in case of updates, i.e., addition of new objects, is also a very important issue. In this paper, we study the problem of maintaining the discovered schema in the case of the addition of new objects. We use the notion of “negative borders” introduced in the context of mining association rules in order to efficiently find the new schema when objects are added to the database. We present experimental results that show the improved efficiency achieved by the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. S. Abiteboul. Querying semistructured data. In Proceedings 6th ICDT Conference, pages 1–18, 1997. 118, 120

    Google Scholar 

  2. R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings 20th VLDB Conference, pages 487–499, 1994. 119, 122, 124, 127, 128

    Google Scholar 

  3. Y. Aumann, R. Feldman, O. Liphstat, and H. Mannila. Borders: an efficient algorithm for association generation in dynamic databases. Journal of Intelligent Information Systems, 12(1):61–73, 1999. 120, 123

    Article  Google Scholar 

  4. D. Cheung, J. Han, V. Ng, and C. Wong. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proceedings 12th IEEE ICDE Conference, pages 106–114, 1996. 119, 123, 124, 128

    Google Scholar 

  5. A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with STORED. In Proceedings ACM SIGMOD Conference, pages 431–442, 1999. 120

    Google Scholar 

  6. R. Feldman, Y. Aumann, A. Amir, and H. Mannila. Efficient algorithms for discovering frequent sets in incremental databases. In Proceedings ACM DMKD Workshop, 1997. 120, 123

    Google Scholar 

  7. H Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997. 120, 123

    Article  Google Scholar 

  8. S. Nestorov, S. Abiteboul, and R. Motwani. Extracting schema from semistructured data. In Proceedings ACM SIGMOD Conference, pages 295–306, 1998. 119

    Google Scholar 

  9. V. Pudi and J. Haritsa. Quantifying the utility of the past in mining large databases. Information Systems, 25(5):323–343, 2000. 120, 128

    Article  Google Scholar 

  10. A. Rajaraman and J. Ullman. Querying Websites using compact skeletons. In Proceedings 20th ACM PODS Symposium, 2001. 119

    Google Scholar 

  11. S. Thomas, S. Bodagala, K. Alsabti, and S. Ranka. An efficient algorithm for the incremental updation of association rules in large databases. In Proceedings KDD Conference, pages 263–266, 1997. 120, 123, 124

    Google Scholar 

  12. H. Toivonen. Sampling large databases for association rules. In Proceedings 22nd VLDB Conference, pages 134–145, 1996. 123

    Google Scholar 

  13. K. Wang and H. Liu. Discovering structural association of semistructured data. IEEE Transactions on Knowledge and Data Engineering, 12(3):353–371, 2000. 119, 120, 121, 123, 124, 127

    Article  Google Scholar 

  14. Q. Y. Wang, J.X. Yu, and K.-F. Wong. Approximate graph schema extraction for semi-structured data. In Proceedings 7th EDBT Conference, pages 302–316, 2000. 119

    Google Scholar 

  15. A. Zhou, Jinwen, S. Zhou, and Z. Tian. Incremental mining of schema for semistructured data. In Proceedings Pasific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 159–168, 1999. 119, 123

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Katsaros, D. (2003). Efficiently Maintaining Structural Associations of Semistructured Data. In: Manolopoulos, Y., Evripidou, S., Kakas, A.C. (eds) Advances in Informatics. PCI 2001. Lecture Notes in Computer Science, vol 2563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-38076-0_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-38076-0_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-07544-8

  • Online ISBN: 978-3-540-38076-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics