Efficiently Maintaining Structural Associations of Semistructured Data

Katsaros, Dimitrios

doi:10.1007/3-540-38076-0_8

Dimitrios Katsaros⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2563))

Included in the following conference series:

Panhellenic Conference on Informatics

440 Accesses
1 Citations

Abstract

Semistructured data arise frequently in the Web or in data integration systems. Semistructured objects describing the same type of information have similar but not identical structure. Finding the common schema of a collection of semistructured objects is a very important task and due to the huge volume of such data encountered, data mining techniques have been employed. Maintenance of the discovered schema in case of updates, i.e., addition of new objects, is also a very important issue. In this paper, we study the problem of maintaining the discovered schema in the case of the addition of new objects. We use the notion of “negative borders” introduced in the context of mining association rules in order to efficiently find the new schema when objects are added to the database. We present experimental results that show the improved efficiency achieved by the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. Abiteboul. Querying semistructured data. In Proceedings 6th ICDT Conference, pages 1–18, 1997. 118, 120
Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proceedings 20th VLDB Conference, pages 487–499, 1994. 119, 122, 124, 127, 128
Google Scholar
Y. Aumann, R. Feldman, O. Liphstat, and H. Mannila. Borders: an efficient algorithm for association generation in dynamic databases. Journal of Intelligent Information Systems, 12(1):61–73, 1999. 120, 123
Article Google Scholar
D. Cheung, J. Han, V. Ng, and C. Wong. Maintenance of discovered association rules in large databases: An incremental updating technique. In Proceedings 12th IEEE ICDE Conference, pages 106–114, 1996. 119, 123, 124, 128
Google Scholar
A. Deutsch, M. Fernandez, and D. Suciu. Storing semistructured data with STORED. In Proceedings ACM SIGMOD Conference, pages 431–442, 1999. 120
Google Scholar
R. Feldman, Y. Aumann, A. Amir, and H. Mannila. Efficient algorithms for discovering frequent sets in incremental databases. In Proceedings ACM DMKD Workshop, 1997. 120, 123
Google Scholar
H Mannila and H. Toivonen. Levelwise search and borders of theories in knowledge discovery. Data Mining and Knowledge Discovery, 1(3):241–258, 1997. 120, 123
Article Google Scholar
S. Nestorov, S. Abiteboul, and R. Motwani. Extracting schema from semistructured data. In Proceedings ACM SIGMOD Conference, pages 295–306, 1998. 119
Google Scholar
V. Pudi and J. Haritsa. Quantifying the utility of the past in mining large databases. Information Systems, 25(5):323–343, 2000. 120, 128
Article Google Scholar
A. Rajaraman and J. Ullman. Querying Websites using compact skeletons. In Proceedings 20th ACM PODS Symposium, 2001. 119
Google Scholar
S. Thomas, S. Bodagala, K. Alsabti, and S. Ranka. An efficient algorithm for the incremental updation of association rules in large databases. In Proceedings KDD Conference, pages 263–266, 1997. 120, 123, 124
Google Scholar
H. Toivonen. Sampling large databases for association rules. In Proceedings 22nd VLDB Conference, pages 134–145, 1996. 123
Google Scholar
K. Wang and H. Liu. Discovering structural association of semistructured data. IEEE Transactions on Knowledge and Data Engineering, 12(3):353–371, 2000. 119, 120, 121, 123, 124, 127
Article Google Scholar
Q. Y. Wang, J.X. Yu, and K.-F. Wong. Approximate graph schema extraction for semi-structured data. In Proceedings 7th EDBT Conference, pages 302–316, 2000. 119
Google Scholar
A. Zhou, Jinwen, S. Zhou, and Z. Tian. Incremental mining of schema for semistructured data. In Proceedings Pasific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), pages 159–168, 1999. 119, 123
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
Dimitrios Katsaros

Authors

Dimitrios Katsaros
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Informatics, Aristotle University, 54006, Thessaloniki, Greece
Yannis Manolopoulos
Dept. of Computer Science, University of Cyprus, P.O. Box 20537, 1678, Nicosia, Cyprus
Skevos Evripidou & Antonis C. Kakas &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Katsaros, D. (2003). Efficiently Maintaining Structural Associations of Semistructured Data. In: Manolopoulos, Y., Evripidou, S., Kakas, A.C. (eds) Advances in Informatics. PCI 2001. Lecture Notes in Computer Science, vol 2563. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-38076-0_8

Download citation

DOI: https://doi.org/10.1007/3-540-38076-0_8
Published: 25 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-07544-8
Online ISBN: 978-3-540-38076-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics