A New Method for Mining Association Rules from a Collection of XML Documents

Paik, Juryon; Youn, Hee Yong; Kim, Ungmo

doi:10.1007/11424826_101

Juryon Paik²⁴,
Hee Yong Youn²⁴ &
Ungmo Kim²⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3481))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1644 Accesses
5 Citations

Abstract

With the sheer amount of data stored, presented and exchanged using XML nowadays, the ability to extract interesting knowledge from XML data sources becomes increasingly important and desirable. In support of this trend, several encouraging attempts at developing methods for mining XML data have been proposed. However, efficiency and simplicity are still barrier for further development. In this paper, we show that any XML document can be mined for association rules using only a specially devised hierarchical data structure called HoPS without multiple XML data scans. It is flexible and powerful enough to represent both simple and complex structured association relationships inherent in XML data.

This work was supported in part by Ubiquitous computing Technology Research Institute and by the University IT Research Center Project, funded by the Korean Ministry of Information and Communication.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.N.: Mining association rules between sets of items in large databases. In: Proc. of the ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th International Conference on Very Large Data Bases, pp. 478–499 (1994)
Google Scholar
Braga, D., Campi, A., Klemettinen, M., Lanzi, P.L.: Mining Association Rules from XML Data. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2002. LNCS, vol. 2454, pp. 21–30. Springer, Heidelberg (2002)
Chapter Google Scholar
Braga, D., Campi, A., Ceri, S., Klemettinen, M., Lanzi, P.L.: A tool for extracting XML association rules. In: Proc. of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), pp. 57–64 (2002)
Google Scholar
Han, J., Fu, Y.: Discovery of multiple-level association rules from large databases. In: Proc. of the 21st International Conference on Very Large Data Bases, pp. 420–431 (1995)
Google Scholar
Meo, R., Pasila, G., Ceri, S.: An extension to SQL for mining association rules. Data Mining and Knowledge Discovery 2(2), 195–224 (1998)
Article Google Scholar
Singh, L., Scheuermann, P., Chen, B.: Generating association rules from semistructureddocuments using an extended concept hierarchy. In: Proc. of the 6th International Conference on Information and Knowledge Management (CIKM 1997), pp. 193–200 (1997)
Google Scholar
Srikant, R., Agrawal, R.: Mining generalized association rules. In: Proc. of the 21st International Conference on Very Large Data Bases, pp. 409–419 (1995)
Google Scholar
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: Proc. of the 1996 ACM SIGMOD International Conference on Management of Data, pp. 1–12 (1996)
Google Scholar
Toivonen, H.: Sampling large databases for association rules. In: Proc. of the 22th International Conference on Very Large Data Bases, pp. 43–52 (1996)
Google Scholar
Wan, J.W.W., Dobbie, G.: Extracting association rules from XML documents using XQuery. In: Proc. of the 5th ACM International Workshop on Web Information and Data Management (WIDM 2003), pp. 94–97 (2003)
Google Scholar
The World Wide Web Consortium (W3C). Extensible Markup Language (XML) 1.0 (Third Edition) W3C Recommendation (2004), http://www.w3.org/TR/2004/RECxml-20040204/

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Sungkyunkwan University, 300 Chunchun-dong, Jangan-gu, Suwon, Gyeonggi-do, 440-746, Republic of Korea
Juryon Paik, Hee Yong Youn & Ungmo Kim

Authors

Juryon Paik
View author publications
You can also search for this author in PubMed Google Scholar
Hee Yong Youn
View author publications
You can also search for this author in PubMed Google Scholar
Ungmo Kim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics and Computer Science, University of Perugia, via Vanvitelli, 1, I-06123, Perugia, Italy
Osvaldo Gervasi
Department of Computer Science, University of Calgary, 2500 University Drive N.W., T2N 1N4, Calgary, AB, Canada
Marina L. Gavrilova
William Norris Professor, Head of the Computer Science and Engineering Department, University of Minnesota, USA
Vipin Kumar
Department of Chemistry, University of Perugia, Via Elce di Sotto, 8, P.O. Box, I-06123, Perugia, Italy
Antonio Laganà
Institute of High Performance Computing, IHCP, 1 Science Park Road, 01-01 The Capricorn, Singapore Science Park II, 117528, Singapore
Heow Pueh Lee
School of Computing, Soongsil University, Seoul, Korea
Youngsong Mun
Clayton School of IT, Monash University, 3800, Clayton, Australia
David Taniar
OptimaNumerics Ltd, P.O. Box, Belfast, United Kingdom
Chih Jeng Kenneth Tan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Paik, J., Youn, H.Y., Kim, U. (2005). A New Method for Mining Association Rules from a Collection of XML Documents. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2005. ICCSA 2005. Lecture Notes in Computer Science, vol 3481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424826_101

Download citation

DOI: https://doi.org/10.1007/11424826_101
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25861-2
Online ISBN: 978-3-540-32044-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics