Efficient Mining of Frequent Itemsets in Distorted Databases

Wang, Jinlong; Xu, Congfu

doi:10.1007/11941439_39

Jinlong Wang²⁰ &
Congfu Xu²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4304))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

3424 Accesses
1 Citations

Abstract

Recently, the data perturbation approach has been applied to data mining, where original data values are modified such that the reconstruction of the values for any individual transaction is difficult. However, this mining in distorted databases brings enormous overheads as compared to normal data sets. This paper presents an algorithm GrC-FIM, which introduces granular computing (GrC), to address the efficiency problem of frequent itemset mining in distorted databases. Using the key granule concept and granule inference, support counts of candidate non-key frequent itemsets can be inferred with the counts of their frequent sub-itemsets obtained during an earlier mining. This eliminates the tedious support reconstruction for these itemsets. And the accuracy is improved in dense data sets while that in sparse ones is the same.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. ACM-SIGMOD Int. Conference on Management of Data (SIGMOD 2000), pp. 439–450 (2000)
Google Scholar
Rizvi, S., Haritsa, J.: Maintaining data privacy in association rule mining. In: Proc. VLDB Int. Conference on Very large data bases (VLDB 2002), pp. 682–693 (2002)
Google Scholar
Du, W., Zhan, Z.: Using randomized response techniques for privacy-preserving data mining. In: Proc. ACM-SIGKDD Int. Conference on Knowldge discovery and data mining (SIGKDD 2003), pp. 505–510 (2003)
Google Scholar
Agrawal, S., Krishnan, V., Haritsa, J.: On addressing efficiency concerns in privacy-preserving mining. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 113–124. Springer, Heidelberg (2004)
Chapter Google Scholar
Lin, T.Y.: Granular computing. In: Announcement of the BISC Special Interest Group on Granular Computing (1997)
Google Scholar
Zadeh, L.A.: Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems 90(2), 111–127 (1997)
Article MATH MathSciNet Google Scholar
Yao, Y.Y., Zhong, N.: Granular computing using information tabules. In: Data Mining, Rough Sets and Granular Computing, pp. 102–124. Physica-Verlag (2002)
Google Scholar
Lin, T.Y., Louie, E.: Data mining using granular computing: fast algorithms for finding association rules. In: Data Mining, Rough Sets and Granular Computing, pp. 23–45. Physica-Verlag (2002)
Google Scholar
Pawlak, Z.: Some issues on rough sets. Transactions on Rough Sets I, 1–58 (2004)
Article MathSciNet Google Scholar
Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent patterns with counting inference. SIGKDD Explorations 2(2), 66–75 (2000)
Article Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB Int. Conference on Very large data bases (VLDB 1994), pp. 487–499 (1994)
Google Scholar
Xu, C., Wang, J., Dan, H., Pan, Y.: An improved EMASK algorithm for privacy-preserving frequent pattern mining. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 752–757. Springer, Heidelberg (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Artificial Intelligence, Zhejiang University, Hangzhou, 310027, China
Jinlong Wang & Congfu Xu

Authors

Jinlong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Congfu Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DisPRR, National ICT Australia Ltd, QLD, Australia
Abdul Sattar
School of Computing, University of Tasmania, Sandy Bay, 7005, Tasmania, Australia
Byeong-ho Kang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J., Xu, C. (2006). Efficient Mining of Frequent Itemsets in Distorted Databases. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_39

Download citation

DOI: https://doi.org/10.1007/11941439_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics