Abstract
Recently, the data perturbation approach has been applied to data mining, where original data values are modified such that the reconstruction of the values for any individual transaction is difficult. However, this mining in distorted databases brings enormous overheads as compared to normal data sets. This paper presents an algorithm GrC-FIM, which introduces granular computing (GrC), to address the efficiency problem of frequent itemset mining in distorted databases. Using the key granule concept and granule inference, support counts of candidate non-key frequent itemsets can be inferred with the counts of their frequent sub-itemsets obtained during an earlier mining. This eliminates the tedious support reconstruction for these itemsets. And the accuracy is improved in dense data sets while that in sparse ones is the same.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: Proc. ACM-SIGMOD Int. Conference on Management of Data (SIGMOD 2000), pp. 439–450 (2000)
Rizvi, S., Haritsa, J.: Maintaining data privacy in association rule mining. In: Proc. VLDB Int. Conference on Very large data bases (VLDB 2002), pp. 682–693 (2002)
Du, W., Zhan, Z.: Using randomized response techniques for privacy-preserving data mining. In: Proc. ACM-SIGKDD Int. Conference on Knowldge discovery and data mining (SIGKDD 2003), pp. 505–510 (2003)
Agrawal, S., Krishnan, V., Haritsa, J.: On addressing efficiency concerns in privacy-preserving mining. In: Lee, Y., Li, J., Whang, K.-Y., Lee, D. (eds.) DASFAA 2004. LNCS, vol. 2973, pp. 113–124. Springer, Heidelberg (2004)
Lin, T.Y.: Granular computing. In: Announcement of the BISC Special Interest Group on Granular Computing (1997)
Zadeh, L.A.: Towards a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets and Systems 90(2), 111–127 (1997)
Yao, Y.Y., Zhong, N.: Granular computing using information tabules. In: Data Mining, Rough Sets and Granular Computing, pp. 102–124. Physica-Verlag (2002)
Lin, T.Y., Louie, E.: Data mining using granular computing: fast algorithms for finding association rules. In: Data Mining, Rough Sets and Granular Computing, pp. 23–45. Physica-Verlag (2002)
Pawlak, Z.: Some issues on rough sets. Transactions on Rough Sets I, 1–58 (2004)
Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent patterns with counting inference. SIGKDD Explorations 2(2), 66–75 (2000)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB Int. Conference on Very large data bases (VLDB 1994), pp. 487–499 (1994)
Xu, C., Wang, J., Dan, H., Pan, Y.: An improved EMASK algorithm for privacy-preserving frequent pattern mining. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 752–757. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, J., Xu, C. (2006). Efficient Mining of Frequent Itemsets in Distorted Databases. In: Sattar, A., Kang, Bh. (eds) AI 2006: Advances in Artificial Intelligence. AI 2006. Lecture Notes in Computer Science(), vol 4304. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11941439_39
Download citation
DOI: https://doi.org/10.1007/11941439_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49787-5
Online ISBN: 978-3-540-49788-2
eBook Packages: Computer ScienceComputer Science (R0)