Abstract
In recent years, frequent itemsets mining in uncertain data has drawn increasingly attractions from data mining communities. Currently, frequent itemsets mining algorithms in uncertain data mainly use frequent itemsets defined based on the expected support rather than the probabilistic support since the computational complexity is prohibitively high. To address this issue, various approximation algorithms for mining the probabilistic frequent itemsets have been proposed. However, the existing approximation algorithms are not adequately effective when the uncertain data is very large or extremely dense or sparse. In this paper, we propose a parallelized approximation algorithm, which is capable of mining probabilistic frequent itemsets on large-scale, dense or sparse uncertain data, based on the MapReduce platform. Experimental results are illustrated and analyzed to demonstrate the computational effectiveness of our algorithm.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Xia, Y., Yang, Y.R., Chi, Y.: Mining association rules with non-uniform privacy concerns. In: ACM DMKD, pp. 27–34 (2004)
Bernecker, T., Kriegel, H.-P., Renz, M., Verhein, F., Züfle, A.: Probabilistic frequent itemset mining in uncertain databases. In: KDD, pp. 119–128 (2009)
Aggarwal, C.C., Li, Y., Wang, J., Wang, J.: Frequent pattern mining with uncertain data. In: KDD, pp. 29–38 (2009)
Chui, C.-K., Kao, B., Hung, E.: Mining frequent itemsets from uncertain data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 47–58. Springer, Heidelberg (2007)
Wang, L., Cheng, R., Lee, S.D., Cheung, D.W.-L.: Accelerating probabilistic frequent itemset mining: a model-based approach. In: CIKM, pp. 429–438 (2010)
Calders, T., Garboni, C., Goethals, B.: Approximation of frequentness probability of itemsets in uncertain data. In: ICDM, pp. 749–754 (2010)
Tong, Y., Chen, L., Cheng, Y., Yu, P.S.: Mining frequent itemsets over uncertain databases. PVLDB 5(11), 1650–1661 (2012)
Lin, M.-Y., Lee, P.-Y., Hsueh, S.-C.: Apriori-based frequent itemset mining algorithms on MapReduce. In: ICUIMC 2012, art. 76 (2012)
Leung, C.K.-S., Hayduk, Y.: Mining Frequent patterns from uncertain data with MapReduce for big data analytics. In: Meng, W., Feng, L., Bressan, S., Winiwarter, W., Song, W. (eds.) DASFAA 2013, Part I. LNCS, vol. 7825, pp. 440–455. Springer, Heidelberg (2013)
Cam, L.L.: An approximation theorem for the Poisson binomial distribution. Pac. J. Math. 10(4), 1181–1196 (1960)
Li, N., Zeng, L., He, Q., et al.: Parallel implementation of apriori algorithm based on MapReduce. In: IEEE SNPD, pp. 236–241(2012)
Michael, K.B., Fredric, C.G.: The relationship between recall and precision. JASIS 45(1), 12–19 (1994)
Acknowledgement
This work is supported by the Program for New Century Excellent Talents of MOE China (Grant No.NCET-11-0213), the Natural Science Foundation of China (Grant Nos. 61273257, 61321491), and the Program for Distinguished Talents of Jiangsu (Grant No. 2013-XXRJ-018).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, J., Mao, XJ., Lu, WY., Zhu, QH., Li, N., Yang, YB. (2015). MapReduce-based Parallelized Approximation of Frequent Itemsets Mining in Uncertain Data. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9492. Springer, Cham. https://doi.org/10.1007/978-3-319-26561-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-26561-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26560-5
Online ISBN: 978-3-319-26561-2
eBook Packages: Computer ScienceComputer Science (R0)