Abstract
Frequent itemset mining (FIM) is a fundamental set of techniques used to discover useful and meaningful relationships between items in transaction databases. Recently, extensions of FIM such as weighted frequent itemset mining (WFIM) and frequent itemset mining in uncertain databases (UFIM) have been proposed. WFIM considers that items may have different weight/importance, and the UFIM takes into account that data collected in a real-life environment may often be inaccurate, imprecise, or incomplete. Recently, a two-phase Apriori-based approach called HEWI-Uapriori was proposed to consider both item weight and uncertainty to mine the high expected weighted itemsets (HEWIs), while it generates a large amount of candidates and is too time-consuming. In this paper, a more efficient algorithm named HEWI-Utree is developed to efficiently mine HEWIs without performing multiple database scans and without generating enormous candidates. It relies on three novel structures named element (E)-table, weighted-probability (WP)-table and WP-tree to maintain the information required for identifying and pruning unpromising itemsets early. Experimental results show that the proposed algorithm is efficient than traditional methods of WFIM and UFIM, as well as the HEWI-Uapriori algorithm, in terms of runtime, memory usage, and scalability.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aggarwal, C.C., Li, Y., Wang, J., Wang, J.: Frequent pattern mining with uncertain data. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 29–38 (2009)
Aggarwal, C.C., Yu, P.S.: A survey of uncertain data algorithms and applications. IEEE Transactions on Knowledge and Data Engineering. 21(5), 609–623 (2009)
Agrawal, R., Srikant, R.: Quest synthetic data generator. http://www.Almaden.ibm.com/cs/quest/syndata.html
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: The International Conference on Very Large Data Bases, pp. 487–499 (1994)
Bernecker, T., Kriegel, H.P., Renz, M., Verhein, F., Zuefl, A.: Probabilistic frequent itemset mining in uncertain databases. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 119–128 (2009)
Cai, C.H., Fu, A.W.C., Kwong, W.W.: Mining association rules with weighted items. In: The International Conference on Database Engineering and Applications Symposium, pp. 68–77 (1998)
Chui, C.-K., Kao, B., Hung, E.: Mining Frequent Itemsets from Uncertain Data. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 47–58. Springer, Heidelberg (2007)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery. 8(1), 53–97 (2004)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P.: RWFIM: Recent weighted-frequent itemsets mining. Engineering Applications of Artificial Intelligence. 45, 18–32 (2015)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Tseng, V.S.: Weighted frequent itemset mining over uncertain databases. Applied Intelligence. 44(1), 166–178 (2016)
Lan, G.C., Hong, T.P., Lee, H.Y., Lin, C.W.: Mining weighted frequent itemsets. The Workshop on Combinatorial Mathematics and Computation Theory, pp. 85–89 (2013)
Lan, G.C., Hong, T.P., Lee, H.Y.: An efficient approach for finding weighted sequential patterns from sequence databases. Applied Intelligence. 41, 439–452 (2014)
Rymon, R.: Search through systematic set enumeration. In: The International Conference on Principles of Knowledge Representation and Reasoning, pp. 539–550 (1992)
SPMF: A Java Open-Source Data Mining Library. http://www.philippe-fournier-viger.com/spmf/
Sun, L., Cheng, R., Cheung, D.W., Cheng, J.: Mining uncertain data with probabilistic guarantees. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 273–282 (2010)
Sun, K., Bai, F.: Mining weighted association rules without preassigned weights. IEEE Transactions on Knowledge and Data Engineering. 20, 489–495 (2008)
Tao, F., Murtagh, F., Farid, M.: Weighted association rule mining using weighted support and significance framework. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 661–666 (2003)
Vo, B., Coenen, F., Le, B.: A new method for mining frequent weighted itemsets based on wit-trees. Expert Systems with Applications. 40, 1256–1264 (2013)
Wang, W., Yang, J., Yu, P.S.: Efficient mining of weighted association rules (WAR). In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 270–274 (2000)
Yun, U., Leggett, J.: WFIM: Weighted frequent itemset mining with a weight range and a minimum weight. In: SIAM International Conference on Data Mining, pp. 636–640 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Lin, J.CW., Gan, W., Fournier-Viger, P., Hong, TP. (2016). Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)