Abstract
Frequent itemset mining is an important in data mining. Fuzzy data mining can more accurately describe the mining results in frequent itemset mining. Nevertheless, frequent itemsets are redundant for the users. A better way is to show the top-k results accordingly. In this paper, we define the score of fuzzy frequent itemset and propose the problem of top-k fuzzy frequent itemset mining, which, to the best of our knowledge, has never been focused on before. To address this problem, we employ a data structure named TopKFFITree to store the superset of the mining results, which has a significantly reduced size in comparison to all the fuzzy frequent itemsets. Then, we present an algorithm named TopK-FFI to build and maintain the data structure. In this algorithm, we employ a method to prune most of the fuzzy frequent itemsets immediately based on the monotony of itemset score. Theoretical analysis and experimental studies over 4 datasets demonstrate that our proposed algorithm can efficiently decrease the runtime and memory cost, and significantly outperform the naive algorithm Top-k-FFI-Miner.
This research is supported by the National Natural Science Foundation of China(61100112,61309030), Beijing Higher Education Young Elite Teacher Project(YETP0987), State Key Program of the National Social Science Foundation of China(13AXW010), Discipline Construction Foundation of Central University of Finance and Economics(2016XX05).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Buckley, J.J., Hayashi, Y.: Fuzzy neural networks: a survey. Fuzzy Sets Syst. 66(1), 1–13 (1994)
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: a maximal frequent itemset algorithm for transactional databases (2001)
Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS, vol. 2431, pp. 74–86. Springer, Heidelberg (2002). doi:10.1007/3-540-45681-3_7
Delgado, M., Marin, N., Sanchez, D., Vila, M.A.: Fuzzy association rules: general model and applications. IEEE Trans. Fuzzy Syst. 11(2), 214–225 (2003)
Hong, T., Kuo, C., Wang, S.: A fuzzy aprioritid mining algorithm with reduced computational time. Appl. Soft Comput. 5(1), 1–10 (2004)
Hong, T., Lin, C., Yulung, W.: Incrementally fast updated frequent pattern trees. Expert Syst. Appl. 34(4), 2424–2435 (2008)
Kuok, C.M., Fu, A., Wong, M.H.: Mining fuzzy association rules in databases. SIGMOD Rec. 27(1), 41–46 (1998)
Lin, C.W., Hong, T.P.: A survey of fuzzy web mining. WIREs Data Min. Knowl. Disc. 3(3), 190–199 (2013)
Lin, C., Hong, T.: Mining fuzzy frequent itemsets based on UBFFP trees. J. Intell. Fuzzy Syst. 27(1), 535–548 (2014)
Lin, C., Hong, T., Wenhsiang, L.: An efficient tree-based fuzzy data mining approach. Int. J. Fuzzy Syst. 12(2), 150–157 (2010)
Lin, J.C.W., Li, T., Fournier-Viger, P., Hong, T.P.: A fast algorithm for mining fuzzy frequent itemsets. J. Intell. Fuzzy Syst. 29(6), 2373–2379 (2015)
Pei, J., Han, J., Mao, R.: An efficient algorithm for mining frequent closed itemsets, Closet (2000)
Wang, T., Li, Z., Yan, Y., Chen, H.: A survey of fuzzy decision tree classifier methodology. In: Cao, B.Y. (ed.) Fuzzy Information and Engineering. AISC, vol. 40, pp. 959–968. Springer, Heidelberg (2007)
Yang, M.S.: A survey of fuzzy clustering. Math. Comput. Model. 18(11), 1–16 (1993)
Zadeh, L.A.: Fuzzy sets and systems. Int. J. General Syst. (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Li, H., Wang, Y., Zhang, N., Zhang, Y. (2017). Finding Top-k Fuzzy Frequent Itemsets from Databases. In: Tan, Y., Takagi, H., Shi, Y. (eds) Data Mining and Big Data. DMBD 2017. Lecture Notes in Computer Science(), vol 10387. Springer, Cham. https://doi.org/10.1007/978-3-319-61845-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-61845-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61844-9
Online ISBN: 978-3-319-61845-6
eBook Packages: Computer ScienceComputer Science (R0)