Abstract
Association rule mining (ARM) is an important research issue in data mining and knowledge discovery. Existing ARM methods cannot discover nonlinear association rules, despite nonlinearity being common and significant in engineering practice. Besides, negative association rules are less researched, although they can effectively reflect widely existing negative associations in practical complex systems. Consequently, we propose MICAR, a nonlinear ARM method based on the maximal information coefficient (MIC). MICAR can extract nonlinear association rules in positive and negative forms from transactional or continuous databases. MICAR is realized in three steps: data preprocessing, candidate itemset mining and association rule generation. MIC is used to identify the type of association rules and find potential nonlinear correlations. MICAR can also control the redundancy in itemsets and association rules by restricting their quantity and forms. Experiments on authentic and simulation datasets show that MICAR can extract high-quality positive and negative association rules more effectively and efficiently than existing methods, especially has the unique ability to extract nonlinear association rules.







Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Agarwal A, Nanavati N (2016) Association rule mining using hybrid ga-pso for multi-objective optimisation. pp 1–7. https://doi.org/10.1109/ICCIC.2016.7919571
Agrawal J, Agrawal S, Singhai A et al (2015) Set-pso-based approach for mining positive and negative association rules. Knowl Inf Syst. https://doi.org/10.1007/s10115-014-0795-2
Agrawal R, Srikant R (2000) Fast algorithms for mining association rules. In: Proc 20th int conf very large data bases VLDB 1215
Agrawal R, Imieliński T, Swami AN (1993) Mining association rules between sets of items in large databases. ACM SIGMOD Record
Alatas B, Akin E (2006) An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules. Soft Comput 10:230–237. https://doi.org/10.1007/s00500-005-0476-x
Alatas B, Akin E, Karci A (2008) Modenar: Multi-objective differential evolution algorithm for mining numeric association rules. Appl Soft Comput 8(1):646–656. https://doi.org/10.1016/j.asoc.2007.05.003
Antonie L, Zaïane O (2004) Mining positive and negative association rules: an approach for confined rules. https://doi.org/10.1007/978-3-540-30116-5_6
Bain M (1970) Learning logical exceptions in chess. University of Strathclyde, Glasgow
Baloch ZQ, Raza SA, Pathak R et al (2020) Machine learning confirms nonlinear relationship between severity of peripheral arterial disease, functional limitation and symptom severity. Diagnostics 10(8):515
Brin S, Motwani R, Silverstein C (1997) Beyond market baskets: generalizing association rules to correlations. https://doi.org/10.1145/253260.253327
Can U, Alatas B (2017) Automatic mining of quantitative association rules with gravitational search algorithm. Int J Softw Eng Knowl Eng 27:343–372. https://doi.org/10.1142/S0218194017500127
Djenouri Y, Drias H, Habbas Z (2014) Bees swarm optimisation using multiple strategies for association rule mining. Int J Bio-Inspir Comput 6(4):239–249. https://doi.org/10.1504/IJBIC.2014.064990
Dong D, Ye Z, Cao Y, et al (2019) An improved association rule mining algorithm based on ant lion optimizer algorithm and fp-growth. In: 2019 10th IEEE international conference on intelligent data acquisition and advanced computing systems: technology and applications (IDAACS)
Dua D, Graff C (2017) Uci machine learning repository. http://archive.ics.uci.edu/ml
Duvallet C, Gibbons S, Gurry T et al (2017) Meta-analysis of gut microbiome studies identifies disease-specific and shared responses. Nat Commun. https://doi.org/10.1038/s41467-017-01973-8
Edwards CJ, Garety P, Hardy A (2019) The relationship between depressive symptoms and negative symptoms in people with non-affective psychosis: a meta-analysis. Psychol Med 49(15):2486–2498. https://doi.org/10.1017/S0033291719002381
Feng X, Jie Z, Zhang Z (2015) Mapreduce-based h-mine algorithm. In: 2015 fifth international conference on instrumentation and measurement, computer, communication and control (IMCCC)
Flann NS, Dietterich TG (1989) A study of explanation-based methods for inductive learning. Mach Learn 4(2):187–226
Hall MA, Smith LA (1999) Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. In: Twelfth international florida artificial intelligence research society conference
Han J, Pei J, Yin Y et al (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83
Han J, Pei J, Yin Y et al (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8:53–87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83
Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. https://doi.org/10.1016/C2009-0-61819-5
Heraguemi K, Kamel N, Drias H (2014) Association rule mining based on bat algorithm. https://doi.org/10.1166/jctn.2015.3873
Jia L, Xiang L, Liu X (2019) An improved eclat algorithm based on tissue-like p system with active membranes. Processes 7:555. https://doi.org/10.3390/pr7090555
Joseph TA, Shenhav L, Xavier JB et al (2020) Compositional lotka-volterra describes microbial dynamics in the simplex. PLoS Comput Biol 16(5):e1007-917
Kabir M, Xu S, Kang B et al (2017) A new multiple seeds based genetic algorithm for discovering a set of interesting boolean association rules. Expert Syst Appl 74:55–69. https://doi.org/10.1016/j.eswa.2017.01.001
Kreesuradej W, Thurachon W (2019) Discovery of incremental association rules based on a new fp-growth algorithm. In: 2019 IEEE 4th iinternational conference on computer and communication systems (ICCCS 2019). Singapore, pp 184–188
Kuo R, Shih C (2007) Association rule mining through the ant colony system for national health insurance research database in Taiwan. Comput Math Appl 54(11):1303–1318. https://doi.org/10.1016/j.camwa.2006.03.043
Kuo R, Chao C, Chiu Y (2011) Application of particle swarm optimization to association rule mining. Appl Soft Comput 11:326–336. https://doi.org/10.1016/j.asoc.2009.11.023
Li S, Mo B, Xu W et al (2020) Research on nonlinear prediction model of weld forming quality during hot-wire laser welding. Opt Laser Technol 131(106):436
Li Z, Liu X, Cao X (2011) A study on improved eclat data mining algorithm. Adv Mater Res 328–330:1896–1899. https://doi.org/10.4028/www.scientific.net/AMR.328-330.1896
Liu D (2010) Improved genetic algorithm based on simulated annealing and quantum computing strategy for mining association rules. JSW 5:1243–1249. https://doi.org/10.4304/jsw.5.11.1243-1249
Liu M, Ye Y, Jiang J et al (2021) Maniea: a microbial association network inference method based on improved eclat association rule mining algorithm. Bioinformatics 20:20
Luna JM, Fournier Viger P, Ventura S (2019) Frequent itemset mining: a 25 years review. Data mining and knowledge discovery. Wiley Interdisciplinary Reviews, Hoboken
Mahmood S, Shahbaz M, Rehman ZU (2013) Extraction of positive and negative association rules from text: a temporal approach. Pak J Sci 65(3):407–413
Martín D, Rosete-Suárez A, Alcala-Fdez J et al (2014) A new multiobjective evolutionary algorithm for mining a reduced set of interesting positive and negative quantitative association rules. IEEE Trans Evolut Comput 18:54–69. https://doi.org/10.1109/TEVC.2013.2285016
Martín D, Rosete-Suárez A, Alcala-Fdez J et al (2014) Qar-cip-nsga-ii: a new multi-objective evolutionary algorithm to mine quantitative association rules. Inf Sci 258:1–28. https://doi.org/10.1016/j.ins.2013.09.009
Mican D, Tomai N (2010) Association-rules-based recommender system for personalization in adaptive web-based applications. DBLP
Muggleton S (1991) Inductive logic programming. New Gener Comput 8(4):295–318
Nasiri M, Taghavi L, Minaei B (2010) Multi-objective rule mining using simulated annealing algorithm. JCIT 5:60–68. https://doi.org/10.4156/jcit.vol5.issue1.8
Olmo JL, Luna JM, Romero JR et al (2013) Mining association rules with single and multi-objective grammar guided ant programming. Integr Comput Aided Eng 20(3):217–234. https://doi.org/10.3233/ICA-130430
Pei J (2000) Closet : an efficient algorithm for mining frequent closed itemsets. In: Proc ACM DMKD’00
Pei J, Han J, Lu H, et al (2001) H-mine: hyper-structure mining of frequent patterns in large databases. In: IEEE Computer Society, USA, ICDM ’01, p 441–448
Pei J, Han J, Lu H et al (2007) H-mine: Fast and space-preserving frequent pattern mining in large databases. IIE Trans 39:593–605. https://doi.org/10.1080/07408170600897460
Qiang HB, Wu Q (2011) Optimization and analysis of wedm process parameters based on nonlinear regression model. Manuf Autom 411(2):331–334
Raj S, Ramesh D, Sreenu M et al (2020) Eafim: efficient apriori-based frequent itemset mining algorithm on spark for big transactional data. Knowl Inf Syst 62(9):3565–3583. https://doi.org/10.1007/s10115-020-01464-1
Reshef DN, Reshef YA, Finucane HK et al (2011) Detecting novel associations in large data sets. Science 334:1518–1524
Safroneeva E, Straumann A, Schoepfer AM (2018) Latest insights on the relationship between symptoms and biologic findings in adults with eosinophilic esophagitis. Gastrointest Endosc Clin N Am 28(1):35–45
Sajid M, Muhammad S, Aziz G (2014) Negative and positive association rules mining from text using frequent and infrequent itemsets. Sci World J 973–750
Sarath K, Vadlamani R (2013) Association rule mining using binary particle swarm optimization. Eng Appl Artif Intell 26:1832–1840. https://doi.org/10.1016/j.engappai.2013.06.003
Savasere A, Omiecinski E, Navathe S (1999) Mining for strong negative associations in a large database of customer transactions
Shaheen M, Shahbaz M, Guergachi A (2013) Context based positive and negative spatio-temporal association rule mining. Knowl Based Syst 37:261–273
Shannon C (2001) A mathematical theory of communication. Mobile Comput Commun Rev 5:3–55
Su T, Xu H, Zhou X (2019) Particle swarm optimization-based association rule mining in big data environment. IEEE Access 7:161008–161016. https://doi.org/10.1109/ACCESS.2019.2951195
Sun L (2020) An improved apriori algorithm based on support weight matrix for data mining in transaction database. J Ambient Intell Hum Ccompput 11(2 SI):495–501. https://doi.org/10.1007/s12652-019-01222-4
Tandon D, Haque MM, Mande SS (2016) Inferring intra-community microbial interaction patterns from metagenomic datasets using associative rule mining techniques. Plos One 11(4):e0154,493
Telikani A, Gandomi A, Shahbahrami A (2020) A survey of evolutionary computation for association rule mining. Inf Sci. https://doi.org/10.1016/j.ins.2020.02.073
Ventura S, Luna JM (2016) Pattern mining with evolutionary algorithms. https://doi.org/10.1007/978-3-319-33858-3
Wang CH, Lee TY, Hui KC, Chung MH (2019) Mental disorders and medical comorbidities: association rule mining approach. Perspect Psychiatric Care 55(3):517–526
Wigington CH, Sonderegger D, Brussaard CPD et al (2016) Re-examination of the relationship between marine virus and microbial cell abundances. Nat Microbiol 1(3):15,024
Wu X, Zhang C, Zhang S (2004) Efficient mining of both positive and negative association rules. ACM Trans Inf Syst 22:381–405. https://doi.org/10.1145/1010614.1010616
Wur SY, Leu Y (1999) An effective boolean algorithm for mining association rules in large databases. pp 179 – 186.https://doi.org/10.1109/DASFAA.1999.765750
Yan X, Zhang C, Zhang S (2005) Armga: identifying interesting association rules with genetic algorithms. Appl Artif Intell 19:677–689. https://doi.org/10.1080/08839510590967316
Ykhlef M (2011) A quantum swarm evolutionary algorithm for mining association rules in large databases. J King Saud Univ Comput Inf Sci 23(1):1–6. https://doi.org/10.1016/j.jksuci.2010.03.001
Yuan X, Buckles B, Yuan Z et al (2002) Mining negative association rules. https://doi.org/10.1109/ISCC.2002.1021739
Zaffalon M, Hutter M (2014) Robust feature selection by mutual information distributions. CoRR abs/1408.1487. http://arxiv.org/abs/1408.1487
Zaki M (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390. https://doi.org/10.1109/69.846291
Zaki M (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12:372–390. https://doi.org/10.1109/69.846291
Zhang S, Wu X (2011) Fundamentals of association rules in data mining and knowledge discovery. Data Min Knowl Discov 1(2):97–116
Acknowledgements
This work is supported in part by the National Natural Science Foundation of China under Grant No.72071206 and the Science and Technology Innovation Program of Hunan Province: 2020RC4046. The authors would like to thank all anonymous reviewers for their detailed, valuable comments and constructive suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, M., Yang, Z., Guo, Y. et al. MICAR: nonlinear association rule mining based on maximal information coefficient. Knowl Inf Syst 64, 3017–3042 (2022). https://doi.org/10.1007/s10115-022-01730-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-022-01730-4