Abstract
Association rules are a popular knowledge discovery technique for warehouse basket analysis. They indicate which items of the warehouse are frequently bought together. The problem of association rule mining has first been stated in 1993. Five years later, several research groups discovered that this problem has a strong connection to Formal Concept Analysis (FCA). In this survey, we will first introduce some basic ideas of this connection along a specific algorithm, Titanic, and show how FCA helps in reducing the number of resulting rules without loss of information, before giving a general overview over the history and state of the art of applying FCA for association rule mining.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on Management of Data (SIGMOD 1993), May 1993, pp. 207–216. ACM Press, New York (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on Very Large Data, September 1994, pp. 478–499. Morgan Kaufmann, San Francisco (1994)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering (ICDE 1995), March 1995, pp. 3–14. IEEE Computer Society Press, Los Alamitos (1995)
Bastide, Y.: Data Mining: algorithmes par niveau, techniques d’implementation et applications. PhD thesis, Université de Clermont-Ferrand II (2000)
Bastide, Y., Taouil, R., Pasquier, N., Stumme, G., Lakhal, L.: Mining frequent patterns with counting inference. SIGKDD Explorations, Special Issue on Scalable Algorithms 2(2), 71–80 (2000)
Bay, S.D.: The UCI KDD Archive. Technical report, University of California, Department of Information and Computer Science, Irvine, 99, http://kdd.ics.uci.edu
Bayardo, R.J.: Efficiently mining long patterns from databases. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of Data (SIGMOD 1998), June 1998, pp. 85–93. ACM Press, New York (1998)
Bonchi, F., Lucchese, C.: On closed constrained frequent pattern mining. In: Proceedings of the 4th IEEE International Conference on Data Mining, pp. 35–42. IEEE Computer Society, Los Alamitos (2004)
Bordat, J.P.: Calcul pratique du treillis de galois d’une correspondance Galois. Math. Sci. Hum. 96, 31–47 (1986)
Boulicaut, J.-F., Bykowski, A.: Frequent closures as a concise representation for binary data mining. In: PADKK 2000: Proceedings of the 4th Pacific- Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications, London, UK, pp. 62–73. Springer, Heidelberg (2000)
Jean-Francois Boulicaut, Artur Bykowski, and Christophe Rigotti. Approximation of frequency queries by means of free-sets. In PKDD 2000: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, pages 75–85, London, UK, 2000. Springer-Verlag.
Brin, S., Motwani, R., Silverstein, C.: Beyond market baskets: Generalizing association rules to correlation. In: Proceedings of the 1997 ACM SIGMOD international conference on Management of Data (SIGMOD 1997), May 1997, pp. 265–276. ACM Press, New York (1997)
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the 1997 ACM SIGMOD international conference on Management of Data (SIGMOD 1997), May 1997, pp. 255–264. ACM Press, New York (1997)
Burdick, D., Calimlim, M., Gehrke, J.: Mafia: A maximal frequent itemset algorithm for transactional databases. In: Proc. of the 17th Int. Conf. on Data Engineering, IEEE Computer Society Press, Los Alamitos (2001)
Bykowski, A., Rigotti, C.: A condensed representation to find frequent patterns. In: PODS 2001: Proceedings of the twentieth ACM SIGMODSIGACT- SIGART symposium on Principles of database systems, pp. 267–273. ACM Press, New York (2001)
Calders, T., Goethals, B.: Mining all non-derivable frequent itemsets. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 74–85. Springer, Heidelberg (2002)
Cristofor, D., Cristofor, L., Simovici, D.A.: Galois Connections and Data Mining. Journal of Universal Computer Science 6(1), 60–73 (2000)
Duquenne, V., Guigues, J.-L.: Famille minimale d’implications informatives résultant d’un tableau de données binaires. Mathématiques et Sciences Humaines 24(95), 5–18 (1986)
Fay, G.: An algorithm for finite Galois connections. Technical report, Institute for Industrial Economy, Budapest (1973)
Ganter, B.: Two basic algorithms in concept analysis. FB4–Preprint 831, TH Darmstadt (1984)
Ganter, B., Reuter, K.: Finding all closed sets: a general approach. Order 8, 283–290 (1991)
Goethals, B., Muhonen, J., Toivonen, H.: Mining non-derivable association rules. In: Proc. SIAM International Conference on Data Mining, Newport Beach, CA (April 2005)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, September 2000. Morgan Kaufmann, San Francisco (2000)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD Int’l Conf. on Management of Data, May 2000, pp. 1–12 (2000)
Han, J., Wang, J., Lu, Y., Tzvetkov, P.: Mining top-k frequent closed patterns without minimum support. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), pp. 211–218. IEEE Computer Society, Los Alamitos (2002)
Kamber, M., Han, J., Chiang, Y.: Metarule-guided mining of multi-dimensional association rules using data cubes. In: Proc. of the 3rd KDD Int’l Conf. (August 1997)
Kryszkiewicz, M.: Concise representation of frequent patterns based on disjunction-free generators. In: ICDM 2001: Proceedings of the 2001 IEEE International, Washington, DC, USA, pp. 305–312. IEEE Computer Society, Los Alamitos (2001)
Lent, B., Agrawal, R., Srikant, R.: Discovering trends in text databases. In: Proceedings of the 3rd international conference on Knowledge Discovery and Data mining (KDD 1997), August 1997, pp. 227–230. AAAI Press, Menlo Park (1997)
Lin, D., Kedem, M.: A new algorithm for discovering the maximum frequent set. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, pp. 105–119. Springer, Heidelberg (1998)
Luxenburger, M.: Implications partielles dans un contexte. Mathématiques, Informatique et Sciences Humaines 29(113), 35–55 (1991)
Luxenburger, M.: Implikationen, Abhängigkeiten und Galois–Abbildungen. PhD thesis, TH Darmstadt, Shaker Verlag, Aachen. In english language, beside the introduction (1993)
Mannila, H.: Methods and problems in data mining. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 41–55. Springer, Heidelberg (1996)
Norris, E.M.: An algorithm for computing the maximal rectangles in a binary relation. Rev. Roum. Math. Pures et Appl. 23(2), 243–250 (1978)
Park, J.S., Chen, M.-S., Yu, P.S.: An efficient hash based algorithm for mining association rules. In: Proceedings of the 1995 ACM SIGMOD international conference on Management of Data (SIGMOD 1995), May 1995, pp. 175–186. ACM Press, New York (1995)
Pasquier, N.: Extraction de bases pour les règles d’association à partir des itemsets fermés fréquents. In: Actes du 18ème congrès sur l’Informatique des Organisations et Systèmes d’Information et de Décision INFORSID 2000 (May 2000)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Pruning closed itemset lattices for association rules. In: Actes des 14èmes journées Bases de Données Avancées (BDA 1998), Octobre 1998, pp. 177–196 (1998)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Closed set based discovery of small covers for association rules. In: Actes des 15èmes journées Bases de Données Avancées (BDA 1999), Octobre 1999, pp. 361–381 (1999)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 398–416. Springer, Heidelberg (1998)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24(1), 25–46 (1999)
Pasquier, N., Taouil, R., Bastide, Y., Stumme, G., Lakhal, L.: Generating a condensed representation for association rules. Journal of Intelligent 24(1), 29–60 (2005)
Pei, J., Han, J., Mao, R.: Closet: An efficient algorithm for mining frequent closed itemsets. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 21–30 (2000)
Savasere, E.O., Navathe, S.: An efficient algorithm for mining association rules in larges databases. In: Proceedings of the 21st international conference on Very Large Data Bases (VLDB 1995), September 1995, pp. 432–444. Morgan Kaufmann, San Francisco (1995)
Silverstein, C., Brin, S., Motwani, R.: Beyond market baskets: Generalizing association rules to dependence rules. Data Mining and Knowledge Discovery 2(1), 39–68 (1998)
Stumme, G.: Conceptual knowledge discovery with frequent concept lattices. FB4- Preprint 2043, TU Darmstadt (1999)
Stumme, G., Taouil, R., Bastide, Y., Pasqier, N., Lakhal, L.: Computing iceberg concept lattices with titanic. J. on Knowledge and Data Engineering 42(2), 189–222 (2002)
Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., Lakhal, L.: Intelligent structuring and reducing of association rules with formal concept analysis. In: Baader, F., Brewka, G., Eiter, T. (eds.) KI 2001. LNCS (LNAI), vol. 2174, pp. 335–350. Springer, Heidelberg (2001)
Taouil, R.: Algorithmique du treillis des fermés: application à l’analyse formelle de concepts et aux bases de données. PhD thesis, Université de Clermont-Ferrand II (2000)
Toivonen, H.: Discovery of frequent patterns in large data collection. PhD thesis, University of Helsinki (1996)
Valtchev, P., Missaoui, R., Godin, R.: Formal concept analysis for knowledge discovery and data mining: The new challenges. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 352–371. Springer, Heidelberg (2004)
Wang, J., Han, J., Pei, J.: Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 236–245. ACM Press, New York (2003)
Wang, J., Karypis, G.: Bamboo: Accelerating closed itemset mining by deeply pushing the length-decreasing support constraint. In: Berry, M.W., Dayal, U., Kamath, C., Skillicorn, D.B. (eds.) Proceedings of the Fourth SIAM International Conference on Data Mining, SIAM, Philadelphia (2004)
Zaki, M.J., Hsiao, C.-J.: Chaarm: An efficient algorithm for closed association rule mining. technical report 99–10. Technical report, Computer Science Dept., Rensselaer Polytechnic (October 1999)
Zaki, M.J., Ogihara, M.: Theoretical foundations of association rules. In: DMKD 1998 workshop on research issues in Data Mining and Knowledge Discovery, June 1998, pp. 1–8. ACM Press, New York (1998)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W.: New algorithms for fast discovery of association rules. In: Proceedings of the 3rd international conference on Knowledge Discovery and Data mining (KDD 1997), August 1997, pp. 283–286. AAAI Press, Menlo Park (1997)
Mohammed, J.: Zaki. Generating non-redundant association rules. In: Proc. KDD 2000, pp. 34–43 (2000)
Zaki, M.J., Hsaio, C.-J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Transactions on Knowledge and Data Engineering 17(4), 462–478 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Lakhal, L., Stumme, G. (2005). Efficient Mining of Association Rules Based on Formal Concept Analysis. In: Ganter, B., Stumme, G., Wille, R. (eds) Formal Concept Analysis. Lecture Notes in Computer Science(), vol 3626. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11528784_10
Download citation
DOI: https://doi.org/10.1007/11528784_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27891-7
Online ISBN: 978-3-540-31881-1
eBook Packages: Computer ScienceComputer Science (R0)