Abstract
Formal concept analysis has been successfully applied as a data mining framework whereby target patterns come in the form of intent families and implication bases. Since their extraction is a challenging task, especially for large datasets, parallel techniques should be helpful in reducing the computational effort and increasing the scalability of the approach. In this paper we describe a way to parallelize a recent divide-and-conquer method computing both the intents and the Duquenne-Guiges implication basis of dataset. Wile intents admit a straightforward computation, adding the basis — whose definition is recursive — poses harder problems, in particular, for parallel design. A first, and by no means final, solution relies on a partition of the basis that allows the crucial and inherently sequential step of redundancy removal to be nevertheless split into parallel subtasks. A prototype implementation of our method, called ParCIM, shows a nearly linear acceleration w.r.t. its sequential counter-part.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Han, J., Kamber, M.: Data Mining: Concepts and Techniques, 1st edn. Morgan Kaufmann Publishers, San Francisco, California, USA (2001)
Wang, J., Han, J., Pei, J.: Closet+: searching for the best strategies for mining frequent closed itemsets. In: KDD 2003. Proceedings of the ninth ACM SIGKDD, ACM Press, New York (2003)
Zaki, M.J., Hsiao, C.J.: Charm: An efficient algorithm for closed itemset mining. In: Proceedings of the Second SIAM International Conference on Data Mining, SIAM (2002)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1997)
Duquenne, V., Guigues, J.L.: Famille minimale d’implications informatives résultant d’un tableau de données binaires. Mathématiques et Sciences Sociales 95, 5–18 (1986)
Carpineto, C., Romano, G.: A lattice conceptual clustering system and its application to browsing retrieval. Mach. Learn. 24, 95–122 (1996)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24, 25–46 (1999)
Zaki, M.J.: Parallel and distributed association mining: A survey. IEEE Concurrency 7, 14–25 (1999)
Ganter, B.: Two basic algorithms in concept analysis. Technical Report Preprint 831, Technische Hochschule, Darmstadt, Germany (1984)
Valtchev, P., Missaoui, R., Lebrun, P.: A partition-based approach towards constructing galois (concept) lattices. Discrete Math. 256, 801–829 (2002)
Valtchev, P., Duquenne, V.: Towards scalable divide-and-conquer methods for computing concepts and implications. Preprint accepted to Discrete Applied Mathematics (2006)
Djoufak, J.F.K., Valtchev, P., Djamegni, C.T.: A parallel algorithm for lattice construction. In: Ganter, B., Godin, R. (eds.) ICFCA 2005. LNCS (LNAI), vol. 3403, pp. 249–264. Springer, Heidelberg (2005)
Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge (1990)
Maier, D.: The Theory of Relational Databases. Computer Science Press (1983)
Armstrong, W.W.: Dependency structures of data base relationships. In: IFIP Congress, pp. 580–583 (1974)
Goetz, S.: Algorithms in cgm, bsp and bsp* model: A survey. Technical report, Carleton Unviversity, Ottawa (1997)
Dehne, F., Fabri, A., Rau-Chaplin, A.: Scalable parallel computational geometry for coarse grained multicomputers. Journal of Computational Geometry and Applications 6, 379–400 (1996)
Robson, R.: Using the STL: the C++ standard template library, 2nd edn. Springer, Heidelberg (2000)
Gropp, W., Lusk, E., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edn. MIT Press, Cambridge, MA (1999)
Parka, H.K., Chi, D.H., Lee, D.K., Ryu, K.W.: An efficient parallel algorithm for merging in the postal model. ETRI Journal 21, 31–39 (1999)
Djamegni, C.T.: Mapping rectangular mesh algorithms onto asymptotically space-optimal arrays. J. Parallel Distrib. Comput. 64, 345–359 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Djoufak Kengue, J.F., Valtchev, P., Tayou Djamegni, C. (2007). Parallel Computation of Closed Itemsets and Implication Rule Bases. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_34
Download citation
DOI: https://doi.org/10.1007/978-3-540-74742-0_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74741-3
Online ISBN: 978-3-540-74742-0
eBook Packages: Computer ScienceComputer Science (R0)