Skip to main content
Log in

HyPar-FCA: a distributed framework based on hybrid partitioning for FCA

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Formal concept analysis (FCA) is a tool for extracting natural clusters from objects and attributes represented as a binary table. Several parallel and distributed algorithms have been proposed to speedup concept discovery. The replication-based approaches suffer from memory bottlenecks while dealing with large contexts, whereas the partitioning-based approaches incur huge communication overhead. We propose HyPar-FCA, a distributed framework that uses horizontal partitioning for low support attributes and vertical partitioning for high support attributes. Our hybrid partitioning strategy can be tuned according to the machines’ memory constraints. It eliminates inter-machine communication for the horizontal partitions and minimizes it using auxiliary structures for the vertical partition. We show that HyPar-FCA is scalable to large contexts and can work on commodity hardware with memory constraints. Compared with state-of-the-art distributed FCA frameworks, HyPar-FCA improves execution time by 22% and reduces memory usage by 27%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. The frequency of the attributes in the dataset is not uniform.

  2. Worst-case input: only the diagonal elements in the context are zero.

  3. https://github.com/TSSG/MRGanterPlus.

  4. https://cloud.iitmandi.ac.in/f/c238347ad2/?raw=1.

  5. An initial setup cost is required in all the approaches, either for partitioning or replicating the context among the workers. Being a one-time cost, we do not consider that in presenting the results.

References

  1. Belohlavek R (2008) Introduction to formal concept analysis, vol 47. Department of Computer Science, Palacky University, Olomouc

    Google Scholar 

  2. Priss U (2006) Formal concept analysis in information science. Annu Rev Inf Sci Technol 40(1):521–543

    Article  Google Scholar 

  3. Kneale W, Kneale WC, Kneale M (1962) The development of logic. Oxford University Press, Oxford

    MATH  Google Scholar 

  4. Arnauld A, Nicole P, Ozell J (1717) Logic, or, the art of thinking. Taylor, London

    Google Scholar 

  5. Missaoui R, Kuznetsov SO, Obiedkov S (2017) Formal concept analysis of social networks. Springer, Berlin

    Book  Google Scholar 

  6. Jiang G, Pathak J, Chute CG (2009) Formalizing ICD coding rules using formal concept analysis. J Biomed Inform 42(3):504–517

    Article  Google Scholar 

  7. Huang Y, Bian L (2015) Using ontologies and formal concept analysis to integrate heterogeneous tourism information. IEEE Trans Emerg Top Comput 3(2):172–184

    Article  Google Scholar 

  8. Atif J, Hudelot C, Bloch I (2013) Explanatory reasoning for image understanding using formal concept analysis and description logics. IEEE Trans Syst Man Cybern Syst 44(5):552–570

    Article  Google Scholar 

  9. Hao F, Min G, Pei Z, Park D-S, Yang LT (2015) \(k\)-clique community detection in social networks based on formal concept analysis. IEEE Syst J 11(1):250–259

    Article  Google Scholar 

  10. Sun Z, Wang B, Sheng J, Hu Y, Wang Y, Shao J (2017) Identifying influential nodes in complex networks based on weighted formal concept analysis. IEEE Access 5:3777–3789

    Article  Google Scholar 

  11. Hao F, Pang G, Pei Z, Qin K, Zhang Y, Wang X (2019) Virtual machines scheduling in mobile edge computing: a formal concept analysis approach. IEEE Trans Sustain Comput 5(3):319–328

    Article  Google Scholar 

  12. Ferré S, Cellier P (2020) Graph-FCA: an extension of formal concept analysis to knowledge graphs. Discret Appl Math 273:81–102

    Article  MathSciNet  Google Scholar 

  13. Andrews S (2011) In-close2, a high performance formal concept miner. In: International Conference on Conceptual Structures. Springer, Berlin, pp 50–62

  14. Lucchese C, Orlando S, Perego R (2005) Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1):21–36

    Article  Google Scholar 

  15. Uno T, Asai T, Uchida Y, Arimura H (2004) An efficient algorithm for enumerating closed patterns in transaction databases. In: International Conference on Discovery Science. Springer, Berlin, pp 16–31

  16. Ganter B (2010) Two basic algorithms in concept analysis. In: International Conference on Formal Concept Analysis, Springer, Berlin, pp 312–340

  17. Kuznetsov SO (1999) Learning of simple conceptual graphs from positive and negative examples. In: European Conference on Principles of Data Mining and Knowledge Discovery. Springer, Berlin, pp 384–391

  18. Negrevergne B, Termier A, Méhaut J-F, Uno T (2010) Discovering closed frequent itemsets on multicore: parallelizing computations and optimizing memory accesses. In: 2010 International Conference on High Performance Computing and Simulation. IEEE, New York, pp 521–528

  19. Patel S, Agarwal U, Kailasam S (2018) A dynamic load balancing scheme for distributed formal concept analysis. In: 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, New York, pp 489–496

  20. Xu B, de Fréin R, Robson E, Foghlú MÓ (2012) Distributed formal concept analysis algorithms based on an iterative MapReduce framework. In: International Conference on Formal Concept Analysis. Springer, Berlin, pp 292–308

  21. Yoshizoe K, Terada A, Tsuda K (2015) Redesigning pattern mining algorithms for supercomputers. arXiv preprint. arXiv:1510.07787

  22. Leroy V, Kirchgessner M, Termier A, Amer-Yahia S (2017) TopPI: an efficient algorithm for item-centric mining. Inf Syst 64:104–118

    Article  Google Scholar 

  23. Goel S, Broder A, Gabrilovich E, Pang B (2010) Anatomy of the long tail: ordinary people with extraordinary tastes. In: Proceedings of the third ACM International Conference on Web Search and Data Mining, pp 201–210

  24. Borah A, Nath B (2019) Rare pattern mining: challenges and future perspectives. Complex Intell Syst 5(1):1–23

    Article  Google Scholar 

  25. Wolff KE (1993) A first course in formal concept analysis. SoftStat 93:429–438

    Google Scholar 

  26. Muneeswaran P, Jyoti, Kailasam S (2020) A hybrid partitioning strategy for distributed FCA. In: CLA, pp 71–82

  27. Krajca P, Outrata J, Vychodil V (2010) Parallel algorithm for computing fixpoints of Galois connections. Ann Math Artif Intell 59(2):257–272

    Article  MathSciNet  Google Scholar 

  28. Zou L, He T, Dai J (2022) A new parallel algorithm for computing formal concepts based on two parallel stages. Inf Sci 586:514–524

    Article  Google Scholar 

  29. Krajca P, Vychodil V (2009) Distributed algorithm for computing formal concepts using map-reduce framework. In: International Symposium on Intelligent Data Analysis. Springer, Berlin, pp 333–344

  30. Chunduri RK, Cherukuri AK (2019) Scalable formal concept analysis algorithms for large datasets using spark. J Ambient Intell Humaniz Comput 10(11):4283–4303

    Article  Google Scholar 

  31. Venkataraman S, Yang Z, Liu D, Liang E, Falaki H, Meng X, Xin R, Ghodsi A, Franklin M, Stoica I et al (2016) Sparkr: Scaling r programs with spark. In: Proceedings of the 2016 International Conference on Management of Data, pp 1099–1104

  32. Chunduri RK, Cherukuri AK (2018) Haloop approach for concept generation in formal concept analysis. J Inf Knowl Manag 17(03):1850029

    Article  Google Scholar 

  33. Lucchese C, Orlando S, Perego R (2007) Parallel mining of frequent closed patterns: harnessing modern computer architectures. In: Seventh IEEE International Conference on Data Mining (ICDM 2007), pp 242–251

  34. Charles P, Grothoff C, Saraswat V, Donawa C, Kielstra A, Ebcioglu K, Von Praun C, Sarkar V (2005) X10: an object-oriented approach to non-uniform cluster computing. ACM SIGPLAN Not 40(10):519–538

    Article  Google Scholar 

  35. Lemire D, Kaser O, Kurz N, Deri L, O’Hara C, Saint-Jacques F, Ssi-Yan-Kai G (2018) Roaring bitmaps: implementation of an optimized software library. Softw Pract Exp 48(4):867–895

    Article  Google Scholar 

  36. SPMF repository. https://www.philippe-fournier-viger.com/spmf/. Online. Accessed 01 Aug 2021

  37. Apache Kafka. http://kafka.apache.org/. Online. Accessed 01 Aug 2021

  38. Apache ZooKeeper-Home. https://zookeeper.apache.org/. Online. Accessed 01 Aug 2021

  39. Welcome to Apache Hadoop. https://hadoop.apache.org/. Online. Accessed 01 Aug 2021

  40. FIMI repository. http://fimi.cs.helsinki.fi/. Online. Accessed 01 Aug 2021

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muneeswaran Packiaraj.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work as supported by SPARC, a Government of India Initiative under Grant no. SPARC/2018-2019/P682/SL.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Packiaraj, M., Kailasam, S. HyPar-FCA: a distributed framework based on hybrid partitioning for FCA. J Supercomput 78, 12589–12620 (2022). https://doi.org/10.1007/s11227-022-04366-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04366-x

Keywords

Navigation