Skip to main content
Log in

On pruning techniques in map-reduce style CbO algorithms

  • S706 Conceptual Structures
  • Published:
Annals of Mathematics and Artificial Intelligence Aims and scope Submit manuscript

Abstract

A fundamental task in formal concept analysis is the enumeration of formal concepts. Among the fastest algorithms for this task belong algorithms which are based on Close-by-One (CbO), a tree recursive algorithm using lexicographical order of formal concepts to ensure that each formal concept is enumerated exactly once. State-of-the-art algorithms based on CbO, e.g. FCbO, In-Close4, and In-Close5, employ several techniques, which we call pruning, to avoid some unnecessary computations. However, the number of the formal concepts can be exponential w.r.t. dimension of the input data. Therefore, the algorithms do not scale well and large datasets become intractable. To resolve this weakness, several parallel and distributed algorithms were proposed. We propose four new CbO-based algorithms intended for Apache Spark or a similar programming model and show how the pruning can be incorporated into them. We experimentally evaluate the impact of the pruning and demonstrate the scalability of the new algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Akhmatnurov, M., Ignatov, D.I.: Context-aware recommender system based on boolean matrix factorisation. In: Yahia, S.B., Konecny, J. (eds.) Proceedings of the Twelfth International Conference on Concept Lattices and Their Applications, Clermont-Ferrand, France, October 13-16, 2015, CEUR Workshop Proceedings. CEUR-WS.org, vol. 1466, pp 99–110 (2015)

  2. Andrews, S.: In-close, a fast algorithm for computing formal concepts. In: International Conference on Conceptual Structures. Springer (2009)

  3. Andrews, S.: In-close2, a high performance formal concept miner. In: Andrews, S., Polovina, S., Hill, R., Akhgar, B. (eds.) Conceptual Structures for Discovering Knowledge - 19th International Conference on Conceptual Structures, ICCS 2011, Derby, UK, July 25-29, 2011. Proceedings, Lecture Notes in Computer Science, vol. 6828, pp 50–62. Springer (2011)

  4. Andrews, S.: A ‘Best-of-Breed’ approach for designing a fast algorithm for computing fixpoints of Galois connections. Inf. Sci. 295, 633–649 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  5. Andrews, S.: Making use of empty intersections to improve the performance of CbO-type algorithms. In: Bertet, K., Borchmann, D., Cellier, P., Ferré, S. (eds.) Formal Concept Analysis - 14th International Conference, ICFCA 2017, Rennes, France, June 13-16, 2017, Proceedings, Lecture Notes in Computer Science, vol. 10308, pp 56–71. Springer (2017)

  6. Andrews, S.: A new method for inheriting canonicity test failures in Close-by-One type algorithms. In: Ignatov, D.I., Nourine, L. (eds.) Proceedings of the Fourteenth International Conference on Concept Lattices and Their Applications, CLA 2018, Olomouc, Czech Republic, June 12-14, 2018, CEUR Workshop Proceedings. CEUR-WS.org, vol. 2123, pp 255–266 (2018)

  7. Belohlávek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chunduri, R.K., Cherukuri, A.K.: Haloop approach for concept generation in formal concept analysis. JIKM 17(3), 1850029 (2018)

    Google Scholar 

  9. Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Brewer, E.A., Chen, P. (eds.) 6th Symposium on Operating System Design and Implementation (OSDI 2004), San Francisco, California, USA, December 6-8, 2004. USENIX Association, pp 137–150 (2004)

  10. Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S., Qiu, J., Fox, G.C.: Twister: a runtime for iterative mapreduce. In: Hariri, S., Keahey, K. (eds.) Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, HPDC 2010, June 21-25, 2010. https://doi.org/10.1145/1851476.1851593, pp 810–818. ACM, Chicago, Illinois, USA (2010)

  11. Ganter, B., Wille, R.: Formal Concept Analysis Mathematical Foundations. Springer-Verlag, Berlin Heidelberg (1999)

    Book  MATH  Google Scholar 

  12. Konecny, J., Krajca, P.: Pruning in map-reduce style CbO algorithms. In: Alam, M., Braun, T., Yun, B. (eds.) Ontologies and Concepts in Mind and Machine - 25th International Conference on Conceptual Structures, ICCS 2020, Bolzano, Italy, September 18-20, 2020, Proceedings, Lecture Notes in Computer Science, vol. 12277, pp 103–116. Springer (2020)

  13. Krajca, P., Outrata, J., Vychodil, V.: Advances in algorithms based on CbO. In: Proceedings of the 7th International Conference on Concept Lattices and Their Applications, Sevilla, Spain, October 19-21, 2010, pp 325–337 (2010)

  14. Krajca, P., Outrata, J., Vychodil, V.: Parallel algorithm for computing fixpoints of Galois connections. Ann. Math. Artif. Intell. 59(2), 257–272 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  15. Krajca, P., Vychodil, V.: Distributed algorithm for computing formal concepts using map-reduce framework. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J. (eds.) Advances in Intelligent Data Analysis VIII, 8th International Symposium on Intelligent Data Analysis, IDA 2009, Lyon, France, August 31 - September 2, 2009. Proceedings, Lecture Notes in Computer Science, vol. 5772, pp 333–344. Springer (2009)

  16. Kuznetsov, S.O.: A fast algorithm for computing all intersections of objects from an arbitrary semilattice. Nauchno-Tekhnicheskaya Informatsiya Seriya 2-Informatsionnye Protsessy i Sistemy (1), 17–20 (1993)

  17. Outrata, J., Vychodil, V.: Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data. Inf. Sci. 185(1), 114–127 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  18. Poelmans, J., Ignatov, D.I., Viaene, S., Dedene, G., Kuznetsov, S.O.: Text mining scientific papers: A survey on fca-based information retrieval research. In: Perner, P. (ed.) Advances in Data Mining. Applications and Theoretical Aspects - 12th Industrial Conference, ICDM 2012, Berlin, Germany, July 13-20, 2012. Proceedings, Lecture Notes in Computer Science, vol. 7377, pp 273–287. Springer (2012)

  19. Xu, B., de Fréin, R., Robson, E., Foghlú, M.Ó.: Distributed formal concept analysis algorithms based on an iterative mapreduce framework. In: Domenach, F., Ignatov, D.I., Poelmans, J. (eds.) Formal Concept Analysis - 10th International Conference, ICFCA 2012, Leuven, Belgium, May 7-10, 2012. Proceedings, Lecture Notes in Computer Science, vol. 7278, pp 292–308. Springer (2012)

  20. Zaharia, M., Das, T., Li, H., Shenker, S., Stoica, I.: Discretized streams: An efficient and fault-tolerant model for stream processing on large clusters. In: Fonseca, R., Maltz, D.A. (eds.) 4th USENIX Workshop on Hot Topics in Cloud Computing, HotCloud’12, Boston, MA, USA, June 12-13, 2012. USENIX Association (2012)

  21. Zaki, M.J.: Mining non-redundant association rules. Data Min. Knowl. Discov. 9(3), 223–248 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Petr Krajča.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supported by the grant JG 2019 of Palacký University Olomouc, No. JG_2019_008.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Konecny, J., Krajča, P. On pruning techniques in map-reduce style CbO algorithms. Ann Math Artif Intell 90, 1107–1124 (2022). https://doi.org/10.1007/s10472-022-09787-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10472-022-09787-1

Keywords

Mathematics Subject Classification (2010)

Navigation