Skip to main content

Pruning in Map-Reduce Style CbO Algorithms

  • Conference paper
  • First Online:
Ontologies and Concepts in Mind and Machine (ICCS 2020)

Abstract

Enumeration of formal concepts is crucial in formal concept analysis. Particularly efficient for this task are algorithms from the Close-by-One family (shortly, CbO-based algorithms). State-of-the-art CbO-based algorithms, e.g. FCbO, In-Close4, and In-Close5, employ several techniques, which we call pruning, to avoid some unnecessary computations. However, the number of the formal concepts can be exponential w.r.t. dimension of the input data. Therefore, the algorithms do not scale well and large datasets become intractable. To resolve this weakness, several parallel and distributed algorithms were proposed. We propose new CbO-based algorithms intended for Apache Spark or a similar programming model and show how the pruning can be incorporated into them. We experimentally evaluate the impact of the pruning and demonstrate the scalability of the new algorithm.

Supported by the grant JG 2019 of Palacký University Olomouc, No. JG_2019_008.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://hadoop.apache.org/.

  2. 2.

    https://spark.apache.org/.

  3. 3.

    IBM Quest Synthetic Data Generator was used.

References

  1. Akhmatnurov, M., Ignatov, D.I.: Context-aware recommender system based on Boolean matrix factorisation. In: Yahia, S.B., Konecny, J. (eds.) Proceedings of the Twelfth International Conference on Concept Lattices and Their Applications, Clermont-Ferrand, France, 13–16 October 2015, CEUR Workshop Proceedings, vol. 1466, pp. 99–110 (2015). CEUR-WS.org

  2. Andrews, S.: In-Close, a fast algorithm for computing formal concepts. In: 17th International Conference on Conceptual Structures, ICCS 2009. Springer (2009)

    Google Scholar 

  3. Andrews, S.: In-Close2, a high performance formal concept miner. In: Andrews, S., Polovina, S., Hill, R., Akhgar, B. (eds.) ICCS 2011. LNCS (LNAI), vol. 6828, pp. 50–62. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22688-5_4

    Chapter  Google Scholar 

  4. Andrews, S.: A ‘best-of-breed’ approach for designing a fast algorithm for computing fixpoints of Galois connections. Inf. Sci. 295, 633–649 (2015)

    Article  MathSciNet  Google Scholar 

  5. Andrews, S.: Making use of empty intersections to improve the performance of CbO-type algorithms. In: Bertet, K., Borchmann, D., Cellier, P., Ferré, S. (eds.) ICFCA 2017. LNCS (LNAI), vol. 10308, pp. 56–71. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59271-8_4

    Chapter  Google Scholar 

  6. Andrews, S.: A new method for inheriting canonicity test failures in Close-by-One type algorithms. In: Ignatov, D.I., Nourine, L. (eds.) Proceedings of the Fourteenth International Conference on Concept Lattices and Their Applications, CLA 2018, Olomouc, Czech Republic, 12–14 June 2018, CEUR Workshop Proceedings, vol. 2123, pp. 255–266 (2018). CEUR-WS.org

  7. Belohlavek, R., Vychodil, V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. Syst. Sci. 76(1), 3–20 (2010)

    Article  MathSciNet  Google Scholar 

  8. Chunduri, R.K., Cherukuri, A.K.: Haloop approach for concept generation in formal concept analysis. JIKM 17(3), 1850029 (2018)

    Google Scholar 

  9. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Brewer, E.A., Chen, P. (eds.) 6th Symposium on Operating System Design and Implementation (OSDI 2004), San Francisco, California, USA, 6–8 December 2004, pp. 137–150. USENIX Association (2004)

    Google Scholar 

  10. Ganter, B., Wille, R.: Formal Concept Analysis Mathematical Foundations. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-642-59830-2

    Book  MATH  Google Scholar 

  11. Krajca, P., Outrata, J., Vychodil, V.: Advances in algorithms based on CbO. In: Proceedings of the 7th International Conference on Concept Lattices and Their Applications, Sevilla, Spain, 19–21 October 2010, pp. 325–337 (2010)

    Google Scholar 

  12. Krajca, P., Outrata, J., Vychodil, V.: Parallel algorithm for computing fixpoints of Galois connections. Ann. Math. Artif. Intell. 59(2), 257–272 (2010)

    Article  MathSciNet  Google Scholar 

  13. Krajca, P., Vychodil, V.: Distributed algorithm for computing formal concepts using map-reduce framework. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 333–344. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_29

    Chapter  MATH  Google Scholar 

  14. Kuznetsov, S.O.: A fast algorithm for computing all intersections of objects from an arbitrary semilattice. Nauchno-Tekhnicheskaya Informatsiya Seriya 2-Informatsionnye Protsessy i Sistemy 27(1), 17–20 (1993). https://www.researchgate.net/publication/273759395_SOKuznetsov_A_fast_algorithm_for_computing_all_intersections_of_objects_from_an_arbitrary_semilattice_Nauchno-Tekhnicheskaya_Informatsiya_Seriya_2_-_Informatsionnye_protsessy_i_sistemy_No_1_pp17-20_19

  15. Outrata, J., Vychodil, V.: Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data. Inf. Sci. 185(1), 114–127 (2012)

    Article  MathSciNet  Google Scholar 

  16. Poelmans, J., Ignatov, D.I., Viaene, S., Dedene, G., Kuznetsov, S.O.: Text mining scientific papers: a survey on FCA-based information retrieval research. In: Perner, P. (ed.) ICDM 2012. LNCS (LNAI), vol. 7377, pp. 273–287. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31488-9_22

    Chapter  Google Scholar 

  17. Xu, B., de Fréin, R., Robson, E., Ó Foghlú, M.: Distributed formal concept analysis algorithms based on an iterative MapReduce framework. In: Domenach, F., Ignatov, D.I., Poelmans, J. (eds.) ICFCA 2012. LNCS (LNAI), vol. 7278, pp. 292–308. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29892-9_26

    Chapter  MATH  Google Scholar 

  18. Zaharia, M., Das, T., Li, H., Shenker, S., Stoica, I.: Discretized streams: an efficient and fault-tolerant model for stream processing on large clusters. In: Fonseca, R., Maltz, D.A. (eds.) 4th USENIX Workshop on Hot Topics in Cloud Computing, HotCloud 2012, Boston, MA, USA, 12–13 June 2012. USENIX Association (2012)

    Google Scholar 

  19. Zaki, M.J.: Mining non-redundant association rules. Data Min. Knowl. Discov. 9(3), 223–248 (2004)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Petr Krajča .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Konecny, J., Krajča, P. (2020). Pruning in Map-Reduce Style CbO Algorithms. In: Alam, M., Braun, T., Yun, B. (eds) Ontologies and Concepts in Mind and Machine. ICCS 2020. Lecture Notes in Computer Science(), vol 12277. Springer, Cham. https://doi.org/10.1007/978-3-030-57855-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-57855-8_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-57854-1

  • Online ISBN: 978-3-030-57855-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics