Elsevier

Information Sciences

Volume 572, September 2021, Pages 126-146
Information Sciences

Triadic concept approximation

https://doi.org/10.1016/j.ins.2021.04.064Get rights and content

Abstract

Formal Concept Analysis is a mathematical theory for knowledge representation as well as data analysis and visualization. It provides a mechanism for understanding and mining data through concept lattice construction and exploration. This theory assists in data processing by providing a framework for applying different analysis techniques. One of its potentials lies in its mathematical foundation that enables the generation, ordering, and visualization of knowledge in the form of formal concepts described in a hierarchy known as a concept lattice. However, as the input dataset grows, the search and especially the visualization and exploration of concepts becomes prohibitive. With the extension of the classical FCA approach to Triadic Concept Analysis, this problem becomes even more evident due to the complexity of the inherent structures in triadic concepts and relationships. In this work, we propose an approach to find triadic concepts when a query in the form of a triple is given, allowing the visualization and exploration of a Hasse diagram of triadic concepts.

Introduction

Due to the large volume of data currently being generated by different applications, knowledge exploration and extraction from data have become increasingly important paths for understanding human behaviour and assisting in decision making. In this context, Formal Concept Analysis (FCA) provides a mathematical framework for representation and extraction of knowledge from data. Originally proposed by Wille and Ganter in the 80’s [2], FCA is based on the binary relationship between two sets of elements, called objects and attributes. The great interest in the use of FCA in different fields of applications can be attributed to its formalism and the possibility of identifying formal concepts, which can be represented through concept lattices [19]. The lattices are used to understand the data through a hierarchical visualization of ordered formal concepts, and to extract implications and association rules.

Although successful in many applications in the literature, some situations require an extension to the classical approach by adding a third dimension, in order to have a better characterization and representation of data. FCA was indeed extended by Rudolf Wille and Fritz Lehmann in [11] to describe a ternary relation between object, attribute, and condition sets. This new approach is known as Triadic Concept Analysis (TCA or 3FCA). An example of such scenario is the social resource sharing system represented by a folksonomy [9] which expresses a ternary relation between users, resources, and keywords (tags) used to annotate such resources by users. In [11] the authors proposed a graphical representation of a trilattice as a three-dimensional diagram. Despite coming from FCA, whose graphical representation is intuitive, the three-dimensional complexity of the data in TCA makes it difficult to use these structures. The graphical representation is not intuitive even for small datasets due to the complexity of the triadic settings, resulting in a loss of TCA power of interpretation and visualization. For example, Fig. 1 shows a trilattice from [7], represented as a 3-net that displays all the triconcepts and their respective equivalence classes. As can be seen, the diagram becomes very complex and difficult to perceive even for a small context.

Some studies try to reduce the complexity of visualization and navigation of the triadic lattice [15], [17] by proposing navigation strategies and graphical representations for triadic settings, based on the classical Hasse diagram. In [15], the authors defined T-iPred procedure as an adaptation of the iPred algorithm [1] (used to compute concept lattices) to represent the Hasse diagram of triadic concepts.

The graphical representation proposed in [15] can be used to explore and search for knowledge hidden in triadic contexts. For example, it would be possible to identify concepts in the diagram that are exact or approximate answers to a given query. In case of an approximation, the closest concepts to the query are sought in the Hasse diagram.

In this work we propose a method to uncover patterns from triadic contexts using the T-iPred as a graphical basis to display and retrieve the answer to a user query, using the clarity and expressiveness of this diagram. Patterns are represented by formal concepts that match with or approximate the query expressed by a triple of object, attribute, and condition sets. Our proposed lattice search strategy uses the diagram generated by T-iPred to display the result of the query as the exact answer to the query or the upper and lower covers for concepts that approximate the triple (query). Our research work was mostly implemented in Java and its software architecture can be seen in Fig. 2. Our contributions are in the third module, where we propose three types of queries that can be addressed to the output of the second module using all the triadic dimensions or any combination of them:

  • 1. The Data-Peeler algorithm [5] computes the triadic concepts

  • 2. The T-iPred algorithm builds the precedence links and displays the Hasse diagram of triadic concepts

  • 3. The concept approximation is performed using three types of queries.

This article is organized as follows: Section 2 presents the theoretical basis on Triadic Concept Analysis while Section 3 gives a brief overview about concept approximation and lattice visualization in FCA. Section 4 presents the concept approximation and the three types of queries that can be based on objects, attributes, and condition sets, and any combination of these. Section 5 shows the empirical study and, finally, in Section 6, we present our conclusions and possible future work.

Section snippets

Background

In this section, we present some of the basic definitions of Triadic Concept Analysis such as triadic contexts, concepts and quasi-orders between concepts. We focus on the triadic approach, but further reading about FCA can be found in [2], [8], [19].

Related work

While there are many studies on drawing, visualizing, and navigating through Hasse diagrams in the form of flat or nested line diagrams, there has been a limited amount of work on concept approximation in FCA, i.e., in the dyadic framework [6], [14], [21]. Most of the studies are based either on the notion of preconcept [3], [21] or on Pawlak’s approximation in Rough Set Theory (RST) [22]. A pair c=(X,Y) is a preconcept where X and Y are object and attribute subsets respectively if XY or

Concept approximation

Even using the Hasse diagram for triadic concepts ordered according to either the extent, intent or modus, proposed by [15], the need of techniques for extracting knowledge from the triadic context remains a challenge in the TCA community. Working with large datasets can produce huge Hasse diagrams, even in the dyadic case. In TCA this problem is more complex due to the number of concepts generated from the triadic data. There have been efforts in the literature to propose paradigms for

Empirical study

This section presents some experimental results obtained through the implementation of the proposed algorithms. The main objective is to empirically evaluate the performance of the algorithms in a real scenario.

All the tests were performed on an Intel Core i5-9400F 2.90 GHz with 16 GB of RAM using Java 8 with JDK 1.8. We adapted the The Mushroom dataset 3 to the TCA framework to run our experiments. We created two contexts to test the

Conclusion

In this article, we propose an approach for the extraction and visualization of knowledge in triadic contexts using queries and the Hasse diagram of triadic concepts proposed in [15]. These queries, defined as triples (A1,A2,A3) with the instantiation of zero to three components, aim to find the closest concepts to a given triple. Using the queries, the user can retrieve concepts either as the exact answer or as an approximation. The queries are useful to discover patterns in triadic contexts,

CRediT authorship contribution statement

Kaio H.A. Ananias: Conceptualization, Methodology, Software, Writing - original draft, Data curation. Rokia Missaoui: Conceptualization, Supervision, Project administration, Writing - review & editing. Pedro H.B. Ruas: Visualization, Validation, Writing - review & editing. Luis E. Zarate: Visualization, Validation, Writing - review & editing. Mark A.J. Song: Conceptualization, Supervision, Project administration, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The second author acknowledges the financial support of the Natural Sciences and Engineering Research Council of Canada (NSERC) while the rest of authors thank FAPEMIG ( CEX-APQ-00997-15 ), CNPq ( 404431/2016-0 ) and CAPES ( 1196329 ) for the financial support. All the authors would like to express their gratitude to the anonymous reviewers for their questions and suggestions.

Kaio H. A. Ananias received his B.Sc. and M.Sc. degree in Computer Science both from the Pontifical Catholic University of Minas Gerais (2018 and 2020 - respectively). He is currently Software Engineer at SBTUR and his research interests include Triadic Concept Analysis, Formal Concept Analysis, Programming Languages, and Compiler Technologies.

References (22)

  • R. Godin et al.

    Lattice model of browsable data spaces

    Information Sciences

    (1986)
  • R. Jäschke et al.

    Discovering shared conceptualizations in folksonomies

    Journal of Web Semantics

    (2008)
  • J. Baixeries et al.

    Yet a faster algorithm for building the hasse diagram of a concept lattice

  • G. Bernhard et al.

    Formal Concept Analysis

    (1999)
  • C. Burgmann, R. Wille, The basic theorem on preconcept lattices, in: Missaoui, R., Schmid, J. (Eds.), Formal Concept...
  • C. Carpineto et al.

    Concept Data Analysis: Theory and Applications

    (2004)
  • L. Cerf et al.

    Data-peeler: Constraint-based closed pattern mining in n-ary relations

  • V. Cross, W. Yi, Approximation and similarity in concept lattices, in: NAFIPS 2009–2009 Annual Meeting of the North...
  • C.V. Glodeanu

    Tri-ordinal factor analysis

    International Conference on Formal Concept Analysis, Springer

    (2013)
  • L.L. Kis, C. Sacarea, D. Troanca, Fca tools bundle-a tool that enables dyadic and triadic conceptual navigation, in:...
  • F. Lehmann et al.

    A triadic approach to formal concept analysis

    Conceptual Structures: Applications, Implementation and Theory

    (1995)
  • Kaio H. A. Ananias received his B.Sc. and M.Sc. degree in Computer Science both from the Pontifical Catholic University of Minas Gerais (2018 and 2020 - respectively). He is currently Software Engineer at SBTUR and his research interests include Triadic Concept Analysis, Formal Concept Analysis, Programming Languages, and Compiler Technologies.

    Rokia Missaoui obtained her PhD in Computer Science from University of Montreal in 1988 and has been a university professor in Canada for more than thirty-two years. She is currently a full professor in the Department of Computer Science and Engineering at the University of Quebec in Outaouais (UQO). Before joining UQO in 2002, she was a professor at the University of Quebec in Montreal (UQAM) for fifteen years. She leads the LARIM laboratory at UQO and is a member of the LATECE laboratory at UQAM. Since the beginning of her career, she has been involved in several research projects funded by granting agencies and industrial partners. Her research interests and teaching activities are currently focused on advanced databases, data mining and warehousing, machine learning, social network analysis and big data analytics.

    Pedro Henrique B. Ruas received his B.Sc. degree in Computer Information Systems, and his master’s degree in Computer Science, both from Pontifical Catholic University of Minas Gerais (2016) and he is presently a Ph.D. student in Computer Science at PUC Minas. He is currently a Data Scientist at Rocketmat AI and operates mainly in the following areas: Triadic Concept Analysis, Formal Concept Analysis and Data Mining.

    Luis E. Zarate received the M.Sc. and PhD from the Federal University of Minas Gerais (UFMG) (1992 and 1998 - respectively). Graduated in Electrical Engineering by URP (Peru) in 1981. Since 1992 he has been working with research and teaching various disciplines at the Pontifical Catholic University of Minas Gerais. Professor Z?rate develops research in Data Mining and Big Data, Formal Concept Analysis and Soft Computing. He works as coordinator of the Laboratory of Computational Intelligence at PUC-MG (LICAP) and is responsible for several researches that are supported by governmental organizations in Brazil.

    Mark A. J. Song received the B.Sc., M.Sc and PhD from the Federal University of Minas Gerais, in 1991, 1996 and 2004, respectively. Since 1993 he is a researcher and professor at the Pontifical Catholic University of Minas Gerais. He is the coordinator of the Laboratory of Formal Methods at PUC-MG (LabMF) and is responsible for several researches funded by granting agencies and by governmental organizations in Brazil. His research area includes formal verification, model verification, software testing and formal concept analysis.

    1

    ORCID: 0000-0002-1115-086X.

    View full text