Elsevier

Information Sciences

Volume 268, 1 June 2014, Pages 381-396
Information Sciences

Derivation digraphs for dependencies in ordinal and similarity-based data

https://doi.org/10.1016/j.ins.2013.12.046Get rights and content

Abstract

We present graph-based method of reasoning with if-then rules describing dependencies between attributes in ordinal and similarity-based data. The rules we consider have two basic interpretations as attribute implications in object-attribute incidence data where objects are allowed to have attributes (features) to degrees and as similarity-based functional dependencies in an extension of the Codd model of data. Main results in this paper show that degrees to which if-then rules are semantically entailed from sets (or graded sets) of other if-then rules can be characterized by existence of particular directed acyclic graphs with vertices labeled by attributes and degrees coming from complete residuated lattices. In addition, we show that the construction of directed acyclic graphs can be used to compute closures of sets of attributes and normalized proofs.

Introduction

Reasoning with if-then rules belongs to important branches of computer science and relational data analysis. In its simplest form, the rules are considered as implications between conjunctions of attributes (features) and are intended to describe if-then dependencies between presence/absence of attributes. Considering the idempotency, associativity, and commutativity of conjunction, the rules are usually written as expressions AB, where A and B are finite sets of attributes. The basic meaning of AB is that if an object has all the attributes from A, then it has all the attributes from B. Rules of this form have been extensively studied in relational data analysis and in particular in formal concept analysis (FCA, see [18]), where they are known as attribute implications, and serve as an indirect description of object-attribute clusters called formal concepts that can be found in object-attribute incidence data. The seminal contribution to FCA regarding attribute implications is [22] where the authors have shown minimal sets of attribute implications that convey information about all if-then dependencies valid in input data. Note that attribute implications can also be seen as a particular case of association rules where the support is disregarded and one is only interested in mining rules with confidence equal to 1, see [1], [26], [30], [37].

The rules also play a central role in dependency theory, most notably in relational database systems [15], [28], where they are used to specify constraints on relation schemes and are called functional dependencies. From the point of view of the syntax, functional dependencies are the same formulas as attribute implications but their interpretation is different. The rules are interpreted in relations on relation schemes (formal counterparts to database tables) and the basic meaning of AB is that any two tuples which agree on all values of the attributes from A should also agree on all values of the attributes from B.

An interesting property is that both the different interpretations of the if-then rules yield the same notion of semantic entailment. As a result, one can use a single inference system for reasoning with both attribute implications and functional dependencies. The best known inference system has been proposed by Armstrong [3] and can be seen as a system consisting of two rules [25]:

  • (Ax)

    infer ABB,

  • (Cut)

    from AB and BCD infer ACD,

where A, B, C, D are subsets of attributes. Various other systems have been proposed in order to normalize proofs and enable automated deduction techniques [2]. An interesting alternative graph-based approach that is also aimed at possible automated proving has been proposed by Maier in [27], see also [28] for an extensive description and its application for theorem proving. Note that in [6] the authors use derivation trees which can be seen as an analogous approach. Later related works include [34] on hypergraph formalisms for representation of database schemes and [4] introducing FD-graphs analogous to the approach by Maier.

The present paper is devoted to a graph-based inference for if-then rules where the attributes are graded, meaning that each attribute in the antecedent and consequent of AB is equipped with a threshold degree saying that an object must have the attribute at least to a specified degree, or in the database interpretation, two tuples must have similar values of the attribute at least to a specified degree. Technically, the rules are considered as implications between graded sets (fuzzy sets), i.e., rules of the form AB, where A and B are maps from the set of all attributes to a suitable scale of degrees (typically a real unit interval or its finite subset). Rules of this form naturally appear in dependency analysis of object-attribute relational data with grades [8] and similarity-based relational databases [13] and have been proposed and studied in the past, see [8] for an overview of results. If-then formulas of a similar form with antecedents containing constants for truth degrees have been studied before in [10] and proved to be useful for characterization of classes of algebras with fuzzy equalities closed under various class operators (varieties, quasivarieties, semivarieties, and sur-reflective classes, see [11], [36]). We therefore consider exploration of inference systems for rules in this broad family an important task from both the theoretical and practical points of view.

Looking for a graph-based inference system for graded if-then rules is interesting from several viewpoints. First, the notion of semantic entailment of the rules we consider is graded, i.e., the entailment expresses a degree to which a rule follows from other rules. It is therefore interesting to find a graph-based inference system that is able to infer rules from other ones including the entailment degrees. Second, there is an Armstrong-like axiomatization of the semantic entailment [8] for the graded rules, i.e., one might be interested in finding a corresponding graph-based inference method. Third, the Armstrong-like proofs can be formalized to form particular sequences (so-called MRAP-sequences, see [12]). It is therefore interesting to observe whether the graph-based proofs can be constructed according to the normalized proofs and vice versa. We address all the issues in this paper.

The paper is organized as follows. In Section 2, we present preliminaries from residuated structures and fuzzy attribute implications. Section 3 introduces derivation digraphs as particular labeled acyclic digraphs constructed from collections of FAIs. Furthermore, in Section 4, we prove that the construction of digraphs can be used to describe degrees of semantic entailment, i.e., we show a completeness result using the graph-based inference. Furthermore, Sections 5 Computing closures, 6 Proofs in normalized forms elaborate on related issues of computing closures and getting normalized proofs. We conclude the paper by an illustrative example in Section 7 and conclusions in Section 8.

Section snippets

Preliminaries

We recall basic notions of directed graphs, residuated lattices, and fuzzy attribute implications. Details can be found in [5], [7], [17], [21], [23].

A directed graph (a digraph) is a pair D=V,A, where V is a nonempty finite set of elements called vertices and A is a binary relation AV×V, each v,wA is called an arc. V and A are called the vertex set and the arc set of D, respectively. If v,wA, we say that the arc v,w leaves v and enters w. A digraph D=V,A is acyclic (in short, D is

Derivation acyclic digraphs for FAIs

We now introduce derivation digraphs as particular acyclic digraphs where vertices are labeled by attributes from Y and degrees from L. The arcs of the digraphs will correspond to fuzzy attribute implications from an input theory and indicate which formulas from the theory are used in the process of inference. In what follows, L is a complete residuated lattice. In order to denote that is a hedge on L, we write L.

Definition 1

T-based L-derivation DAG

Let T be a set of FAIs over Y.

  • 1.

    Any D=V, such that VY×L and for every yY

Completeness

We now turn our attention to the completeness by which we mean a characterization of the semantic entailment by existence of L-derivation DAGs. We prove the claim by showing that a FAI is provable from a theory T iff it has a T-based L-derivation DAG.

In order to simplify the proofs in this section, we use the following derived inference rules [8, Lemma 3.1]:

  • (Add)

    from AB and AC infer ABC,

  • (Pro)

    from ABC infer AB.

By a derivable rule we mean that for all A,B,C,D,ELY, from the part preceding “infer”,

Computing closures

Considering the construction of T-based L-derivation DAGs as an alternative proof technique not only can help visualize the inference from if-then rules but in addition, the construction of such DAGs yields algorithms for checking whether (and to what degree) AB semantically follows from a theory. Indeed, in order to check whether ABT=1, we may proceed as follows:

Procedure 1

[Checking of full entailment] For any T and AB:

  • 1.

    Construct a T-based L-derivation DAG D=V, withV={y,A(y)|A(y)>0};If V=,

Proofs in normalized forms

We now show that T-based L-derivation DAGs are in a correspondence with normalized proofs called MRAP-sequences [12]. A nonconstructive proof of this observation already follows from our observations since we have already established a completeness result in Theorem 5, and the inference system for the MRAP-sequences is also known to be complete, i.e., there is a correspondence between T-based L-derivation DAGs and MRAP-sequences representing the same proofs. Nevertheless, we focus in this

Illustrative example

For further illustration of the procedures discussed in the paper, we present here an extended example in which we use the similarity-based database semantics of formulas. Assume that a bank is keeping the following information about clients: city of residence (attribute c), age (attribute a), education (attribute e), job position (attribute j), salary (attribute s), loan amount (attribute l), account balance (attribute b), number of children (attribute ch), and insurance products (attribute i

Conclusions

The paper presents initial study of graph-based inference methods for graded if-then rules. We have introduced a notion of a T-based L-derivation directed acyclic graph (DAG) which generalizes the ordinary notion of a T-based derivation DAG from [27]. The main results show that degrees of semantic entailment of if-then rules from collections of other if-then rules can be characterized by the existence of such directed acyclic graphs. We have proved the result by showing correspondences between

Acknowledgements

L. Urbanova is supported by Grant No. P103/11/1456 of the Czech Science Foundation. V. Vychodil is supported by project reg. No. CZ.1.07/2.3.00/20.0059 of the European Social Fund in the Czech Republic.

References (37)

  • Giorgio Ausiello et al.

    Graph algorithms for functional dependency manipulation

    J. ACM

    (1983)
  • Jorgen Bang-Jensen et al.

    Digraphs: Theory, Algorithms and Applications. Springer Monographs in Mathematics

    (2010)
  • Catriel Beeri et al.

    Computational problems related to the design of normal form relational schemas

    ACM Trans. Database Syst.

    (1979)
  • Radim Belohlavek

    Fuzzy Relational Systems: Foundations and Principles

    (2002)
  • Radim Belohlavek et al.

    Attribute implications in a fuzzy setting

  • Radim Belohlavek, Vilem Vychodil, Axiomatizations of fuzzy attribute logic, in: B. Prasad (Ed.), Proc. 2nd Indian...
  • Radim Belohlavek et al.

    Fuzzy horn logic I: proof theory

    Arch. Math. Logic

    (2006)
  • Radim Belohlavek et al.

    Fuzzy horn logic II: implicationally defined classes

    Arch. Math. Logic

    (2006)
  • Cited by (5)

    • Closure structures parameterized by systems of isotone Galois connections

      2017, International Journal of Approximate Reasoning
      Citation Excerpt :

      Let us note that the presented inference system for if-then formulas which is parameterized by a system of isotone Galois connections is not the only one possible. Several equivalent systems can be introduced which are analogous to the graph-based system in [52] and the system based on the simplification equivalence [6] which is suitable for automated provers. Also note that Theorem 7 and Theorem 9 are limited to algebraic S-closure operators on algebraic lattices.

    • On sets of graded attribute implications with witnessed non-redundancy

      2016, Information Sciences
      Citation Excerpt :

      An alternative inference system based on the rules of simplification in presented in [6]. A graph-based inference system for FAIs is presented in [52]. The implications of [11] are more or less just theoretical because in practice one is unable to use such a graph-based procedure to find a system of pseudo-intents—because of the enormous size of G, enumerating of all maximal independent sets satisfying the additional condition (i) is intractable.

    View full text