Matrix computation for information systems

doi:10.1016/S0020-0255(00)00072-4

Information Sciences

Volume 131, Issues 1–4, January 2001, Pages 129-156

https://doi.org/10.1016/S0020-0255(00)00072-4 Get rights and content

Abstract

Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. We need a practical approach to apply the theory. Some problems (for example, the general problem of finding all reducts) are NP-hard. Thus, it is important to investigate computational methods for the theory. In rough set theory, a table called information system or database is used as a special kind of formal language to represent knowledge. We discuss matrix computation for information systems in this paper. Matrices can be computed intuitively and efficiently. A matrix can be seen as an internal representation of equivalence relations. Furthermore, matrix computation “min” implements the operation “AND” on equivalence relations, and the matrix relation “⩽” corresponds to the relation “⊆”. Finally, this operation on matrices is carried out for its elements individually. So it can be implemented in parallel mode to get concurrently the “AND” matrix.

Introduction

This work has two objectives: first, to introduce rough set theory, developed by Pawlak, to a wider audience; second, to present computational methods for the theory, allowing it to be implemented in many more systems. Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. This approach seems to be of fundamental importance to artificial intelligence and cognitive sciences. Although the burgeoning methodology has been successful in many real-life applications, there are still several theoretical problems to be solved. We need a practical approach to apply the theory. Some problems (for example, the general problem of finding all reducts) are NP-hard. Thus, it is important to investigate computational methods for the theory. This is a part of the work. In rough set theory, a table called information system or database is used as a special kind of formal language to represent knowledge. The issue of knowledge representation is primarily important. Semantically, knowledge is defined as partitions, and syntactically, information systems.

Furthermore, representing knowledge in a form of a matrix has many advantages. For example, [1], [2], [3], [4], [5], [6], [7] proposed to represent knowledge in a form of a discernibility matrix to enable simple conputation of the core, reducts, etc.

We discuss matrix computation for information systems in this paper. Equivalence relations are expressed in terms of matrices which can be computed intuitively and efficiently. We can introduce “AND” operations on the family of matrices and the family of equivalence relations. These two families are isomorphic. Then we have partial ordering relations “⊆” on these families.

A matrix can be seen as an internal representation of equivalence relations. Furthermore, matrix computation “min” implements the operation “AND” on equivalence relations, and the matrix relation “⩽” corresponds to the relation “⊆”. Finally, this operation on matrices is carried out for its elements individually. So it can be implemented in parallel mode to get concurrently the “AND” matrix.

In another paper, we have discussed matrix computation for knowledge bases. In this paper, we discuss how to apply these methods to databases and provide a new insight into properties of data, dependencies of attributes in databases. In Section 2, we discuss equivalence relations and matrices on universe U and their “AND” intersections. The time complexity for computing an intersection matrix of two matrices is O(|U|²), where |U| is the cardinality of U. In Section 3, we introduce information systems and attribute matrices. An information system consists of two finite sets: one universe U and one attribute set A. Algorithm M with the time complexity O(|U|²) from an attribute to find the corresponding matrix is given. The algorithm can be run in parallel mode to compute concurrently all corresponding matrices from many attributes. In Section 4, we discuss matrices for many attributes. Section 5 discusses functional dependencies. The time complexity to check the functional dependency of two attribute subsets is O(|U|²). Section 6 discusses identity dependencies. The time complexity to check the identity dependency of two attribute subsets is O(|U|²). Section 7 introduces the significance of an attribute. Let the attribute set be A. The time complexity for computing a significance is O(|A|×|U|²). Algorithm C with the time complexity O(|A|²|U|²) is presented to allow us to find the core, that is, the set of significant attributes. Section 8 introduces attribute dependencies on A. Section 9 discusses keys, that is, the minimal identity dependent subsets of A. We present Algorithm A with the time complexity O(|A|³|U|²) to find one key. The algorithm can be run in parallel mode to compute concurrently all keys. Algorithm K with the time complexity O(2^|A||A||U|²) to find all keys is presented in Section 10. Algorithm H with the time complexity O(|A|×|U|²) and Algorithm H^′ with the time complexity O(|A|²×|U|²) to find at most one key are presented in Section 11.

Section snippets

Equivalence matrices

Let U={u₁,u₂,…,u_n} be a non-empty finite set (n>0). We denote the number n of elements (called objects) in U by |U|. We call U the universe of objects.

A binary relation θ on U is called an equivalence (relation) if it is

(i) reflexive: $u θ u$ for all u∈U;
(ii) symmetric: if $u θ v$ , then $v θ u$ for all u,v∈U;
(iii) transitive: if $u θ v$ and $v θ w$ , then $u θ w$ for all u,v,w∈U.

An equivalence relation θ gives a partition π_θ={X₁,X₂,…,X_{|π_θ|}} such that u and v in U are in the same subset X if and only if

u θ v

, and vice

Information systems and attribute matrices

An information system $I$ is a system 〈U,A〉, where (1) U={u₁,u₂,…,u_i,…,u_|U|} is a finite non-empty set, called universe or object space; elements of U are called objects; (2) A={a₁,a₂,…,a_l,…,a_|A|} is also a finite non-empty set; elements of A are called attributes; (3) for every a∈A there is a mapping a from U into some space $a : U→a(U)$ , and $a(U)={a(u) | u∈U}$ is called the domain of attribute a. An information system is also called a knowledge representation system, an attribute-value system, or a

Matrices for many attributes

Now, consider a subset X⊆A. We have a corresponding equivalence matrix M_X=[r_ij] as follows: r_ij=1 if and only if a(u_i)=a(u_j) for every a∈X. We know that M_X=∩_a∈XM_a.

For the empty set, we take M_∅=M_δ, where $M_{δ} =[r_{ij}], r_{ij} =1$ for all i,j=1,2,…,|U|. That is, M_δ is the identity: M_δ∩M=M for every equivalence matrix M over U.

We know that M_δ is the greatest element (with respect to ⩽) in the semi-group (M,∩) of equivalence matrix set $M$ under intersection operation ∩.

Theorem 4.1

Let 〈U,A〉 be an information system. Then,

Functional dependencies

Definition 5.1

A functional dependency (FD) between two subsets X,Y⊆A of attributes in an information system $I =〈U,A〉$ , is a statement, denoted by X→Y, which holds in the information system $I$ , if and only if, for every pair u,v∈U we have that a(u)=a(v) for all a∈X implies a(u)=a(v) for all a∈Y.

Theorem 5.1

A functional dependency X→Y if and only if M_X⩽M_Y; i.e., ∩_a∈XM_a⩽∩_a∈YM_a.

Proof

By Definition 5.1, we know that X→Y, if and only if, for every pair u,v∈U we have that a(u)=a(v) for a∈X implies a(u)=a(v) for a∈Y; i.e., uθ_Xv implies

Identity dependencies

Definition 6.1

An identity dependency (ID) between two subsets X,Y⊆A of attributes in an information system $I =〈U,A〉$ , is a statement, denoted by X↔Y, which holds in the information system $I$ , if and only if, both X→Y and Y→X hold.

From Theorem 5.1, we have the following.

Theorem 6.1

An identity dependency X↔Y if and only if M_X=M_Y; i.e., ∩_a∈XM_a=∩_a∈YM_a.

Thus, we can easily check whether or not X↔Y for X,Y⊆A by checking M_X=M_Y, and the time complexity for checking it is O(|U|²).

Example 6.1

For the Skull database, we have the following e.g.: x

Significance and core

Let $I =〈U,A〉$ be an information system.

Definition 7.1

Let X be a non-empty subset of A: ∅⊂X⊆A. Given an attribute x∈X, we say that x is significant in X if M_X<M_X−{x}; and that x is not significant or nonsignificant in X if M_X=M_X−{x}.

That is, x∈X is significant in X if and only if $X X−{x};x∈X$ is not significant in X if and only if X↔X−{x}.

Example 7.1

Let X={x}, a singleton in 2^A.

Notice that $M_{X} =∩_{x∈X} M_{x} =M_{x},M_{X−{x}} =M_{∅} =M_{δ} =[r_{ij}], r_{ij} =1$ for all i,j. We have the following.

1.
x is significant in X if M_x≠M_δ.
2.
x is not significant in X if M_x=M

Attribute dependencies

Let 〈U,A〉 be an information system, where U is the universe, and A is the set of attributes.

Definition 8.1

Let X be a non-empty subset of $A : ∅⊂X⊆A$ . The non-empty subset X is said to be independent if each x∈X is significant in X; otherwise X is dependent.

An empty set ∅ is said to be independent.

Example 8.1

Let X={x}, a singleton in 2^A.

We have the following.

Case A. M_x≠M_δ. X is independent since x is significant in X.
Case B. M_x=M_δ. X is dependent since x is not significant.

Example 8.2

From Example 7.3, for the Skull database we know

Keys

Let 〈U,A〉 be an information system.

Definition 9.1

Let X be a (non-empty or empty) subset X⊆A of A. A subset X₀ of X is said to be a key of X if X₀ satisfies

1.
X₀↔X; i.e., M_X₀=M_X;
2.
if X′⊂X₀ then $X′ X$ ; i.e., if X′⊂X₀ then M_X<M_X′.

The empty subset ∅ has key ∅ (see Example 9.1 below).

That is, X₀ is a minimal identity dependent subset of X, i.e., we have X↔⋯↔X₀→X′→⋯ for X⊇⋯⊇X₀⊃X′⊇⋯.

From this definition, the time complexity to find keys is exponential. First, we need consider all |2^X|=2^|X| subsets of X. And for every

Finding all keys for a database

Let $I =〈U,A〉$ be an information system, where U is universe and A is the set of attributes. We want to find all its keys. That is, we want to find all subsets $A_{0} : A_{01},A_{02},…,A_{0s}$ of A such that

(1) M_A₀=M_A; i.e., A₀↔A; and
(2) if A′⊂A₀ then M_A<M_A′; i.e., if A′⊂A₀ then $A′ A$ .

Algorithm K

This algorithm finds all keys of A by searching from singletons to A. Let A={a₁,a₂,…,a_j,…,a_|A|}. We should check all the subsets of A. Let us denote the binomial coefficients by $C_{k}^{l} = l! k!(l−k)! .$

(1) Let us denote C₁^|A|=|A| singletons,

Finding one key

Algorithm H

Let 〈U,A〉 be an information system. Let X={x₁,x₂,…,x_j,…,x_|X|} be a subset of A. This algorithm finds at most one key of X.

Step 1. If |X|=1,X={x₁},M_x₁=M_δ, then ∅ is the unique key of X and the algorithm is completed. If |X|⩾1 and there exists an x∈X such that M_x≠M_δ then compute sig(x_j)=M_δ−M_{x_j} for j=1,2,…,|X|. Suppose that |sig(x_j₁)|⩾|sig(x_j₂)|⩾|sig(x_j₃)|⩾⋯⩾|sig(x_{j_|X|−1})|⩾|sig(x_{j_|X|})|, where |sig(x_j₁)|>0.

Step 2. If |sig(x_j₂)|=0 then (M_{x_j₂}=M_{x_j₃}=⋯=M_{x_{j_|X|}}=M_δ so we can take X={x_j₁,x_j₂} and) {x_j₁} is

Summary

Rough set theory is a new mathematical tool to deal with vagueness and uncertainty. This approach seems to be of fundamental importance to artificial intelligence and cognitive sciences. It is important to investigate computational methods for the theory. In this paper, we suggest a series of algorithms for the computation in information systems. Especially, we suggest the use of matrices. By using matrices, we can design some algorithms with lower price.

References (7)

Z. Pawlak
Rough Sets: Theoretical Aspects of Reasoning about Data
(1991)
Z. Pawlak et al.
Rough sets
CACM
(1995)
L. Polkowski et al.
Rough Sets in Knowledge Discovery, vol. 1, Methodology and Applications
(1998)

There are more references available in the full text version of this article.

Cited by (35)

Learning fuzzy rules from fuzzy samples based on rough set technique
2007, Information Sciences
Although the traditional rough set theory has been a powerful mathematical tool for modeling incompleteness and vagueness, its performance in dealing with initial fuzzy data is usually poor. This paper makes an attempt to improve its performance by extending the traditional rough set approach to the fuzzy environment. The extension is twofold. One is knowledge representation and the other is knowledge reduction. First, we provide new definitions of fuzzy lower and upper approximations by considering the similarity between the two objects. Second, we extend a number of underlying concepts of knowledge reduction (such as the reduct and core) to the fuzzy environment and use these extensions to propose a heuristic algorithm to learn fuzzy rules from initial fuzzy data. Finally, we provide some numerical experiments to demonstrate the feasibility of the proposed algorithm. One of the main contributions of this paper is that the fundamental relationship between the reducts and core of rough sets is still pertinent after the proposed extension.
Matrix representation of optimal scale for generalized multi-scale decision table
2021, Journal of Ambient Intelligence and Humanized Computing
Boolean Matrix Approach for Multi-scale Covering Decision Information System
2020, Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence
Matrix method for the optimal scale selection of multi-scale information decision systems
2019, Mathematics
Residual prediction method of subsequent spare parts based on exponential smoothing method and rough set theory
2018, Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics
Method for determining initial spares varieties of rough set based on incomplete information
2018, Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics

View all citing articles on Scopus

¹: Sidler Clarke Inc., 6465 Millcreek Drive, Unit 205, Mississauga, Ont., Canada L5N 5R3.

View full text