Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

As the importance of data analysis increases, clustering techniques have been more focused [3] and more clustering algorithms have been proposed.

Conventional clustering algorithms partition a set of objects into some clusters with clear boundaries. In other words, one object must belong to one cluster. k-means (KM) [4], also called hard c-means (HCM), is representative one.

However, the boundaries may not be clear in practice and quite a few objects should belong to more than one cluster. Fuzzy set representation of clusters makes it possible for each object to belong to more than one cluster and the degree of belongingness of an object to each cluster is represented as a value in an unit interval [0, 1]. Fuzzy c-means (FCM) [5, 6] achieves the representation by introducing a fuzzification parameter into KM.

On the other hand, it is pointed out that the fuzzy degree sometimes may be too descriptive for interpreting clustering results [1]. In such cases, rough set representation is considered as an useful and powerful tool [7, 8]. The basic concept of the representation is based on two definitions of lower and upper approximations of a set. The lower approximation means that “an object surely belongs to the set” and the upper one means that “an object possibly belongs to the set”. Clustering based on rough set could provide a solution that is less restrictive than conventional clustering and less descriptive than fuzzy clustering [1, 9], and therefore, the rough set based clustering has attracted increasing interest of researchers [1, 2, 10,11,12,13,14].

Rough k-means (RKM) proposed by Lingras et al. [1, 2] is one of initial rough set based clustering. In RKM, the degree of belongingness and cluster centers are calculated by iterative process like KM or FCM. However, RKM has a problem that the algorithm is not constructed based on optimization of an objective function. Here, we call clustering based on optimization of an objective function “objective-based clustering”. In other words, calculation outputs of the objective-based clustering make the objective function minimize.

Many non-hierarchical clustering algorithms such as KM and FCM are objective-based clustering. Calculation outputs of such algorithms strongly depend on initial values. Hence, we need some indicator when we choose the “better” outputs among many outputs from different initial values. The objective functions play very important role as the indicator, that is, we can choose the “better” outputs by comparing the value of the objective function of the output with each other.

RKM is one of the most representative algorithms inspired by KM and some assumptions of RKM are very natural, however it is not useful from the viewpoint that the algorithm is not based on any objective functions because we do not have any indicator to choose “better” outputs. Some rough set based clustering algorithms based on an objective function are proposed [12], however these may be a bit complicated and not easy to expand the theoretical discussion.

We have proposed some objective-based rough clustering methods. This paper shows objective functions and algorithms of the methods, type-I rough c-means, type-II rough c-means, and rough non metric model. In each method, we show rough hard clustering and rough fuzzy one.

2 Rough Sets

2.1 Concept of Rough Sets

Let U be the universe and \(R \subseteq U \times U\) be an equivalence relation on U. R is also called indiscernibility relation. The pair \(X=(U,R)\) is called an approximation space. If \(x, y \in U\) and \((x,y) \in R\), we say that x and y are indistinguishable in X.

Equivalence classes of the relation R is called elementary sets in X. The set of all elementary sets is denoted by U / R. The empty set is also elementary in every X.

Every finite union of elementary sets in X is called a composed set in X.

Since it is impossible to distinguish the elements in the same equivalence class, we may not be able to get a precise representation for an arbitrary subset \(A \subset U\). Instead, any A can be represented by its lower and upper bounds. The upper bound \(\overline{A}\) is the least composed set in X that contains A, called the best upper approximation or, in short, the upper approximation. The lower bound \(\underline{A}\) is the greatest composed set in X that is included in A, called the best lower approximation or, briefly, the lower approximation. The set \(\mathrm{Bnd}(A) = \overline{A} - \underline{A}\) is called the boundary of A in X.

The pair \((\underline{A},\overline{A})\) is the representation of an ordinary set A in the approximation space X, or simply the rough set of A. The elements in the lower approximation of A definitely belong to A, while elements in the upper bound of A may or may not belong to A.

2.2 Conditions of Rough Clustering

Let a set of objects and a set of equivalence classes by an equivalence relation R be \(U=\{x_k \mid x_k=(x_{k1},\dots ,x_{kp})^T \in \mathfrak {R}^p, \ k=1,\dots ,n\}\) and \(U/R=\{A_i \mid i=1,\dots ,c\}\), respectively. \(v_i=(v_{i1},\dots ,v_{ip})^T \in \mathfrak {R}^p\) \((i=1,\dots ,c)\) means a cluster center of a cluster \(A_i\). We notice that \(A_i \ne \emptyset \) for any i. That is, \(\underline{A} = \emptyset \) means that \(\mathrm{Bnd}(A) \ne \emptyset \). Similarly, \(\mathrm{Bnd}(A) = \emptyset \) means that \(\underline{A} \ne \emptyset \).

Lingras et al., who proposed rough K-means (RKM) [1, 2], put the following conditions. Their conditions are very natural from the viewpoint of the definition of rough sets.

(C1):

An object x can be part of at most one lower bound.

(C2):

If \(x \in \underline{A}_i\), \(x \in \overline{A}_i\).

(C3):

An object x is not part of any lower bound if and only if x belongs to two or more upper bounds.

Note that the above conditions are not necessarily independent or complete.

3 Type-I Rough c-Means

3.1 Type-I Rough Hard c-Means

In this section, we describe type-I rough c-means (RCM-I). In order to distinguish the later mentioned rough fuzzy c-means, we also write type-I rough hard c-means (RHCM-I).

3.1.1 Objective Function

For any objects \(x_k=(x_{k1},\dots ,x_{kp})^T\in \mathfrak {R}^p\) \((k=1,\dots ,n)\), \(\nu _{ki}\) and \(u_{ki}\) \((i=1,\dots ,c)\) mean belongingness of an object \(x_k\) to a lower approximation of \(A_i\) and a boundary of \(A_i\), respectively. Partition matrices of \(\nu _{ki}\) and \(u_{ki}\) are denoted by \(N=\{\nu _{ki}\}\) and \({U}=\{u_{ki}\}\), respectively. We define an objective function of RCM-I as follows:

$$\begin{aligned} J_\text {RCM-I}(N,{U},V)&= \sum _{k=1}^{n} \sum _{l=1}^{n} \sum _{i=1}^{c} \Big ( \nu _{ki}u_{li}(\underline{w}d_{ki} + \overline{w}d_{li}) +(\nu _{ki}\nu _{li}+u_{ki}u_{li})D_{kl} \Big ), \end{aligned}$$
(1)

where

$$\begin{aligned}\begin{gathered} d_{ki} = \Vert x_k - v_i \Vert ^2, \quad D_{kl} = \Vert x_k - x_l\Vert ^2. \end{gathered}\end{aligned}$$

For any k, constraints are as follows:

$$\begin{aligned}\begin{gathered} \underline{w}+\overline{w}=1, \\ \nu _{ki} \in \{0,1\} , \quad u_{ki} \in \{0,1\}, \\ \sum _{i=1}^c \nu _{ki} \in \{0,1\}, \quad \sum _{i=1}^c u_{ki} \ne 1, \\ \sum _{i=1}^c \nu _{ki} = 1 \Longleftrightarrow \sum _{i=1}^c u_{ki} = 0. \end{gathered}\end{aligned}$$

The last term of (1) is a regularized term. If the term does not exist, it results in trivial solutions of \(\nu _{ki}=0\) and \(u_{ki}=0\). From the above constraints, we can derive the following relation for any k:

$$\begin{aligned} \sum _{i=1}^c \nu _{ki} = 0 \Longleftrightarrow \sum _{i=1}^c u_{ki} \ge 2 \end{aligned}$$

It is obvious that these relations are equivalent to (C1)–(C3) in Sect. 2.2.

3.1.2 Derivation of Optimal Solutions and Algorithm

We’ll obtain an optimal solution to \(v_i\) with fixing \(\nu _{ki}\) and \(u_{ki}\). Here, we introduce the following function:

$$\begin{aligned} J^i_\text {RCM-I}(V)&= \sum _{k=1}^{n} \sum _{l=1}^{n} \Big ( \nu _{ki}u_{li}(\underline{w}d_{ki} + \overline{w}d_{li})\Big ). \end{aligned}$$
(2)

Since \(\nu _{ki}\) and \(u_{ki}\) are fixed, \(v_i\) which minimizes \(J^i_\text {RCM-I}\) is an optimal solution which also minimizes \(J_\text {RCM-I}\). Now, we have to consider the following two cases:

  1. 1.

    \(\underline{A}_i \ne \emptyset \) and \(\mathrm{Bnd}(A_i) \ne \emptyset \), that is, \(|\underline{A}_i| \cdot |\mathrm{Bnd}(A_i)| \ne 0\).

  2. 2.

    \(\underline{A}_i = \emptyset \) or \(\mathrm{Bnd}(A_i) = \emptyset \), that is, \(\nu _{ki} = 0\) or \(u_{ki} = 0\) for any k.

If \(\underline{A}_i \ne \emptyset \) and \(\mathrm{Bnd}(A_i) \ne \emptyset \), from partially differentiating (2) by \(v_i\),

$$\begin{aligned} \frac{\partial J^i_\text {RCM-I}}{\partial v_i} = \sum _{k=1}^{n} \sum _{l=1}^{n} \nu _{ki} u_{li} \left( \underline{w} (x_k-v_i)+\overline{w} (x_l-v_i)\right) . \nonumber \end{aligned}$$

From \(\frac{\partial J^i_\text {RCM-I}}{\partial v_i} = 0\),

$$\begin{aligned} \sum _{k=1}^{n} \sum _{l=1}^{n} \nu _{ki} u_{li} v_i&= \sum _{k=1}^{n} \sum _{l=1}^{n} \nu _{ki} u_{li} (\underline{w}x_k + \overline{w}x_l), \nonumber \end{aligned}$$

then, we get

$$\begin{aligned} \sum _{k=1}^{n} \nu _{ki} \sum _{l=1}^{n} u_{li} v_i = \underline{w} \sum _{l=1}^{n} u_{li} \sum _{k=1}^{n} \nu _{ki} x_k + \overline{w}\sum _{k=1}^{n} \nu _{ki} \sum _{l=1}^{n} u_{li} x_l. \end{aligned}$$
(3)

We here notice the following relations:

$$\begin{aligned}\begin{gathered} |\underline{A}_i| = \sum _{k=1}^{n} \nu _{ki}, \quad |\mathrm{Bnd}(A_i)| = \sum _{l=1}^{n} u_{li}. \end{gathered}\end{aligned}$$

Then, (3) can be rewritten as follows:

$$\begin{aligned} |\underline{A}_i| \cdot |\mathrm{Bnd}(A_i)| v_i&=\underline{w} |\mathrm{Bnd}(A_i)| \sum _{x_k\in \underline{A}_i}x_k +\overline{w} |\underline{A}_i|\sum _{x_k\in \mathrm{Bnd}(A_i)}x_k. \end{aligned}$$

Since \(|A_i| \cdot |\mathrm{Bnd}(A_i)|\ne 0\),

$$\begin{aligned} v_i&=\underline{w} \displaystyle {\frac{\sum _{x_k\in \underline{A}_i}x_k}{|\underline{A}_i|}} + \overline{w} \displaystyle {\frac{\sum _{x_k\in \mathrm{Bnd}(A_i)}x_k}{|\mathrm{Bnd}(A_i)|}}. \end{aligned}$$

On the other hand, if \(\underline{A}_i = \emptyset \) or \(\mathrm{Bnd}(A_i) = \emptyset \), \(\nu _{ki} = 0\) or \(u_{ki} = 0\) for any k. In the both cases, \(J^i_\text {RCM-I}\) becomes the minimum value 0 in spite of \(v_i\). Therefore, we can determine \(v_i\) as follows:

$$\begin{aligned} v_i = \frac{\sum _{x \in \overline{A}_i} x}{|\overline{A}_i|}. \end{aligned}$$

From the above discussion, the optimal solution to \(v_i\) is (4).

$$\begin{aligned} v_i ={\left\{ \begin{array}{ll} \displaystyle \underline{w} \frac{\sum _{x \in \underline{A}_i}x}{|\underline{A}_i|} + \overline{w} \frac{\sum _{x \in \mathrm{Bnd}(A_i)} x}{|\mathrm{Bnd}(A_i)|}, \quad (\underline{A}_i \ne \emptyset \wedge \mathrm{Bnd}(A_i) \ne \emptyset ) \\ \displaystyle \frac{\sum _{x \in \overline{A}_i} x}{|\overline{A}_i|}. \quad (\text {otherwise}) \end{array}\right. } \end{aligned}$$
(4)

Optimal solutions to \(\nu _{ki}\) and \(u_{ki}\) can be obtained by comparing the following two cases:

  1. 1.

    \(x_k\) belongs to the lower approximation \(\underline{A}_{p_k}\) of which the cluster center \(v_i\) is nearest to \(x_k\). Here,

    $$\begin{aligned} p_k=\arg \min _i d_{ki}. \end{aligned}$$

    In this case, the value of the term for \(x_k\) of the objective function can be calculated as follows:

    $$\begin{aligned} J^\nu _k&=\sum _{l=1,t\ne k}^n\Big (\nu _{kp_k}u_{lp_k}(\underline{w} d_{kp_k} + \overline{w}d_{lp_k}) +(\nu _{kp_k}\nu _{lp_k} + u_{kp_k}u_{lp_k}) D_{kl} \Big ) \nonumber \\&=\sum _{l=1,l\ne k}^n \Big ( \nu _{kp_k} u_{lp_k}(\underline{w}d_{kp_k} + \overline{w}d_{lp_k}) + \nu _{kp_k} \nu _{lp_k}D_{kl} \Big ). \nonumber \end{aligned}$$
  2. 2.

    \(x_k\) belongs to the upper approximation of two clusters \(\overline{A}_{p_k}\) and \(\overline{A}_{q_k}\) of which the cluster centers \(v_{p_k}\) and \(v_{q_k}\) are the first and second nearest to \(x_k\). Here,

    $$\begin{aligned} q_k=\arg \min _{i\ne p_k}d_{ki}. \end{aligned}$$

    In this case, the value of the terms for \(x_k\) of the objective function can be calculated as follows:

    $$\begin{aligned} J^u_k&= \sum _{l=1,l \ne k}^n \sum _{i=p_k,q_k}^c \Big (\nu _{li}u_{ki}(\underline{w}d_{li} + \overline{w}d_{ki}) + (\nu _{ki} \nu _{li} +u_{ki} u_{li}) D_{kl} \Big )\nonumber \\&= \sum _{l=1,l \ne k}^n \sum _{i=p_k,q_k}^c \Big ( \nu _{li} u_{ki} (\underline{w}d_{ki} + \overline{w}d_{ki}) + u_{ki} u_{li} D_{kl} \Big ). \nonumber \end{aligned}$$

In comparison with \(J^\nu _k\) and \(J^u_k\), we determine \(\nu _{ki}\) and \(u_{ki}\) as follows:

$$\begin{aligned} \nu _{ki}&= {\left\{ \begin{array}{ll} 1 ,&{} (J^\nu _k < J^u_k \wedge i = p_k) \\ 0. &{} (\text {otherwise}) \end{array}\right. } \\ u_{ki}&= {\left\{ \begin{array}{ll} 1, &{} \Big (J^\nu _k \ge J^u_k \wedge (i = p_k \vee i = q_k)\Big ) \\ 0. &{} (\text {otherwise}) \end{array}\right. } \end{aligned}$$

Here, we construct RCM-I algorithm using optimal solutions to N, V, and U which are derived in the above. In practice, the optimal solutions are calculated through iterative optimization. We show RCM-I algorithm as Algorithm 1.

figure a

We can consider the Sequential RCM-I (SRCM-I) in which cluster centers are re-calculated every time cluster partition changes as Algorithm 2.

figure b

3.2 Type-I Rough Fuzzy c-Means

We have two ways to fuzzify RCM-I by introducing fuzzy-set representation.

The first is to introduce the fuzzification parameter m, and the second is to introduce the entropy term. These ways are known to be very useful. We call the method using the first way type-I rough fuzzy c-means (RFCM-I). and that using the second way entropy-regularized type-I rough fuzzy c-means (ERFCM-I), as mentioned above. In this paper, we describe RFCM-I.

In RFCM-I, degrees of belongingness to \(\mathrm{Bnd}(A_i)\) are only fuzzified.

3.2.1 Objective Function

The objective function of RFCM-I is defined as follows:

$$\begin{aligned} J_{\text {RFCM-I}}(N,U,V)&=\sum ^n_{k=1}\sum ^n_{l=1}\sum ^c_{i=1}\left( u^m_{ki}\nu _{li}(\underline{\omega }d_{li}+\overline{\omega }d_{ki}) +(\nu _{ki}\nu _{li}+u^m_{ki}u^m_{li}) D_{kl}\right) . \end{aligned}$$
(5)

Constraints are as follows:

$$\begin{aligned}\begin{gathered} \underline{\omega }+\overline{\omega }=1, \\ \nu _{ki}\in \{0,1\}, \quad u_{li}\in [0,1], \forall k,i\\ \sum _{i=1}^c (\nu _{ki}+u_{ki}) =1, \ \forall k \\ \end{gathered}\end{aligned}$$

3.2.2 Derivation of Optimal Solutions and Algorithm

To get an optimal solution to \(v_i\), we partially differentiate (5) with respect to \(v_i\), getting

$$\begin{aligned} v_i&= {\left\{ \begin{array}{ll} \displaystyle \frac{\sum _{x_k\in \underline{A}_i}x_k}{|\underline{A}_i|}, &{}(\mathrm{Bnd}(A_i)=\emptyset )\\ \displaystyle \frac{\sum ^n_{k=1}u^m_{ki}x_k}{\sum ^n_{k=1}u^m_{ki}}, &{}(\underline{A}_i=\emptyset )\\ \underline{\omega }\times \displaystyle \frac{\sum _{x_k\in \underline{A}_i}x_k}{|\underline{A}_i|}+\overline{\omega }\times \displaystyle \frac{\sum ^n_{k=1}u^m_{ki}x_k}{\sum ^n_{k=1}u^m_{ki}}. &{} (\text {otherwise}) \end{array}\right. } \end{aligned}$$

We must consider the following two cases to derive optimal solutions to N and U:

  1. 1.

    \(x_k\) belongs to \(\underline{A}_{p_k}\).

  2. 2.

    \(x_k\) belongs to \(\mathrm{Bnd}(A_i). \quad \forall i\)

If \(x_k\) belongs to \(\underline{A}_{p_k}\), optimal solutions and the objective function are represented as follows:

$$\begin{aligned}&\nu _{ki} = 1, \quad u_{ki} = 0, \nonumber \\ \underline{J}^k_{\text {RFCM-I}}=\sum ^n_{l=1,l \ne k}&(u^m_{lp_k}(\underline{\omega }d_{kp_k}+\overline{\omega }d_{lp_k})+2\nu _{lp_k}D_{kl}). \end{aligned}$$
(6)

If \(x_k\) belongs to \(\mathrm{Bnd}(A_i)\), optimal solutions and the objective function are represented as follows:

$$\begin{aligned}&\nu _{ki} = 0, \quad u_{ki} = \displaystyle \frac{\left( \frac{1}{\alpha _i}\right) ^{\frac{1}{m-1}}}{\sum ^c_{j=1}\left( \frac{1}{\alpha _j}\right) ^{\frac{1}{m-1}}}_{,} \nonumber \\ \overline{J}^k_{\text {RFCM-I}}=&\sum _{i=1}^c \sum ^n_{l=1, l \ne k} (u^m_{ki}\nu _{li}(\underline{\omega }d_{li}+\overline{\omega }d_{ki})+2u^m_{ki}u^m_{li}D_{kl}). \end{aligned}$$
(7)

Here,

$$\begin{aligned} \alpha _i = \sum ^n_{l=1,l \ne k}(\nu _{li}(\underline{\omega }d_{li}+\overline{\omega }d_{ki})+2u^m_{li}D_{kl}) \end{aligned}$$

We calculate the optimal solution to \(u_{ki}\) by using the Lagrange multiplier. From (5), the Lagrange function of RFCM-I is defined as follows:

$$\begin{aligned}&L_{\text {RFCM-I}} = \sum ^n_{k=1}\sum ^n_{l=1}\sum ^c_{i=1}\left( u^m_{ki}\nu _{li}(\underline{\omega }d_{li}+\overline{\omega }d_{ki}) \right. \\&\left. \quad +(\nu _{ki}\nu _{li}+u^m_{ki}u^m_{li}) D_{kl}\right) - \sum _{k=1}^n \lambda _k \left( \sum _{i=1}^c u_{ki} -1 \right) . \end{aligned}$$

Comparing (6) and (7), the optimal solutions to N and U are as follows:

$$\begin{aligned} \nu _{ki}&= {\left\{ \begin{array}{ll} 1, &{} (J^{\nu }_k<J^{u}_k\wedge i=p_k) \\ 0, &{} (\text {otherwise})\\ \end{array}\right. }\\ u_{ki}&= {\left\{ \begin{array}{ll} 0, &{}(J^{\nu }_k<J^{u}_k\wedge i=p_k) \\ \displaystyle \frac{\left( \frac{1}{\alpha _i}\right) ^{\frac{1}{m-1}}}{\sum ^c_{j=1}\left( \frac{1}{\alpha _j}\right) ^{\frac{1}{m-1}}}_{.} &{} (\text {otherwise}) \end{array}\right. } \end{aligned}$$

Last, we describe the algorithm of RFCM-I.

figure c

4 Type-II Rough c-Means

We propose another method: type-II rough c-means (RCM-II) or type-II rough hard c-means (RHCM-II) to solve Lingras’s problems. The objective function of RCM-II is simpler than RCM-II.

4.1 Type-II Rough Hard c-Means

4.1.1 Objective Function

Let \(N=(\nu _{ki})_{1\le k \le n , \ 1\le i \le c}\) and \(U=(u_{ki})_{1\le k \le n, \ 1\le i \le c}\) be degrees of belongingness of \(x_k\) to \(\underline{A}_i\) and \(\mathrm{Bnd}(A_i)\). Let V be a set of cluster centers. The objective function of RCM-II is defined as follows:

$$\begin{aligned} J_{\text {RCM-II}}(N,U,V) = \sum _{k=1}^n \sum _{i=1}^c (\nu _{ki} \underline{w} + u_{ki} \overline{w}) \Vert x_k -v_i \Vert ^2. \end{aligned}$$
(8)
$$\begin{aligned}\begin{gathered} \underline{w} + \overline{w}=1 , \ \underline{w}>0, \ \overline{w}>0. \ (1\le k \le n, \ 1 \le i \le c) \end{gathered}\end{aligned}$$

Constraints are as follows:

$$\begin{aligned}\begin{gathered} \nu _{ki}, u_{ki} \in \{0,1\} , \quad \forall k,i \\ \sum _{i=1}^c \nu _{ki} \in \{0,1\}, \quad \sum _{i=1}^c u_{ki} \ne 1, \quad \forall k \\ \sum _{i=1}^c \nu _{ki} = 1 \iff \sum _{i=1}^c u_{ki} = 0. \quad \forall k \\ \end{gathered}\end{aligned}$$

From these constraints, the following restriction holds true:

$$\begin{aligned} \sum _{i=1}^c \nu _{ki} = 0 \iff \sum _{i=1}^c u_{ki} > 1. \quad \forall k \end{aligned}$$

These constraints are clearly equivalent to (C1)–(C3). \(J_{\text {RCM-II}}\) is minimized under these constraints.

4.1.2 Derivation of Optimal Solutions and Algorithm

We partially differentiate (8) with respect to \(v_i\). We get

$$\begin{aligned} v_i = \frac{\sum _{k=1}^n (\underline{w} \nu _{ki}+ \overline{w} u_{ki})x_k}{\sum _{k=1}^n (\underline{w} \nu _{ki}+ \overline{w} u_{ki})}. \end{aligned}$$
(9)

We must consider the following two cases to derive optimal solutions to N and U:

  1. 1.

    \(x_k\) belongs to \(\underline{A}_{p_k}\).

  2. 2.

    \(x_k\) belongs to \(\mathrm{Bnd}(A_{p_k})\) and \(\mathrm{Bnd}(A_{q_k})\).

Here,

$$\begin{aligned} p_k = \min \limits _i \Vert x_k -v_i \Vert ^2, \\ q_k = \min \limits _{i \ne p_k}\Vert x_k - v_i \Vert ^2. \end{aligned}$$

If \(x_k\) belongs to \(\underline{A}_{p_k}\), we get the value of the objective function as follows:

$$\begin{aligned} \underline{J}_{\text {RCM-II}}^k = \underline{w} \Vert x_k - v_{p_k}\Vert ^2. \end{aligned}$$
(10)

If \(x_k\) belongs to \(\mathrm{Bnd}(A_{p_k})\) and \(\mathrm{Bnd}(A_{q_k})\), we get the value of the objective function as follows:

$$\begin{aligned} \overline{J}_{\text {RCM-II}}^k = \sum _{i=p_k,q_k}\overline{w} \Vert x_k - v_{i} \Vert ^2. \end{aligned}$$
(11)

Comparing (10) and (11), we derive the optimal solution to N and U as follows:

$$\begin{aligned} \nu _{ki}&= {\left\{ \begin{array}{ll} 1, &{} (\underline{J}_{\text {RCM-II}}^k < \overline{J}_{\text {RCM-II}}^k \wedge i=p_k) \\ 0. &{} ({\text {otherwise}}) \end{array}\right. }\\ u_{ki}&= {\left\{ \begin{array}{ll} 1, &{} (\underline{J}_{\text {RCM-II}}^k \ge \overline{J}_{\text {RCM-II}}^k \wedge (i=p_k \vee i=q_k))\\ 0. &{} ({\text {otherwise}}) \end{array}\right. } \end{aligned}$$

We describe the RCM-II algorithm as follows:

figure d

4.2 Type-II Rough Fuzzy c-Means

4.2.1 Objective Function

Here, we propose another method: type-II rough fuzzy c-means (RFCM-II) to solve Lingras’s problems. RFCM-II is an extended method using the concept of fuzzy theory. In RFCM-II, degrees of belongingness to \(\mathrm{Bnd}(A_i)\) are only fuzzified. The objective function of RFCM-II is defined as follows:

$$\begin{aligned} J_{\text {RFCM-II}}(N,U,V) = \sum _{k=1}^n \sum _{i=1}^c (\underline{w} \nu _{ki}+ \overline{w} u_{ki}^m)\Vert x_k -v_i\Vert ^2. \end{aligned}$$
(12)
$$\begin{aligned}\begin{gathered} \underline{w} + \overline{w}=1 , \ \underline{w}>0, \ \overline{w}>0. \ (1\le k \le n, \ 1 \le i \le c) \end{gathered}\end{aligned}$$

Constraints are as follows:

$$\begin{aligned}\begin{gathered} \nu _{ki}, u_{ki} \ge 0 , \quad \forall k,i \\ \sum _{i=1}^c (\nu _{ki}+ u_{ki}) = 1, \quad \forall k \end{gathered}\end{aligned}$$

\(J_{\text {RFCM-II}}\) is minimized under these constraints.

4.2.2 Derivation of Optimal Solutions and Algorithm

First, we derive an optimal solution of the cluster center. Similar to Sect. 4.1.2, we get

$$\begin{aligned} v_i = \frac{\sum _{k=1}^n (\underline{w} \nu _{ki}+ \overline{w} u_{ki}^m)x_k}{\sum _{k=1}^n (\underline{w} \nu _{ki} + \overline{w}u_{ki}^m)} . \end{aligned}$$
(13)

Next, we derive optimal solutions of lower approximation and boundary. Similar to Sect. 4.1.2, we must consider the following two cases to derive optimal solutions to N and U:

  1. 1.

    \(x_k\) belongs to \(\underline{A}_{p_k}\).

  2. 2.

    \(x_k\) belongs to \(\mathrm{Bnd}(A_{i})\) \(\forall i\).

If \(x_k\) belongs to \(\underline{A}_{p_k}\), we get the value of the objective function as follows:

$$\begin{aligned} \underline{J}_{\text {RFCM-II}}^k = \underline{w} \nu _{ki} \Vert x_k - v_{p_k}\Vert ^2. \end{aligned}$$
(14)

If \(x_k\) belongs to \(\mathrm{Bnd}(A_i)\), we get the value of the objective function as follows:

$$\begin{aligned} \overline{J}_{\text {RFCM-II}}^k = \sum _{i=1}^c \overline{w} u_{ki}^m \Vert x_k -v_i\Vert ^2. \end{aligned}$$
(15)

Comparing (14) and (15), we derive the optimal solution to N and U as follows:

$$\begin{aligned} \nu _{ki}&= {\left\{ \begin{array}{ll} 1, &{} (\underline{J}_{\text {RFCM-II}}^k < \overline{J}_{\text {RFCM-II}}^k \wedge i=p_k) \\ 0. &{} ({\text {otherwise}}) \end{array}\right. }\\ u_{ki}&= {\left\{ \begin{array}{ll} \displaystyle \frac{\left( \frac{1}{\Vert x_k -v_i \Vert ^2}\right) ^{\frac{1}{m-1}}}{\sum _{j=1}^c \left( \frac{1}{\Vert x_k - v_j\Vert ^2}\right) ^{\frac{1}{m-1}}}, &{} (\underline{J}_{\text {RFCM-II}}^k \ge \overline{J}_{\text {RFCM-II}}^k)\\ 0. &{} ({\text {otherwise}}) \end{array}\right. } \end{aligned}$$

We calculate the optimal solution to \(u_{ki}\) by using the Lagrange multiplier method. The Lagrange function of RFCM-II is defined as follows:

$$\begin{aligned} L_{\text {RFCM-II}}&= \sum _{k=1}^n \sum _{i=1}^c \left( \underline{w} \nu _{ki}+ \overline{w} u_{ki}^m\right) \Vert x_k -v_i\Vert ^2 - \sum _{k=1}^n \lambda _k \left( \sum _{i=1}^c u_{ki} -1\right) . \end{aligned}$$

From the above discussion, we describe the RFCM-II algorithm.

figure e

5 Rough Non Metric Model

5.1 Rough Hard Non Metric Model

5.1.1 Objective Function

To construct a new relational clustering algorithm based on rough sets, rough non metric model (RNM) or rough hard non metric model (RHNM), we define the following objective function based on Non Metric Model by Roubens [15]:

$$\begin{aligned} J_\text {RNM}(N,U) = \underline{w} \sum _{i=1}^c \sum _{k=1}^n \sum _{t=1}^n \nu _{ki} \nu _{ti} D_{kt} + \overline{w} \sum _{i=1}^c \sum _{k=1}^n \sum _{t=1}^n u_{ki} u_{ti} D_{kt}. \end{aligned}$$
(16)

Here \(\underline{w}+\overline{w}=1\) and \(\underline{w} \in (0,1)\). If \(\underline{w}\) is close to 0, almost all objects belong to the lower approximation. If \(\underline{w}\) is close to 1, however, almost all objects belong to the upper approximation. \(\underline{w}\) (or \(\overline{w}\)) therefore controls belongingness and it plays a very important role in our proposed methods. \(D_{kt}\) means a dissimilarity between \(x_k\) and \(x_t\). One of the examples is a Euclidean norm:

$$\begin{aligned} D_{kt} = \Vert x_k - x_t\Vert ^2. \end{aligned}$$

We consider the following conditions for \(\nu _{ki}\) and \(u_{ki}\):

$$\begin{aligned} \nu _{ki} \in \{0,1\} , \qquad u_{ki} \in \{0,1\}. \end{aligned}$$

From (C1)–(C3) in Sect. 2.2, we derive the following constraints:

$$\begin{aligned}\begin{gathered} \sum _{i=1}^c \nu _{ki} \in \{0,1\}, \qquad \sum _{i=1}^c u_{ki} \ne 1, \\ \sum _{i=1}^c \nu _{ki} = 1 \Longleftrightarrow \sum _{i=1}^c u_{ki} = 0. \end{gathered}\end{aligned}$$

From the above constraints, we derive the following relation for any k:

$$\begin{aligned} \sum _{i=1}^c \nu _{ki} = 0 \Longleftrightarrow \sum _{i=1}^c u_{ki} \ge 2. \end{aligned}$$

It is obvious that these relations are equivalent to (C1)–(C3) in Sect. 2.2

5.1.2 Derivation of Optimal Solutions and Algorithm

Optimal solutions to \(\nu _{ki}\) and \(u_{ki}\) are obtained by comparing the following two cases for each \(x_k\):

  1. 1.

    \(x_k\) belongs to the lower approximation \(\underline{A}_{p_k}\).

  2. 2.

    \(x_k\) belongs to the boundaries of two clusters \(\overline{A}_{q^1_k}\) and \(\overline{A}_{q^2_k}\).

We describe the details of each case as follows.

In the first case, let us assume that \(x_k\) belongs to the lower approximation \(\underline{A}_{p_k}\). \(p_k\) is derived as follows:

The objective function J is rewritten as follows:

$$\begin{aligned} J_\text {RNM}(N,U) = \underline{w} \sum _{i=1}^c\underline{J}_{i} + \overline{w} \sum _{i=1}^c \sum _{l=1}^n \sum _{t=1}^n u_{li} u_{ti} D_{lt}. \end{aligned}$$

Here

$$\begin{aligned} \underline{J}_{i} = \left( 2 \nu _{ki} \sum _{t=1}^n \nu _{ti} D_{kt} + \sum _{l=1, l\ne k}^n \sum _{t=1, t\ne k}^n \nu _{li} \nu _{ti} D_{lt} \right) . \end{aligned}$$

Note that \(D_{kk}=0\) and \(D_{kt}=D_{tk}\), therefore

$$\begin{aligned} p_k=\arg \min _i \sum _{t=1}^n \nu _{ti} D_{kt}. \end{aligned}$$
(17)

This means the following relations:

$$\begin{aligned} \nu _{ki}&= {\left\{ \begin{array}{ll} 1, &{} (i = p_k) \\ 0, &{} (\text {otherwise}) \end{array}\right. } \\ u_{ki}&= 0. \quad (\forall i) \end{aligned}$$

In this case, the value of the objective function is calculated as follows:

$$\begin{aligned} J_\text {RNM}(N,U)&= \underline{w} \left( 2 \sum _{t=1}^n \nu _{tp_k}D_{kt} + \sum _{i=1}^c \sum _{l=1, l\ne k}^n \sum _{t=1, t\ne k}^n \nu _{li} \nu _{ti} D_{lt} \right) + \overline{w} \sum _{i=1}^c \sum _{l=1}^n \sum _{t=1}^n u_{li} u_{ti} D_{lt} \\&= 2 J^\nu _k + J_c. \end{aligned}$$

Here

$$\begin{aligned} J^\nu _k&= \sum _{t=1}^n \left( \underline{w} \nu _{tp_k} + \sum _{i=1}^c \overline{w} u_{ki} u_{ti} \right) D_{kt} = \underline{w} \sum _{t=1}^n \nu _{tp_k} D_{kt}, \\ J_c&= \underline{w} \sum _{i=1}^c \sum _{l=1, l\ne k}^n \sum _{t=1, t\ne k}^n \nu _{li} \nu _{ti} D_{lt} + \overline{w} \sum _{i=1}^c \sum _{l=1, l\ne k}^n \sum _{t=1, t\ne k}^n u_{li} u_{ti} D_{lt}. \nonumber \end{aligned}$$
(18)

In the second case, let us assume that \(x_k\) belongs to the boundaries of two clusters \(\overline{A}_{q^1_k}\) and \(\overline{A}_{q^2_k}\). \(q^1_k\) and \(q^2_k\) are derived as follows:

The objective function J is rewritten as follows:

$$\begin{aligned} J_\text {RNM}(N,U) = \overline{w} \sum _{i=1}^c \overline{J}_{i} + \underline{w} \sum _{i=1}^c \sum _{l=1}^n \sum _{t=1}^n \nu _{ki} \nu _{ti} D_{lt}. \end{aligned}$$

Here

$$\begin{aligned} \overline{J}_{i} = \left( 2 u_{ki} \sum _{t=1}^n u_{ti} D_{kt} + \sum _{l=1, l\ne k}^n \sum _{t=1, t\ne k}^n u_{li} u_{ti} D_{lt} \right) . \end{aligned}$$

Therefore

$$\begin{aligned} q^1_k&= \arg \min _i \sum _{t=1}^n u_{ti} D_{kt}, \end{aligned}$$
(19)
$$\begin{aligned} q^2_k&= \arg \min _{i,i\ne q^1_k} \sum _{t=1}^n u_{ti} D_{kt}. \end{aligned}$$
(20)

This means the following relations:

$$\begin{aligned} \nu _{ki}&= 0, \quad (\forall i) \\ u_{ki}&= {\left\{ \begin{array}{ll} 1, &{} ( i = q^1_k \vee i = q^2_k) \\ 0. &{} (\text {otherwise}) \end{array}\right. } \end{aligned}$$

In this case, the value of the objective function is calculated as follows:

$$\begin{aligned} J_\text {RNM}(N,U)&= \underline{w} \sum _{i=1}^c \sum _{l=1}^n \sum _{t=1}^n \nu _{li} \nu _{ti} D_{lt} + \overline{w} \left( 2 \sum _{t=1}^n (u_{tq^1_k}+u_{tq^2_k})D_{kt} \right. \\&\qquad \left. + \sum _{i=1}^c \sum _{l=1, l\ne k}^n \sum _{t=1, t\ne k}^n u_{li} u_{ti} D_{lt} \right) \\&= 2 J^u_k + J_c. \end{aligned}$$

Here

$$\begin{aligned} J^u_k&= \sum _{t=1}^n \left( \sum _{i=1}^c \underline{w} \nu _{ki} \nu _{ti} + \overline{w} (u_{tq^1_k}+u_{tq^2_k}) \right) D_{kt} \nonumber \\&= \overline{w} \sum _{t=1}^n (u_{tq^1_k}+u_{tq^2_k}) D_{kt}. \end{aligned}$$
(21)

In comparison with \(J^\nu _k\) and \(J^u_k\), we determine \(\nu _{ki}\) and \(u_{ki}\) as follows:

$$\begin{aligned} \nu _{ki}&= {\left\{ \begin{array}{ll} 1, &{} (J^\nu _k < J^u_k \wedge i = p_k) \\ 0, &{} (\text {otherwise}) \end{array}\right. } \\ u_{ki}&= {\left\{ \begin{array}{ll} 1, &{} \Big (J^\nu _k \ge J^u_k \wedge (i = q^1_k \vee i = q^2_k)\Big ) \\ 0. &{} (\text {otherwise}) \end{array}\right. } \end{aligned}$$

From the above discussion, we show the RNM algorithm as Algorithm 6. The proposed algorithm is constructed based on iterative optimization.

figure f

5.2 Rough Fuzzy Non Metric Model

In the previous section, we proposed the RNM algorithm. In the algorithm, an object \(x_k\) belongs to just two boundaries if \(x_k\) does not belong to any lower approximation, since \(u_{ki} \in \{0,1\}\) and the objective function (16) is linear for \(u_{ki}\). In this section, we therefore propose the RFNM algorithm to make \(x_k\) belong to more than one boundary if \(x_k\) does not belong to any lower approximation.

We have two ways to fuzzify RNM. The first is to introduce the fuzzification parameter m, and the second is to introduce the entropy term. These ways are known to be very useful. We call the method using the first way rough fuzzy non metric model (RFNM) and that using the second way entropy-regularized rough fuzzy non metric model (ERFNM), as mentioned above. In this paper, we describe RFNM.

5.2.1 Objective Function

We consider the following objective function of RFNM:

$$\begin{aligned} J_\text {RFNM}(N,U) = \underline{\omega } \sum _{i=1}^{c}\sum _{k=1}^{n}\sum _{t=1}^{n}\nu _{ki}\nu _{ti}D_{kt} + \overline{\omega } \sum _{i=1}^{c}\sum _{k=1}^{n}\sum _{t=1}^{n}u_{ki}^{m}u_{ti}^{m}D_{kt}. \end{aligned}$$
(22)

Here \(\underline{w}+\overline{w}=1\). \(D_{kt}\) means a dissimilarity between \(x_k\) and \(x_t\). The last entropy term means fuzzification of \(u_{ki}\) and makes the objective function nonlinear for \(u_{ki}\). Hence, the value of the optimal solution on \(u_{ki}\) that minimizes the objective function (22) is in [0, 1).

We assume the following conditions for \(\nu _{ki}\) and \(u_{ki}\):

$$\begin{aligned} \nu _{ki} \in \{0,1\} , \qquad u_{ki} \in [0,1). \end{aligned}$$

From (C1)–(C3) in Sect. 2.2, we derive the following constraints:

$$\begin{aligned} \sum _{i=1}^c \nu _{ki} \in \{0,1\}, \end{aligned}$$
(23)
$$\begin{aligned} \sum _{i=1}^c u_{ki} \in \{0,1\}, \end{aligned}$$
(24)
$$\begin{aligned} \sum _{i=1}^c \nu _{ki} = 1 \Longleftrightarrow \sum _{i=1}^c u_{ki} = 0. \end{aligned}$$
(25)

From the above constraints, we derive the following relation for any k:

$$\begin{aligned} \sum _{i=1}^c \nu _{ki} = 0 \Longleftrightarrow \sum _{i=1}^c u_{ki} = 1. \end{aligned}$$
(26)

It is obvious that these relations are equivalent to (C1)–(C3) in Sect. 2.2.

5.2.2 Derivation of Optimal Solutions and Algorithm

Same as RNM, optimal solutions to \(\nu _{ki}\) and \(u_{ki}\) are obtained by comparing two cases for each \(x_k\):

  1. 1.

    \(x_k\) belongs to the lower approximation \(\underline{A}_{p_k}\).

  2. 2.

    \(x_k\) belongs to the boundaries of two clusters \(\overline{A}_{q^1_k}\) and \(\overline{A}_{q^2_k}\).

In the first case, let us assume that \(x_k\) belongs to the lower approximation \(\underline{A}_{p_k}\). \(p_k\) is derived as follows:

The objective function J is rewritten as follows:

$$\begin{aligned} J_\text {RFNM}(N,U) = \underline{\omega } \sum _{i=1}^{c} \underline{J}_{i} + \overline{\omega } \sum _{i=1}^{c}\sum _{l=1}^{n}\sum _{t=1}^{n}u_{li}^{m}u_{ti}^{m}D_{lt}. \end{aligned}$$

Here

$$\begin{aligned} \underline{J}_{i} = 2\nu _{ki}\sum _{t=1}^{n}\nu _{ti}D_{kt} + \sum _{l=1,l \ne k}^{n}\sum _{t=1,t \ne k}^{n}\nu _{li}\nu _{ti}D_{lt}. \end{aligned}$$

Note that \(D_{kk}=0\) and \(D_{kt}=D_{tk}\). Therefore

$$\begin{aligned} p_{k} = \arg \min _{i} \sum _{t=1}^{n}\nu _{ti}D_{kt}. \end{aligned}$$
(27)

This means the following relations:

$$\begin{aligned} \nu _{ki}&= \left\{ \begin{array}{ll} 1 &{} (i=p_{k}) \\ 0 &{} (\text {otherwise}) \\ \end{array} \right. \\ u_{ki}&= 0 \end{aligned}$$

In this case, the value of the objective function is calculated as follows:

$$\begin{aligned} J_\text {RFNM}(N,U)&= \underline{\omega } \left( 2\sum _{t=1}^{n}\nu _{tp_{k}}D_{kt} + \sum _{i=1}^{c}\sum _{l=1, l \ne k}^{n}\sum _{t=1, t \ne k}^{n}\nu _{li}\nu _{ti}D_{lt} \right) + \overline{\omega } \sum _{i=1}^{c}\sum _{l=1}^{n}\sum _{t=1}^{n}u_{li}^{m}u_{ti}^{m}D_{lt}\\&= 2J_{k}^{\nu } + J_{c}. \end{aligned}$$

Here

$$\begin{aligned} J_{k}^{\nu }&= \sum _{t=1}^{n} \left( \underline{\omega } \nu _{tp_{k}} + \sum _{i=1}^{c}\overline{\omega } u_{ki}^{m}u_{ti}^{m} \right) D_{kt} = \underline{\omega }\sum _{t=1}^{n}\nu _{tp_{k}}D_{kt}, \\ J_{c}&= \underline{\omega } \sum _{i=1}^{c}\sum _{l=1, l \ne k}^{n}\sum _{t=1, t \ne k}^{n}\nu _{li}\nu _{ti}D_{lt} + \overline{\omega } \sum _{i=1}^{c}\sum _{l=1, l \ne k}^{n}\sum _{t=1, t \ne k}^{n}u_{li}^{m}u_{ti}^{m}D_{lt}. \nonumber \end{aligned}$$
(28)

In the second case, let us assume that \(x_k\) belongs to the boundaries of more than one cluster. The objective function J is convex for \(u_{ki}\), hence we derive an optimal solution to \(u_{ki}\) using a Lagrange multiplier.

Here we introduce the following Lagrange function with the constraint (26):

$$\begin{aligned} L = J + \sum _{k=1}^{n}\eta _{k}\sum _{i=1}^{c}(u_{ki}-1). \end{aligned}$$

We partially differentiate L by \(u_{ki}\) and get the following equation:

$$\begin{aligned} \frac{\partial L}{\partial u_{ki}}&= 2m\overline{\omega }u_{ki}^{m-1}\left( \sum _{t=1, t\ne {k}}u_{ti}^{m}D_{kt} + u_{ki}^{m}D_{kk}\right) + \eta _{k}\\&= 2m\overline{\omega }u_{ki}^{m-1}\sum _{t=1, t \ne k}u_{ti}^{m}D_{kt} + \eta _{k} \end{aligned}$$

\(\frac{\partial L}{\partial u_{ki}} = 0\), we obtain the following relation:

$$\begin{aligned} u_{ki} = \frac{(-\eta _{k})^{\frac{1}{m-1}}}{D_{ki}^{\frac{1}{m-1}}}, \end{aligned}$$
(29)

where

$$\begin{aligned} D_{ki} = 2m\overline{\omega }\sum _{t=1, t \ne k}^{n}u_{ti}^{m}D_{kt} = 2m\overline{\omega }\sum _{t=1}^{n}u_{ti}^{m}D_{kt}. \end{aligned}$$
(30)

From the constraint (26) and the above Eq. (29), we get the following equation:

$$\begin{aligned}\begin{gathered} \sum _{i=1}^{c}u_{ki} = \sum _{i=1}^{c}\frac{(-\eta _{k})^{\frac{1}{m-1}}}{D_{ki}^{\frac{1}{m-1}}} = 1, \\ (-\eta _{k})^{\frac{1}{m-1}} = 1/\sum _{j=1}^{c}\frac{1}{D_{kj}^{\frac{1}{m-1}}}. \end{gathered}\end{aligned}$$

We then obtain the following optimal solution:

$$\begin{aligned} u_{ki} = \frac{\left( \frac{1}{D_{ki}}\right) ^{\frac{1}{m-1}}}{\sum _{j=1}^{c}\left( \frac{1}{D_{kj}}\right) ^{\frac{1}{m-1}}} \end{aligned}$$

This means the following relations:

$$\begin{aligned} \nu _{ki}&= 0, (\forall i) \\ u_{ki}&= \frac{\left( \frac{1}{D_{ki}}\right) ^{\frac{1}{m-1}}}{\sum _{j=1}^{c}\left( \frac{1}{D_{kj}}\right) ^{\frac{1}{m-1}}}. (\forall i) \end{aligned}$$

In this case, the value of the objective function is calculated as follows:

$$\begin{aligned} J_\text {RFNM}(N, U)&= \underline{\omega }\sum _{i=1}^{c}\sum _{l=1}^{n}\sum _{t=1}^{n}\nu _{li}\nu _{ti}D_{lt} + \overline{\omega }\sum _{i=1}^{c}\left( 2mu_{ki}^{m-1}\sum _{t=1}^{n}u_{ti}^{m}D_{kt} \right. \\&\quad \left. + \sum _{l=1,l \ne k}^{n}\sum _{t=1,t \ne k}^{n}u_{li}^{m}u_{ti}^{m}D_{kt} \right) \\&= 2J_{k}^{u} + J_{c}. \end{aligned}$$

Here

$$\begin{aligned} J_{k}^{u}&= \sum _{t=1}^{n} \left( \sum _{i=1}^{c}\underline{\omega }\nu _{ki}\nu _{ti} + \sum _{i=1}^{c}\overline{\omega }mu_{ki}^{m-1}u_{ti}^{m} \right) D_{kt} \nonumber \\&= \sum _{i=1}^{c}mu_{ki}^{m-1}\sum _{t=1}^{n}\overline{\omega }u_{ti}^{m}D_{kt}. \end{aligned}$$
(31)

In comparison with \(J_{k}^{\nu }\) and\(J_{k}^{u}\), we determine \(\nu _{ki}\) and \(u_{ki}\) as follows:

$$\begin{aligned} \nu _{ki}&= \left\{ \begin{array}{ll} 1 &{} (J_{k}^{\nu } < J_{k}^{u} \wedge i=p_{k}) \\ 0 &{} (\text {otherwise}) \\ \end{array} \right. \\ u_{ki}&= \left\{ \begin{array}{ll}\frac{\left( \frac{1}{D_{ki}}\right) ^{\frac{1}{m-1}}}{\sum _{j=1}^{c}\left( \frac{1}{D_{kj}}\right) ^{\frac{1}{m-1}}} &{} \left( J_{k}^{\nu } \ge J_{k}^{u} \right) \\ 0 &{} (\text {otherwise}) \\ \end{array} \right. \end{aligned}$$

From the above discussion, we show the RFNM algorithm as Algorithm 7. The proposed algorithm is also constructed based on iterative optimization.

figure g

6 Conclusion

This paper showed various types of objective functions of objective-based rough clustering and their algorithm.

As mentioned above, many non-hierarchical clustering algorithms are based on optimization of some objective function. The reason is that we could choose the “better” output among many outputs from different initial values by comparing the value of the objective function of the output with each other. Lingras’s algorithm is almost only one algorithm with rough set representation inspired by KM, however it is not useful from the viewpoint that the algorithm is not based on any objective functions. Therefore, our proposed algorithms could be expected to be more useful in the field of rough clustering.

In objective-based clustering methods, the concept of classification function is very important. The classification function gives us belongingness of unknown datum to each cluster. It is impossible to derive the classification functions of our algorithms in this paper analytically, hence we can not show the functions explicitly. However, as we have seen, the value of the belongingness numerically. In future works, we will develop these discussion.