The document describes the Rough K-Means clustering algorithm. It takes a dataset as input and outputs lower and upper approximations of K clusters. It works as follows:
1. Objects are randomly assigned to initial clusters. Cluster centroids are then computed.
2. Objects are assigned to clusters based on the ratio of their distance to closest versus second closest centroid. Objects on the boundary may belong to multiple clusters.
3. Cluster centroids are recomputed based on the new cluster assignments. The process repeats until cluster centroids converge.
An example is provided to illustrate the algorithm on a sample dataset with 6 objects and 2 features.
1 of 4
More Related Content
Rough K Means - Numerical Example
1. Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
ROUGH K-MEANS CLUSTERING ALGORITHM
Algorithm: Rough K-Means
Input: Dataset of n objects with d features, number of clusters K and values of
parameters Wlower, Wupper, and Epsilon.
Output: Lower approximation () and Upper approximation () of K
Clusters.
Step1: Randomly assign each data object one lower approximation(). By
definition (property 2) the data object also belongs to upper approximation
of the same Cluster.
Step 2: Compute Cluster Centroids Cj
If () and () - () =
Cj =
()モ ()
Else If () = and () - ()
Cj =
() ()モ () ()
Else
Cj = Wlower
()モ () + Wupper
() ()モ () ()
Step 3: Assign each object to the lower approximation or upper
approximation () of cluster i respectively. For each object vector x, let
d(X, Cj) be the distance between itself and the centroid of cluster Cj.
d(X, Cj) = 1 j K d(X, Cj).
The ratio d(X, Ci) / d(X, Cj), 1 i, j K is used to determine the membership
of x as follow: If d(X, Ci) / d(X, Cj) epsilon, for any pair (i, j), the
(駒) and (駒 ) and x will not be a part of any lower
approximation. Otherwise, (駒), such that d(X, Ci) is the minimum of
1 i K. In addition (駒).
Step 4: Repeat Steps 2 and 3 until convergence.
2. Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
Illustrative Example
Table 1 shows example information system with real-valued conditional attributes.
It consists of six objects/genes, and two features/samples. k = 2, which is the
number of clusters. Weight of the lower approximation Wlower=0.7, Weight of the
upper approximation Wupper = 0.3 and Relative threshold = 2.
Table 1 Example dataset for Rough K-Means
U X Y
1 0 3
2 1 3
3 3 1
4 3 0.5
5 5 0
6 6 0
Step1: Randomly assign each data objects to exactly one lower approximation
K1 = {(0, 3), (1, 3), (3, 1)}
K2 = {(3, 0.5), (5, 0), (6, 0)}
Step 2: In this case () and () - () = , so we compute the centroid
using Cj =
()モ () ,
C1 =
0+1+3
3
,
3+3+1
3
= (1.33, 2.33)
C2 =
3+5+6
3
,
0.5+0+0
3
= (4.67, 0.17)
Find the distance from centroid to each point using equlidean distance,
倹 = (2 1 )2 + (2 1 )2
3. Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
d1(X, C1):
(0, 3)(1.33, 2.33) (1.33 0)2 + (2.33 3)2 = 1.49
(1, 3)(1.33, 2.33) (1.33 1)2 + (2.33 3)2 = 0.75
(3, 1)(1.33, 2.33) (1.33 3)2 + (2.33 1)2 = 2.13
(3, 0.5)(1.33, 2.33) (1.33 3)2 + (2.33 0.5)2 = 2.48
(5, 0)(1.33, 2.33) (1.33 5)2 + (2.33 0)2 = 4.45
(6, 0)(1.33, 2.33) (1.33 6)2 + (2.33 0)2 = 5.22
d2(X, C2):
(0, 3)(4.67, 0.17) (4.67 0)2 + (0.17 3)2 = 5.46
(1, 3)(4.67, 0.17) (4.67 1)2 + (0.17 3)2 = 4.63
(3, 1)(4.67, 0.17) (4.67 3)2 + (0.17 1)2 = 1.86
(3, 0.5)(4.67, 0.17) (4.67 3)2 + (0.17 0.5)2 = 1.70
(5, 0)(4.67, 0.17) (4.67 5)2 + (0.17 0)2 = 0.37
(6, 0)(4.67, 0.17) (4.67 6)2 + (0.17 0)2 = 1.34
Step 3: Assign each object to the lower approximation or upper
approximation () of cluster i respectively. Check If d(X, Ci) /d(X, Cj) epsilon.
1. (0, 3) d2 / d1 = 5.46/1.49 = 3.66443 2. So, x1 will be a part of 1
2. (1, 3) 4.63/0.75 = 6.173 2. So, x2 will be a part of 1
3. (3, 1) 2.13/1.86 = 1.145 < 2 , so x3 will not be a part of 1& 2
4. (3, 0.5) 2.48/1.70 = 1.458 < 2 , so x4 will not be a part of 1& 2
5. (5, 0) 4.35/0.37 = 11.756 2. So, x5 will be a part of 2
6. (6, 0) 5.22/1.34 = 3.895 2. So, x6 will be a part of 2
4. Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
Now, we have clusters
1= {(0, 3), (1, 3)} 1 = {(0, 3), (1, 3), (3, 1), (3, 0.5)}
2= {(5, 0), (6, 0)} 2 = {(5, 0), (6, 0), (3, 1), (3, 0.5)}
Here, () and () - () then find out the new centroid by using
below equation,
Cj = Wlower
()モ () + Wupper
() ()モ () ()
C1 = 0.7
0+1
2
,
3+3
2
+ 0.3
3+3
2
,
1+0.5
2
= (1.25, 2.325)
C2 = 0.7
5+6
2
,
0+0
2
+ 0.3
3+3
2
,
1+0.5
2
= (4.75, 0.225)
Step 4: Repeat Steps 2 and 3 until convergence (Old Centroid = New Centroid).