際際滷

際際滷Share a Scribd company logo
Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
ROUGH K-MEANS CLUSTERING ALGORITHM
Algorithm: Rough K-Means
Input: Dataset of n objects with d features, number of clusters K and values of
parameters Wlower, Wupper, and Epsilon.
Output: Lower approximation () and Upper approximation () of K
Clusters.
Step1: Randomly assign each data object one lower approximation(). By
definition (property 2) the data object also belongs to upper approximation
 of the same Cluster.
Step 2: Compute Cluster Centroids Cj
If ()   and () - () = 
Cj =
 
()モ ()
Else If () =  and () - ()  
Cj =
 
()  ()モ ()  ()
Else
Cj = Wlower 
 
()モ () + Wupper 
 
()  ()モ ()  ()
Step 3: Assign each object to the lower approximation   or upper
approximation () of cluster i respectively. For each object vector x, let
d(X, Cj) be the distance between itself and the centroid of cluster Cj.
d(X, Cj) = 1 j K d(X, Cj).
The ratio d(X, Ci) / d(X, Cj), 1 i, j  K is used to determine the membership
of x as follow: If d(X, Ci) / d(X, Cj)  epsilon, for any pair (i, j), the
  (駒) and   (駒 ) and x will not be a part of any lower
approximation. Otherwise,   (駒), such that d(X, Ci) is the minimum of
1  i  K. In addition   (駒).
Step 4: Repeat Steps 2 and 3 until convergence.
Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
Illustrative Example
Table 1 shows example information system with real-valued conditional attributes.
It consists of six objects/genes, and two features/samples. k = 2, which is the
number of clusters. Weight of the lower approximation Wlower=0.7, Weight of the
upper approximation Wupper = 0.3 and Relative threshold = 2.
Table 1 Example dataset for Rough K-Means
U X Y
1 0 3
2 1 3
3 3 1
4 3 0.5
5 5 0
6 6 0
Step1: Randomly assign each data objects to exactly one lower approximation
K1 = {(0, 3), (1, 3), (3, 1)}
K2 = {(3, 0.5), (5, 0), (6, 0)}
Step 2: In this case ()   and () - () = , so we compute the centroid
using Cj =
 
()モ () ,
C1 =
0+1+3
3
,
3+3+1
3
= (1.33, 2.33)
C2 =
3+5+6
3
,
0.5+0+0
3
= (4.67, 0.17)
Find the distance from centroid to each point using equlidean distance,
倹 = (2  1 )2 + (2  1 )2
Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
d1(X, C1):
(0, 3)(1.33, 2.33)  (1.33  0)2 + (2.33  3)2 = 1.49
(1, 3)(1.33, 2.33)  (1.33  1)2 + (2.33  3)2 = 0.75
(3, 1)(1.33, 2.33)  (1.33  3)2 + (2.33  1)2 = 2.13
(3, 0.5)(1.33, 2.33)  (1.33  3)2 + (2.33  0.5)2 = 2.48
(5, 0)(1.33, 2.33)  (1.33  5)2 + (2.33  0)2 = 4.45
(6, 0)(1.33, 2.33)  (1.33  6)2 + (2.33  0)2 = 5.22
d2(X, C2):
(0, 3)(4.67, 0.17)  (4.67  0)2 + (0.17  3)2 = 5.46
(1, 3)(4.67, 0.17)  (4.67  1)2 + (0.17  3)2 = 4.63
(3, 1)(4.67, 0.17)  (4.67  3)2 + (0.17  1)2 = 1.86
(3, 0.5)(4.67, 0.17)  (4.67  3)2 + (0.17  0.5)2 = 1.70
(5, 0)(4.67, 0.17)  (4.67  5)2 + (0.17  0)2 = 0.37
(6, 0)(4.67, 0.17)  (4.67  6)2 + (0.17  0)2 = 1.34
Step 3: Assign each object to the lower approximation   or upper
approximation () of cluster i respectively. Check If d(X, Ci) /d(X, Cj)  epsilon.
1. (0, 3)  d2 / d1 = 5.46/1.49 = 3.66443  2. So, x1 will be a part of 1
2. (1, 3)  4.63/0.75 = 6.173  2. So, x2 will be a part of 1
3. (3, 1)  2.13/1.86 = 1.145 < 2 , so x3 will not be a part of 1& 2
4. (3, 0.5)  2.48/1.70 = 1.458 < 2 , so x4 will not be a part of 1& 2
5. (5, 0)  4.35/0.37 = 11.756  2. So, x5 will be a part of 2
6. (6, 0)  5.22/1.34 = 3.895  2. So, x6 will be a part of 2
Rough K-Means DR. E. N. SATHISHKUMAR
Department of Computer Science, Periyar University, Salem - 11
Now, we have clusters
1= {(0, 3), (1, 3)} 1 = {(0, 3), (1, 3), (3, 1), (3, 0.5)}
2= {(5, 0), (6, 0)} 2 = {(5, 0), (6, 0), (3, 1), (3, 0.5)}
Here, ()   and () - ()   then find out the new centroid by using
below equation,
Cj = Wlower 
 
()モ () + Wupper 
 
()  ()モ ()  ()
C1 = 0.7 
0+1
2
,
3+3
2
+ 0.3 
3+3
2
,
1+0.5
2
= (1.25, 2.325)
C2 = 0.7 
5+6
2
,
0+0
2
+ 0.3 
3+3
2
,
1+0.5
2
= (4.75, 0.225)
Step 4: Repeat Steps 2 and 3 until convergence (Old Centroid = New Centroid).

More Related Content

Rough K Means - Numerical Example

  • 1. Rough K-Means DR. E. N. SATHISHKUMAR Department of Computer Science, Periyar University, Salem - 11 ROUGH K-MEANS CLUSTERING ALGORITHM Algorithm: Rough K-Means Input: Dataset of n objects with d features, number of clusters K and values of parameters Wlower, Wupper, and Epsilon. Output: Lower approximation () and Upper approximation () of K Clusters. Step1: Randomly assign each data object one lower approximation(). By definition (property 2) the data object also belongs to upper approximation of the same Cluster. Step 2: Compute Cluster Centroids Cj If () and () - () = Cj = ()モ () Else If () = and () - () Cj = () ()モ () () Else Cj = Wlower ()モ () + Wupper () ()モ () () Step 3: Assign each object to the lower approximation or upper approximation () of cluster i respectively. For each object vector x, let d(X, Cj) be the distance between itself and the centroid of cluster Cj. d(X, Cj) = 1 j K d(X, Cj). The ratio d(X, Ci) / d(X, Cj), 1 i, j K is used to determine the membership of x as follow: If d(X, Ci) / d(X, Cj) epsilon, for any pair (i, j), the (駒) and (駒 ) and x will not be a part of any lower approximation. Otherwise, (駒), such that d(X, Ci) is the minimum of 1 i K. In addition (駒). Step 4: Repeat Steps 2 and 3 until convergence.
  • 2. Rough K-Means DR. E. N. SATHISHKUMAR Department of Computer Science, Periyar University, Salem - 11 Illustrative Example Table 1 shows example information system with real-valued conditional attributes. It consists of six objects/genes, and two features/samples. k = 2, which is the number of clusters. Weight of the lower approximation Wlower=0.7, Weight of the upper approximation Wupper = 0.3 and Relative threshold = 2. Table 1 Example dataset for Rough K-Means U X Y 1 0 3 2 1 3 3 3 1 4 3 0.5 5 5 0 6 6 0 Step1: Randomly assign each data objects to exactly one lower approximation K1 = {(0, 3), (1, 3), (3, 1)} K2 = {(3, 0.5), (5, 0), (6, 0)} Step 2: In this case () and () - () = , so we compute the centroid using Cj = ()モ () , C1 = 0+1+3 3 , 3+3+1 3 = (1.33, 2.33) C2 = 3+5+6 3 , 0.5+0+0 3 = (4.67, 0.17) Find the distance from centroid to each point using equlidean distance, 倹 = (2 1 )2 + (2 1 )2
  • 3. Rough K-Means DR. E. N. SATHISHKUMAR Department of Computer Science, Periyar University, Salem - 11 d1(X, C1): (0, 3)(1.33, 2.33) (1.33 0)2 + (2.33 3)2 = 1.49 (1, 3)(1.33, 2.33) (1.33 1)2 + (2.33 3)2 = 0.75 (3, 1)(1.33, 2.33) (1.33 3)2 + (2.33 1)2 = 2.13 (3, 0.5)(1.33, 2.33) (1.33 3)2 + (2.33 0.5)2 = 2.48 (5, 0)(1.33, 2.33) (1.33 5)2 + (2.33 0)2 = 4.45 (6, 0)(1.33, 2.33) (1.33 6)2 + (2.33 0)2 = 5.22 d2(X, C2): (0, 3)(4.67, 0.17) (4.67 0)2 + (0.17 3)2 = 5.46 (1, 3)(4.67, 0.17) (4.67 1)2 + (0.17 3)2 = 4.63 (3, 1)(4.67, 0.17) (4.67 3)2 + (0.17 1)2 = 1.86 (3, 0.5)(4.67, 0.17) (4.67 3)2 + (0.17 0.5)2 = 1.70 (5, 0)(4.67, 0.17) (4.67 5)2 + (0.17 0)2 = 0.37 (6, 0)(4.67, 0.17) (4.67 6)2 + (0.17 0)2 = 1.34 Step 3: Assign each object to the lower approximation or upper approximation () of cluster i respectively. Check If d(X, Ci) /d(X, Cj) epsilon. 1. (0, 3) d2 / d1 = 5.46/1.49 = 3.66443 2. So, x1 will be a part of 1 2. (1, 3) 4.63/0.75 = 6.173 2. So, x2 will be a part of 1 3. (3, 1) 2.13/1.86 = 1.145 < 2 , so x3 will not be a part of 1& 2 4. (3, 0.5) 2.48/1.70 = 1.458 < 2 , so x4 will not be a part of 1& 2 5. (5, 0) 4.35/0.37 = 11.756 2. So, x5 will be a part of 2 6. (6, 0) 5.22/1.34 = 3.895 2. So, x6 will be a part of 2
  • 4. Rough K-Means DR. E. N. SATHISHKUMAR Department of Computer Science, Periyar University, Salem - 11 Now, we have clusters 1= {(0, 3), (1, 3)} 1 = {(0, 3), (1, 3), (3, 1), (3, 0.5)} 2= {(5, 0), (6, 0)} 2 = {(5, 0), (6, 0), (3, 1), (3, 0.5)} Here, () and () - () then find out the new centroid by using below equation, Cj = Wlower ()モ () + Wupper () ()モ () () C1 = 0.7 0+1 2 , 3+3 2 + 0.3 3+3 2 , 1+0.5 2 = (1.25, 2.325) C2 = 0.7 5+6 2 , 0+0 2 + 0.3 3+3 2 , 1+0.5 2 = (4.75, 0.225) Step 4: Repeat Steps 2 and 3 until convergence (Old Centroid = New Centroid).