The assignment of separating the information focuses into different gatherings along with the end goal that information focuses in the same gatherings are exact like the other information focuses that are in the same gathering when compared to those in the different gatherings. Basically, it points to isolate the bunch with a comparable attributes and allocates them into groups, which is called as clustering. Comprehensively, grouping can be isolated into two subgroups. They are as follows (Aggarwal & Reddy, 2016):
- Hard Clustering: In hard Clustering, every datum point either has a place with a group totally or not. For instance, in the above case every client is put into one gathering out of 10 gatherings.
- Soft Clustering:In soft Clustering, rather than putting every datum point into a different group, a likelihood or probability of that information point to be in those bunches is appointed.
This task aims to perform clustering on provided data set that is, BBC sports data set from the cloud. This data set contains 737 documents from BBC sports’ website according to the sports news articles. Here, we open the provided data set. The provided data set contains three files like, BBC sports classes, BBC sports matrix and BBC sport terms. These files are shown below.
K-Means is likely the most understood bunching calculation. It is educated in a considerable measure of starting information science and machine learning classes. It is straightforward and can be actualized in code and can check out the realistic delineation (Kaushik, 2016).
- To start, we initially select various classes/gatherings to utilize and randomly introduces their separate focuses. To make sense of the quantity of classes to utilize, it is great to investigate the information and endeavour, to recognize any unmistakable groupings. The middle focuses are vectors of indistinguishable length from every datum point vector and are the “X’s” in the above realistic.
- Each information point is ordered by processing the separation between that point and each gathering focus, and afterwards characterizing the point to be in the gathering whose middle is nearest to it.
- Based on these characterized focuses, we recomputed the gathering focus by taking the mean of the considerable number of vectors in the gathering.
- Repeat these means for a set of number emphases or until the point that the gathering focuses don’t change much between the cycles. You can likewise pick random introduce for gathering focuses a couple of times, and afterwards select the run that appears as though it provided the best outcomes (Celebi, 2016).
K-Means has the preferred standpoint that it’s truly quick, as all we’re truly doing is registering the separations among the focuses and gather focus; not many calculations! It hence has a direct multifaceted nature O(n).
Then again, K-Means has few inconveniences. Right off the bat, you need to choose what number of gatherings/classes there are. This isn’t constantly unimportant and preferably with a clustering calculation we’d need it to make sense of those for us in the light of the fact that the purpose of it is to increase some knowledge from the information. K-means likewise begins with an arbitrary decision of group focus and subsequently it might yield diverse clustering results on various keeps running of the calculation. Along these lines, the outcomes may not be repeatable and need consistency. Other bunch of strategies are more reliable.
K-Medians is another clustering calculation identified with K-Means, aside from as opposed to recomposing the gathering focuses on utilizing the mean, so we utilize the middle vector of the gathering. This technique is less touchy to anomalies (on account of utilizing the Median) however it is much slower for bigger datasets as arranging is required on every emphasis when registering the Median vector.
k-means strategy is utilized for isolating the perceptions into similar bunches, in the light of their portrayal by an arrangement of quantitative factors. K-means clustering has the accompanying points of interest specifically as follows:
- A protest might be relegated to a class amid one cycle at that point, change the class in the accompanying emphasis, which isn’t conceivable with the Agglomerative Hierarchical Clustering, where the task cannot be reversed.
- With the duplication of the beginning stages and reiterations, a few arrangements might be investigated.
Grouping criteria for k-means Clustering
A few grouping reasons might be utilized for achieving the answer. XLSTAT provides four factors as limited:
- Trace (W) or Median
- Determinant (W)
- Trace (W)
- Wilks lambda
Results of k-means grouping in XLSTAT
- The optimization outline: This is a table which demonstrates the development of the inside class difference. On the off chance that, few redundancies have been asked for the outcomes, for each reiteration are shown.
- Statistics for every cycle: Activate this choice to see the development of random insights computed as it emphasises for redundancy continuing, and provides the ideal outcome for the picked rule. In the event that the comparing choice is initiated in the Charts tab, an outline demonstrating the advancement of the picked foundation as the emphases continue is shown.
- Variance decay for the ideal arrangement: This is a table which demonstrates the inside class change between the class difference and the aggregate fluctuation.
- Class centroids: This is a table which demonstrates the class centroids for different descriptors.
- Distance between class centroids: This is a table which demonstrates Euclidean separations among the class centroids for different descriptors.
- Central objects: This is a table which demonstrates the directions of the closest which questions the centroid for every class.
- Distance between the focal articles: This is a table which demonstrates the Euclidean separations between the class focal items for the different descriptors.
- Results by class: The expressive measurements for the classes (number of articles, aggregate of weights, inside class change, least separation to the centroid, most extreme separation to the centroid, mean separation to the centroid) are shown in the initial segment of the table. The second part demonstrates the items.
- Result by question: This is a table which demonstrates the task class for every single protest in arranged items.
|
|
|
|
|
|
|
|
|
|
Statistics’ Summary: |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
Variable |
Observations |
Obs. with missing data |
Obs. without missing data |
Minimum |
Maximum |
Mean |
Std. deviation |
|
|
7 |
9 |
0 |
9 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
1 |
9 |
0 |
9 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
3 |
9 |
0 |
9 |
0.000 |
1.000 |
0.222 |
0.441 |
|
|
2 |
9 |
0 |
9 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
4 |
9 |
0 |
9 |
0.000 |
1.000 |
0.333 |
0.500 |
|
|
2 |
9 |
0 |
9 |
0.000 |
1.000 |
0.333 |
0.500 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Optimization summary: |
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
Repetition |
Iteration |
Initial within-class variance |
Final within-class variance |
ln(Determinant(W)) |
|
|
|
|
|
1 |
1 |
0.750 |
0.583 |
-Inf |
|
|
|
|
|
2 |
1 |
0.938 |
0.375 |
-Inf |
|
|
|
|
|
3 |
1 |
0.708 |
0.250 |
-Inf |
|
|
|
|
|
4 |
1 |
1.000 |
0.333 |
-Inf |
|
|
|
|
|
5 |
1 |
0.458 |
0.333 |
-Inf |
|
|
|
|
|
6 |
1 |
0.708 |
0.375 |
-Inf |
|
|
|
|
|
7 |
1 |
0.667 |
0.250 |
-Inf |
|
|
|
|
|
8 |
1 |
0.750 |
0.375 |
-Inf |
|
|
|
|
|
9 |
1 |
1.000 |
0.250 |
-Inf |
|
|
|
|
|
10 |
1 |
0.875 |
0.250 |
-Inf |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Statistics for each iteration: |
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
Iteration |
Within-class variance |
Trace(W) |
ln(Determinant(W)) |
Wilks’ Lambda |
|
|
|
|
|
0 |
0.750 |
3.000 |
-Inf |
0.000 |
|
|
|
|
|
1 |
0.583 |
2.333 |
-Inf |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Variance decomposition for the optimal classification: |
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|
Absolute |
Percent |
|
|
|
|
|
|
|
Within-class |
0.583 |
84.00% |
|
|
|
|
|
|
|
Between-classes |
0.111 |
16.00% |
|
|
|
|
|
|
|
Total |
0.694 |
100.00% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Initial class centroids: |
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
Class |
7 |
1 |
3 |
2 |
4 |
2 |
|
|
|
1 |
0.000 |
0.000 |
1.000 |
0.000 |
0.500 |
0.500 |
|
|
|
2 |
0.000 |
0.000 |
0.000 |
0.000 |
0.500 |
0.500 |
|
|
|
3 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
4 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
5 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Class centroids: |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
Class |
7 |
1 |
3 |
2 |
4 |
2 |
Sum of weights |
Within-class variance |
|
1 |
0.000 |
0.000 |
1.000 |
0.000 |
0.500 |
0.500 |
2.000 |
1.000 |
|
2 |
0.000 |
0.000 |
0.000 |
0.000 |
0.667 |
0.667 |
3.000 |
0.667 |
|
3 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
2.000 |
0.000 |
|
4 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
1.000 |
0.000 |
|
5 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
1.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distances between the class centroids: |
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
1 |
2 |
3 |
4 |
5 |
|
|
|
|
1 |
0 |
1.027 |
1.225 |
1.225 |
1.225 |
|
|
|
|
2 |
1.027 |
0 |
0.943 |
0.943 |
0.943 |
|
|
|
|
3 |
1.225 |
0.943 |
0 |
0.000 |
0.000 |
|
|
|
|
4 |
1.225 |
0.943 |
0.000 |
0 |
0.000 |
|
|
|
|
5 |
1.225 |
0.943 |
0.000 |
0.000 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Central objects: |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
Class |
7 |
1 |
3 |
2 |
4 |
2 |
|
|
|
1 (0) |
0.000 |
0.000 |
1.000 |
0.000 |
0.000 |
0.000 |
|
|
|
2 (0) |
0.000 |
0.000 |
0.000 |
0.000 |
1.000 |
1.000 |
|
|
|
3 (0) |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
4 (0) |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
5 (0) |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distances between the central objects: |
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
1 (0) |
2 (0) |
3 (0) |
4 (0) |
5 (0) |
|
|
|
|
1 (0) |
0 |
1.732 |
1.000 |
1.000 |
1.000 |
|
|
|
|
2 (0) |
1.732 |
0 |
1.414 |
1.414 |
1.414 |
|
|
|
|
3 (0) |
1.000 |
1.414 |
0 |
0.000 |
0.000 |
|
|
|
|
4 (0) |
1.000 |
1.414 |
0.000 |
0 |
0.000 |
|
|
|
|
5 (0) |
1.000 |
1.414 |
0.000 |
0.000 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Result based on class: |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
Class |
1 |
2 |
3 |
4 |
5 |
|
|
|
|
Objects |
2 |
3 |
2 |
1 |
1 |
|
|
|
|
Sum of weights |
2 |
3 |
2 |
1 |
1 |
|
|
|
|
Within-class variance |
1.000 |
0.667 |
0.000 |
0.000 |
0.000 |
|
|
|
|
Minimum distance to centroid |
0.707 |
0.471 |
0.000 |
0.000 |
0.000 |
|
|
|
|
Average distance to centroid |
0.707 |
0.654 |
0.000 |
0.000 |
0.000 |
|
|
|
|
Maximum distance to centroid |
0.707 |
0.745 |
0.000 |
0.000 |
0.000 |
|
|
|
|
|
0 |
0 |
0 |
0 |
0 |
|
|
|
|
|
1 |
0 |
0 |
|
|
|
|
|
|
|
|
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Results by object: |
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
Observation |
Class |
Distance to centroid |
|
|
|
|
|
|
|
0 |
1 |
0.707 |
|
|
|
|
|
|
|
0 |
2 |
0.745 |
|
|
|
|
|
|
|
0 |
3 |
0.000 |
|
|
|
|
|
|
|
0 |
4 |
0.000 |
|
|
|
|
|
|
|
0 |
5 |
0.000 |
|
|
|
|
|
|
|
0 |
3 |
0.000 |
|
|
|
|
|
|
|
0 |
2 |
0.745 |
|
|
|
|
|
|
|
1 |
1 |
0.707 |
|
|
|
|
|
|
|
0 |
2 |
0.471 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Statistics’ Summary: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Variable |
Observations |
Observation with the missing data |
Observation without the missing data |
Minimum |
Maximum |
Mean |
Std. deviation |
|
0 |
5 |
0 |
5 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Summary of Optimization: |
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
Repetitions |
Iterations |
starting within-class variance |
Final within-class variance |
ln (Determinant(W)) |
|
|
|
|
1 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
2 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
3 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
4 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
5 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
6 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
7 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
8 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
9 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
10 |
1 |
0.000 |
0.000 |
-Inf |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Statistics for every single iteration: |
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
Iteration |
Within-class variance |
Trace (W) |
Ln (Determinant(W)) |
Wilks’ Lambda |
|
|
|
|
0 |
0.000 |
0.000 |
-Inf |
0.000 |
|
|
|
|
1 |
0.000 |
0.000 |
-Inf |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For optimal classification, variance decomposition: |
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
Absolute |
% |
|
|
|
|
|
|
Within-class |
0.000 |
0.00% |
|
|
|
|
|
|
Between the classes |
0.000 |
0.00% |
|
|
|
|
|
|
SUM |
0.000 |
100.00% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Initial class centroids: |
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
Class |
0 |
|
|
|
|
|
|
|
1 |
0.000 |
|
|
|
|
|
|
|
2 |
0.000 |
|
|
|
|
|
|
|
3 |
0.000 |
|
|
|
|
|
|
|
4 |
0.000 |
|
|
|
|
|
|
|
5 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Class centroids: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Class |
0 |
Sum of weights |
Within-class variance |
|
|
|
|
|
1 |
0.000 |
1.000 |
0.000 |
|
|
|
|
|
2 |
0.000 |
1.000 |
0.000 |
|
|
|
|
|
3 |
0.000 |
1.000 |
0.000 |
|
|
|
|
|
4 |
0.000 |
1.000 |
0.000 |
|
|
|
|
|
5 |
0.000 |
1.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distances between the class centroids: |
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
1 |
2 |
3 |
4 |
5 |
|
|
|
1 |
0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
2 |
0.000 |
0 |
0.000 |
0.000 |
0.000 |
|
|
|
3 |
0.000 |
0.000 |
0 |
0.000 |
0.000 |
|
|
|
4 |
0.000 |
0.000 |
0.000 |
0 |
0.000 |
|
|
|
5 |
0.000 |
0.000 |
0.000 |
0.000 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Central objects: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Class |
0 |
|
|
|
|
|
|
|
1 (0) |
0.000 |
|
|
|
|
|
|
|
2 (0) |
0.000 |
|
|
|
|
|
|
|
3 (0) |
0.000 |
|
|
|
|
|
|
|
4 (0) |
0.000 |
|
|
|
|
|
|
|
5 (0) |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distances between the central objects: |
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
1 (0) |
2 (0) |
3 (0) |
4 (0) |
5 (0) |
|
|
|
1 (0) |
0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
2 (0) |
0.000 |
0 |
0.000 |
0.000 |
0.000 |
|
|
|
3 (0) |
0.000 |
0.000 |
0 |
0.000 |
0.000 |
|
|
|
4 (0) |
0.000 |
0.000 |
0.000 |
0 |
0.000 |
|
|
|
5 (0) |
0.000 |
0.000 |
0.000 |
0.000 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Result based on class: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Classes |
1 |
2 |
3 |
4 |
5 |
|
|
|
Objects |
1 |
1 |
1 |
1 |
1 |
|
|
|
Sum of weights |
1 |
1 |
1 |
1 |
1 |
|
|
|
Within-class variance |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
Minimum distance to centroid |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
Average distance to centroid |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
Maximum distance to centroid |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
0 |
0 |
0 |
0 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Results by object: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Observation |
Class |
Distance to centroid |
|
|
|
|
|
|
0 |
1 |
0.000 |
|
|
|
|
|
|
0 |
2 |
0.000 |
|
|
|
|
|
|
0 |
3 |
0.000 |
|
|
|
|
|
|
0 |
4 |
0.000 |
|
|
|
|
|
|
0 |
5 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Repeat K-means are provided in the below result.
For BBC Sports Matrix
Statistics’ Summary: |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Variable |
Observations |
Observations with missing data |
Observations without missing data |
Minimum |
Maximum |
Mean |
Std. deviation |
|
|
|
0 |
9 |
0 |
9 |
0.000 |
1.000 |
0.333 |
0.500 |
|
|
|
0 |
9 |
0 |
9 |
0.000 |
1.000 |
0.111 |
0.333 |
|
|
|
0 |
9 |
0 |
9 |
0.000 |
2.000 |
0.556 |
0.882 |
|
|
|
0 |
9 |
0 |
9 |
0.000 |
2.000 |
0.667 |
0.866 |
|
|
|
0 |
9 |
0 |
9 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
0 |
9 |
0 |
9 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Optimization summary: |
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
Repetition |
Iteration |
Initial within-class variance |
Final within-class variance |
ln(Determinant(W)) |
|
|
|
|
|
|
1 |
1 |
1.958 |
0.375 |
-Inf |
|
|
|
|
|
|
2 |
1 |
2.875 |
0.300 |
-Inf |
|
|
|
|
|
|
3 |
1 |
2.583 |
0.125 |
-Inf |
|
|
|
|
|
|
4 |
1 |
2.500 |
0.833 |
-Inf |
|
|
|
|
|
|
5 |
1 |
1.688 |
0.125 |
-Inf |
|
|
|
|
|
|
6 |
1 |
2.438 |
0.500 |
-Inf |
|
|
|
|
|
|
7 |
1 |
2.792 |
0.750 |
-Inf |
|
|
|
|
|
|
8 |
1 |
2.833 |
0.125 |
-Inf |
|
|
|
|
|
|
9 |
1 |
2.500 |
0.500 |
-Inf |
|
|
|
|
|
|
10 |
1 |
2.875 |
0.125 |
-Inf |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Statistics for each iteration: |
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
Iteration |
Within-class variance |
Trace(W) |
ln(Determinant(W)) |
Wilks’ Lambda |
|
|
|
|
|
|
0 |
1.958 |
7.833 |
-Inf |
0.000 |
|
|
|
|
|
|
1 |
0.375 |
1.500 |
-Inf |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Variance decomposition for the optimal classification: |
|
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|
|
Absolute |
Percent |
|
|
|
|
|
|
|
|
Within-class |
0.375 |
19.85% |
|
|
|
|
|
|
|
|
Between-classes |
1.514 |
80.15% |
|
|
|
|
|
|
|
|
Total |
1.889 |
100.00% |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Initial class centroids: |
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
|
|
|
Class |
0 |
0 |
0 |
0 |
0 |
0 |
|
|
|
|
1 |
0.333 |
0.000 |
0.000 |
0.667 |
0.000 |
0.000 |
|
|
|
|
2 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
3 |
1.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
4 |
0.000 |
0.500 |
1.500 |
1.500 |
0.000 |
0.000 |
|
|
|
|
5 |
0.500 |
0.000 |
1.000 |
0.500 |
0.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Class centroids: |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Class |
0 |
0 |
0 |
0 |
0 |
0 |
Sum of weights |
Within-class variance |
|
|
1 |
1.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
3.000 |
0.000 |
|
|
2 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
2.000 |
0.000 |
|
|
3 |
0.000 |
0.000 |
0.000 |
2.000 |
0.000 |
0.000 |
1.000 |
0.000 |
|
|
4 |
0.000 |
0.000 |
2.000 |
1.000 |
0.000 |
0.000 |
1.000 |
0.000 |
|
|
5 |
0.000 |
0.500 |
1.500 |
1.500 |
0.000 |
0.000 |
2.000 |
1.500 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distances between the class centroids: |
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
1 |
2 |
3 |
4 |
5 |
|
|
|
|
|
1 |
0 |
1.000 |
2.236 |
2.449 |
2.398 |
|
|
|
|
|
2 |
1.000 |
0 |
2.000 |
2.236 |
2.179 |
|
|
|
|
|
3 |
2.236 |
2.000 |
0 |
2.236 |
1.658 |
|
|
|
|
|
4 |
2.449 |
2.236 |
2.236 |
0 |
0.866 |
|
|
|
|
|
5 |
2.398 |
2.179 |
1.658 |
0.866 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Central objects: |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Class |
0 |
0 |
0 |
0 |
0 |
0 |
|
|
|
|
1 (0) |
1.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
2 (1) |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
0.000 |
|
|
|
|
3 (0) |
0.000 |
0.000 |
0.000 |
2.000 |
0.000 |
0.000 |
|
|
|
|
4 (0) |
0.000 |
0.000 |
2.000 |
1.000 |
0.000 |
0.000 |
|
|
|
|
5 (0) |
0.000 |
0.000 |
1.000 |
1.000 |
0.000 |
0.000 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Distances between the central objects: |
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
1 (0) |
2 (1) |
3 (0) |
4 (0) |
5 (0) |
|
|
|
|
|
1 (0) |
0 |
1.000 |
2.236 |
2.449 |
1.732 |
|
|
|
|
|
2 (1) |
1.000 |
0 |
2.000 |
2.236 |
1.414 |
|
|
|
|
|
3 (0) |
2.236 |
2.000 |
0 |
2.236 |
1.414 |
|
|
|
|
|
4 (0) |
2.449 |
2.236 |
2.236 |
0 |
1.000 |
|
|
|
|
|
5 (0) |
1.732 |
1.414 |
1.414 |
1.000 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Results by class: |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Class |
1 |
2 |
3 |
4 |
5 |
|
|
|
|
|
Objects |
3 |
2 |
1 |
1 |
2 |
|
|
|
|
|
Sum of weights |
3 |
2 |
1 |
1 |
2 |
|
|
|
|
|
Within-class variance |
0.000 |
0.000 |
0.000 |
0.000 |
1.500 |
|
|
|
|
|
Minimum distance to centroid |
0.000 |
0.000 |
0.000 |
0.000 |
0.866 |
|
|
|
|
|
Average distance to centroid |
0.000 |
0.000 |
0.000 |
0.000 |
0.866 |
|
|
|
|
|
Maximum distance to centroid |
0.000 |
0.000 |
0.000 |
0.000 |
0.866 |
|
|
|
|
|
|
0 |
1 |
0 |
0 |
0 |
|
|
|
|
|
|
0 |
0 |
|
|
1 |
|
|
|
|
|
|
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Results by object: |
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
Observation |
Class |
Distance to centroid |
|
|
|
|
|
|
|
|
0 |
1 |
0.000 |
|
|
|
|
|
|
|
|
1 |
2 |
0.000 |
|
|
|
|
|
|
|
|
0 |
2 |
0.000 |
|
|
|
|
|
|
|
|
0 |
1 |
0.000 |
|
|
|
|
|
|
|
|
0 |
1 |
0.000 |
|
|
|
|
|
|
|
|
0 |
3 |
0.000 |
|
|
|
|
|
|
|
|
0 |
4 |
0.000 |
|
|
|
|
|
|
|
|
0 |
5 |
0.866 |
|
|
|
|
|
|
|
|
1 |
5 |
0.866 |
|
|
|
|
|
|
|
Aggarwal, C. and Reddy, C. (2016). Data clustering.
Celebi, M. (2016). Partitional clustering algorithms. [S.l.]: Springer International Pu.
Kaushik, S. (2016). An Introduction to Clustering & different methods of clustering. [online] Analytics Vidhya. Available at: https://www.analyticsvidhya.com/blog/2016/11/an-introduction-to-clustering-and-different-methods-of-clustering/ [Accessed 24 Aug. 2018].