K-medoids method


The process shows, using the Maximum Variance (R15) dataset, how the K-medoids method can be used.


Maximum Variance (R15) [SIPU Datasets] [Maximum Variance]

The dataset contains 600 two-dimensional vectors, which are concentrated into 15 clusters. The points are aligned around a center with the coordinates (10,10), in increasing distances from each other as they get further from the center. This is the difficulty of the task, as the clusters near the center are close to blending into each other.

Figure 11.5. The dataset

The dataset


Figure 11.6. Setting the parameters of the clustering

Setting the parameters of the clustering

The difference of the K-medoids method from the K-means method is that the centers of the clusters have to be existing points. After setting the distance function and the number of clusters k, and then running the process, it can be seen that even though a more sophisticated distance function has been chosen, the alignment of the data did not make the precise analysis of the central clusters possible.

Figure 11.7. The clusters produced by the analysis

The clusters produced by the analysis

Interpretation of the results

The process has shown that not all datasets provide a chance for the usage of arbitrary cluster analysis.





K-medoids method
dataset properties
cluster analysis


Read CSV