## Self-organizing maps (SOM) and vector quantization (VQ) ### Description

The process presents the Kohonen's vector quantization (VQ) and the self-organizing map (SOM) algorithms using the Maximum Variance (R15) dataset. These algorithms can be fitted by the `SOM/Kohonen` operator.

### Input

Maximum Variance (R15) [SIPU Datasets] [Maximum Variance]

The dataset consists of `600` two-dimensional records, which are grouped into `15` groups. The points are located around the point with coordinates (10, 10) and they are farther from each other as they are far from the center. The difficulty of the task is that the groups which are around the center almost fuse. In the figure below these points are depicted by coloring the different groups.

Figure 23.10. The scatterplot of the Maximum Variance (R15) dataset ### Output

First, the method of Kohonen's vector quantization is used. By this method we got `10` clusters. The results can be seen on the figure below.

Figure 23.11. The result of Kohonen's vector quantization The size of clusters can be depicted by a simple pie chart.

Figure 23.12. The pie chart of cluster size A table displays all the statistics which characterize the clusters, among others the frequency of clusters, the standard deviation of clusters, the maximum distance from the center of clusters, and the number of the adjacent cluster with the distance between them.

Figure 23.13. Statistics of clusters Then, the method of batch SOM algorithm is applied for the same dataset. In this case, the numbers of row and column segments should be defined where `6` was chosen. The results are shown in the following two figures. The first one is the schematic graph of the SOM/Kohonen operator on the resulting net where the coloring shows the frequency of each cell.

Figure 23.14. Graphical representation of the SOM The second figure is a scatterplot which displays the resulting clusters in the coordinate system of original input attributes.

Figure 23.15. Scatterplot of the result of SOM ### Interpretation of the results

The experiment shows how to use two unsupervised data mining techniques such as vector quantization and self-organizing maps. The two methods are particularly effective for examining `2`-dimensional data. However, being important prototype methods, they can greatly simplify the further analysis in higher dimension too.

### Workflow

`sas_clust2_exp2.xml`

### Keywords

 vector quantization (VQ) self-organizing map (SOM) clustering

### Operators

 Data Source Graph Explore Self-organizing Map