Example of Cluster K-Means
main topic interpreting results session command see also

You live-trap, anesthetize, and measure one hundred forty-three black bears. The measurements are total length and head length (Length, Head.L), total weight and head weight (Weight, Head.W), and neck girth and chest girth (Neck.G, Chest.G). You wish to classify these 143 bears as small, medium-sized, or large bears. You know that the second, seventy-eighth, and fifteenth bears in the sample are typical of the three respective categories. First, you create an initial partition column with the three seed bears designated as 1 = small, 2 = medium-sized, 3 = large, and with the remaining bears as 0 (unknown) to indicate initial cluster membership. Then you perform K-means clustering and store the cluster membership in a column named BearSize.

1 Open the worksheet BEARS.MTW.

2 To create the initial partition column, choose Calc > Make Patterned Data > Simple Set of Numbers.

3 In Store patterned data in, enter Initial for the storage column name.

4 In both From first value and From last value, enter 0.

5 In List each value, enter 143. Click OK.

6 Go to the Data window and enter 1, 2, and 3 in the second, seventy-eighth, and fifteenth rows, respectively, of the column named Initial.

7 Choose Stat > Multivariate > Cluster K-Means.

8 In Variables, enter 'Head.L'-Weight.

9 Under Specify Partition by, choose Initial partition column and enter Initial.

10 Check Standardize variables.

11 Click Storage. In Cluster membership column, enter BearSize.

12 Click OK in each dialog box.

Session window output

K-means Cluster Analysis: Head.L, Head.W, Neck.G, Length, Chest.G, Weight

Standardized Variables

Final Partition

Number of clusters: 3

Within Average Maximum

cluster distance distance

Number of sum of from from

observations squares centroid centroid

Cluster1 41 63.075 1.125 2.488

Cluster2 67 78.947 0.997 2.048

Cluster3 35 65.149 1.311 2.449

Cluster Centroids

Grand

Variable Cluster1 Cluster2 Cluster3 centroid

Head.L -1.0673 0.0126 1.2261 -0.0000

Head.W -0.9943 -0.0155 1.1943 0.0000

Neck.G -1.0244 -0.1293 1.4476 -0.0000

Length -1.1399 0.0614 1.2177 0.0000

Chest.G -1.0570 -0.0810 1.3932 -0.0000

Weight -0.9460 -0.2033 1.4974 -0.0000

Distances Between Cluster Centroids

Cluster1 Cluster2 Cluster3

Cluster1 0.0000 2.4233 5.8045

Cluster2 2.4233 0.0000 3.4388

Cluster3 5.8045 3.4388 0.0000

Interpreting the results

K-means clustering classified the 143 bears as 41 small bears, 67 medium-size bears, and 35 large bears. Minitab displays, in the first table, the number of observations in each cluster, the within cluster sum of squares, the average distance from observation to the cluster centroid, and the maximum distance of observation to the cluster centroid. In general, a cluster with a small sum of squares is more compact than one with a large sum of squares. The centroid is the vector of variable means for the observations in that cluster and is used as a cluster midpoint.

The centroids for the individual clusters are displayed in the second table while the third table gives distances between cluster centroids.

The column BearSize contains the cluster designations.

Example of Cluster K-Means main topic interpreting results session command see also

Interpreting the results

Example of Cluster K-Means
main topic interpreting results session command see also