Example of Discriminant Analysis
main topic
     interpreting results     session command     see also 

In order to regulate catches of salmon stocks, it is desirable to identify fish as being of Alaskan or Canadian origin. Fifty fish from each place of origin were caught and growth ring diameters of scales were measured for the time when they lived in freshwater and for the subsequent time when they lived in saltwater. The goal is to be able to identify newly-caught fish as being from Alaskan or Canadian stocks. The example and data are from [6], page 519-520.

1    Open the worksheet EXH_MVAR.MTW.

2    Choose Stat > Multivariate > Discriminant Analysis.

3    In Groups, enter SalmonOrigin.

4    In Predictors, enter Freshwater Marine. Click OK.

Session window output

Discriminant Analysis: SalmonOrigin versus Freshwater, Marine

 

 

Linear Method for Response: SalmonOrigin

 

 

Predictors: Freshwater, Marine

 

 

Group    Alaska    Canada

Count        50        50

 

 

Summary of classification

 

                  True Group

Put into Group  Alaska  Canada

Alaska              44       1

Canada               6      49

Total N             50      50

N correct           44      49

Proportion       0.880   0.980

 

N = 100           N Correct = 93           Proportion Correct = 0.930

 

 

Squared Distance Between Groups

 

         Alaska   Canada

Alaska  0.00000  8.29187

Canada  8.29187  0.00000

 

 

Linear Discriminant Function for Groups

 

             Alaska  Canada

Constant    -100.68  -95.14

Freshwater     0.37    0.50

Marine         0.38    0.33

 

 

Summary of Misclassified Observations

 

                                                Squared

Observation    True Group  Pred Group   Group  Distance  Probability

          1**      Alaska      Canada  Alaska     3.544        0.428

                                       Canada     2.960        0.572

          2**      Alaska      Canada  Alaska    8.1131        0.019

                                       Canada    0.2729        0.981

         12**      Alaska      Canada  Alaska    4.7470        0.118

                                       Canada    0.7270        0.882

         13**      Alaska      Canada  Alaska    4.7470        0.118

                                       Canada    0.7270        0.882

         30**      Alaska      Canada  Alaska     3.230        0.289

                                       Canada     1.429        0.711

         32**      Alaska      Canada  Alaska     2.271        0.464

                                       Canada     1.985        0.536

         71**      Canada      Alaska  Alaska     2.045        0.948

                                       Canada     7.849        0.052

Interpreting the results

As shown in the Summary of Classification table, the discriminant analysis correctly identified 93 of 100 fish, though the probability of correctly classifying an Alaskan fish was lower (44/50 or 88%) than was the probability of correctly classifying a Canadian fish (49/50 or 98%). To identify newly-caught fish, you could compute the linear discriminant functions associated with Alaskan and Canadian fish and identify the new fish as being of a particular origin depending upon which discriminant function value is higher. You can either do this by using Calc > Calculator using stored or output values, or performing discriminant analysis again and predicting group membership for new observations.

The Summary of Misclassified Observations table shows the squared distances from each misclassified point to group centroids and the posterior probabilities. The squared distance value is that value from observation to the group centroid, or mean vector. The probability value is the posterior probability. Observations are assigned to the group with the highest posterior probability.