Using Cross-Validation
main topic
    
 

Cross-validation calculates the predictive ability of potential models to help you determine the appropriate number of components to retain in your model. Cross-validation is recommended if you do not know the optimal number of components. When the data contain multiple response variables, Minitab validates the components for all responses simultaneously. For more information, see [18].

Listed below are the methods for cross-validation:

·    Leave-one-out: Calculates potential models leaving out one observation at a time. For large data sets, this method can be time-consuming, because it recalculates the models as many times as there are observations.

·    Leave-group-out of size: Calculates the models leaving multiple observations out at a time, reducing the number of times it has to recalculate a model. This method is most appropriate when you have a large data set.

·    Leave-out-as-specified-in-column: Calculates the models, simultaneously leaving out the observations that have matching numbers in the group identifier column, which you create in the worksheet. This method allows you to specify which observations are omitted together. For example, if the group identifier column includes numbers 1, 2, and 3, all observations with 1 are omitted together and the model is recalculated. Next, all observations with 2 are omitted and the model is recalculated, and so on. In this case, the model is recalculated a total of 3 times. The group identifier column must be the same length as your response and predictor columns and cannot contain missing values.