Method of obtaining probability plot points
main topic
 

Probability Plot creates an estimated cumulative distribution function (cdf) from your sample by plotting the value of each observation (including repeated values) against its estimated cumulative probability.

Estimated cumulative probability is calculated by one of the following formulas, according to what is selected in Tools > Options > Individual Graphs > Probability Plots (the default is median rank). For each formula, let n equal the number of observations and i equal the rank-order of each observation such that i = 1 for the smallest value and i = n for the largest.

Method

Formula

Median Rank (Benard)

i - 0.3

n + 0.4

 

Mean Rank (Herd-Johnson)

i

n + 1

 

Modified Kaplan-Meier (Hazen)

i - 1/2

n

 

Kaplan-Meier

i

n

 

The fitted distribution line represents the cdf for the chosen theoretical distribution with the indicated parameters (either estimated or historical). If you do not provide historical parameters, Minitab will estimate the parameters using least squares estimation (normal or lognormal distribution) or maximum likelihood estimation (other distributions).

The y-values (and in some cases the x-values) are transformed so that the fitted line is linear. Tick labels, however, remain consistent with the untransformed values. Thus, to the extent that the chosen distribution fits your data, the plotted points form a straight line.

The table below shows the transformations used for each distribution.

Distribution

X-coordinate

Y-coordinate (score)

Normal

data

image\phi_neg_1.gif(p)

Lognormal

ln(data)

image\phi_neg_1.gif(p)

3-parameter lognormal

ln(data - threshold)

image\phi_neg_1.gif(p)

Gamma

ln(data)

G-1(p), k

3-parameter gamma

ln(data - threshold)

G-1(p), k

Exponential

ln(data)

ln(-ln(1 - p))

2-parameter exponential

ln(data - threshold)

ln(-ln(1 - p))

Smallest extreme value

data

ln(-ln(1 - p))

Weibull

ln(data)

ln(-ln(1 - p))

3-parameter Weibull

ln(data - threshold)

ln(-ln(1 - p))

Largest extreme value

data

-ln(-ln(p))

Logistic

data

ln(p / (1 - p))

Loglogistic

ln(data)

ln(p / (1 - p))

3-parameter loglogistic

ln(data - threshold)

ln(p / (1 - p))

where:

 

 

data

=

data value for the observation

In(x)

=

natural log of x

image\phi_neg_1.gif(p)

=

value returned for p by the inverse cdf for the standard normal distribution.

G-1(p), k

=

value returned for p by the inverse cdf for a Gamma distribution with shape = k and scale = 1. Minitab uses the estimated shape parameter unless you enter a historical value.

More

If you plot data unadjusted for threshold, distribution fit is not indicated by a straight line.