In theory there is no difference between Theory and Practice.
... In practice there is.
There is a profound difference between the mathematical behavior of a function whose parameter values are
defined (e.g. the FORM/SORM paradigm) and the same function whose parameter values must be estimated from data.
In the first instance it is not unreasonable to ask "What is the probability that a point is some number of
(rotated and normalized) standard deviations from the given joint mean?"
In the second instance the question
becomes "Given these experimental observations, what is the probability of some future observation being at
least as large (or small) as some point of interest?" These are not synonymous interrogatives, and even some
very famous statisticians had difficulty seeing the distinction.
Karl Pearson (1857-1936) who founded the prestigious statistical journal Biometrika and for whom the Pearson
correlation coefficient is named, and who invented the Chi-square test, failed to appreciate that he was
incorrectly using the sample means as though they were the population means, treating the means as known when
they were estimated from the data. This lead to a famous row with another luminary, R. A. Fisher (1890-1962)
who wrote the first book on Design of Experiments and who revolutionized statistics with the concept of
likelihood and estimating parameters by maximizing their likelihood.
Fisher pointed out that Pearson had
misunderstood his own chi-square test, and was therefore calculating probability of failure incorrectly. The resulting acrimonious and vitriolic row lasted years and was finally resolved in 1926 in Fisher's
favor based on data collected, ironically, by Egon Pearson, Karl Pearson's son, who published the results of
11,688 2x2 contingency tables, observed under random sampling. If Karl Pearson had been correct the observed
mean value for Chi-square would have been three. Fisher said it would be
one, as it was (1.00001). As a result,
the elder Pearson's erroneous calculations would erroneously accept as having a
5% probability of occurrence, something with a true probability of 0.5% - an
error of 10x, effectively increasing his Type II
error rate (failing to reject a false null hypothesis) by 10x.
The lesson here
is that really smart people can make this mistake and the consequences can be severe.
- Alan Agresti, Categorical Data Analysis, 2nd ed.
Wiley, 2002, sec 16.2
Joan Fisher Box, R. A. Fisher: The Life of a Scientist,