There is a profound difference between the mathematical behavior of a function whose parameter values are defined (e.g. the FORM/SORM paradigm) and the same function whose parameter values must be estimated from data. In the first instance it is not unreasonable to ask "What is the probability that a point is some number of (rotated and normalized) standard deviations from the given joint mean?"
In the second instance the question becomes "Given these experimental observations, what is the probability of some future observation being at least as large (or small) as some point of interest?" These are not synonymous interrogatives, and even some very famous statisticians had difficulty seeing the distinction.
Karl Pearson (1857-1936) who founded the prestigious statistical journal Biometrika and for whom the Pearson correlation coefficient is named, and who invented the Chi-square test, failed to appreciate that he was incorrectly using the sample means as though they were the population means, treating the means as known when they were estimated from the data. This lead to a famous row with another luminary, R. A. Fisher (1890-1962) who wrote the first book on Design of Experiments and who revolutionized statistics with the concept of likelihood and estimating parameters by maximizing their likelihood.
Fisher pointed out that Pearson had misunderstood his own chi-square test, and was therefore calculating probability of failure incorrectly. The resulting acrimonious and vitriolic row lasted years and was finally resolved in 1926 in Fisher's favor based on data collected, ironically, by Egon Pearson, Karl Pearson's son, who published the results of 11,688 2x2 contingency tables, observed under random sampling. If Karl Pearson had been correct the observed mean value for Chi-square would have been three. Fisher said it would be one, as it was (1.00001). As a result, the elder Pearson's erroneous calculations would erroneously accept as having a 5% probability of occurrence, something with a true probability of 0.5% - an error of 10x, effectively increasing his Type II error rate (failing to reject a false null hypothesis) by 10x.
The lesson here is that really smart people can make this mistake and the consequences can be severe.