Gaining Experience

Gaining Experience:

While it may be true that only real data can provide real experience, even fighter pilots use simulators. The spreadsheet provides an easy mechanism for acquiring experience with things such as the influence of sample size, the most effective experimental design, the influence of data from less than optimal experiments, dealing with convergence difficulties, and as important as anything, the influence of happenstance when repeated, nominally identical, experiments produce sometimes surprising differences. Simulated responses can be generated by creating a cell (C5) containing the probability of a single observation, based on current values of L and S, as shown in Table 4.

Table 4

 column C EXCEL IF(RAND() > \$G5, 0, 1) Quattro Pro @IF(@RAND > \$G5, 0, 1) Lotus 1-2-3 @IF(@RAND > \$G5, 0, 1)

RAND() generates a uniform random variate on (0,1) and the IF compares it with p (in column G), which is linked to the explanatory variable, and returns a 1 if true and 0 if false. Copy this cell into the column to contain the "response" data (column C), and the first realization of outputs is generated. Since this column will be automatically recalculated any time any cell is updated, its values need to be fixed before they can be analyzed. To do this, highlight the column, choose COPY, then PASTE VALUES. (This is usually done with icons and a few mouse clicks.) The column will no longer contain the generating functions, only zeros and ones.

Leaving a generator cell unchanged at the top of the column (unused in row 5) makes it easy to copy it into the entire column and generate another "realization" of data. Other realizations can be generated by pressing the RECALCULATE NOW key before changing the column from functions to numbers. A cell (C64) containing the sum of the responses (Sy) provides a qualitative feel for the data. (Were half the "observations" ones? Did you remember to redefine this sum for any change in sample size?) The spreadsheet is a great learning tool because of it immediacy. Results are plotted (almost) instantly and its ease of use encourages experimentation.

Summary:

The modern P/C spreadsheet can provide far greater access to Generalized Linear Models than is currently afforded by the relative paucity and high cost of sophisticated statistical software. The techniques described here allow the quality practitioner to compute maximum likelihood parameter estimates and their likelihood ratio confidence contours and plot the resulting model with its 95% confidence bounds, or a specific point of interest like a90/95. The methods are easy to understand and implement and encourage the user to learn by experimenting.

Acknowledgments:

I wish to thank my engineering colleagues for their support and my statistical brethren, especially Bill Meeker, for their tutelage and patience.

[First page] [back] [next] [Spreadsheets] [References]

 [ HOME ] [ Feedback ] Mail to Charles.Annis@StatisticalEngineering.com Copyright � 1998-2008 Charles Annis, P.E. Last modified: June 08, 2014