Generalized Linear Models on a P/C Spreadsheet

 Home Up Next Back

GLM Index
1.
GLM
2.  Why GLM?
3.  Model Interpretation
5.  Confidence
6.  Building Confidence
8.  Gaining Experience
10.  References
11.  the data
12.  notes

 Logistic Regression and Generalized Linear Models ... The appropriate analysis of pass/fail data (from non-destructive evaluation (NDE) for example) is often beyond the capacity of quality practitioners because of the limited availability and high cost of specialized statistical software. Here is a simple implementation of generalized linear models (GLM) that uses an ordinary P/C spreadsheet, like Microsoft EXCEL, Borland Quattro Pro, or Lotus 1-2-3, and produces maximum likelihood parameter estimates and corresponding likelihood ratio confidence contours (the parameter confidence region), and plots the resulting model with its 95% confidence bounds. Steps for building a sample spreadsheet are outlined and discussed in the NDE context, and illustrated with experimental data. The method is easy and fast to implement, and makes GLM as accessible as the ubiquitous spreadsheet.

Regression and GLM

Ordinary least squares linear regression assumes that the model response varies continuously and is unbounded, and so is inappropriate for binary data for which the observed outcome is bounded and discrete, having only 0 or 1 as possible values. The resulting error is decidedly non-normal and so produces unreliable parameter estimates even when the model is restricted to realistic values (0 < y < 1). Generalized Linear Models (GLM) overcome this difficulty by "linking" the binary response to the explanatory covariates through the probability of either outcome, which does vary continuously from 0 to 1. The transformed probability is then modeled with an ordinary polynomial function, linear in the explanatory variables, so is a generalized linear model.

Non-destructive evaluation (NDE) to detect cracking in a structure provides an example. A perfect inspection would be a step function with POD = 1 for a > acrit and POD = 0 when a < acrit. (Notice that an inspection that finds everything cannot discriminate between a pernicious crack and a benign microstructural artifact, and is therefore useless.) A crack is either detected or it is not (a binary response) but the probability of detection (POD) usually varies continuously from nearly zero for small cracks, to nearly one for large ones.

aside ...
This is an oversimplification, since cracks of similar size exhibit large differences in detectability due to fixed effects like surface preparation and part geometry, random effects like crack orientation, inspector-to-inspector differences and residual stresses, and other factors. While a thorough discussion of NDE isn't intended here, modeling POD provides a good example of implementing GLM on a P/C.  An excellent survey of applications of statistics to NDE can be found in Olin and Meeker (1996).

Reference:
Olin, B.D. and Meeker, W.Q., (1996), Applications of Statistics in Nondestructive Evaluation. (with discussion), Technometrics 38, 95-112.