Generalized linear models link the independent variable with the probability of
observing the dependent variable.
"Linear models" describe the how the dependent variable (the response)
related through some function to the independent variable(s) which are the response-controlling variables.
Ordinary least-squares regression is the most common example of a linear model.1
This idea is generalized with GLM to link the probability of a binary outcome
through some function to the variables that control it. So it is the probability of the outcome,
rather than the outcome itself, that matters.
When the data are binary, OLS is simply unworkable because two of its tenets are violated. OLS requires
that the response, y, be unlimited
, and that the error variance be constant everywhere in that range.
Binary and proportion data are constrained to lie on the unit interval, 0 <
1, and the variance is not constant but depends on p: Var(p) = p(1-p).
That doesn’t mean you can’t coerce a computer program to give you an answer using OLS with a binary or proportion response.
It does mean, however, that the answer will be wrong.
The most common method to describe a binary outcome uses logistic regression, a special case of GLM having the
logit, which is related to the logistic density, as the link function
. Another common link function is the probit, related to the normal (Gaussian) density. Both the
and the probit are symmetrical. But asymmetrical data would
require an asymmetrical link, and data that do not reach either zero on the left
nor one on the right, require
special link functions.
MIL-HDBK-1823A makes extensive use of
GLMs to describe the probability of detection for hit/miss inspections.
GLM on a P/C spreadsheet
Nearly two decades ago when using generalized linear models meant that you wrote
your own software, I implemented a simple binary-response GLM on a P/C spreadsheet.
There are easier methods available now.
Still, working through the exercise was very illuminating to me and I
recommend it to anyone who would like to get a visceral feel for the
machinations of GLMs. You can find step-by-step instructions
1 Linear models are linear in the model parameters, not
necessarily linear in the dependent variable. So a model like
is linear (in the model parameters,
), while a model like
is not a linear model.