Defines the fixed model for a hierarchical or double hierarchical generalized linear model (R.W. Payne, Y. Lee, J.A. Nelder & M. Noh).
Options
DISTRIBUTION = string token |
Distribution of the data (binomial, poisson, normal, gamma); default norm |
|---|---|
LINK = string token |
Link for the fixed model (identity, logarithm, logit, reciprocal, probit, complementaryloglog); default iden |
DISPERSION = scalar |
Value of dispersion parameter in calculation of s.e.s etc; default * for DIST=norm or gamm, and 1 for DIST=pois or bino |
DLINK = string token |
Link for the dispersion model (logarithm, reciprocal); default loga |
DTERMS = formula |
Dispersion model; default * i.e. none |
CONSTANT = string token |
How to treat the constant (estimate, omit) default esti |
FACTORIAL = scalar |
Limit on number of variates and/or factors in a fixed model term; default 3 |
WEIGHTS = variate |
Prior weights; default * i.e. 1 |
OFFSET = variate |
Offset variate; default * i.e. none |
DOFFSET = variate |
Offset variate for dispersion model; default * i.e. none |
DDISPERSION = scalar |
Dispersion parameter to use in a dispersion model for the residual dispersion parameter phi; default 1 |
IDISPERSION = scalar |
Initial value for the residual dispersion parameter phi; default * i.e. formed automatically |
Parameter
TERMS = formula |
Fixed model |
|---|
Description
HGFIXEDMODEL is one of several procedures with the prefix HG, which provide tools for fitting the hierarchical generalized linear models defined by Lee & Nelder (1996, 2001, 2006) and described by Lee, Nelder & Pawitan (2006). These models extend generalized linear models (GLMs) to include additional random terms in the linear predictor. They include generalized linear mixed models (GLMMs) as a special case, but do not constrain the additional terms to follow a Normal distribution and to have an identity link (as in the GLMM). For example, if the basic generalized linear model is a log-linear model (Poisson distribution and log link), a more appropriate assumption or the additional random terms might be a gamma distribution and a log link.
The role of HGFIXEDMODEL is to specify the fixed model terms in the HGLM, and to define the distribution of the data (this corresponds to error distribution of a GLM). The fixed model is given by the TERMS parameter. Most of the options operate similarly to those occurring in the directives FIT and MODEL. The link function for the fixed model is defined by the LINK option, and the FACTORIAL option sets a limit on the number of variates and/or factors for a term to be included in the fixed model (default 3). The CONSTANT option indicates whether or not to include a constant term or intercept (by default this is included), and the OFFSET option allows an offset variate to be included. The DISTRIBUTION option defines the distribution of the data, the WEIGHTS option allows you to specify a variate of prior weights, and the DISPERSION option governs how the dispersion parameter is obtained.
The HGLM methodology also caters for structured dispersion models, in which fixed terms are included in the generalized linear models that are used to estimate the dispersion parameters. Currently these GLMs must have a gamma distribution. The DTERMS option allows you to specify fixed terms for the GLM that estimates the residual dispersion parameter phi. The DLINK parameter specifies the link to use with the dispersion model, the DOFFSET option allows you to specify an offset variate, and the DDISPERSION option defines the dispersion parameter for the dispersion GLM (default 1). You can also extend the GLM to become an HGLM (thus making the full model a double hierarchical generalized linear model or DHGLM), by using the HGDRANDOMMODEL procedure to add some random terms.
The IDISPERSION option allows you to define an initial value for the residual dispersion parameter phi. Initial values for the dispersion parameters of the additional random terms of the HGLM can be defined using the IDISPERSION parameter of the HGRANDOMMODEL procedure. If you set both of these, the HGANALYSE procedure will then use them to initialize the weights that are involved in the fitting of the augmented mean model; for details see Chapter 6 of Lee, Nelder & Pawitan (2006). The default weights that are formed automatically if either of these is unset are satisfactory in most circumstances, but you may want to try your own initial values if you encounter convergemce problems.
Options: DISTRIBUTION, LINK, DISPERSION, DLINK, DTERMS, CONSTANT, FACTORIAL, WEIGHTS, OFFSET, DOFFSET, DDISPERSION, IDISPERSION.
Parameter: TERMS.
Method
The information is stored in a workspace G5PL_HG (accessed using the WORKSPACE directive) for later use by HGANALYSE.
References
Lee, Y., & Nelder, J.A. (1996). Hierarchical generalized linear models (with discussion). Journal of the Royal Statistical Society, Series B, 58, 619-678.
Lee, Y., & Nelder, J.A. (2001). Hierarchical generalized linear models: a synthesis of generalised linear models, random-effect models and structured dispersions. Biometrika, 88, 987-1006.
Lee, Y. & Nelder, J.A. (2006). Double hierarchical generalized linear models (with discussion). Appl. Statist., 55, 139-185.
Lee, Y., Nelder, J.A. & Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood. Chapman & Hall, London.
Lee, Y., Nelder, J.A. & Pawitan, Y. (2006). Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood. Chapman and Hall, Boca Raton.
See also
Procedures: HGANALYSE, HGDISPLAY, HGDRANDOMMODEL, HGFTEST, HGGRAPH, HGKEEP, HGNONLINEAR, HGPLOT, HGPREDICT, HGRANDOMMODEL, HGRTEST, HGSTATUS, HGWALD.
Commands for: Regression analysis.
Example
CAPTION 'HGFIXEDMODEL example',!t(\
'Number of faults in rolls of fabric of various lengths',\
'(data from Bissell (1972) Biometrika, 59, 435-441).'),\
'Fit negative binomial: var(y) = mu + alpha * mu * mu',\
'(equivalent to Poisson gamma HGLM with saturated random effect).';\
STYLE=meta,3(plain)
VARIATE [NVALUES=32] length,faults
READ length,faults
551 6 651 4 832 17 375 9 715 14 868 8 271 5 630 7
491 7 372 7 645 6 441 8 895 28 458 4 642 10 492 4
543 8 842 9 905 23 542 9 522 6 122 1 657 9 170 4
738 9 371 14 735 17 749 10 495 7 716 3 952 9 417 2 :
CALCULATE loglength = log(length)
& loglength = loglength - mean(loglength)
FACTOR [LEVELS=32; VALUES=1...32] saturated
HGFIXEDMODEL [DISTRIBUTION=poisson; LINK=log] loglength
HGRANDOMMODEL [DISTRIBUTION=normal; LINK=identity] saturated
HGANALYSE faults