Fits probit models allowing for natural mortality and immunity (R.W. Payne).
Options
PRINT = string tokens |
Printed output required (model , summary , estimates , correlations , fittedvalues , monitoring , effectivedoses ); default mode , summ , esti , fitt , effe |
---|---|
TRANSFORMATION = string token |
Transformation to be used (probit , logit , complementaryloglog ); default prob |
MORTALITY = string token |
Whether to estimate natural mortality (omit , estimate ); default omit |
IMMUNITY = string token |
Whether to estimate natural immunity (omit , estimate ); default omit |
GROUPS = factor |
Defines groups for an analysis of parallelism; default * i.e. no groups |
SEPARATE = string tokens |
Which parameters (apart from intercept) should be estimated separately for different groups (slope , mortality , immunity , notintercept ); default * i.e. none |
LD = scalar or variate |
Effective, or lethal, doses to be estimated, other than 50 |
CIPROBABILITY = scalar |
Probability level for the confidence interval of effective doses; default 0.95, i.e. a 95% confidence interval |
LOGBASE = string token |
Base of antilog transformation to be applied to LD’s (ten , e ); default * i.e. none |
DISPERSION = scalar |
Controls the use of a heterogeneity factor in the calculation of s.e.s etc; with the default of 1 no factor is used, a missing value * estimates the heterogeneity from the residual deviance |
FITMETHOD = string token |
Method to use to fit the model (generalizednonlinear , nonlinear ) default nonl for Wadley’s problem, otherwise gene |
MAXCYCLE = scalar |
Maximum number of iterations for fitting the model; default 30 |
Parameters
Y = variates |
Number of subjects responding in each batch |
---|---|
DOSE = variates |
Dose received by each batch of subjects |
NBINOMIAL = variates, scalars or factors |
Variate specifying the number of subjects in each batch, or factor specifying groupings of the observations assumed to have equal expected total numbers of subjects in Wadley’s problem; if omitted, assumes Wadleys’s problem with all observations having the same expected total number of subjects |
INITIAL = variates |
Initial values for parameters |
STEPLENGTHS = variates |
Step lengths for parameters |
LDESTIMATES = variates |
Saves estimates of the effective, or lethal, doses |
LDLOWER = variates |
Saves lower values of the confidence intervals for the estimates of the effective, or lethal, doses (for FITMETHOD=gene only) |
LDUPPER = variates |
Saves upper values of the confidence interval values for the estimates of the effective, or lethal, doses (for FITMETHOD=gene only) |
Description
Probit analysis is a way of modelling the relationship between a stimulus, like a drug, and a quantal response (success/failure). It is assumed that for each subject, there is a certain level of dose of the stimulus below which it will unaffected, but above which it will respond. This level of dose, known as its tolerance, will vary from subject to subject within the population.
For example, it is often assumed that the tolerance of houseflies to logarithm of the dose of an insecticide will follow a Normal distribution; so, if we were to plot the proportion of the population with each tolerance against log dose, we would obtain the familiar bell-shaped curve. Likewise, if we plot the probability that a randomly-selected individual will respond, against the logarithm of dose, we would obtain a sigmoid (S-shaped) curve limited below by zero and above by one. To make the relationship linear, it is usual to transform the y-axis either to probits or to Normal equivalent deviates. In Genstat
Probit(P%) = NED(P%/100)
The Normal equivalent deviate may be familiar as the transformation that is used to produce “probability” graph paper.
In probit analysis, we are interested in estimating the equation of that line. This can be done by perfoming an experiment in which there are several batches of subjects, each of which is given a different dose of the stimulus. The data then consists of a variate indicating the number of subjects that responded out of each batch, a variate to show the dose given to each batch, and a final variate for the total numbers of subjects in the batches; these are specified by parameters Y
, DOSE
and NBINOMIAL
, respectively.
The NBINOMIAL
parameter can be omitted if the total numbers cannot be measured, as in some fumigation experiments (“Wadley’s problem”; see for example Finney 1971, pages 202-8). The assumption is that the total numbers receiving the doses will come from the same Poisson distribution, and the mean of this distribution is then estimated in the analysis. Alternatively, NBINOMIAL
can specify a factor to indicate groupings of the doses whose total numbers are expected to come from the same distributions.
The PRINT
option controls printed output:
model |
details of the model that has been fitted, |
---|---|
summary |
summary analysis-of-variance table, |
estimates |
parameter estimates and standard errors, |
correlations |
correlations between parameter estimates, |
fittedvalues |
fitted values and residuals, |
monitoring |
information about the fitting process, and |
effectivedoses |
effective, or lethal, doses (see parameter LD below). |
By default, PRINT=mode,summ,esti,fitt,effe
.
The TRANSFORMATION
option allows other transformations to be selected. Putting TRANSFORMATION=logit
requests a logit transformation:
logit(P%) = log( P% / (100 – P%) )
This is very like the probit but approaches zero (to the left) and one (to the right) rather more slowly. The other possibility is the complementary log-log ( =log( -log(100-P%) ), which is relevant to the “one-hit” model (that is infection processes where just one infected particle is sufficient to cause the response).
Sometimes, subjects may respond even in the absence of any dose. For example, with some short-lived insects, some would have died simply from natural causes during the period of the experiment. By setting option MORTALITY=estimate
this natural mortality can be included in the model and estimated. Similarly, there may be subjects that will not respond, no matter how high the dose. Setting option IMMUNITY=estimate
will include and estimate a parameter for natural immunity.
It is also often of interest to fit study the way in which the model varies for different groups of subjects. For example, there may be groups of batches of subjects, each of which is given a different drug. The GROUPS
option should then specify the group to which each batch of subjects belongs, and option SEPARATE
indicates which parameters of the model (slope, mortality, and/or immunity) should have separate estimates. Separate parameters are always fitted for the intercept unless you include the setting notintercept
. So, if SEPARATE
is left at its default value, parallel lines will be fitted with identical values for any estimates of mortality and immunity.
The LD
option can request the estimation of one or more effective (or lethal) doses, specifying a scalar if there is just one, or a variate if there are several. The LOGBASE
option is useful if the doses have been transformed to logarithms before calling PROBITANALYSIS
. If you use LOGBASE
to specify the base of the logarithms (ten
or e
), the back-transformed lethal doses will be printed as well.
The estimates of the effective (or lethal) doses can be saved, in a variate, by the LDESTIMATES
parameter. Also, when model is fitted as a generalized nonlinear model (see the FITMETHOD
option, below), the lower and upper values of the confidence intervals for the estimates can be saved by the LDLOWER
and LDUPPER
parameters, respectively. If LOGBASE
is set, these are all back-transformed. The CIPROBABILITY
option specifies the probability level for the confidence intervals; the default is 0.95, i.e. 95% confidence intervals.
The DISPERSION
option can be used to request use of a heterogeneity factor in the calculation of the standard errors of the slopes and lethal doses (see Finney 1971, pages 70-74). The standard assumptions for probit analysis are that the observations have binomial distributions in probit lines and planes, or Poisson distributions in Wadley’s problem. Under these circumstances, the residual deviance will follow a Chi-square distribution. The residual deviance should on average be equal to its number of degrees of freedom. A significantly large value may indicate that there are other (possibly unknown) factors affecting the subjects, for example that the conditions were not uniform during the experiment. Alternatively it may occur because the subjects did not react independently, for example because there were sub-populations of genetically related individuals. If the large Chi-square seems to arise because the residuals are larger in general than expected (overdispersion) and not because of systematic deviations from the fitted relationship, it is sensible to increase the standard errors by a heterogeneity factor equal to the residual mean deviance. This can be requested by setting option DISPERSION=*
. Alternatively DISPERSION
can be set to a known value if one is available.
When the FITMETHOD
option is set to generalizednonlinear
, the model is fitted as a generalized nonlinear model, using the FIT
directive. The alternative setting, nonlinear
, fits it as a nonlinear model using FITNONLINEAR
. Apart from minor numerical differences, the two methods should generate the same results. Generalized nonlinear models allow a confidence region to be generated for lethal doses, and these are used as default for all situations except Wadley’s problem. The nonlinear method is more accurate, and is thus used as the default for the more difficult situation presented by Wadley’s problem. However, there is the limitation that you cannot use the notintercept
setting of the SEPARATE
option with the nonlinear method.
The final two parameters, INITIAL
and STEPLENGTHS
, allow initial values and step lengths to be specified for the optimization. For a generalized nonlinear model, the order of parameters is: total(s) for Wadley’s problem (if appropriate), mortality parameters (if any) and immunity parameters (if any); the slopes and intercepts are fitted as regression parameters. For a nonlinear model, the order of parameters is: LD50(s), slope(s), mortality parameters (if any) and immunity parameters (if any); the totals for Wadley’s problem, if required, as fitted as linear parameters. The MAXCYCLE
option sets a limit on the number of iteractions used during fitting (default 30). Parameter estimates, fitted values, residuals, and so on, can be saved after running the procedure, by using the RKEEP
directive in the usual way.
Options: PRINT
, TRANSFORMATION
, MORTALITY
, IMMUNITY
, GROUPS
, SEPARATE
, LD
, CIPROBABILITY
, LOGBASE
, DISPERSION
, FITMETHOD
, MAXCYCLE
.
Parameters: Y
, DOSE
, NBINOMIAL
, INITIAL
, STEPLENGTHS
, LDESTIMATES
, LDLOWER
, LDUPPER
.
Method
For FITMETHOD=generalizednonlinear
a calculated link is used to take account of any mortality or immunity parameters, and a calculated distribution to allow estimation of totals for Wadley’s problem. The fitting is carried out by FIT
(with the CALCULATION
option set if any totals, mortality or immunity parameters are to be estimated), and procedure FIELLER
is used to obtain LD values.
For FITMETHOD=nonlinear
initial values are obtained, if necessary, using the Genstat facilities for generalized linear models, ignoring any mortality or immunity. Expressions specifying the model are defined in sets of nested IF
-blocks, taking account of the settings for example of TRANSFORMATION
and GROUPS
. The fitting is carried out by the FITNONLINEAR
directive, and any extra LD values are estimated using RFUNCTION
.
Action with RESTRICT
The Y
variate, the DOSE
variate, or the GROUPS
factor can be restricted to indicate that the model is to be fitted only to a subset of the units.
Reference
Finney, D.J. (1971). Probit Analysis (third edition). Cambridge University Press, Cambridge.
See also
Commands for: Regression analysis.
Example
CAPTION 'PROBITANALYSIS.',\ !t('Data from Finney, Probit Analysis, 3rd Edition, pages 132-133.',\ 'Parallel lines are fitted to data from 2 different derris roots.',\ 'The insects (grain beetles) are subject to natural mortality;',\ 'a single (common) parameter is fitted for both roots.',\ 'The results differ slightly from those of Finney due to',\ 'the use here of maximum likelihood for the fitting,', 'rather than iterative weighted linear regression.'); STYLE=meta,plain VARIATE [VALUES=2.17,2.00,1.68,1.08,1.79,1.66,1.49,1.17,0.57, *] Logdose & [VALUES= 142, 127, 128, 126, 125, 117, 127, 51, 132,129] Total & [VALUES= 142, 126, 115, 58, 125, 115, 114, 40, 37, 21] Kill FACTOR [LEVELS=2; LABELS=!t(W213,W214); VALUES=4(1),5(2),*] Derris PROBITANALYSIS [TRANS=probit; MORTALITY=estimate; GROUPS=Derris; LD=!(50,90)]\ Kill; DOSE=Logdose; NBINOMIAL=Total