PROBITANALYSIS procedure

Fits probit models allowing for natural mortality and immunity (R.W. Payne).

Options

`PRINT` = string tokens	Printed output required (`model`, `summary`, `estimates`, `correlations`, `fittedvalues`, `monitoring`, `effectivedoses`); default `mode`, `summ`, `esti`, `fitt`, `effe`
`TRANSFORMATION` = string token	Transformation to be used (`probit`, `logit`, `complementaryloglog`); default `prob`
`MORTALITY` = string token	Whether to estimate natural mortality (`omit`, `estimate`); default `omit`
`IMMUNITY` = string token	Whether to estimate natural immunity (`omit`, `estimate`); default `omit`
`GROUPS` = factor	Defines groups for an analysis of parallelism; default `*` i.e. no groups
`SEPARATE` = string tokens	Which parameters (apart from intercept) should be estimated separately for different groups (`slope`, `mortality`, `immunity`, `notintercept`); default `*` i.e. none
`LD` = scalar or variate	Effective, or lethal, doses to be estimated, other than 50
`CIPROBABILITY` = scalar	Probability level for the confidence interval of effective doses; default 0.95, i.e. a 95% confidence interval
`LOGBASE` = string token	Base of antilog transformation to be applied to LD’s (`ten`, `e`); default `*` i.e. none
`DISPERSION` = scalar	Controls the use of a heterogeneity factor in the calculation of s.e.s etc; with the default of 1 no factor is used, a missing value `*` estimates the heterogeneity from the residual deviance
`FITMETHOD` = string token	Method to use to fit the model (`generalizednonlinear`, `nonlinear`) default `nonl` for Wadley’s problem, otherwise `gene`
`MAXCYCLE` = scalar	Maximum number of iterations for fitting the model; default 30

Parameters

`Y` = variates	Number of subjects responding in each batch
`DOSE` = variates	Dose received by each batch of subjects
`NBINOMIAL` = variates, scalars or factors	Variate specifying the number of subjects in each batch, or factor specifying groupings of the observations assumed to have equal expected total numbers of subjects in Wadley’s problem; if omitted, assumes Wadleys’s problem with all observations having the same expected total number of subjects
`INITIAL` = variates	Initial values for parameters
`STEPLENGTHS` = variates	Step lengths for parameters
`LDESTIMATES` = variates	Saves estimates of the effective, or lethal, doses
`LDLOWER` = variates	Saves lower values of the confidence intervals for the estimates of the effective, or lethal, doses (for `FITMETHOD=gene` only)
`LDUPPER` = variates	Saves upper values of the confidence interval values for the estimates of the effective, or lethal, doses (for `FITMETHOD=gene` only)

Description

Probit analysis is a way of modelling the relationship between a stimulus, like a drug, and a quantal response (success/failure). It is assumed that for each subject, there is a certain level of dose of the stimulus below which it will unaffected, but above which it will respond. This level of dose, known as its tolerance, will vary from subject to subject within the population.

For example, it is often assumed that the tolerance of houseflies to logarithm of the dose of an insecticide will follow a Normal distribution; so, if we were to plot the proportion of the population with each tolerance against log dose, we would obtain the familiar bell-shaped curve. Likewise, if we plot the probability that a randomly-selected individual will respond, against the logarithm of dose, we would obtain a sigmoid (S-shaped) curve limited below by zero and above by one. To make the relationship linear, it is usual to transform the y-axis either to probits or to Normal equivalent deviates. In Genstat

Probit(P%) = NED(P%/100)

The Normal equivalent deviate may be familiar as the transformation that is used to produce “probability” graph paper.

In probit analysis, we are interested in estimating the equation of that line. This can be done by perfoming an experiment in which there are several batches of subjects, each of which is given a different dose of the stimulus. The data then consists of a variate indicating the number of subjects that responded out of each batch, a variate to show the dose given to each batch, and a final variate for the total numbers of subjects in the batches; these are specified by parameters Y, DOSE and NBINOMIAL, respectively.

The NBINOMIAL parameter can be omitted if the total numbers cannot be measured, as in some fumigation experiments (“Wadley’s problem”; see for example Finney 1971, pages 202-8). The assumption is that the total numbers receiving the doses will come from the same Poisson distribution, and the mean of this distribution is then estimated in the analysis. Alternatively, NBINOMIAL can specify a factor to indicate groupings of the doses whose total numbers are expected to come from the same distributions.

The PRINT option controls printed output:

`model`	details of the model that has been fitted,
`summary`	summary analysis-of-variance table,
`estimates`	parameter estimates and standard errors,
`correlations`	correlations between parameter estimates,
`fittedvalues`	fitted values and residuals,
`monitoring`	information about the fitting process, and
`effectivedoses`	effective, or lethal, doses (see parameter `LD` below).

By default, PRINT=mode,summ,esti,fitt,effe.

The TRANSFORMATION option allows other transformations to be selected. Putting TRANSFORMATION=logit requests a logit transformation:

logit(P%) = log( P% / (100 – P%) )

This is very like the probit but approaches zero (to the left) and one (to the right) rather more slowly. The other possibility is the complementary log-log ( =log( -log(100-P%) ), which is relevant to the “one-hit” model (that is infection processes where just one infected particle is sufficient to cause the response).

Sometimes, subjects may respond even in the absence of any dose. For example, with some short-lived insects, some would have died simply from natural causes during the period of the experiment. By setting option MORTALITY=estimate this natural mortality can be included in the model and estimated. Similarly, there may be subjects that will not respond, no matter how high the dose. Setting option IMMUNITY=estimate will include and estimate a parameter for natural immunity.

It is also often of interest to fit study the way in which the model varies for different groups of subjects. For example, there may be groups of batches of subjects, each of which is given a different drug. The GROUPS option should then specify the group to which each batch of subjects belongs, and option SEPARATE indicates which parameters of the model (slope, mortality, and/or immunity) should have separate estimates. Separate parameters are always fitted for the intercept unless you include the setting notintercept. So, if SEPARATE is left at its default value, parallel lines will be fitted with identical values for any estimates of mortality and immunity.

The LD option can request the estimation of one or more effective (or lethal) doses, specifying a scalar if there is just one, or a variate if there are several. The LOGBASE option is useful if the doses have been transformed to logarithms before calling PROBITANALYSIS. If you use LOGBASE to specify the base of the logarithms (ten or e), the back-transformed lethal doses will be printed as well.

The estimates of the effective (or lethal) doses can be saved, in a variate, by the LDESTIMATES parameter. Also, when model is fitted as a generalized nonlinear model (see the FITMETHOD option, below), the lower and upper values of the confidence intervals for the estimates can be saved by the LDLOWER and LDUPPER parameters, respectively. If LOGBASE is set, these are all back-transformed. The CIPROBABILITY option specifies the probability level for the confidence intervals; the default is 0.95, i.e. 95% confidence intervals.

The DISPERSION option can be used to request use of a heterogeneity factor in the calculation of the standard errors of the slopes and lethal doses (see Finney 1971, pages 70-74). The standard assumptions for probit analysis are that the observations have binomial distributions in probit lines and planes, or Poisson distributions in Wadley’s problem. Under these circumstances, the residual deviance will follow a Chi-square distribution. The residual deviance should on average be equal to its number of degrees of freedom. A significantly large value may indicate that there are other (possibly unknown) factors affecting the subjects, for example that the conditions were not uniform during the experiment. Alternatively it may occur because the subjects did not react independently, for example because there were sub-populations of genetically related individuals. If the large Chi-square seems to arise because the residuals are larger in general than expected (overdispersion) and not because of systematic deviations from the fitted relationship, it is sensible to increase the standard errors by a heterogeneity factor equal to the residual mean deviance. This can be requested by setting option DISPERSION=*. Alternatively DISPERSION can be set to a known value if one is available.

When the FITMETHOD option is set to generalizednonlinear, the model is fitted as a generalized nonlinear model, using the FIT directive. The alternative setting, nonlinear, fits it as a nonlinear model using FITNONLINEAR. Apart from minor numerical differences, the two methods should generate the same results. Generalized nonlinear models allow a confidence region to be generated for lethal doses, and these are used as default for all situations except Wadley’s problem. The nonlinear method is more accurate, and is thus used as the default for the more difficult situation presented by Wadley’s problem. However, there is the limitation that you cannot use the notintercept setting of the SEPARATE option with the nonlinear method.

The final two parameters, INITIAL and STEPLENGTHS, allow initial values and step lengths to be specified for the optimization. For a generalized nonlinear model, the order of parameters is: total(s) for Wadley’s problem (if appropriate), mortality parameters (if any) and immunity parameters (if any); the slopes and intercepts are fitted as regression parameters. For a nonlinear model, the order of parameters is: LD50(s), slope(s), mortality parameters (if any) and immunity parameters (if any); the totals for Wadley’s problem, if required, as fitted as linear parameters. The MAXCYCLE option sets a limit on the number of iteractions used during fitting (default 30). Parameter estimates, fitted values, residuals, and so on, can be saved after running the procedure, by using the RKEEP directive in the usual way.

Options: PRINT, TRANSFORMATION, MORTALITY, IMMUNITY, GROUPS, SEPARATE, LD, CIPROBABILITY, LOGBASE, DISPERSION, FITMETHOD, MAXCYCLE.

Parameters: Y, DOSE, NBINOMIAL, INITIAL, STEPLENGTHS, LDESTIMATES, LDLOWER, LDUPPER.

Method

For FITMETHOD=generalizednonlinear a calculated link is used to take account of any mortality or immunity parameters, and a calculated distribution to allow estimation of totals for Wadley’s problem. The fitting is carried out by FIT (with the CALCULATION option set if any totals, mortality or immunity parameters are to be estimated), and procedure FIELLER is used to obtain LD values.

For FITMETHOD=nonlinear initial values are obtained, if necessary, using the Genstat facilities for generalized linear models, ignoring any mortality or immunity. Expressions specifying the model are defined in sets of nested IF-blocks, taking account of the settings for example of TRANSFORMATION and GROUPS. The fitting is carried out by the FITNONLINEAR directive, and any extra LD values are estimated using RFUNCTION.

Action with `RESTRICT`

The Y variate, the DOSE variate, or the GROUPS factor can be restricted to indicate that the model is to be fitted only to a subset of the units.

Reference

Finney, D.J. (1971). Probit Analysis (third edition). Cambridge University Press, Cambridge.

Example

CAPTION 'PROBITANALYSIS.',\
        !t('Data from Finney, Probit Analysis, 3rd Edition, pages 132-133.',\
        'Parallel lines are fitted to data from 2 different derris roots.',\
        'The insects (grain beetles) are subject to natural mortality;',\
        'a single (common) parameter is fitted for both roots.',\
        'The results differ slightly from those of Finney due to',\
        'the use here of maximum likelihood for the fitting,',
        'rather than iterative weighted linear regression.'); STYLE=meta,plain
VARIATE [VALUES=2.17,2.00,1.68,1.08,1.79,1.66,1.49,1.17,0.57,  *] Logdose
&       [VALUES= 142, 127, 128, 126, 125, 117, 127,  51, 132,129] Total
&       [VALUES= 142, 126, 115,  58, 125, 115, 114,  40,  37, 21] Kill
FACTOR  [LEVELS=2; LABELS=!t(W213,W214); VALUES=4(1),5(2),*] Derris
PROBITANALYSIS [TRANS=probit; MORTALITY=estimate; GROUPS=Derris; LD=!(50,90)]\
        Kill; DOSE=Logdose; NBINOMIAL=Total

Updated on June 19, 2019

Was this article helpful?

Yes No