1. Home
  2. KCROSSVALIDATION procedure

KCROSSVALIDATION procedure

Computes cross validation statistics for punctual kriging (D.A. Murray & R. Webster).

Options

PRINT = string tokens Controls printed output (statistics, correlation); default stat
PLOT = string token Whether to produce a scatter plot of the predicted against the true values (scatter); default * i.e. none
Y = variate or scalar Y positions or interval (not needed for 2D regular data i.e. when DATA is a matrix)
X = variate X positions (needed only for 2D irregular data)
YOUTER = variate Variate containing 2 values to define the Y-bounds of the region to be examined (bottom then top); by default the whole region is used
XOUTER = variate Variate containing 2 values to define the X-bounds of the region to be examined (bottom then top); by default the whole region is used
RADIUS = scalar Maximum distance between target point and usable data
SEARCH = string token Type of search (isotropic, anisotropic); default isot
MINPOINTS = scalar Minimum number of data points from which to compute elements; default 7
MAXPOINTS = scalar Maximum number of data points from which to compute elements; default 20
DRIFT = string token Amount of drift (constant, linear, quadratic); default cons
YXRATIO = scalar Ratio of Y interval to X interval
SAVE = pointer Pointer containing model estimates saved from MVARIOGRAM

Parameters

DATA = variates or matrices Observed measurements as a variate or, for data on a regular grid, as a matrix
ISOTROPY = string tokens Form of variogram (isotropic, Burgess, geometrical); default isot
MODELTYPE = string tokens Model fitted to the variogram (power, boundedlinear, circular, spherical, doublespherical, pentaspherical, exponential, besselk1, gaussian, cubic, stable, cardinalsine, matern); default *
NUGGET = scalars The nugget variance
SILLVARIANCES = scalars or variates Sill variances of the spatially dependent component
RANGES = scalar or variates Ranges of the spatially dependent component
GRADIENT = scalars or variates Slope of the unbounded component
EXPONENT = scalars or variates Power of the unbounded component or power for the stable model
SMOOTHNESS = scalar Value of ν parameter for the Matern model
PHI = scalars or variates Phi parameters in anisotropic model (ISOTROPY = burg or geom)
RMAX = scalars or variates Maximum gradient of an anisotropic model
RMIN = scalars or variates Minimum gradient of an anisotropic model
MEASUREMENTERROR = scalars Variance of measurement error
PREDICTIONS = variates or matrices Saves the kriged estimates in matrices for 2D Regular data, otherwise in variates
VARIANCES = variates or matrices Saves the estimation variances in matrices for 2D Regular data, otherwise in variates
STATISTICS = variates Saves the cross validation statistics

Description

In geostatistics one way of choosing between plausible models for variograms is to use them for kriging, and see how well the kriging predicts the true values. The observed value of z at each sampling point in the data is omitted in turn from the whole set and predicted from the others. The predictions are compared with the true values to give a mean deviation or error, and the kriging variances are compared with the squared deviations to give a mean squared deviation ratio. This process is known as “cross-validation”. The procedure KCROSSVALIDATION uses this principle of leave-one-out cross-validation.

The data are supplied, by the DATA parameter, in one of the two forms as for the KRIGE directive: i.e. in a matrix for data on a regular grid, or as a variate for irregularly scattered data together with the X and Y options set to variates to supply the spatial coordinates.

By default all data are considered when forming the kriging system. However, you may select a subset of the data by limiting the area to a rectangle defined by XOUTER and YOUTER options. Each of these should be set to a variate with two values to define lower and upper limits in the x (East-West) and y (North-South) directions respectively.

The minimum and maximum number of points for the kriging system are set by the MINPOINTS and MAXPOINTS options. There is a minimum limit of 3 for MINPOINTS and a maximum of 40 for MAXPOINTS, and MINPOINTS must be less than or equal to MAXPOINTS. The defaults are 7 and 20 respectively. You may select data points around the point to be kriged by setting the RADIUS option to the radius within which they must lie. If the variogram is anisotropic, the search may be requested to be anisotropic by setting option SEARCH to anisotropic; by default SEARCH=isotropic.

You can invoke universal kriging for two-dimensional data by setting the DRIFT option to linear or to quadratic, i.e. to be of order 1 or 2 respectively. The default is DRIFT=constant, to give ordinary kriging. For data in a regular grid that is not square, the ratio of the spacing in the y direction to that in the x direction should be given by the YXRATIO option. The default is 1.0 (i.e. square).

The variogram is specified by its type and parameters, as follows. The MODEL option may be defined to be set to either power, boundedlinear (one dimension only), circular, spherical, doublespherical, pentaspherical, exponential, besselk1 (Whittle’s function), gaussian, cubic, stable (i.e. powered exponential; see Webster & Oliver 2001), cardinalsine or matern. All models may have a nugget variance, supplied using the NUGGET option; this is the constant estimated by MVARIOGRAM. You can specify the variance of any measurement error using the MEASUREMENTERROR parameter. The parameters of the power function (the only unbounded model) are defined by the GRADIENT and EXPONENT parameters. The parameter for the power of the stable model is supplied using the EXPONENT parameter. The parameter ν for the Matern model is supplied using the SMOOTHNESS parameter. The simple bounded models (i.e. all other settings of MODEL except doublespherical) require the SILLVARIANCES (the sill of the correlated variance) and RANGES parameters. The latter is strictly the correlation range of the boundedlinear, circular, spherical and pentaspherical models, while for the asymptotic models it is the distance parameter of the model. The doublespherical model requires SILLVARIANCES and RANGES to be set to variates of length two, to correspond to the two components of the model.

The ISOTROPY parameter allows the variation to be defined to be either isotropic or anisotropic in one of two ways: either Burgess anisotropy (Burgess & Webster 1980) or geometric anisotropy (Webster & Oliver 1990). The anisotropy is specified by three parameters, namely PHI the angle in radians of the direction of maximum variation, RMAX the maximum gradient of the model, and RMIN the minimum gradient. In the current release only the power function may be anisotropic.

The predictions (or estimates) and variances can be saved using the PREDICTIONS and VARIANCES parameters. The cross-validation statistics can be saved using the STATISTICS parameter.

The PRINT option can be set to statistics to print the cross validation statistics or correlation to print the correlation between the predicted and true values. The PLOT option can be used to produce a plot of the predicted values against the true values.

Options: PRINT, PLOT, Y, X, YOUTER, XOUTER, RADIUS, SEARCH, MINPOINTS, MAXPOINTS, DRIFT, YXRATIO, SAVE.

Parameters: DATA, ISOTROPY, MODEL, NUGGET, SILLVARIANCES, RANGES, GRADIENT, EXPONENT, SMOOTHNESS, PHI, RMAX, RMIN, MEASUREMENTERROR, PREDICTIONS, VARIANCES, STATISTICS.

Method

The mean error is given by

i=1…n { z(xi) – zhat(xi) } / n

the mean squared error is

i=1…n { z(xi) – zhat(xi) }2 / n

and the mean squared deviation ratio

i=1…n { (z(x_i) – zhat(xi) )2 / sig2(xi) } / n

Action with RESTRICT

The vectors involved in the analysis may be restricted as for KRIGE.

References

Burgess, T.M. & Webster, R. (1980). Optimal interpolation and isarithmic mapping of soil properties. I. The semi-variogram and punctual kriging. Journal of Soil Science, 31, 315-331.

Webster, R. & Oliver, M.A. (1990). Statistical Methods in Soil and Land Resource Survey. Oxford University Press, Oxford.

Webster, R. & Oliver, M.A. (2001). Geostatistics for Environmental Scientists. Wiley, Chichester.

See also

Directives: FVARIOGRAM, FCOVARIOGRAM, KRIGE, MCOVARIOGRAM, COKRIGE.

Procedures: MVARIOGRAM, DVARIOGRAM, DCOVARIOGRAM, DHSCATTERGRAM.

Commands for: Spatial statistics.

Example

CAPTION 'KCROSSVALIDATION example',!t(\
      'Data are levels of potassium at Broom''s Barn Experimental Station',\
      '(Webster, R and Oliver, M.A. 2001. Geostatistics for Environmental',\
      'Scientists, Wiley)'); STYLe=meta,plain
VARIATE [VALUES=8(1),10(2),25(3),26(4),24(5),23(6),21(7),21(8),\
  27(9),29(10),29(11),29(12),29(13),28(14),28(15),27(16),26(17),25(18)] x
VARIATE [VALUES=(24...31),19,(23...31),(1...11),18,19,(23...31),\
  12,13,14,(2...8),(10...19),(23...31),(4...7),(9...19),(23...31),(5...19),\
  (23...30),(7...19),(23...30),(7...19),(23...30),(4...19),21,20,(22...30),\
  (2...30),(2...30),(2...30),(2...30),(2...18),(20...30),(2...18),(20...30),\
  (3...18),(20...30),(3...18),(20...29),(3...16),18,(20...29)] y
VARIATE [VALUES=26,22,18,19,26,23,32,28,55,19,18,17,15,16,19,15,\
 24,14,28,26,23,21,22,22,24,41,30,20,22,22,26,16,18,15,16,15,16,14,20,15,70,\
 20,22,24,23,20,24,20,20,34,18,18,21,18,22,28,25,28,24,23,16,18,17,16,19,21,\
 13,15,15,24,23,22,25,19,20,19,20,16,16,20,28,21,28,24,18,15,14,15,19,20,15,\
 14,16,28,22,26,27,19,19,15,19,18,20,19,27,29,35,25,16,14,15,16,16,16,16,15,\
 24,24,24,23,22,16,19,16,20,18,27,58,24,18,14,17,17,14,18,15,28,24,23,24,21,\
 16,23,20,26,18,25,20,44,24,53,12,12,15,15,16,18,21,24,20,20,18,20,24,18,23,\
 32,33,27,32,27,26,54,38,*,58,96,23,17,18,20,18,17,21,23,33,24,20,18,19,21,\
 20,21,20,23,31,27,29,30,21,24,60,20,24,30,32,29,25,21,28,35,20,21,24,36,29,\
 24,26,22,20,20,23,24,26,30,42,38,42,38,40,38,25,24,34,25,20,21,22,25,20,27,\
 27,19,32,27,28,23,22,20,21,23,24,20,29,42,36,42,37,33,35,32,30,27,27,19,21,\
 32,30,27,27,28,20,38,26,24,28,28,27,29,27,33,36,27,24,27,33,40,41,36,24,25,\
 24,28,26,25,20,23,22,32,29,19,54,42,41,37,35,33,39,53,42,28,27,26,26,42,38,\
 36,31,20,26,26,23,28,20,19,24,34,29,18,41,30,30,35,33,26,27,41,33,36,27,28,\
 32,39,39,39,27,20,26,28,23,27,24,32,32,44,28,18,39,38,32,30,28,28,35,28,24,\
 29,26,31,31,36,34,31,24,25,31,26,25,35,31,28,25,24,19,38,41,30,28,39,33,29,\
 25,38,23,26,28,29,38,38,28,24,25,28,24,29,19,22,29,39,24,39,38,36,33,28,27,\
 26,28,31,29,24,29,30,35,38,29,30,23,23,29,29,23,20,38,36] k
CALCULATE        logk = LOG10(k)
KCROSSVALIDATION [PRINT=stat,corr; Y=y; X=x; RADIUS=5; MINP=7; MAXP=20]\
                 DATA=logk; MODELTYPE=spherical; NUGGET=0.00466;\
                 SILL=0.01515; RANGE=10.8; STAT=stats
PRINT            stats
Updated on March 7, 2019

Was this article helpful?