Chapter 12 Spatial interpolation of point data

Last updated: 2020-08-12 00:40:39

Aims

Our aims in this chapter are:

  • Calculate an empirical variogram
  • Fit a variogram model
  • Interpolate using three methods:
    • Inverse Distance Weighted (IDW) interpolation
    • Ordinary Kriging (OK)
    • Universal Kriging (UK)
  • Evaluate interpolation accuracy using Leave-One-Out Cross Validation

We will use the following R packages:

  • sf
  • stars
  • gstat
  • automap

12.1 What is spatial interpolation?

12.1.1 Interpolation models

Spatial interpolation is the prediction of a given phenomenon in unmeasured locations (Figures 12.112.2). For that, we need a spatial interpolation model—a set of procedures to calculate predicted values of the variable of interest, given calibration data.

Spatial interpolation (Input elevation point data, Interpolated elevation surface)^[http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/understanding-interpolation-analysis.htm]

Figure 12.1: Spatial interpolation (Input elevation point data, Interpolated elevation surface)61

Spatial interpolation (Point locations of ozone monitoring stations, Interpolated prediction surface)^[http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/understanding-interpolation-analysis.htm]

Figure 12.2: Spatial interpolation (Point locations of ozone monitoring stations, Interpolated prediction surface)62

Calibrarion data usually include:

  • Field measurements—available for a limited number of locations, for example: rainfall data from meteorological stations
  • Covariates—available for each and every location within the area of interest, for example: elevation from a DEM

Spatial interpolation models can be divided into two general categories:

  • Deterministic models—Models using arbitrary parameter values, for example: IDW
  • Statistical models—Models using parameters chosen objectively based on the data, for example: Kriging

Keep in mind that data structure does not imply meaning. It is technically possible to interpolate any numeric variable measured in a set of points, however it does not always make sense to do so. For example, it does not make sense to spatially interpolate point data when they refer to a localized phenomenon, such as amount of emissions per power plant (Figure 12.3).

$CO_2$ emissions from power plants: a localized phenomenon^[https://edzer.github.io/UseR2016/]

Figure 12.3: \(CO_2\) emissions from power plants: a localized phenomenon63

Converesly, it does not make sense to sum up point measurements of a continuous phenomenon (Figure 12.4).

$PM_{10}$ measurements: a continuous phenomenon^[https://edzer.github.io/UseR2016/]

Figure 12.4: \(PM_{10}\) measurements: a continuous phenomenon64

12.1.2 The weighted average principle

Many of the commonly used interpolation methods, including the ones we learn about in this Chapter (Nearest Neighbor, IDW, Kriging), are based on the same principle, where a predicted value is a weighted average of neighboring points. Weight are usually inveresely related to distance, i.e., as distance increases the weight (importance) of the point decreases. The predicted value for a particular point is calculated as a weighted average of measured values in other points:

\[\hat{Z}(s_{0})=\frac{\sum_{i=1}^{n}w(s_{i})Z(s_{i})}{\sum_{i=1}^{n}w(s_{i})}\]

where:

  • \(\hat{Z}(s_{0})\) is the predicted value at location \(s_{0}\)
  • \(w(s_{i})\) is the weight of measured point \(i\)
  • \(Z(s_{i})\) is the value of measured point \(i\)

The weight \(w(s_{i})\) of each measured point is a function of distance (Figure 12.5) from the predicted point.

Distances between predicted point and all measured points^[http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-kriging-works.htm]

Figure 12.5: Distances between predicted point and all measured points65

In IDW, the weight is the inverse of distance to the power of \(p\):

\[w(s_{i})=\frac{1}{d(s_{0}, s_{i})^p}\]

where:

  • \(w(s_{i})\) is the weight of measured point \(i\)
  • \(d(s_{0}, s_{i})\) is the distance between predicted point \(s_{0}\) and measured point \(s_{i}\)

The default value for \(p\) is usually \(p=2\):

\[w(s_{i})=\frac{1}{d(s_{0}, s_{i})^2}\]

The \(p\) parameter basically determines how steeply does weight increase with proximity. As a result, \(p\) determines whether weights are more or less equally distributed among neighbors (low \(p\)) or whether one point (the nearest) has overwhelmingly high weight and thus the predicted value will be strongly influenced by that point (high \(p\)). In other words, when \(p\) approaches zero, the predicted result will approach a uniform surface which is just an average of all measured points. When \(p\) approaches infinity, the predicted result will approach nearest neighbor interpolation, which is the simplest spatial interpolation method there is: every predicted point gets the value of the nearest measured point (Figures 12.612.7).

Spatial interpolation of annual rainfall using IDW with $p=0.25$, $p=2$ and $p=16$Spatial interpolation of annual rainfall using IDW with $p=0.25$, $p=2$ and $p=16$Spatial interpolation of annual rainfall using IDW with $p=0.25$, $p=2$ and $p=16$

Figure 12.6: Spatial interpolation of annual rainfall using IDW with \(p=0.25\), \(p=2\) and \(p=16\)

Nearest Neighbor interpolation (left) and Voronoi polygons (right)Nearest Neighbor interpolation (left) and Voronoi polygons (right)

Figure 12.7: Nearest Neighbor interpolation (left) and Voronoi polygons (right)

In Kriging, the weight is a particular function of distance known as the variogram model (Figure 12.8). The variogram model is fitted to characterize the autocorrelation structure in the measured data, based on the empirical variogram.

Variogram models: spherical (left) and exponential (right)^[http://desktop.arcgis.com/en/arcmap/10.3/tools/spatial-analyst-toolbox/how-kriging-works.htm]

Figure 12.8: Variogram models: spherical (left) and exponential (right)66

There are two frequently used kriging methods: Ordinary Kriging (OK) and Universal Kriging (UK). Adding up the Inverse Distance Weighted (IDW) interpolation, we now mentioned three interpolation methods. We are going to cover those three methods (Figure 12.9), mostly from the practical point of view, in the next three sections (Sections 12.212.4).

Spatial interpolation of annual rainfall using IDW, OK and UKSpatial interpolation of annual rainfall using IDW, OK and UKSpatial interpolation of annual rainfall using IDW, OK and UK

Figure 12.9: Spatial interpolation of annual rainfall using IDW, OK and UK

For the examples, we will load the rainfall.csv file (Section 4.4.3), calculate the annual column (Section 4.5) and convert it to a point layer (Section 7.4):

We will also use a \(1\times1\) \(km^2\) DEM raster of the area of interest (Sections 9.1, 9.3 and 10.1):

Finally, we subset the dem to include only the area to the north of 31 degrees latitude, where meteorological station density is relatively high:

Next, we extract the elevation values (Section 10.7.2):

and subset those stations that coincide with the raster:

Figure 12.10 shows the dem elevation raster and the rainfall point layer:

Rainfall data points and elevation raster

Figure 12.10: Rainfall data points and elevation raster

12.2 Inverse Distance Weighted interpolation

12.2.1 The gstat object

To interpolate, we first need to create an object of class gstat, using a function of the same name: gstat. A gstat object contains all necessary information to conduct spatial interpolation, namely:

  • The model definition
  • The calibration data

Based on its arguments, the gstat function “understands” what type of interpolation model we want to use:

  • No variogram model → IDW
  • Variogram model, no covariates → Ordinary Kriging
  • Variogram model, with covariates → Universal Kriging

The complete decision tree of gstat, including several additional methods which we are not going to use, is shown in Figure 12.11.

`gstat` predict methods^[Applied Spatial Data Analysis with R (2013)]

Figure 12.11: gstat predict methods67

We are going to use three parameters of the gstat function:

  • formula—The prediction “formula” specifying the dependent and the independent variables (covariates)
  • data—The calibration data
  • model—The variogram model

Keep in mind that we need to specify parameter names, because these three parameters are not the first three in the gstat function definition.

For example, to interpolate using the IDW method we create the following gstat object, specifying just the formula (Section 12.2.2 below) and data:

12.2.2 Working with formula objects

Im R, formula objects are used to specify relation between objects, in particular—the role of different data columns in statistical models. A formula object is created using the ~ operator, which separates names of dependent variables (to the left of the ~ symbol) and independent variables (to the right of the ~ symbol). Writing 1 to the right of the ~ symbol, as in ~ 1, means that there are no independent variables68.

For example, in the following expression we create a formula object named f:

Checking the class shows that f is indeed a formula object:

We can also convert character values to formula using the as.formula function. For example:

The as.formula function is particularly useful when we want to construct different formulas as part of a for loop (Section 12.3.3).

12.2.3 Making predictions

Now that our model is defined, we can use the predict function to actually interpolate, i.e., to calculate predicted values. The predict function accepts:

  • A rasterstars object, such as dem
  • A modelgstat object, such as g

The raster serves for two purposes:

  • Specifying the locations where we want to make predictions (in all methods)
  • Specifying covariate values (in Universal Kriging only)

For example, the following expression interpolates annual values according to the model defined in g and the raster template defined in dem:

The resulting stars object has two attributes:

  • var1.pred—the predicted values
  • var1.var—the variance (for Kriging only)

For example:

We can subset just the first attribute and rename it to "annual":

The interpolated annual rainfall raster, using IDW, is shown in Figure 12.12:

Predicted annual rainfall using Inverse Distance Weighted (IDW) interpolation

Figure 12.12: Predicted annual rainfall using Inverse Distance Weighted (IDW) interpolation

12.3 Ordinary Kriging

12.3.1 Annual rainfall example

Kriging methods require a variogram model. The variogram model is an objective way to quantify the autocorrelation pattern in the data, and assign weights accordingly when making predictions (Section 12.1.2).

As a first step, we can calculate and examine the empirical variogram using the variogram function. The function requires two arguments:

  • formula—Specifies the dependent variable and the covariates, just like in gstat
  • data—The point layer with the dependent variable and covariates as point attributes

For example, the following expression calculates the empirical variogram of annual, with no covariates:

Using plot to examine it we can examine the variogram (Figure 12.13):

Empirical variogram

Figure 12.13: Empirical variogram

There are several ways to fit a variogram model to an empirical variogram. We will use the simplest one—automatic fitting using function autofitVariogram from package automap:

The function chooses the best fitting type of model, and also fine tunes its parameters. (Use show.vgms() to display variogram model types.) Note that the autofitVariogram function does not work on sf objects, which is why we convert the object to a SpatialPointsDataFrame (package sp).

The fitted model can be plotted with plot (Figure 12.14):

Variogram model

Figure 12.14: Variogram model

The resulting object is actually a list with several components, including the empirical variogram and the fitted variogram model. The $var_model component of the resulting object contains the actual model:

The variogram model can then be passed to the gstat function, and we can carry on with the Ordinary Kriging interpolation:

Again, we will subset the predicted values attribute and rename it:

The Ordinary Kriging predictions are shown in Figure 12.15:

Predicted annual rainfall using Ordinary Kriging

Figure 12.15: Predicted annual rainfall using Ordinary Kriging

12.3.2 Elevation example

Another example: suppose that we did not have a DEM for Israel, but only the elevation measurements at the meteorological stations. How can we produce an elevation raster using Ordinary Kriging?

First, we prepare the gstat object:

Then, we interpolate:

The predicted elevation raster is shown in Figure 12.16:

Ordinary Kriging prediction of elevation

Figure 12.16: Ordinary Kriging prediction of elevation

12.3.3 Monthly rainfall example

In the next example we use kriging inside a for loop, to make a series of predictions for different variables. Specifically, we will use Ordinary Kriging to predict monthly rainfall, i.e., sep through may columns in the rainfall layer.

In each for loop “round”, the formula is going to be re-defined according to the current month i. For example:

First, we set up a vector with the column names of the variables we wish to interpolate, and a list where we “collect” the results:

Second, we specify the for loop, as follows:

Finally, we combine the list of results per month into a single multi-band raster, using do.call and c (Section 11.3.2):

The interpolated montly rainfall amounts are shown in Figure 12.17:

Monthly rainfall predictions using Ordinary Kriging

Figure 12.17: Monthly rainfall predictions using Ordinary Kriging

12.4 Universal Kriging

Universal Kriging interpolation uses a model with one or more independent variables, i.e., covariates. The covariates need to be known for both:

  • The point layer, as an attribute such as elev_1km in rainfall
  • The predicted locations, as raster values such as dem values

The formula now specifies the name(s) of the covariate(s) to the right of the ~ symbol, separated by + if there are more than one. Also, we are using a subset of rainfall where elev_1km values were present:

Comparing the Ordinary Kriging and Universal Kriging variogram models (Figure 12.18):

OK and UK variogram modelsOK and UK variogram models

Figure 12.18: OK and UK variogram models

Next we create a gstat object, where the formula contains the covariate and the corresponding variogram model:

Remember that all of the variables that appear in the formula need to be present in the data. In this case we have two variables: a dependent variable (annual) and an independent variable (elev_1km).

Now we can make predictions:

and then subset and rename:

Universal Kriging predictions are shown in Figure 12.19:

Predicted annual rainfall using Universal Kriging

Figure 12.19: Predicted annual rainfall using Universal Kriging

12.5 Cross-validation

In Sections 12.212.4, we have calculated annual rainfall surfaces using three different methods: IDW, Ordinary Kriging and Universal Kriging. Although it is useful to examine and compare the results graphically (Figures 12.12, 12.15, and 12.19), we also need an objective way to evaluate interolation accuracy. That way, we can objectively choose the most accurate method among the various interpolation methods there are.

Plainly speaking, to evaluate prediction accuracy we need to compare the predicted values with measured data in the same location. Since measured data are often sparse and expensive to produce, it makes little sense to collect more data merely for the sake of accuracy assessment. Instead, the available data are usually split to two parst, called training and test data. The training data are used to fit the model, while the test data are used to calculate prediction accuracy. The procedure is called cross validation. A specific, commonly used, type of cross validation is Leave-One-Out Cross Validation where all observations consecutively take the role of test data while the remaning observations take the role of training data. The separation of training and test data is important because evaluating a model based on the same data used to fit it gives the model an “unfair” advantage and therefore overestimates accuracy.

In Leave-One-Out Cross Validation we:

  • Take out one point out of the calibration data
  • Make a prediction for that point
  • Repeat for all points

In the end, what we get is a table with an observed value and a predicted value for all points.

We can run Leave-One-Out Cross Validation using the gstat.cv function, which accepts a gstat object:

The gstat.cv function returns an object of class SpatialPointsDataFrame (package sp), which we can convert to an sf object with st_as_sf:

The result of gstat.cv has the following attributes:

  • var1.pred—Predicted value
  • var1.var—Variance (only for Kriging)
  • observed—Observed value
  • residual—Observed-Predicted
  • zscore—Z-score (only for Kriging)
  • fold—Cross-validation ID

A bubble plot is convenient to examine the residuals, since it shows positive and negative values in different color (Figure 12.20):

Cross-validation residuals

Figure 12.20: Cross-validation residuals

Using the “predicted” and “observed” columns we can calculate prediction accuracy indices, such as the Root Mean Square Error (RMSE):

\[ RMSE=\sqrt{\frac{\sum_{i=1}^{n} (pred_i-obs_i)^2}{n}} \]

where \(pred_i\) and \(obs_i\) are predicted and observed values for point \(i\), respectively.

For example: