In this assignment, different concepts of spatial interpolation were explored, but the focus is mainly on the use of the Inverse Distance Weighting (IDW) method.
The web maps created for this assignment utilized the sample dataset from the 'Global Summary Of the Day' (GSOD), as well as the US Current Weather and Wind Station Data layer from NOAA.
In creating a sample temperature map for a particular EU country, I chose Poland mainly because of its semi-compact shape which is not that irregular as compared to the others which are more elongated (e.g. Italy, Sweden, Norway, Finland). The location of weather stations within the country is also well-distributed, making it ideal for interpolation.
The Station point layer was symbolized using the Above and Below Theme under Counts and Amounts (Color) option in ArcGIS Online to give emphasis on the values above and below the average temperature. This way, viewers can easily visualize that the red points are the stations that recorded temperature above the average temperature of 9.87, while the blue points are stations that recorded temperature below the average.
The raster surface was generated by running the Interpolate Points tool. I used Geometric Interval for the classification method as it really works well with data that are not distributed normally like what I have in my Poland sample data.
Basically, the Geometric Interval method ensures that each class range has approximately the same number of values within each class and that the change between intervals is consistent. To classify the temperature values, I chose eight categories only to make the comparison easier since the human eye can typically perceive and differentiate about 8-10 colors efficiently.
Another part of this exercise takes a look into the effect of different power values used in IDW to the resulting interpolated surface. I used the US Current Weather and Wind Station Data layer from NOAA to create a temperature map of Kansas using two different power values for IDW interpolation.
The Current Weather and Wind Station Data layer is created from hourly METAR station data provided from NOAA and contains approximately 11 weather variables for each location. This data is updated hourly (top of each hour) using the Aggregated Live Feeds methodology
Using the Swipe tool on this Kansas Temperature web app, you can dynamically compare and inspect the results of the IDW interpolation.
Deterministic interpolation techniques like IDW, as compared to geostatistical ones like Kriging does not produce prediction standard errors. This means that deterministic interpolation methods do not account spatial autocorrelation on the measure points.
To assess and compare the results of interpolation, one way is to divide the sample points into training and test data. This can be done by for example, by removing one point from the sample points and interpolating the value on the discarded location. You can then compare the predicted value with the actual known value of the discarded point. This process can be iterated until all the points are assessed, producing a measure of prediction accuracy. Typically, the 80-20 rule is applied wherein 80% is used for interpolation and 20% for validation.
We can then calculate the Root Mean Squared Error by squaring the difference between the observed and interpolated values. The one with the lesser RMSE can be deemed as the better interpolation technique over the other.
The weighting applied to each feature in IDW is inversely proportional to the distance from the point being calculated. The power value determines the rate of change in this weighting. This means that as the value of p increases, the less it considers the value of the farther points during interpolation.
Below is an illustration that gives us an easier way to visualize the effect of different power values for IDW interpolation.
The chart below shows the histogram of the pixel values from the interpolated surface using power values 1 and 2. Statistics of the resulting rasters are also recorded for comparison.
Using a power value of 2 (default IDW value in ArcGIS) slightly changed the result of the interpolated surface as shown in the histogram below. The mean temperature drops a little bit to 31.34. Looking at the count of each pixel value, it seems that the number of pixels with 30-32 temp range value slightly increased while the distribution follows that of the normal curve also. The Kurtosis value which describes the distribution also implies that using a power value of 1 produces a distribution with fewer and less extreme outliers than a normal distribution (Kurtosis = 3) as shown in the case of using the power value of 2.
To find the optimal power value for IDW interpolation, one can evaluate different power values and check for the Root Mean Squared Prediction Error (RMSPE) of the interpolated surface. The one that produces the least RMSPE is the optimal power value.
Note: The web application used in this assignment will be available as long as the contents are accessible from the ZGIS ArcGIS Online organization account.