Map algebra#

Map algebra is often used as an umbrella term for conducting various mathematical and logical operations, as well as spatial analysis operations, based on raster data. These techniques were first developed by Dana Tomlin in the 1970’s (Tomlin, 1990) and they have since then been a fundamental part of raster data analysis in GIS. Map algebra provides a set of operators that can be applied to a single or multiple raster layers to produce a new raster layer. For instance, you can do basic mathematical calculations (multiply, sum, divide, etc.) between multiple raster layers that are central operations for map overlay analysis, or conduct mathematical operations on a single raster to compute values based on a given neighborhood. The latter can be used e.g. to detect hot spots based on the pixel values, in which high values are surrounded by other high values. Map algebra is widely used in terrain analysis, land suitability modeling, hydrological modeling, and environmental assessments. By integrating spatial data with mathematical functions, it enables powerful spatial decision-making.

The operations of map algebra can be divided into different categories:

  • Focal operations compute values based on a specified neighborhood (e.g. 3x3 window) on a given raster layer.

  • Local operations apply functions on a cell-by-cell basis between multiple raster layers.

  • Global operations use all raster cells in computations to calculate e.g. statistical summaries.

  • Zonal operations analyze values within defined zones, such as calculating average elevation within a watershed.

  • Incremental operations apply iterative calculations or cumulative functions over space or time (e.g. cumulative cost surfaces).

There are various Python libraries that can be used for map algebra. Here, we are focusing on xarray, xarray-spatial and rasterstats libraries that provide numerous useful functionalities to conduct focal, local, global, zonal and incremental operations using raster data. In the following, we will apply map algebra to Digital Elevation Model (DEM) raster data obtained from Eastern Finland to gain knowledge of the topography in this area. In addition, we will learn how it is possible to conduct suitability analysis to find an optimal location to build a new summer house based on specific criteria, and how to conduct path finding on raster data to find least-cost route between two locations across the raster cost surface.

Focal operations#

A focal function operates on a cell and its neighboring cells within a defined window (e.g., 3x3 or 5x5). The output value for each cell is derived by applying a mathematical or statistical operation to the values within that neighborhood.

import xarray as xr
import xrspatial
import matplotlib.pyplot as plt

fp = "data/Tuupovaara_DEM.nc"
data = xr.open_dataset(fp, decode_coords="all")
data
<xarray.Dataset> Size: 1MB
Dimensions:      (x: 400, y: 400)
Coordinates:
  * x            (x) float64 3kB 3.69e+06 3.69e+06 3.69e+06 ... 3.7e+06 3.7e+06
  * y            (y) float64 3kB 6.95e+06 6.95e+06 ... 6.94e+06 6.94e+06
    spatial_ref  int64 8B ...
Data variables:
    elevation    (y, x) float64 1MB ...
Attributes:
    AREA_OR_POINT:  Area
# Plot the elevation values and contours
fig, ax = plt.subplots(figsize=(9, 7))
data["elevation"].plot(ax=ax)
cs = data["elevation"].plot.contour(ax=ax, colors="red", linewidths=0.5)

# Label contours
ax.clabel(cs, cs.levels, inline=True, fontsize=6)
plt.title("Elevation in Tuupovaara, Finland");
../../../_images/b0cd2ccc641a2deea6c3761bd75fa066da8300be1ced64ffad4f8c78f98eca97.png

Figure 7.X. Elevation surface with contour lines.

Slope#

# Calculate slope
data["slope"] = xrspatial.slope(data["elevation"])
data["slope"].plot(cmap="Greens")
plt.title("Slope (degrees)");
../../../_images/ebe918e3af44460f0ecd8109230f074dfab4d577b862b422cb04852d83f7fa3d.png

Figure 7.X. Slope in degrees calculated from the elevation data.

Aspect#

# Calculate aspect
data["aspect"] = xrspatial.aspect(data["elevation"])
# Filter values that are below 0 (areas without aspect defined)
data["aspect"] = data["aspect"].where(data["aspect"] >= 0)
data["aspect"].plot(cmap="jet")
plt.title("Aspect (degree between 0-360)\n0 faces North");
../../../_images/fa15854aab7fe40df6ab8af6600ba392dab24eb91b3ad65ea6e071cc5377966f.png

Figure 7.X. Aspect surface shows the direction of the slope in degrees.

Curvature#

data["curvature"] = xrspatial.curvature(data["elevation"])
data["curvature"].plot()
plt.title("Curvature");
../../../_images/8171b3d4449e3e21940bb866f723447f317950a4c97d30b72217ef0e32cc5f36.png

Figure 7.X. Curvature describes the rate of change in the slope.

Curvature describes how fast the slope is increasing or decreasing as we move along a surface. A positive curvature means the surface is curving up (upwardly convex) at that cell. A negative curvature means the surface is curving down (downwardly convex) at that cell. A curvature of 0 means the surface is straight and constant in whatever angle it is sloped towards.

Hot and cold spots#

Hot and cold spots identify statistically significant hot spots and cold spots in an input raster. To be a statistically significant hot spot, a feature will have a high value and be surrounded by other features with high values as well. Thus, it is a similar measure to local spatial autocorrelation (LISA) although hot/cold spot analysis focuses on identifying only high-high and low-low areas, where as LISA also identify outliers (high values surrounded by low values).

data["hot_cold"] = xrspatial.focal.hotspots(data["elevation"], kernel)
data["hot_cold"].plot(cmap="RdYlBu_r", figsize=(6, 4))
plt.title("Identified hot and cold spots based the elevation");
../../../_images/daa76bfba56e4b9358551503989f7536fa3f931e8cfcaf14c9baa97f53b5014c.png

Figure 7.X. Hot spots are clusters with high values surrounded by other high values.

Hillshade#

data["hillshade"] = xrspatial.hillshade(data["elevation"])
data["hillshade"].plot(cmap="Greys")
<matplotlib.collections.QuadMesh at 0x33bcc5d90>
../../../_images/9b83695fcad94f40cf869d47a646a82f6d380748ddea552679961696874bdbec.png

Figure 7.X. Hillshade is a shaded relief based on the surface raster considering the illumination source angle and shadows.

# Calculate relative height
data["relative_height"] = data["elevation"] - data["elevation"].min().item()
from matplotlib.colors import LightSource, Normalize
import matplotlib.colorbar as cbar
import matplotlib.cm as cm
import numpy as np

fig, ax = plt.subplots()

# Specify the colormap to use
colormap = plt.cm.terrain

# Specify the light source
ls = LightSource(azdeg=225, altdeg=25)

# Convert DataArray into numpy array
array = data["relative_height"].to_numpy()

# Normalize elevation for color mapping
norm = Normalize(vmin=np.min(array), vmax=np.max(array))

# Create hillshade based on elevation
hillshade = ls.shade(array, cmap=colormap, vert_exag=1, blend_mode="overlay")
ax.imshow(hillshade)
ax.set_title("Hillshade with color blending")

# Create a ScalarMappable for colorbar
sm = cm.ScalarMappable(cmap=colormap, norm=norm)
sm.set_array([])  # Needed for colorbar creation

# Add colorbar
cbar = fig.colorbar(sm, ax=ax, orientation="vertical", label="Relative Height (m)")
../../../_images/0eb8be474231841a8585decabb356ed348134593ee729824735641513ffc4f74.png

Figure 7.X. Hillshade with color blending can give a more realistic appearance of the landscape

Smoothing and focal statistics#

# Kernel size
k = 15

# Generate a kernel (basically produces a boolean matrix full with numbers 1 and 0)
kernel = xrspatial.convolution.circle_kernel(1, 1, k)
# Smoothen the surface
data["smoothed_elevation"] = xrspatial.focal.focal_stats(
    data["elevation"], kernel, stats_funcs=["mean"]
)

data["smoothed_elevation"].plot(cmap="RdYlBu_r", figsize=(6, 4))
plt.title("Kernel smoothing with kernel size 15");
../../../_images/54c668d7ec6af5a95d02b76b8b9dbe0fcd5851bfbb6ba4c980438f47dcd6fcdd.png

Figure 7.X. Smoothed surface based on the average elevation of 15 neighboring cells at each pixel.

Reclassification#

The goal in the following section is to calculate and use different surface features to find a suitable place for building a new summer house. To do this, we will use information for example about elevation, slope and aspect of the terrain. so think of a scenario where all of these can be utilized. The criteria for finding a suitable place for a summer cottage will be based on following preferences:

  • The higher the elevation, the better

  • Some slope is good but not too steep

  • The ridge should be pointing South (who wouldn’t like more sun on their patio..)

# Take 20 % sample to reduce the time it takes to classify
percentage = 0.2

# The sample size
n = int(round(int(data["elevation"].count()) * percentage, 0))

# Reclassify elevation into 5 classes and add number 1 to the result to make the scale from 1-5
data["elevation_points"] = (
    xrspatial.classify.natural_breaks(data["elevation"], k=5, num_sample=n) + 1
)

# Plot the result
fig, ax = plt.subplots(figsize=(8, 5))
data["elevation"].plot(ax=ax)
data["elevation_points"].plot(ax=ax)
plt.title("Elevation categories");
../../../_images/508ebcd164ab12e51bafdff7d74e9db671de2dbebda0fe0d21dbf251b6e95d7d.png

Figure 7.X. Elevation categories (k=5) based on natural breaks classification scheme.

bins = [1, 2, 3, 4, 5]
new_values = [4, 5, 3, 2, 1]

data["slope_nb"] = (
    xrspatial.classify.natural_breaks(data["slope"], k=5, num_sample=n) + 1
)
data["slope_points"] = xrspatial.classify.reclassify(
    data["slope_nb"], bins=bins, new_values=new_values
)

# Plot
fig, ax = plt.subplots(figsize=(6, 4))
data["slope_points"].plot(ax=ax, cmap="Greens")
plt.title("Slope categories");
../../../_images/e5a070435b2ca0b025895cd46a19aff90b796cf643635d84831d0a8c3d46028a.png

Figure 7.X. Slope categories (k=5) based on natural breaks classification scheme.

bins = [90, 150, 210, 270, 360]
new_values = [1, 3, 5, 3, 1]

# Classify
data["aspect_points"] = xrspatial.classify.reclassify(
    data["aspect"], bins=bins, new_values=new_values
)

# Make a plot
fig, ax = plt.subplots(figsize=(6, 4))
data["aspect_points"].plot(ax=ax, cmap="RdYlBu_r", alpha=0.7)
plt.title("Aspect categories based on custom classifier");
../../../_images/1beefacb54ac1ee55ae57a86164e12e43ed7db12bf705ddefbda70d2e1eb51c4.png

Figure 7.X. Aspect categories based on a custom a custom classification scheme.

Local operations#

A local function operates .. Chapter 7.6 includes many more examples of using local operations related to working with multiband satellite data and geospatial timeseries data spanning multiple years.

# Calculate the suitability index by weighting the "points" given for different layers
data["suitability_index"] = (
    data["elevation_points"] * 0.2
    + data["aspect_points"] * 0.6
    + data["slope_points"] * 0.2
)

# Plot the suitability index
data["suitability_index"].plot(cmap="RdYlBu_r", figsize=(6, 4))
plt.title("Suitability index");
../../../_images/e17fb7a8500e886426f5153d311cb5686a790d8f8bb848e9ae0bb4fd4c452f8f.png

Figure 7.X. Suitability index calculated based on elevation, aspect and slope.

Global operations#

In map algebra, global functions are operations where the output value of each cell depends on the entire dataset or a large spatial extent, not just local neighbors. These functions are used to analyze patterns, relationships, and spatial influences across the whole raster. They are essential for modeling cumulative effects, spatial dependencies, and large-scale patterns in fields like hydrology, transportation, and environmental science.

Statistical summaries#

data["elevation"].min().item()
1342.0
data["elevation"].max().item()
2294.0
data["elevation"].mean().item()
1663.44648125
data["elevation"].median().item()
1649.0
data["elevation"].std().item()
147.3785525805685

Viewshed#

from shapely import box, Point
import geopandas as gpd

# Extract the center coordinates of the raster
bbox = box(*data.rio.bounds())
xcoord = bbox.centroid.x
ycoord = bbox.centroid.y

# Create a GeoDataFrame of the centroid
observer_location = gpd.GeoDataFrame(geometry=[Point(xcoord, ycoord)], crs=data.rio.crs)
# Elevation at a given point
elevation = data["elevation"].interp(x=xcoord, y=ycoord).item()
print("Elevation in the location of observer:", elevation, "meters.")
Elevation in the location of observer: 1509.25 meters.

Let’s imagine that there is a bird watching tower that rises 10 meters above the ground. In the following, we assume that a person is viewing the landscape on top of this tower to improve the visibility of the landscape. To calculate viewshed from this observation point, we can use .viewshed() function from the xrspatial library as follows:

# Observer elevation
observer_elevation = 10

# Calculate viewshed
data["viewshed"] = xrspatial.viewshed(
    data["elevation"], x=xcoord, y=ycoord, observer_elev=observer_elevation
)
fig, ax = plt.subplots()

# Plot hillshade that was calculated earlier
data["hillshade"].plot(ax=ax, cmap="Greys")

# Plot viewshed
data["viewshed"].plot(ax=ax, cmap="RdYlBu_r", alpha=0.6)

# Observer location
observer_location.plot(
    ax=ax, color="black", marker="x", markersize=15, label="Observer location"
)

# Add legend and title
ax.legend(loc="upper left")
ax.set_title("Visible areas from the observer location");
../../../_images/602be6c3c351694aacc8f0b49b667007a527a06fce7314e4cb71c6339950e1fc.png

Figure 7.X. Visible areas from the observer location based on the viewshed analysis.

Zonal operations#

To be added.

import osmnx as ox
from shapely import box

# Fetch lake "Riuttanen" from OSM
lake = ox.geocode_to_gdf("Riuttanen, Joensuu")
lake = gdf1.to_crs(crs=data.rio.crs)

# Fetch peak Riuttavaara based on OSM Node ID
peak = ox.geocode_to_gdf("N11034739930", by_osmid=True)
peak = peak.to_crs(crs=data.rio.crs)

# Create a buffer around the peak
peak["geometry"] = peak.buffer(200)

# Plot
fig, ax = plt.subplots()

data["elevation"].plot(ax=ax, cmap="terrain")
lake.plot(ax=ax, facecolor="None")
peak.plot(ax=ax, edgecolor="red", facecolor="None")
ax.set_title("Lake and Peak polygons");
../../../_images/a208713cce136284f140242b10a91199744f9f67c5cb29eadddd72ace8e6ce19.png

Figure 7.X. Two zones that are used for comparison and calculating zonal statistics.

# Merge zones into a single GeoDataFrame
zones = pd.concat([peak, lake]).reset_index(drop=True)
import rasterstats
import pandas as pd

elevation_array = data["elevation"].to_numpy()
affine = data.rio.transform()

# Run the zonal statistics
stats = rasterstats.zonal_stats(
    zones,
    elevation_array,
    affine=affine,
    stats=["mean", "min", "max", "std"],  # Statistics to calculate
    nodata=data["elevation"].rio.nodata,  # Handle nodata values
)

stats
[{'min': 1650.0,
  'max': 1800.0,
  'mean': 1727.7313432835822,
  'std': 44.88168484486737},
 {'min': 1408.0,
  'max': 1471.0,
  'mean': 1409.546827794562,
  'std': 3.8168113458094}]
stats = pd.DataFrame(stats)
stats
min max mean std
0 1650.0 1800.0 1727.731343 44.881685
1 1408.0 1471.0 1409.546828 3.816811
zones = zones.join(stats)
zones
geometry bbox_west bbox_south bbox_east bbox_north place_id osm_type osm_id lat lon ... type place_rank importance addresstype name display_name min max mean std
0 POLYGON ((3695077.073 6942292.381, 3695076.11 ... 30.783790 62.534025 30.783890 62.534125 155389184 node 11034739930 62.534075 30.783840 ... peak 18 0.160063 peak Riuttavaara Riuttavaara, Joensuu, Joensuu sub-region, Nort... 1650.0 1800.0 1727.731343 44.881685
1 POLYGON ((3693801.983 6941952.076, 3693802.445... 30.762616 62.521290 30.791158 62.538771 155368181 relation 8774094 62.530008 30.774021 ... water 22 0.106730 water Riuttanen Riuttanen, Joensuu, Joensuu sub-region, North ... 1408.0 1471.0 1409.546828 3.816811

2 rows × 21 columns

# What is the maximum difference in elevation between peak and lake?
difference = zones.at[0, "mean"] - zones.at[1, "mean"]
print(f"Elevation difference between the peak and lake: {difference:.0f} m.")
Elevation difference between the peak and lake: 318 m.

Incremental operations#

Incremental operations ..

Least-cost path calculation based on raster surface#

origin = gpd.GeoDataFrame(geometry=[Point(3691000, 6942000)], crs=data.rio.crs)
destination = gpd.GeoDataFrame(geometry=[Point(3698500, 6948000)], crs=data.rio.crs)
fig, ax = plt.subplots()

data["hillshade"].plot(ax=ax, cmap="Greys")
origin.plot(ax=ax, color="red", markersize=58, label="Origin")
destination.plot(ax=ax, color="blue", markersize=58, label="Destination")
ax.legend(loc="upper left")
plt.title("Origin and destination");
../../../_images/4dd500fc354667b7705da663f1125d69ff9fb164c69f88461bc25477e1de6f9f.png

Figure 7.X. Origin and destination points that are used to find the least-cost path across the surface.

barriers = list(range(1400, 1580))
barriers += list(range(2000, 2200))

origin_latlon = (origin.geometry.y.item(), origin.geometry.x.item())
destination_latlon = (destination.geometry.y.item(), destination.geometry.x.item())
least_cost_path = xrspatial.a_star_search(
    data["elevation"], origin_latlon, destination_latlon, barriers
)
route = xr.where(~np.isnan(least_cost_path), 1, least_cost_path)
fig, ax = plt.subplots()

origin.plot(ax=ax, color="red", markersize=58, label="Origin")
destination.plot(ax=ax, color="blue", markersize=58, label="Destination")
route.plot(ax=ax, cmap="gist_heat")
ax.legend(loc="upper left")
plt.title("Origin and destination");
../../../_images/98584b608bd52927de4a7edb9a3a96d475407b0aa0edd6c9ee7bf2d262a73f6b.png

Figure 7.X. The calculated least-cost path from origin to destination based on A* algorithm.

from shapely import LineString

# Convert (row, col) path to geographic coordinates
transform = data.rio.transform()

# Extract (row, col) indices where path is not NaN
path_indices = np.argwhere(~np.isnan(least_cost_path.values))

coords = [transform * (int(col), int(row)) for row, col in path_indices]

# Create a LineString from the path
line = LineString(coords)

# Convert to a GeoDataFrame
shortest_path = gpd.GeoDataFrame({"geometry": [line]}, crs=data.rio.crs)
fig, ax = plt.subplots()

data["elevation"].plot(ax=ax, cmap="terrain")
shortest_path.plot(ax=ax, color="red", label="Least cost path")
origin.plot(ax=ax, color="red", markersize=58, label="Origin")
destination.plot(ax=ax, color="blue", markersize=58, label="Destination")
ax.legend(loc="upper left")
plt.title("Least cost path between origin and destination");
../../../_images/9d193b5602373b212e994a7b708292e7ef2bf2d53eea95591a4cb90fa192b94b.png

Figure 7.X. Vectorized least-cost path (LineString) across the terrain.

Flow accumulation#

Flow accumulation for example related to watershed analysis is another example of incremental operation. ADD MORE INFO. Chapter 12 will cover examples how incremental operations are used to calculate watersheds based in digital elevation model in New Zealand.