Common raster operations#

Creating a raster mosaic with rioxarray#

Quite often you need to merge multiple raster files together and create a raster mosaic. This can be done easily with the merge_datasets() -function in rioxarray. Here, we will create a mosaic based on DEM files (altogether 4 files) covering Kilimanjaro region in Tanzania. First we will read elevation data from a S3 bucket for Kilimanjaro region in Africa.

import xarray as xr
import os
import rioxarray as rxr
from rioxarray.merge import merge_datasets

# S3 bucket containing the data
bucket = ""

# Generate urls for the elevation files
urls = [
    os.path.join(bucket, "elevation/kilimanjaro/ASTGTMV003_S03E036_dem.tif"),
    os.path.join(bucket, "elevation/kilimanjaro/ASTGTMV003_S03E037_dem.tif"),
    os.path.join(bucket, "elevation/kilimanjaro/ASTGTMV003_S04E036_dem.tif"),
    os.path.join(bucket, "elevation/kilimanjaro/ASTGTMV003_S04E037_dem.tif"),

# Read the files
datasets = [
    xr.open_dataset(url, engine="rasterio").squeeze("band", drop=True) for url in urls

Investigate how our data looks like:

Dimensions:      (x: 3601, y: 3601)
  * x            (x) float64 36.0 36.0 36.0 36.0 36.0 ... 37.0 37.0 37.0 37.0
  * y            (y) float64 -2.0 -2.0 -2.001 -2.001 ... -2.999 -2.999 -3.0 -3.0
    spatial_ref  int64 0
Data variables:
    band_data    (y, x) float32 ...

Visualize the tiles to see how they look like separately:

import matplotlib.pyplot as plt

fig, axes = plt.subplots(2, 2, figsize=(16, 16))

# Plot the tiles to see how they look separately
datasets[0]["band_data"].plot(ax=axes[0][0], vmax=5900, add_colorbar=False)
datasets[1]["band_data"].plot(ax=axes[0][1], vmax=5900, add_colorbar=False)
datasets[2]["band_data"].plot(ax=axes[1][0], vmax=5900, add_colorbar=False)
datasets[3]["band_data"].plot(ax=axes[1][1], vmax=5900, add_colorbar=False)
<matplotlib.collections.QuadMesh at 0x7f9df494a5b0>

As we can see we have multiple separate raster files that are actually located next to each other. Hence, we want to put them together into a single raster file that can by done by creating a raster mosaic. Now we can create a raster mosaic by merging these datasets with merge_datasets() function:

# Create a mosaic out of the tiles
mosaic = merge_datasets(datasets)

Rename the data variable to more intuitive one:

# Add a more intuitive name for the data variable
mosaic = mosaic.rename({"band_data": "elevation"})

Plot the end result where the tiles have been merged:

# Plot the mosaic
mosaic["elevation"].plot(figsize=(12, 12))
<matplotlib.collections.QuadMesh at 0x7f9df47b3a30>

Clipping raster#

Create a GeoDataFrame with bounding box that we can use for clipping the raster:

import geopandas as gpd
from shapely.geometry import box

# Bounding box coordinates
minx = 37.1
miny = -3.3
maxx = 37.6
maxy = -2.85

# Create a GeoDataFrame that will be used to clip the raster
geom = box(minx, miny, maxx, maxy)
clipping_gdf = gpd.GeoDataFrame({"geometry": [geom]}, index=[0], crs="epsg:4326")

# Explore the extent on a map
Make this Notebook Trusted to load map: File -> Trust Notebook

Clip the mosaic with GeoDataFrame:

# Clip the Mosaic with GeoDataFrame and specify CRS
kilimanjaro =,
<matplotlib.collections.QuadMesh at 0x7f9df42ee040>
Dimensions:      (y: 1620, x: 1800)
  * y            (y) float64 -2.85 -2.85 -2.851 -2.851 ... -3.299 -3.299 -3.3
  * x            (x) float64 37.1 37.1 37.1 37.1 37.1 ... 37.6 37.6 37.6 37.6
    spatial_ref  int64 0
Data variables:
    elevation    (y, x) float32 1.581e+03 1.581e+03 ... 1.295e+03 1.294e+03

Now we have a ready raster mosaic and we can save the raster to a GeoTIFF file:

# Save file to GeoTIFF"data/kilimanjaro.tif", compress="LZMA", tiled=True)