# Coordinate Reference Systems#

Coordinate reference systems (CRS) are important because the geometric shapes in a GeoDataFrame are simply a collection of coordinates in an arbitrary space. A CRS tells Python how those coordinates are related to places on the Earth. A map projection (or a projected coordinate system) is a systematic transformation of the latitudes and longitudes into a plain surface where units are quite commonly represented as meters (instead of decimal degrees). This transformation is used to represent the three dimensional earth on a flat, two dimensional map.

As the CRS in different spatial datasets differ fairly often (i.e. one might have coordinates defined in decimal degrees while the other one has them in meters), it is a common procedure to reproject (transform) different layers into a common CRS. It is important that the layers are in the same coordinate reference system when analyzing the spatial relationships between the layers, for example, when making a Point in Polygon -query, or other type of overlay analysis.

Choosing an appropriate projection for your map is not always straightforward because it depends on what you actually want to represent with your map, and what is the spatial scale of your data. In fact, there is not a single “perfect projection” since each one of them has some strengths and weaknesses, and you should choose a projection that fits best for your needs. In fact, the projection you choose might even tell something about you! Figure 6.X. Map projections Source: XKCD https://xkcd.com/977/.

You can find a lot of information about available coordinate reference systems from:

## Managing coordinate reference systems in pyproj and geopandas#

Our main tool for managing coordinate reference systems (CRS) is the PROJ library 1 that is available in python and geopandas through the pyproj Python library 2. Pyproj can be used, for example, to get basic information of the CRS of a GeoDataFrame, and to reproject data from one projection to another.

Let’s demonstrate this using sample data from Europe. We will re-project the layer from the original CRS WGS84 (lat, lon coordinates) into a Lambert Azimuthal Equal Area projection which is the recommended projection for Europe 3.

In Shapefiles, information about the coordinate reference system is stored in the .prj -file. If this file is missing, you might be in trouble!. When reading the data into GeoDataFrame with Geopandas crs information is automatically stored into the .crs attribute of the GeoDataFrame.

Let’s start by reading the data from the Europe_borders.shp file and checking the crs:

# Import necessary packages
import geopandas as gpd

# Read the file
fp = "L2_data/Europe_borders.shp"

# Check the coordinate reference system
data.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich


What we see here is in fact a CRS object from the pyproj module.

The EPSG number (named after the European Petroleum Survey Group) is a code that tells about the coordinate system of the dataset. “EPSG Geodetic Parameter Dataset is a collection of definitions of coordinate reference systems and coordinate transformations which may be global, regional, national or local in application”.

The EPSG code of our geodataframe is4326, which refers to the WGS84 coordinate system (we can also figure this out by looking at the coordinates values which are longitude and latitudes decimal degrees).

Let’s continue by checking the values in our geometry -column to verify that the CRS of our GeoDataFrame seems correct:

data["geometry"].head()

0    POLYGON ((8.45778 54.56236, 8.44953 54.56269, ...
1    POLYGON ((8.71992 47.69664, 8.72092 47.69530, ...
2    POLYGON ((6.73317 53.57409, 6.73017 53.57542, ...
3    POLYGON ((6.85822 53.59411, 6.85592 53.59550, ...
4    POLYGON ((6.89894 53.62561, 6.88439 53.62814, ...
Name: geometry, dtype: geometry


As we can see, the coordinate values of the Polygons indeed look like latitude and longitude values, so everything seems to be in order.

WGS84 projection is not really a good one for representing European borders on a map (areas get distorted), so let’s convert those geometries into Lambert Azimuthal Equal Area projection (EPSG: 3035) which is the recommended projection by European Comission.

Changing the projection is simple to do in Geopandas with .to_crs() -function which is a built-in function of the GeoDataFrame. The function has two alternative parameters 1) crs and 2) epgs that can be used to make the coordinate transformation and re-project the data into the CRS that you want to use.

• Let’s re-project our data into EPSG 3035 using epsg -parameter:

# Let's make a backup copy of our data
data_wgs84 = data.copy()

# Reproject the data
data = data.to_crs(epsg=3035)

# Check the new geometry values

0    POLYGON ((4221214.558 3496203.404, 4220681.651...
1    POLYGON ((4224860.478 2732279.320, 4224932.819...
2    POLYGON ((4104652.176 3390034.953, 4104460.401...
3    POLYGON ((4113025.664 3391895.756, 4112879.943...
4    POLYGON ((4115871.228 3395282.099, 4114921.348...
Name: geometry, dtype: geometry


And here we go, the coordinate values in the geometries have changed! Now we have successfully changed the projection of our layer into a new one, i.e. to ETRS-LAEA projection.

To really understand what is going on, it is good to explore our data visually. Let’s compare the datasets by making maps out of them.

data_wgs84.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

import matplotlib.pyplot as plt

# Make subplots that are next to each other
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(12, 12))

# Plot the data in WGS84 CRS
data_wgs84.plot(ax=ax1, facecolor="gray")

ax1.set_title("WGS84")

# Plot the one with ETRS-LAEA projection
data.plot(ax=ax2, facecolor="blue")

ax2.set_title("ETRS Lambert Azimuthal Equal Area projection")

# Set aspect ratio as 1
ax1.set_aspect(aspect=1)
ax2.set_aspect(aspect=1)

# Remove empty white space around the plot
plt.tight_layout() Figure 6.X. Map of Europe plotted with two different coordinate reference systems.

Indeed, the maps look quite different, and the re-projected one looks much better in Europe as the areas especially in the north are more realistic and not so stretched as in WGS84.

Finally, let’s save our projected layer into a Shapefile so that we can use it later. Note, even if the crs information is stored in the .prj file, it might be a good idea also to include crs info in the filename:

# Ouput filepath
outfp = "L2_data/Europe_borders_epsg3035.shp"

# Save to disk
data.to_file(outfp)


## Dealing with different CRS formats#

There are various ways to present Coordinate Reference System information, such as PROJ strings, EPSG codes, Well-Known-Text (WKT), JSON. It is likely that you will encounter some of these when working with spatial data obtained from different sources. Being able to convert the CRS information from one format to another is needed every now and then, hence, it is useful to know a few tricks how to do this.

Luckily, dealing with CRS information is easy in Python using the pyproj library. In fact, pyproj is a Python wrapper around a software called PROJ (maintained by OSGeo community), which is widely used tool for conducting coordinate transformations in various GIS softwares. Pyproj is also used under the hood in Geopandas, and it handles all the CRS definitions and coordinate transformations (reprojecting from CRS to another as we did earlier).

### Overview#

The following code cell prints out a summary summary of different ways of representing crs information using pyproj CRS. Here, we use the crs of the original European borders layer as a starting point:

### Import CRS class from pyproj
from pyproj import CRS

# PROJ dictionary:
crs_dict = data_wgs84.crs

# pyproj CRS object:
crs_object = CRS(data_wgs84.crs)

# EPSG code (here, the input crs information is a bit vague so we need to lower the confidence threshold)
crs_epsg = CRS(data_wgs84.crs).to_epsg(min_confidence=25)

# PROJ string
crs_proj4 = CRS(data_wgs84.crs).to_proj4()

# Well-Known Text (WKT)
crs_wkt = CRS(data_wgs84.crs).to_wkt()

<ipython-input-10-b642b7efee1d>:11: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
crs_proj4 = CRS(data_wgs84.crs).to_proj4()

print("PROJ dictionary:\n", crs_dict)
print("\nCRS object:\n", crs_object)
print("\nEPSG code: \n", crs_epsg)
print("\nPROJ string: \n", crs_proj4)
print("\nWell-Known Text (WKT):\n", crs_wkt)

PROJ dictionary:
epsg:4326

CRS object:
GEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],CS[ellipsoidal,2],AXIS["geodetic latitude (Lat)",north,ORDER,ANGLEUNIT["degree",0.0174532925199433]],AXIS["geodetic longitude (Lon)",east,ORDER,ANGLEUNIT["degree",0.0174532925199433]],USAGE[SCOPE["unknown"],AREA["World"],BBOX[-90,-180,90,180]],ID["EPSG",4326]]

EPSG code:
4326

PROJ string:
+proj=longlat +datum=WGS84 +no_defs +type=crs

Well-Known Text (WKT):
GEOGCRS["WGS 84",DATUM["World Geodetic System 1984",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],CS[ellipsoidal,2],AXIS["geodetic latitude (Lat)",north,ORDER,ANGLEUNIT["degree",0.0174532925199433]],AXIS["geodetic longitude (Lon)",east,ORDER,ANGLEUNIT["degree",0.0174532925199433]],USAGE[SCOPE["unknown"],AREA["World"],BBOX[-90,-180,90,180]],ID["EPSG",4326]]


### Pyproj CRS object#

Next, let’s see how it is possible to easily extract useful information from CRS, and transform CRS information from format to another. pyproj -library has a class called CRS that provides many useful functionalities for dealing with CRS information.

# Let's see the current CRS of our data
print(data.crs)

epsg:3035


Printing the crs using the print() statement gives us the EPSG code.

However, let’s see how the same information looks like in other formats such as WKT or Proj4 text. For this we need to use the CRS class.

Note

The following examples have been tested to work with pyproj version 2.6.1 and geopandas version 0.8.1. You can check package versions by running the conda list -command.

# Initialize the CRS class for epsg code 3035:
crs_object = CRS.from_epsg(3035)
crs_object

<Projected CRS: EPSG:3035>
Name: ETRS89-extended / LAEA Europe
Axis Info [cartesian]:
- Y[north]: Northing (metre)
- X[east]: Easting (metre)
Area of Use:
- name: Europe - LCC & LAEA
- bounds: (-35.58, 24.6, 44.83, 84.17)
Coordinate Operation:
- name: Europe Equal Area 2001
- method: Lambert Azimuthal Equal Area
Datum: European Terrestrial Reference System 1989
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich


As we can see, the CRS object contains a of information about the coordinate reference system such as the Name of the CRS (ETRS89/LAEA Europe), the area where the CRS is in use (Europe with bounds (-16.1, 32.88, 40.18, 84.17)), and the Datum (European Terrestrial Reference System 1989).

We can also easily parse this information individually as follows:

# Name
print("Name:", crs_object.name)

# Coordinate system
print("Coordinate system:", crs_object.coordinate_system)

# Bounds of the area where CRS is used
print("Bounds:", crs_object.area_of_use.bounds)

Name: ETRS89-extended / LAEA Europe
Coordinate system: cartesian
Bounds: (-35.58, 24.6, 44.83, 84.17)


You can explore all the possible information that can be extracted from the CRS by typing crs_object. and pressing Tabulator.

Let’s see how we can convert the crs information from one format to another. Quite often it is useful to know the EPSG code of the CRS. Next, we will conduct a few transformations to demonstrate the capabilities of the CRS class.

# Retrive CRS information in WKT format
crs_wkt = crs_object.to_wkt()
print(crs_wkt)

PROJCRS["ETRS89-extended / LAEA Europe",BASEGEOGCRS["ETRS89",DATUM["European Terrestrial Reference System 1989",ELLIPSOID["GRS 1980",6378137,298.257222101,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4258]],CONVERSION["Europe Equal Area 2001",METHOD["Lambert Azimuthal Equal Area",ID["EPSG",9820]],PARAMETER["Latitude of natural origin",52,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",10,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",4321000,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",3210000,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["northing (Y)",north,ORDER,LENGTHUNIT["metre",1]],AXIS["easting (X)",east,ORDER,LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["Europe - LCC & LAEA"],BBOX[24.6,-35.58,84.17,44.83]],ID["EPSG",3035]]


As we can see, the WKT format contains a lot of information. Typically, e.g. the .prj file of a Shapefile contains the information in this format. Let’s see how it is possible to extract EPSG code from this. For doing it, we need to re-initialize the CRS object, this time from the WKT text presentation.

# Retrieve EPSG code from WKT text
epsg = CRS(crs_wkt).to_epsg()
print(epsg)

3035


Not able to recognize epsg?

Sometimes to_epsg() isn’t able to recognize the EPSG code from the WKT representation. This can happen if the WKT information is missing some details. Luckily, we can easily adjust the minimum level of confidence for matching the CRS info and the EPSG code. We can do this by adjusting a parameter min_confidence when calling the function. By default, the confidence level is 70 %, but it is also possible to set a lower confidence threshold.

The coordinate information of our input shapefile is incomplete, and does not yield an epsg value with default setting: However, CRS is able to determine the EPSG value with a lower confidence treshold:

# Let's try to extract the EPSG code from the crs of our original data:
CRS(data.crs).to_epsg()
>>> None

# Let's try it again with a lower confidence requirement (25 %)
CRS(data.crs).to_epsg(min_confidence=25)
>>> 3035


However, be cautious when using this, as guessing the EPSG from “exotic” coordinate reference systems might also provide false results.

Let’s now save our data to disk using the WKT format as the crs of our GeoDataFrame. WKT is a preferred output format when storing crs information as text.

# Re-define the CRS of the input GeoDataFrame
data.crs = CRS.from_epsg(3035).to_wkt()

print(data.crs)

PROJCRS["ETRS89-extended / LAEA Europe",BASEGEOGCRS["ETRS89",DATUM["European Terrestrial Reference System 1989",ELLIPSOID["GRS 1980",6378137,298.257222101,LENGTHUNIT["metre",1]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],ID["EPSG",4258]],CONVERSION["Europe Equal Area 2001",METHOD["Lambert Azimuthal Equal Area",ID["EPSG",9820]],PARAMETER["Latitude of natural origin",52,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8801]],PARAMETER["Longitude of natural origin",10,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",4321000,LENGTHUNIT["metre",1],ID["EPSG",8806]],PARAMETER["False northing",3210000,LENGTHUNIT["metre",1],ID["EPSG",8807]]],CS[Cartesian,2],AXIS["northing (Y)",north,ORDER,LENGTHUNIT["metre",1]],AXIS["easting (X)",east,ORDER,LENGTHUNIT["metre",1]],USAGE[SCOPE["unknown"],AREA["Europe - LCC & LAEA"],BBOX[24.6,-35.58,84.17,44.83]],ID["EPSG",3035]]

# Ouput filepath
outfp = "L2_data/Europe_borders_epsg3035.shp"

# Save to disk
# data.to_file(outfp)


That’s it.

HINT: A module called PyCRS can also be useful library as it contains information and supports many different coordinate reference definitions, such as OGC WKT (v1), ESRI WKT, Proj4, and any EPSG, ESRI, or SR-ORG code available from spatialreference.org.

## Global map projections#

Finally, let’s play around with global map projections :) L2_data folder conaints a layer ne_110m_admin_0_countries.shp that represents the country borders of the world. The data was fownloaded from https://www.naturalearthdata.com/.

### Check your understanding#

Read in a global dataset and plot three maps with different projections! See hints and projection definitions from:

• http://geopandas.org/projections.html

• https://pyproj4.github.io/pyproj/dev/api/crs.html

• https://spatialreference.org/

When plotting the maps, think about the advantages and disadvantages of different world map projections.

# Read in data

# Check input crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

# Set fig size
plt.rcParams["figure.figsize"] = [12, 6]

# Plot in original crs
plt.title("WGS84")

Text(0.5, 1.0, 'WGS84') Figure 6.X. Global map plotted in WGS 84.

# Define projection as web mercator, 3785
web_mercator = CRS.from_epsg(3785)

# Re-project and plot

# Remove x and y axis
plt.axis("off")
plt.title("Web mercator")

Text(0.5, 1.0, 'Web mercator') Figure 6.X. Global map plotted in Web Mercator.

# Define projection Eckert IV from https://spatialreference.org/ref/esri/54012/
eckert_IV = CRS.from_proj4(
"+proj=eck4 +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
)

# Re-project and plot

# Remove x and y axis
plt.axis("off")
plt.title("Eckert IV")

Text(0.5, 1.0, 'Eckert IV') Figure 6.X. Global map plotted in Eckert IV.

# Define an orthographic projection, centered in Finland! from: http://www.statsmapsnpix.com/2019/09/globe-projections-and-insets-in-qgis.html
ortho = CRS.from_proj4(
"+proj=ortho +lat_0=60.00 +lon_0=23.0000 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs"
)

# Re-project and plot

# Remove x and y axis
plt.axis("off")
plt.title("Orthographic")

Text(0.5, 1.0, 'Orthographic') Figure 6.X. Global map plotted in an orthographic projection.

## Summary#

That’s it! In this section we learned how to:

1. reproject (transform) the geometries from crs to another using the to_crs() -function in GeoPandas

2. Define the coordinate reference system in different formats using pyproj CRS

## Footnotes#

1

https://proj.org/

2

https://pyproj4.github.io/pyproj/stable/

3