{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Data classification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Data classification is a common task in geospatial data analysis that determines the assignment of values to distinct classes.\n", "Classifying original values into categories may help simplify the data for further analysis or communicating the results. Data classification is central when visualizing geographic information to correctly represent the distribution of the data. \n", "\n", "Here, we will get familiar with classification schemes from the [PySAL](https://pysal.org/) [^pysal] [`mapclassify` library](https://pysal.org/mapclassify/) [^mapclassify] that is intended to be used when visualizing thematic maps. Further details of geographic data visualization will be covered in chapter 8. We will also learn how to classify data values based on pre-defined threshold values and conditional statements directly in `geopandas`. \n", "\n", "Our sample data is an extract from the Helsinki Region Travel Time Matrix ({cite}`Tenkanen2020`) that represents travel times to the central railway station across 250 m x 250 m statistical grid squares covering the Helsinki region. Let's read in the data and check the first rows of data: " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | car_m_d | \n", "car_m_t | \n", "car_r_d | \n", "car_r_t | \n", "from_id | \n", "pt_m_d | \n", "pt_m_t | \n", "pt_m_tt | \n", "pt_r_d | \n", "pt_r_t | \n", "pt_r_tt | \n", "to_id | \n", "walk_d | \n", "walk_t | \n", "geometry | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "32297 | \n", "43 | \n", "32260 | \n", "48 | \n", "5785640 | \n", "32616 | \n", "116 | \n", "147 | \n", "32616 | \n", "108 | \n", "139 | \n", "5975375 | \n", "32164 | \n", "459 | \n", "POLYGON ((382000 6697750, 381750 6697750, 3817... | \n", "
1 | \n", "32508 | \n", "43 | \n", "32471 | \n", "49 | \n", "5785641 | \n", "32822 | \n", "119 | \n", "145 | \n", "32822 | \n", "111 | \n", "133 | \n", "5975375 | \n", "29547 | \n", "422 | \n", "POLYGON ((382250 6697750, 382000 6697750, 3820... | \n", "
2 | \n", "30133 | \n", "50 | \n", "31872 | \n", "56 | \n", "5785642 | \n", "32940 | \n", "121 | \n", "146 | \n", "32940 | \n", "113 | \n", "133 | \n", "5975375 | \n", "29626 | \n", "423 | \n", "POLYGON ((382500 6697750, 382250 6697750, 3822... | \n", "
3 | \n", "32690 | \n", "54 | \n", "34429 | \n", "60 | \n", "5785643 | \n", "33233 | \n", "125 | \n", "150 | \n", "33233 | \n", "117 | \n", "144 | \n", "5975375 | \n", "29919 | \n", "427 | \n", "POLYGON ((382750 6697750, 382500 6697750, 3825... | \n", "
4 | \n", "31872 | \n", "42 | \n", "31834 | \n", "48 | \n", "5787544 | \n", "32127 | \n", "109 | \n", "126 | \n", "32127 | \n", "101 | \n", "121 | \n", "5975375 | \n", "31674 | \n", "452 | \n", "POLYGON ((381250 6697500, 381000 6697500, 3810... | \n", "