LAB 02: CLASSIFY! Redefining “Urban” Watersheds > TUTORIAL
Mapping is more than a mere display of spatial information; it is a critical, analytical, and interpretive act. This module builds on the previous one and introduces basic geoprocessing operations and design
decisions you’ll encounter in working with two foundational types of
geospatial data: raster images and vector geometries.
Premise and Objectives
Our second exercise builds on and expands the tools and concepts introduced in the introductory module. After completing this exercise, you will have the following:
You will then post your work-in-progress to Are.na.
Important Note: Completing {T01} is a prerequisite for following this exercise. It is assumed that you mastered the workflow basics from the previous week.
Prep
If you’re working in the SSDLAB virtual environment, copy the data repository for this exercise from the shared drive (G:) to your working location (U:). If not, download the data using this link. In this exercise, we will use the following datasets:
Set-Up
Launch QGIS. Using the Browser Panel, add all
Double-click the CRS code in the status bar to open the
Click
Data Querying 1: Select by Attributes
In the previous exercise, we familiarized ourselves with the interactive selection methods using the Selection Tool from the Selection Toolbar or clicking on row numbers in the layer’s attribute table. Another selection technique is to directly query the Attribute Table based on the fields and values within the dataset — a method usually referred to as Select by Attributes.
Open the attribute table of the
To answer (and visualize) the latter, we will first select a subset of features from the Populated Places layer with the population class 3,4 or 5. Then we will export them as a separate layer.
There are multiple routes to select features within a dataset: either we can open its Attribute Table and click on
We want to select just those cities with the population class 3 or higher. To do this follow these steps:
Open layer’s attribute table again. The header will tell you how many features were selected: we have selected 345 populated places.
We will now save those 345 features as a separate shapefile:
On your own: Take a moment to inspect the values in the CAPITAL field, and then create a separate shapefile containing only populated places that are either national (
Data Querying 2: Select by Location
The Select By Location tool allows you to select features based on their location relative to features in another layer. In this instance, we want to extract populated places that are within the Great Lakes Watershed.
On the main menu, navigate to
The
Export the selected features as
Labels
Labels are textual information displayed on maps, adding details you could not necessarily represent using symbols or geometry. In a GIS software environment, a label refers to a piece of text on the map that is dynamically placed and whose text string is derived from one or more feature attributes. Technically, any information stored in the Attribute Table of a vector layer can be textually displayed as a label on a map.
To label cities on our map follow these steps:
This will display the names of all populated places in our feature class:
To label only a subset of the populated places, we can create a filter querying the Populated Places feature class within the Labels tab using expressions (the same method we used to Select by Attributes):
Adding Raster Data
Navigate to
Raster data consists of a matrix of pixels (or cells) organized into a grid. Each pixel contains a value representing information:
Examples of other types of raster data include multi-spectral satellite imagery, elevation (DEM), population (cell values corresponding to the number of people living in each grid cell), various land use/land cover datasets, etc.
Raster Clipping
If you zoom out from the Great Lakes watershed, you will notice that both rasters added to our scene are global datasets, meaning you can work with them on a planetary scale. Sometimes, however, you will want to perform spatial analysis on a particular region or clip and display just a portion of your raster. Clip raster by extent and clip raster by Mask Layer are tools that allow you to extract a part of a raster dataset based on a template extent. In the spirit of situating ourselves in the Great Lakes watershed, we will clip both Global Human Settlement layers by the Great Lakes basin polygon:
Repeat the same raster extraction process with the Built-up layer as input. Save the clipped raster(s).
Raster Symbology
Next, we will change the appearance of our Global Human Settlements grids.
Open the Properties menu for the Built-up (clipped) layer and navigate to the Symbology tab. You’ll notice that this looks different from the style menu we have worked with for our vector layers. Instead of Symbol type, we have a “Render type” field and many options for how to color the bands in our dataset. Since we are working with a single-layer grayscale image, our symbology options are relatively limited. Still, we can experiment with color gradients. (Raster datasets can be made up of multiple bands, such as multispectral satellite imagery. We would better showcase the full power of raster symbology if we were working with such images, but that’s outside the scope of this tutorial.)
If you look at the expandable section labeled
Now change the “Render type” to
On your own: experiment with color gradients, switching between from
Finally, we will change the appearance of our Settlement Models grid. As mentioned previously, the numerical values of this raster represent specific settlement typologies. Here is the full breakdown (from the GHSL documentation):
Rasters like this one typically don't offer as many possibilities to experiment with symbology, given their limited number of values/classes and the fact they usually come with documentation that specifies not just nomenclature but also their color definitions:
Although you should always familiarize yourself with such conventions, we encourage you to challenge that approach and suggest different modes of representing the above categories. For example, you could split them into two (built-up/non-built-up) classes by assigning one of just two colors to each class (e.g., 10-13: grey, 21-30: red) or apply an entirely different set of colors (or gradients!) than the one from the official documentation.
To do so, open the layer’s Symbology tab and set the Render type to
Wrapping Up and Exporting
When satisfied with your symbology choices, make the vector layers visible again and produce a final composition. (Choose just one of the two raster layers for your final map.)
Important: If exporting a map for further editing in Illustrator, remember to export your vector layers separately from your raster layer(s): rasters as png or jpg and vectors as svg.
>> TUTORIAL APPENDIX: REDEFINING WATERSHEDS <<
Dissolve Tool
The polygon shapefile representing the Great Lakes basin used in this exercise is technically a derivative of the
The tool used to break the common boundaries of the adjacent subbasins is called Dissolve. To access the tool (and thus create a Great Lakes Basin shapefile yourself), navigate to
Merge Tool
Sometimes you must combine multiple vector layers (of the same geometry type) into a new, single feature class. For example, to complete this week’s assignment, you may decide to merge the Great Lakes Watershed with (the subbasins of) the Mississippi River Watershed and create a unique 3/4 Coast Watershed encompassing the two interconnected hydrological systems with the city of Chicago in the middle. To do so, you will use the Merge tool. To access the tool, navigate to
Use the newly created shapefile as the basis for the Lab 02: Urban Watershed assignment.
Note: The additional hydrological data to complete the exercise is uploaded to
Mapping is more than a mere display of spatial information; it is a critical, analytical, and interpretive act. This module builds on the previous one and introduces basic geoprocessing operations and design
decisions you’ll encounter in working with two foundational types of
geospatial data: raster images and vector geometries.
Premise and Objectives
Our second exercise builds on and expands the tools and concepts introduced in the introductory module. After completing this exercise, you will have the following:
- Learned data selection methods using tabular and spatial queries
- Explored basic techniques for labeling features
- Learned the basics of adding, interpreting, and visualizing raster data in QGIS
- Learned the basics of merging and dissolving vector features in QGIS
You will then post your work-in-progress to Are.na.
Important Note: Completing {T01} is a prerequisite for following this exercise. It is assumed that you mastered the workflow basics from the previous week.
Prep
If you’re working in the SSDLAB virtual environment, copy the data repository for this exercise from the shared drive (G:) to your working location (U:). If not, download the data using this link. In this exercise, we will use the following datasets:
- Populated Places (compiled by Natural Earth)
- Lakes and Reservoirs (derived from World Data Bank 2)
- The Great Lakes basin (derived from USGS Great Lakes Restoration Initiative)
- The Global Human Settlement Layer (GHSL) built-up surface (GHS-BUILT-S) and settlement models (GHS-SMOD)
Set-Up
Launch QGIS. Using the Browser Panel, add all
LAB 02
vector data to your scene and save your map project.Double-click the CRS code in the status bar to open the
Project Properties – CSR
dialogue box. Set the Project CSR to NAD83 / Great Lakes and St Lawrence Albers (EPSG: 3175):Click
OK
and close the Project Properties window. Data Querying 1: Select by Attributes
In the previous exercise, we familiarized ourselves with the interactive selection methods using the Selection Tool from the Selection Toolbar or clicking on row numbers in the layer’s attribute table. Another selection technique is to directly query the Attribute Table based on the fields and values within the dataset — a method usually referred to as Select by Attributes.
Open the attribute table of the
Populated Places(popPlaces)
point shapefile and observe the information stored in various
fields. Note that some fields contain nominal information – e.g., name, while others
contain quantitative (ordinal) information – e.g., population class (POPCLASS), which ranks cities from 1 (least populous) to 5 (most populous). By querying the Populated Places dataset, we can answer questions such as:- How many cities in the dataset have the status of national or provincial capital? Or,
- What cities in the dataset have a population rank of 3 or higher?
To answer (and visualize) the latter, we will first select a subset of features from the Populated Places layer with the population class 3,4 or 5. Then we will export them as a separate layer.
There are multiple routes to select features within a dataset: either we can open its Attribute Table and click on
Select features using an expression
button or select the dataset in the Layers panel and navigate to Edit
> Select
> Select features by Expression...
option. (The same icon should show up as a new shortcut button in your
toolbar once you have performed this action once.) Either route will
open the Select by Expression
dialogue box:We want to select just those cities with the population class 3 or higher. To do this follow these steps:
- Expand
Fields and Values
- Double-click on the
POPCLASS
field and it will appear in the expression box on the left (theValues
sub-window will appear on the right) - Click
All Unique
(this will load all five unique field values) - Expand
Operators
- Double-click
>=
(greater than) operator - Type
3
(or clickPOPCLASS
andAll Unique
again, and double click3
on the right) - Your query should look like this:
"POPCLASS" >= 3
- Click
Select features
Open layer’s attribute table again. The header will tell you how many features were selected: we have selected 345 populated places.
We will now save those 345 features as a separate shapefile:
- Right-click on Populated Places in the Layers Panel and choose
Export
>Save Selected Features As...
- The
Save Vector Layer as…
dialogue box will open - Set “Format” to
ESRI Shapefile
- Navigate to your working folder and save shapefile as
popPlaces_NA_popClass_3-5.shp
- Toggle off the original Populated Places layer
On your own: Take a moment to inspect the values in the CAPITAL field, and then create a separate shapefile containing only populated places that are either national (
2
) or state/provincial (1
) capitals.Data Querying 2: Select by Location
The Select By Location tool allows you to select features based on their location relative to features in another layer. In this instance, we want to extract populated places that are within the Great Lakes Watershed.
On the main menu, navigate to
Vector
> Research Tools
> Select By Location
The
Select by location
dialogue box will open. Set the following parameters:- “Select features from” →
popPlaces_NA_popClass_3-5
- “Where the features…” →
are within
- “By comparing the features from” →
greatlakes_basin
- Click
Run
popPlaces_GL_popClass_3-5.shp
. If you wish, edit the basic symbology settings of your data layers. Save your map project.Labels
Labels are textual information displayed on maps, adding details you could not necessarily represent using symbols or geometry. In a GIS software environment, a label refers to a piece of text on the map that is dynamically placed and whose text string is derived from one or more feature attributes. Technically, any information stored in the Attribute Table of a vector layer can be textually displayed as a label on a map.
To label cities on our map follow these steps:
- Right-click on the Populated Places (popPlaces_GL_popClass_3-5) layer and choose
Properties...
- In the Labels tab (fourth from the top), choose
Single Labels
- Set “Value” to
NAME
- Choose desired label font, style, size, and color (refer to the “Text Sample” appearance)
- Click
OK
This will display the names of all populated places in our feature class:
To label only a subset of the populated places, we can create a filter querying the Populated Places feature class within the Labels tab using expressions (the same method we used to Select by Attributes):
- In the Labels tab, choose
Rule-based Labeling
- Click
Add rule
(green “+” button) and a new window will pop up - Open Expression String Builder by clicking the
ε
button by the “Filter” field - Build a query, e.g.
“POPCLASS” = 4
- Scroll down and set desired label font, style, size, and color
- Click
OK
and save your map project
Adding Raster Data
Navigate to
..Data/Raster/
folder in the Browser panel and drag GHS_BUILT_S_E2020_GLOBE_R2022A_54009_1000_V1_0.tif
(GHS Built-up Surface Grid) and GHS_SMOD_E2020_GLOBE_R2022A_54009_1000_V1_0.tif
(GHS Settlement Model Grid)
to the scene. Toggle off the visibility of all other layers.
Raster data consists of a matrix of pixels (or cells) organized into a grid. Each pixel contains a value representing information:
- In the case of Built-up Surface Grid, the value (unit) represents the number of built-up square meters in a 1 x 1 km grid cell
- In the case of Settlement Model Grid, the value represents a specific settlment typology, e.g. a pixel holding the value of
30
denotes an “Urban Centre grid cell”, whereas the value of 11 stands for “ Very low density rural grid cell.”
Examples of other types of raster data include multi-spectral satellite imagery, elevation (DEM), population (cell values corresponding to the number of people living in each grid cell), various land use/land cover datasets, etc.
Raster Clipping
If you zoom out from the Great Lakes watershed, you will notice that both rasters added to our scene are global datasets, meaning you can work with them on a planetary scale. Sometimes, however, you will want to perform spatial analysis on a particular region or clip and display just a portion of your raster. Clip raster by extent and clip raster by Mask Layer are tools that allow you to extract a part of a raster dataset based on a template extent. In the spirit of situating ourselves in the Great Lakes watershed, we will clip both Global Human Settlement layers by the Great Lakes basin polygon:
- In the main menu bar, click through
Raster
>Extraction
>Clip Raster by Mask Layer…
; a dialogue box will pop up - Set the Input layer to
GHS_SMOD_E2020….tif
- Set Mask layer to
greatlakes_basin
- Make sure
Match the extent of the clipped raster to the extend of the mask layer…
option is checked - Click
Run
and close the dialogue box
Repeat the same raster extraction process with the Built-up layer as input. Save the clipped raster(s).
Raster Symbology
Next, we will change the appearance of our Global Human Settlements grids.
Open the Properties menu for the Built-up (clipped) layer and navigate to the Symbology tab. You’ll notice that this looks different from the style menu we have worked with for our vector layers. Instead of Symbol type, we have a “Render type” field and many options for how to color the bands in our dataset. Since we are working with a single-layer grayscale image, our symbology options are relatively limited. Still, we can experiment with color gradients. (Raster datasets can be made up of multiple bands, such as multispectral satellite imagery. We would better showcase the full power of raster symbology if we were working with such images, but that’s outside the scope of this tutorial.)
If you look at the expandable section labeled
Min / Max Value Settings
,
you will notice that the minimum and maximum values for Color gradients do not
need to correspond to the absolute value range of the raster (in this case, the value ranges between 0 and 511,437);
they can also be calculated based on a Cumulative count cut
, or Mean +/– standard deviation * X
,
both of which are used to get rid of outliers. The Cumulative count cut
means that QGIS is only taking into account the values between 2% and
98%, in the default case. Do this and click Apply
to see how the image changes:Now change the “Render type” to
Singleband pseudocolor
to get something more similar to a symbology we would do for a vector file:- Set the
Mode
fromContinuous
toEqual Interval
- Change the “Color gradient” to a ramp with at least three distinct colors
- Click
Classify
to load the values and then hitApply
to see it on the map - Save your map project
On your own: experiment with color gradients, switching between from
Continuous
and Equal Interval
modes.Finally, we will change the appearance of our Settlement Models grid. As mentioned previously, the numerical values of this raster represent specific settlement typologies. Here is the full breakdown (from the GHSL documentation):
- Class 30: “Urban Centre grid cell”, if the cell belongs to an Urban Centre spatial entity;
-
Class 23: “Dense Urban Cluster grid cell”, if the cell belongs to a Dense Urban Cluster spatial entity;
-
Class 22: “Semi-dense Urban Cluster grid cell”, if the cell belongs to a Semi-dense Urban Cluster spatial entity;
-
Class 21: “Suburban or per-urban grid cell”, if the cell belongs to an Urban Cluster cells at first hierarchical level but is not part of a Dense or Semi-dense Urban Cluster;
-
Class 13: “Rural cluster grid cell”, if the cell belongs to a Rural Cluster spatial entity;
-
Class 12: “Low Density Rural grid cell”, if the cell is classified as Rural grid cells at first hierarchical level, has more than 50 inhabitant and is not part of a Rural Cluster;
-
Class 11: “Very low density rural grid cell”, if the cell is classified as Rural grid cells at first hierarchical level, has less than 50 inhabitant and is not part of a Rural Cluster;
- Class 10: “Water grid cell”, if the cell has 0.5 share covered by permanent surface water and is not populated nor built.
Rasters like this one typically don't offer as many possibilities to experiment with symbology, given their limited number of values/classes and the fact they usually come with documentation that specifies not just nomenclature but also their color definitions:
Although you should always familiarize yourself with such conventions, we encourage you to challenge that approach and suggest different modes of representing the above categories. For example, you could split them into two (built-up/non-built-up) classes by assigning one of just two colors to each class (e.g., 10-13: grey, 21-30: red) or apply an entirely different set of colors (or gradients!) than the one from the official documentation.
To do so, open the layer’s Symbology tab and set the Render type to
Paletted/Unique values
,
and experiment with the settings:Wrapping Up and Exporting
When satisfied with your symbology choices, make the vector layers visible again and produce a final composition. (Choose just one of the two raster layers for your final map.)
Important: If exporting a map for further editing in Illustrator, remember to export your vector layers separately from your raster layer(s): rasters as png or jpg and vectors as svg.
>> TUTORIAL APPENDIX: REDEFINING WATERSHEDS <<
Dissolve Tool
The polygon shapefile representing the Great Lakes basin used in this exercise is technically a derivative of the
greatlakes_subbasins.shp
shapefile from Lab 01. The tool used to break the common boundaries of the adjacent subbasins is called Dissolve. To access the tool (and thus create a Great Lakes Basin shapefile yourself), navigate to
Vector
> Geoprocessing Tools
> Dissolve…
in the main menu. Set the input layer to greatlakes_subbasins.shp
and remember to save the dissolved layer to your GIS-Outputs
directory (otherwise, the tool will generate a temporary layer on the scratch disk that you won’t be able to use outside this QGIS file).Merge Tool
Sometimes you must combine multiple vector layers (of the same geometry type) into a new, single feature class. For example, to complete this week’s assignment, you may decide to merge the Great Lakes Watershed with (the subbasins of) the Mississippi River Watershed and create a unique 3/4 Coast Watershed encompassing the two interconnected hydrological systems with the city of Chicago in the middle. To do so, you will use the Merge tool. To access the tool, navigate to
Vector
> Data Management Tools
> Merge Vector Layers…
in the main menu. Set the Input layers by clicking on the …
button and save the merged layer to your GIS-Outputs
directory:Use the newly created shapefile as the basis for the Lab 02: Urban Watershed assignment.
Note: The additional hydrological data to complete the exercise is uploaded to
G:/2024_WNTR_ICSM/LAB_02/1_Data/Additional Data
folder and to the Dropbox repository.