LAB 04: GENERATE! > TUTORIAL

Through this exercise, you will learn several key geoprocessing and analysis tools in QGIS: integrating tabular data into our scene, followed by the calculation of a distance matrix to explore the spatial relationships between pairs of locations. The generated outputs are as much analytical as they are speculative. 



Premise and Objectives

After completing this exercise, you should be able to:

  • add tabular data containing geographic coordinates to the scene
  • reproject a shapefile
  • compute a distance matrix between two sets of locations
  • create a distance raster using the interpolation method


You will then post your work-in-progress to Are.na.


Prep

If you’re working in the SSDLAB virtual environment, copy the data repository for this exercise from the shared drive (G:) to your working location (U:). If not, download the data using this link.  In this exercise, will use two new datasets:


Adding x and y coordinate data as a layer

In addition to geospatial datasets, such as shapefiles or GeoTIFFs, you can also add tabular data that contains the geographic location (in the form of x and y coordinates) to your map: 

  • Launch QGIS and open the Data Source Manager (located on the main toolbar)
  • Open the Delimited text tab
  • Click the Browse () button and navigate to ../00_Data/NAS and select NAS_Dreissena-polymorpha_1986-2023.csv

Several fields will automatically populate. Confirm the following:
  • File Format is set to CSV (comma-separated values)
  • Geometry Definition is set to Point Coordinates; the X field is set to Longitude, and the Y field is set to Latitude
  • Geometry CRS is set to EPSG:4326 - WGS 84

Click Add and close the dialogue box. Save your map project.



Important Note: The CRS units must correspond to the x and y units in the tabular data. If the table contains latitude and longitude information in degrees, always set the Geometry CRS to EPSG:4326 - WGS 84 (unless specified otherwise in the metadata). 

Right-click on the point layer you just added and export features as NAS_zebra-mussels_wgs84.shp. Make sure the shapefile CRS is set to EPSG:4326 - WGS 84.




Reprojecting a vector layer

When using GIS tools or methods involving distance measurements, you should ensure that all your data layers are in the same projected coordinate system using linear units such as meters or feet. The best practice is to also use the same CSR as the Project CRS. In this exercise, we will use the NAD 83 / Conus Albers (EPSG : 5070).

To reproject the zebra mussels layer to NAD 83 / Conus Albers, click through  Vector > Data Management Tools > Reproject Layer in the main menu. Set the input layer as NAS_zebra-mussel_wgs84.shp, and Target CRS as EPSG:5070 - NAD83 / Conus Albers


Save the reprojected layer as NAS_zebra-mussel_nad83-conus-albers.shp

Add the EPA plastics facilities (EPA_TRI_Plastics_nad83-conus-albers.shp) and the Great Lakes shorelines (main_lakes.shp) shapefiles to your scene. Open the Layer Properties window for both layers and inspect their CRSs. You'll notice that the shorelines shapefile is unprojected. Use the Reproject Layer function again, and set its CRS to NAD 83 / Conus Albers. Save the new layer as GL_shorelines_nad83-conus-albers.shp

Set the project CRS to NAD 83 / Conus Albers and save your map project.

Your map project should look something like this:



Distance From Point to Point

A distance matrix is a two-dimensional array containing distance measurements between two sets of locations. In QGIS, the Distance to nearest hub (points) tool takes two point layers and computes the distance between point features taken as the origin and their closest point from the destination features (hubs). We can use it to answer questions such as:
  • Which plastics facilities are closest to the Great Lakes?
  • Which sites of zebra mussels sightings are nearest (or farthest) from the plastics facilities?

To answer the first question, we need to generate a series of points along the coastlines (the tool only takes point features as inputs). The points will serve as a proxy for coastlines:

  • Add the Processing Toolbox Panel to to your workspace (View > Panels > Processing Toolbox Panel)
  • Find the Points along geometry tool under Vector geometry (start typing “Points along geometry” in the search box)
  • Set the Input layer to GL_shorelines_nad83-conus-albers.shp
  • Set the distance to 2 kilometers (this will generate a point every 2 km along the coast)
  • Save the Interpolated points as GL_shorelines_points.shp.



Your should see something like this:


Open and inspect the Attribute Tables of the input shorelines layer and the resulting points layer. You will notice that the attrubutes from the single polygon representing shorelines are copied to each interpolated points we just generated, and there is no unique identifier that would disringuish points from one another. Although not always required, it’s generally considered a good practice to have at the minimum a unique Feature ID (FID) for each feature in a feature class.

To do that, we will create a new field in the shoreline points Attribute Table, and assign a row number to it as a unique identifier.

In the shorline points Attribute Table: 
  • Click on Open Field Calculator button (fourth from right in the toolbar)
  • In the Field Calculator dialogue box, check the Create a new Field
  • Set the Output field name to ID or FID (for feature classes, it’s a convention to name identifiers as “FID”)
  • Double-click on “row_number” (you should see @row_number appear in the Expression Box)
  • Click OK (This will automatically activate edit mode)
  • Click Save edits and toggle editing mode off




Now we’re able to compute distances between our sets of point features:

  • In the Processing Toolbox, search for  Distance to Nearest Hub (Points)
  • Set the Source points layer to EPA_TRI_Plastics_nad83-conus-albers
  • Set the Destination hubs layer to GL_shorelines_points.shp
  • Set the Hub layer name attribute to FID
  • Set the Measurement unit to Meters or Kilometers
  • Save the output Distance matrix as EPA_TRI_DistanceToLakes.shp
  • Run the process


Open the Attribute Table of the resulting layer. We have generated two new fields in the Attribute Table where each EPA plastics facility is matched with its nearest coastline point, and QGIS has computed the distance between them in meters (the HubName field is the FID number of the nearest point).



On your own:
  1. Create a "water bodies" point feature class that includes the Great Lakes shorelines and the nearby rivers. Then compute a new distance matrix of the EPA plastics facilities in relation to the water bodies, including rivers. (Hint: you will have to use selection methods to create a subset of the ne_rivers_north_america.shp features, create points along geometry, and then merge those points with the lakes' shorelines points that you already have. We covered multiple selection methods in the Lab 02 tutorial and the Merge Vector Layers tool in the appendix.)
  2. Compute a distance matrix between the NAS_zebra-mussel and the EPA_TRI_Plastics data layers.


Distance Raster (Interpolation)

Interpolation methods use points with known values to estimate values at other unknown points. We can use it to predict unknown values for any geographic point data, such as elevation, distance, chemical concentrations, noise levels, etc.

One method to do so is the Inverse Distance Weighting (IDW) interpolation technique, also known as the weighted average interpolator. It takes an input array with scattered data values for every point and outputs a grid geometry in the form of a distance raster. Distance rasters are powerful analytical tools, but even more powerful visual tools.

Distance matrices that you just created are ideal inputs for IDW interpolation:

  • In the main menu, navigate to Raster > AnalysisGrid (Inverse Distance to a Power)
  • Set the Point layer to EPA_TRI_DistanceToLakes.shp
  • Set the Z value from field to HubDist
  • Run the operation


Finally, save the resulting raster as EPA_distToLakes_grid.tif and adjust the symbology:


On your own: Use this same workflow to generate new interpolation rasters for the distance between EPA sites and Zebra Mussels (or any other point feature class of interest).







INTRODUCTION TO CRITICAL SPATIAL MEDIA / CEGU 23517 / ENST 23517 / ARCH 23517 / DIGS 23517 / ARTV 20665 / MAAD 13517 | WINTER 2024

INSTRUCTORS: Alexander Arroyo, Grga Bašić, Sol Kim

URBAN THEORY LAB   |   COMMITTEE ON ENVIRONMENT, GEOGRAPHY, AND URBANIZATION   |    UNIVERSITY OF CHICAGO