This lesson is in the early stages of development (Alpha version)

Introduction to Geospatial Raster and Vector Data with Python

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain.

Getting Started

Data Carpentry’s teaching is hands-on, so participants are encouraged to use their own computers to ensure the proper setup of tools for an efficient workflow. To most effectively use these materials, please make sure to download the data and install everything before working through this lesson.

This workshop assumes no prior experience with the tools covered in the workshop. However, learners with prior experience working with geospatial data may be able to skip episodes 1-4, which focus on geospatial concepts and tools. Similarly, learners who have prior experience with the Python programming language may wish to skip the Plotting and Programming in Python lesson.

To get started, follow the directions in the Setup tab to get access to the required software and data for this workshop.

Data

The data used in this lesson includes optical satellite images from the Copernicus Sentinel-2 mission and public geographical datasets from the dedicated distribution platform of the Dutch government.

These are real-world data sets that entail sufficient complexity to teach many aspects of data analysis and management. They have been selected to allow students to focus on the core ideas and skills being taught while offering the chance to encounter common challenges with geospatial data.

Follow the directions in the Setup tab to download the required files.

Workshop Overview

Lesson Starting Points Overview
Episode 1: Introduction to Raster Data Understand data structures and common storage and transfer formats for spatial data. Start here if you want to understand fundamental geospatial concepts like coordinate reference systems, rasters, and vectors.
Plotting and Programming in Python Import data into Python, calculate summary statistics, and create publication-quality graphics. Start here if you have an understanding of geospatial concepts but want to learn Python fundamentals.
Episode 5: Access satellite imagery using Python Open, work with, and plot vector and raster-format spatial data in Python. Start here if you already have a good grasp of geospatial concepts and a working knowledge of Python.

Schedule

Setup Download files required for the lesson
00:00 1. Introduction to Raster Data What format should I use to represent my data?
What are the main data types used for representing geospatial data?
What are the main attributes of raster data?
00:20 2. Introduction to Vector Data What are the main attributes of vector data?
00:35 3. Coordinate Reference Systems What is a coordinate reference system and how do I interpret one?
01:00 4. The Geospatial Landscape What programs and applications are available for working with geospatial data?
01:10 5. Access satellite imagery using Python Where can I find open-access satellite data?
How do I search for satellite imagery with the STAC API?
How do I fetch remote raster datasets using Python?
01:55 6. Read and visualize raster data How is a raster represented by rioxarray?
How do I read and plot raster data in Python?
How can I handle missing data?
03:15 7. Vector data in Python How can I distinguish between and visualize point, line and polygon vector data?
04:05 8. Crop raster data with rioxarray and geopandas How can I crop my raster data to the area of interest?
05:05 9. Raster Calculations in Python How do I perform calculations on rasters and extract pixel values for defined locations?
06:20 10. Calculating Zonal Statistics on Rasters How to compute raster statistics on different zones delineated by a vector data?
07:20 11. Parallel raster computations using Dask How can I parallelize computations on rasters with Dask?
How can I determine if parallelization improves calculation speed?
What are good practices in applying parallelization to my raster calculations?
08:20 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.