This lesson is in the early stages of development (Alpha version)

The Geospatial Landscape

Overview

Teaching: 10 min
Exercises: 0 min
Questions
  • What programs and applications are available for working with geospatial data?

Objectives
  • Describe the difference between various approaches to geospatial computing, and their relative strengths and weaknesses.

  • Name some commonly used GIS applications.

  • Name some commonly used Python packages that can access and process spatial data.

  • Describe pros and cons for working with geospatial data using a command-line versus a graphical user interface.

Standalone Software Packages

Most traditional GIS work is carried out in standalone applications that aim to provide end-to-end geospatial solutions. These applications are available under a wide range of licenses and price points. Some of the most common are listed below.

Open-source software

The Open Source Geospatial Foundation (OSGEO) supports several actively managed GIS platforms:

Commercial software

Online + Cloud computing

Private companies have released SDK platforms for large scale GIS analysis:

Publicly funded open-source platforms for large scale GIS analysis:

GUI vs CLI

The earliest computer systems operated without a graphical user interface (GUI), relying only on the command-line interface (CLI). Since mapping and spatial analysis are strongly visual tasks, GIS applications benefited greatly from the emergence of GUIs and quickly came to rely heavily on them. Most modern GIS applications have very complex GUIs, with all common tools and procedures accessed via buttons and menus.

Benefits of using a GUI include:

Downsides of using a GUI include:

In scientific computing, the lack of reproducibility in point-and-click software has come to be viewed as a critical weakness. As such, scripted CLI-style workflows are again becoming popular, which leads us to another approach to doing GIS — via a programming language. This is the approach we will be using throughout this workshop.

GIS in programming languages

A number of powerful geospatial processing libraries exist for general-purpose programming languages like Java and C++. However, the learning curve for these languages is steep and the effort required is excessive for users who only need a subset of their functionality.

Higher-level scripting languages like Python and R are easier to learn and use. Both now have their own packages that wrap up those geospatial processing libraries and make them easy to access and use safely. A key example is the Java Topology Suite (JTS), which is implemented in C++ as GEOS. GEOS is accessible in Python via the shapely package (and geopandas, which makes use of shapely) and in R via sf. R and Python also have interface packages for GDAL, and for specific GIS apps.

This last point is a huge advantage for GIS-by-programming; these interface packages give you the ability to access functions unique to particular programs, but have your entire workflow recorded in a central document - a document that can be re-run at will. Below are lists of some of the key spatial packages for Python, which we will be using in the remainder of this workshop.

These packages along with the matplotlib package are all we need for spatial data visualisation. Python also has many fundamental scientific packages that are relevant in the geospatial domain. Below is a list of particularly fundamental packages. numpy, scipy, and scikit-image are all excellent options for working with rasters, as arrays.

An overview of these and other Python spatial packages can be accessed here.

As a programming language, Python can be a CLI tool. However, using Python together with an Integrated Development Environment (IDE) application allows some GUI features to become part of your workflow. IDEs allow the best of both worlds. They provide a place to visually examine data and other software objects, interact with your file system, and draw plots and maps, but your activities are still command-driven: recordable and reproducible. There are several IDEs available for Python. JupyterLab is well-developed and the most widely used option for data science in Python. VSCode and Spyder are other popular options for data science.

Traditional GIS apps are also moving back towards providing a scripting environment for users, further blurring the CLI/GUI divide. ESRI have adopted Python into their software, and QGIS is both Python and R-friendly.

GIS File Types

There are a variety of file types that are used in GIS analysis. Depending on the program you choose to use some file types can be used while others are not readable. Below is a brief table describing some of the most common vector and raster file types.

Vector

File Type Extensions Description
Esri Shapefile .SHP .DBF .SHX The most common geospatial file type. This has become the industry standard. The three required files are: SHP is the feature geometry. SHX is the shape index position. DBF is the attribute data.
Geographic JavaScript Object Notation (GeoJSON) .GEOJSON .JSON Used for web-based mapping and uses JavaScript Object Notation to store the coordinates as text.
Google Keyhole Markup Language (KML) .KML .KMZ KML stands for Keyhole Markup Language. This GIS format is XML-based and is primarily used for Google Earth.
OpenStreetMap .OSM OSM files are the native file for OpenStreetMap which had become the largest crowdsourcing GIS data project in the world. These files are a collection of vector features from crowd-sourced contributions from the open community.

Raster

File Type Extensions Description
ERDAS Imagine .IMG ERDAS Imagine IMG files is a proprietary file format developed by Hexagon Geospatial. IMG files are commonly used for raster data to store single and multiple bands of satellite data.Each raster layer as part of an IMG file contains information about its data values. For example, this includes projection, statistics, attributes, pyramids and whether or not it’s a continuous or discrete type of raster.
GeoTIFF .TIF .TIFF .OVR The GeoTIFF has become an industry image standard file for GIS and satellite remote sensing applications. GeoTIFFs may be accompanied by other files:TFW is the world file that is required to give your raster geolocation.XML optionally accompany GeoTIFFs and are your metadata.AUX auxiliary files store projections and other information.OVR pyramid files improves performance for raster display.
Cloud Optimized GeoTIFF (COG) .TIF .TIFF Based on the GeoTIFF standard, COGs incorporate tiling and overviews to support HTTP range requests where users can query and load subsets of the image without having to transfer the entire file.

Key Points

  • Many software packages exist for working with geospatial data.

  • Command-line programs allow you to automate and reproduce your work.

  • JupyterLab provides a user-friendly interface for working with Python.