Training Resources for Geospatial Computing

This is a curated list of training resources for popular packages that are used in geospatial computing. The current list primarily includes Python packages, but it will be updated to include packages from other languages, e.g. R. If you need resources for any particular package, please contact us so we can update the list accordingly.

Please feel free to contribute the adding resources you find useful for geospatial research, education, and training purposes.

Outline

Common Resources for Geospatial Computing and Earth Observation

These are links to resources that have useful information and tutorials on geospatial computing and Earth observation.

Dask

Dask is a flexible library for parallel computing in Python.

Dask is composed of two parts:

  1. Dynamic task scheduling optimized for computation. This is similar to Airflow, Luigi, Celery, or Make, but optimized for interactive computational workloads.

  2. “Big Data” collections like parallel arrays, dataframes, and lists that extend common interfaces like NumPy, Pandas, or Python iterators to larger-than-memory or distributed environments. These parallel collections run on top of dynamic task schedulers.

Xarray

xarray is an open source project and Python package that makes working with labelled multi-dimensional arrays simple, efficient, and fun!

GeoPandas

GeoPandas is an open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types

RasterIO

Geographic information systems use GeoTIFF and other formats to organize and store gridded raster datasets such as satellite imagery and terrain models. Rasterio reads and writes these formats and provides a Python API based on Numpy N-dimensional arrays and GeoJSON.

Plotly

Plotly’s Python graphing library makes interactive, publication-quality graphs.

NumPy

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

Matplotlib

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python.

Pandas

pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Scikit-Learn

Seaborn

Seaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics.

Cartopy

Cartopy is a python package which provides a set of tools for creating projection-aware geospatial plots using python’s standard plotting package, matplotlib. Cartopy also has a robust set of tools for defining projections and reprojecting data.

PySAL

The python spatial analysis library for Geospatial Data Science

SpatioTemporal Asset Catalogs

The SpatioTemporal Asset Catalog (STAC) specification provides a common language to describe a range of geospatial information, so it can more easily be indexed and discovered. A ‘spatiotemporal asset’ is any file that represents information about the earth captured in a certain space and time.

geemap

geemap is a Python package for interactive mapping with Google Earth Engine (GEE), which is a cloud computing platform with a multi-petabyte catalog of satellite imagery and geospatial datasets.

PyTorch

PyTorch is an optimized tensor library for deep learning using GPUs and CPUs.

TensorFlow

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

JAX

JAX is Autograd and XLA, brought together for high-performance numerical computing and machine learning research.

Textbooks

These are textbooks related to deep learning and geospatial computing. You can also view all the textbooks in the Textbooks folder.

Miscellaneous Libraries

Fiona

Fiona reads and writes geographic data files and thereby helps Python programmers integrate geographic information systems with other computer systems. Fiona contains extension modules that link the Geospatial Data Abstraction Library (GDAL).

Xarray-spatial

Xarray-Spatial implements common raster analysis functions using Numba and provides an easy-to-install, easy-to-extend codebase for raster analysis.

Rio-xarray

Geospatial xarray extension powered by rasterio

Regionmask

regionmask is a Python module that:

Contains a number of defined regions, including: countries (from Natural Earth), a landmask and regions used in the scientific literature (the Giorgi regions 1 and the SREX regions 2). Can plot figures of these regions with matplotlib and cartopy. Can be used to create masks of the regions for arbitrary longitude and latitude grids (2D integer masks and 3D boolean masks). Support for shapefiles is provided via geopandas. Arbitrary regions can be defined easily.

Geocube

Tool to convert geopandas vector data into rasterized xarray data.

Salem

Salem is a small library to do geoscientific data processing and plotting. It extends xarray to add geolocalised subsetting, masking, and plotting operations to xarray’s DataArray and DataSet structures.