This repository serves as a brief introduction into the field of geo-spatial data. The objective is to play around with various solutions proposed by the geo-spatial and GIS communities.
Specifically, this repository covers
- Spatio-Temporal Asset Catalog (STAC) creation and querying using
pystac
andpystac_client
. - Creating monthly mosaics of satellite data using
Dask
. - Cloud Optimized GeoTIFFs (COGs) and the use of tilers using
titiler
.
git clone git@github.com:alexberndt/geospatial-data
cd geospatial-data
poetry install
Get started with a jupyter notebook by running
poetry run jupyter-lab
This repository consists of
-
STAC Registry and Querying
src/ml_notebooks/stac/stac.ipynb
src/ml_notebooks/stac/read_stac_api.ipynb
with helper functions written in a separate file
src/ml_notebooks/stac/helper.py
to aide with code readability.Although
catalog.validate()
passed all checks, I was struggling to query data using thepystac_client
tool. Any idea as to what I was doing wrong?To avoid being blocked by this, I used a publicly available STAC catalog to test STAC queries with (see
read_stac_api.ipynb
notebook). -
Exploration of Provided Satellite Data
src/ml_notebooks/stac/explore_data.ipynb
-
Monthly Mosaic
src/ml_notebooks/conda/monthly_mosaic.ipynb
As mentioned above, I was struggling getting the STAC queries to work without error, so ended up testing the coiled-based cloud cluster using the example STAC provided by Planetary Computer. The results are documented in this notebook:
src/ml_notebooks/conda/median_mosaic.ipynb
-
COG Visualization
src/ml_notebooks/cog/cog_visualizations.ipynb
The assets are saved as follows
Here follows a list of technologies commonly used to solve ML-related problems in the geo-spatial industry: