New Xarray accessor for rasters through GeoUtils #8041
Description
Hi all,
(Co-opening in rioxarray
due to the raster nature of the accessor: corteva/rioxarray#687)
As in #8040 for DEMs, writing an issue to let you know that we intend to write an Xarray accessor to enable functions specific to raster analysis in our package GeoUtils. GeoUtils is built on top of rasterio
and aims to facilitate raster/vector manipulation.
To answer the question you'll probably ask: Why the need for another raster accessor when there is rioxarray
and xarray-spatial
?
In short:
- For
rioxarray
and georeferencing: we simplify the handling of georeferenced data ofrasterio
/rioxarray
for end-users focusing on analysis by allowing match-operations between any raster/vector and for all types of functions (reproject, crop, rasterize, polygonize, etc), and reading the metadata of each object for them to work implicitly in any CRS. Done separately, those can be a complex learning curve for beginners, and lead to inconsistent results for these "basic operations" in the community. - For
xarray-spatial
and analysis tools: we simply want to add a lot more functionalities that (1) understand georeferencing, (2) are robust to nodata and (3) pixel interpretation of rasters (corner or center?). In particular: local and zonal stats, variography, 2D registration, filters, grid interpolation, error propagation, etc... We'd wrap functionalities existing in the non-GIS xarray ecosystem whenever we can, and adapt them to georeferenced ops. Those can be tricky to adapt due to the above 3 points, and so we really feel the need for them to be implemented & tested consistently somewhere. We'd build on top ofxarray-spatial
for what exists there, and try to coordinate! 😊
The accessor would mirror all the functionalities we have (and future ones) and build them on top of rioxarray
and geocube
. Those are:
- Match-reference georeferencing manipulation (a reference = another xarray.Dataset or a geopandas.GeoDataFrame, when using
reproject
,crop
,rasterize
,polygonize
to allow implicit metadata handling and facilitate quick analysis), - Support for spatial georeferenced operations with nodata values, such as proximity (I see a bit of overlap with Xarray-Spatial/Proximity: https://xarray-spatial.org/user_guide/surface.html),
- 2D registration for georeferenced data,
- Spatial statistics for georeferenced data,
- Error propagation for georeferenced data,
- Filters for georeferenced data,
- Parsing sensor metadata from filenames/auxiliary data for most common satellite data (might evolve in a different package in time!).
For the accessor name, I was thinking of "geo" or "gu", such as: ds.geo.polygonize()
, ds.geo.proximity()
, ds.geo.coregister()
. I'm not sure if those are already in use. What do you think?
Thanks!