This project is created to track down RasterSources API regressions.
NOTE: at this point, this project depends on GeoTrellis Contrib 3.14.0-SNAPSHOT, it requires GeoTrellis Contrib local publish.
Working on a cluster take into account the fact that GDAL requires a different strategy with the resources allocation.
It is not possible to use maximizeResourceAllocation
flag with using JNI bindings.
As the result of this work, was also figured out that maximizeResourceAllocation
in general is not the best solution
for GeoTrellis ingests.
Pay attention to GDAL proper configuration:
gdal.options {
GDAL_DISABLE_READDIR_ON_OPEN = "TRUE" # we don't usually want to read the entire dir with tiff metadata
CPL_VSIL_CURL_ALLOWED_EXTENSIONS = ".tif" # filter files read by extension to speed up reads
GDAL_MAX_DATASET_POOL_SIZE = "256" # number of allocated GDAL datasets
GDAL_CACHEMAX = "1000" # number in megabyes to limit GDAL apetite
# CPL_DEBUG = "ON" # to eanble GDAL logging on all nodes
}
For 50
i3.xlarge
nodes it turned out that GDAL_CACHEMAX = 1000
and 200
single core executors
looks like a good option. For 25
i3.xlarge
nodes GDAL_CACHEMAX = 500
and 70
single core executors, etc.
The test dataset: s3://azavea-datahub/raw/ned-13arcsec-geotiff
The test dataset size: 1115 Objects - 210.7 GB
Legacy GeoTrellis Ingest: 1 core per executor
, 1500M
RAM per executor
GeoTiff RasterSources Ingest: 1 core per executor
, 1500M
RAM per executor
GDAL RasterSources Ingest: 70 executors
, 1 core per executor
, 1500M
RAM per executor, GDAL_CACHEMAX = 500
Legacy GeoTrellis Ingest: 1 core per executor
, 1500M
RAM per executor
GeoTiff RasterSources Ingest: 1 core per executor
, 1500M
RAM per executor
GDAL RasterSources Ingest: 200 executors
, 1 core per executor
, 1500M
RAM per executor, GDAL_CACHEMAX = 1000
Legacy GeoTrellis Ingest: max resources allocation
, 200 executors
, 1 core per executor
, 4200M
RAM per executor.
With less RAM job is failing, maxmizing resources usage kills job as well.
GeoTiff RasterSources Ingest: max resources allocation
, 200 executors
, 1 core per executor
, 4200M
RAM per executor
With less RAM job is failing, maxmizing resources usage kills job as well.
GDAL RasterSources Ingest: max resources allocation
, 200 executors
, 1 core per executor
, 1500M
RAM per executor, GDAL_CACHEMAX = 1000
(OLD Version, is deprecated; it was written because of cluster misconfiguration (see the next section)) The new API completely replaces the old one. The two ingests are a bit different. GDAL Ingest requires a bit
more complicated settings tuning, however, the new API is not slower and sometimes even faster.
GDALRasterSources are much more complicated in tuning and give no significant performance improvements, however, it is probably because of an old GDAL 2.3.x version that was used on EMR cluster that doesn't take into account CGroups. GDAL tests would be relaunched once we'll have GDAL 2.4 RPMs.
In terms of this benchmark, we figured out that maximizeResolurceAllocation flag
can behave not like everybody expects it to behave. The main danger here that it sets
spark.default.parallelism
to 2X number of CPU cores available to YARN containers
. It is a pretty
small number usually and in fact forces spark to use spark.default.parallelism
in all
reduce
operations and to reshuffle data into this particular number of partitions.
By default Spark
tries to preserve partitioning scheme. But with this option enabled it will force shuffle
if the partitioner
option was not explicitly passed into all operations that potentially may cause shuffle.
./sbt ingest-ned
on the cluster without maximizeResourceAllocation
flag usage (20 i3.xlarge nodes
):
./sbt ingest-ned
with maximizeResourceAllocation
flag usage:
You can notice that in the first picture we can see the partitioning scheme preserving.
In the second picture we see that exactly the same application behaves differently after
the CutTiles
step and the data is repartitioned into 160
partitions
(in this case spark.default.parallelism
was set to 160
):