-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid updating partitions table when unnecessary #114
Conversation
@@ -137,15 +148,18 @@ def read_json(file: Union[Path, str, Iterator[Any]] = "stdin") -> Iterable: | |||
yield orjson.loads(line) | |||
|
|||
|
|||
@dataclass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove dataclass; this is more appropriate as a normal Python object. IMO dataclasses should have light functionality and really be kept to data containers with public members (sort of like a struct). The dataclass requirement for immutable defaults was forcing the need to make this private attribute optional, which triggered the change from a dataclass to a standard object.
787fc09
to
ebd4b6d
Compare
geom = Geometry.from_geojson(geojson) | ||
if geom is None: | ||
raise Exception(f"Invalid geometry encountered: {geojson}") | ||
geometry = str(geom.wkb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for None to appease type linter
This commit refactors the loader code to avoid unnecessary partition updates. Partition updates should be avoided when unnecessary as they can have a large performance impact for partitions containing many items.
ebd4b6d
to
f4f1e82
Compare
Tested this on a re-ingest of Sentinel 1 GRD with about a ~10x improvement in ingest timing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This commit refactors the loader code to avoid unnecessary partition updates. Partition updates should be avoided when unnecessary as they can have a large performance impact for partitions containing many items.