Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive partitioning: Fix preprocessing of CreateDirectories #9535

Merged
merged 4 commits into from
Nov 1, 2023

Conversation

carlopi
Copy link
Contributor

@carlopi carlopi commented Nov 1, 2023

Fixes https://github.com/duckdblabs/duckdb-internal/issues/588 improving on #9473.

Idea is that we iterate on all global partitions instead of iterating on the local ones.

Thanks @l1t1 for providing the test case.

Fixes duckdblabs/duckdb-internal#588 improving on duckdb#9473.
Idea is that we iterate on all global partitions instead of iterating on the local ones.
@carlopi

This comment was marked as outdated.

@carlopi carlopi requested review from samansmink and Mytherin and removed request for samansmink November 1, 2023 09:13
@carlopi
Copy link
Contributor Author

carlopi commented Nov 1, 2023

Actually, not yet fixed in a slightly more involved case.

@carlopi carlopi marked this pull request as draft November 1, 2023 09:22
@carlopi carlopi marked this pull request as ready for review November 1, 2023 12:50
Invariant would be that CreateDirectories will only be called while holding the global lock,
and the same global lock will be held while iterating on what partitions needs to be created.
@carlopi carlopi marked this pull request as draft November 1, 2023 12:55
@carlopi carlopi marked this pull request as ready for review November 1, 2023 12:56
@Mytherin Mytherin merged commit e807b41 into duckdb:main Nov 1, 2023
45 checks passed
@Mytherin
Copy link
Collaborator

Mytherin commented Nov 1, 2023

Thanks!

@carlopi carlopi deleted the fixhivefolders branch November 1, 2023 19:08
@l1t1

This comment was marked as abuse.

@carlopi
Copy link
Contributor Author

carlopi commented Nov 2, 2023

@l1t1: Thanks for double checking on this, but you mentioned what I think is an outdated link, could you give either https://github.com/duckdb/duckdb/actions/runs/6719943007?pr=9535 (note the link you cited ended with 6717806759) or nightly builds like https://github.com/duckdb/duckdb/actions/runs/6726710145 ?

Thanks a lot

@l1t1

This comment was marked as abuse.

@carlopi
Copy link
Contributor Author

carlopi commented Nov 2, 2023

Thanks for checking!
Note that the test https://github.com/duckdb/duckdb/blob/main/test/sql/copy/parquet/parquet_hive2.test is the one you provided, so will be checked moving further.

@bucweat
Copy link
Contributor

bucweat commented Nov 2, 2023

Ran my test case and results look as expected. Thanks :-)

┌─────────────────┬────────────┐
│ library_version │ source_id  │
│     varchar     │  varchar   │
├─────────────────┼────────────┤
│ v0.9.2-dev242   │ e807b416e8 │
└─────────────────┴────────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants