-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File System: Add optional_ptr<FileOpener>
to various calls, and add support for attaching DuckDB files over S3
#11343
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…om DatabaseInstance instead of ClientContext
…This disables HTTPFS caching as that is not thread safe
samansmink
approved these changes
Mar 26, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Good idea on the FILE_FLAGS_PARALLEL_ACCESS
github-actions bot
pushed a commit
to duckdb/duckdb-r
that referenced
this pull request
Mar 30, 2024
Merge pull request duckdb/duckdb#11343 from Mytherin/s3attach
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Follow-up from #11259 and #11297
This PR expands several
FileSystem
API calls to add anoptional_ptr<FileOpener>
to various methods that were missing them:The
FileOpener
is used in order to pass along information from theClientContext
or sourceDatabase
into the file system. In particular, it is used by the S3 and Azure file systems to pass along authentication information (e.g. secrets). Without aFileOpener
, these function calls do not have authentication information, which makes it impossible to support these functions for e.g. private S3 buckets. This PR resolves that issue.S3 Attach
After the secret manager was added - this actually solves the last roadblock for getting
ATTACH
to work for DuckDB databases over S3. This PR also addresses a few more minor problems that now allow for reading DuckDB databases over S3.FILE_FLAGS_PARALLEL_ACCESS
: This is a new flag added in this PR that signifies we need the file system to support parallel read/write access to the same file handle, as that is what happens when attaching a DuckDB database. For thehttpfs
extension, this flag disables the cache within the file, as that is not thread-safe, and instead directly submits GET requests to S3.DatabaseFileSystem
: we add anOpenerFileSystem
based on theDatabaseInstance
class, which is returned byFileSystem::GetFileSystem(db)
. This allows us to access aDatabaseInstance
from aFileOpener
, even when aClientContext
is not available. This can happen for example when loading the initial database.TryGetDatabase
: As a follow-up to this, we also add a new method to theFileOpener
-TryGetDatabase
. This method can be used to obtain aDatabaseInstance
, and works for both theDatabaseFileOpener
andClientContextFileOpener
. This method is used to obtain the secret manager in the HTTPFS file system, allowing (persistent) S3 secrets to be accessed when attaching to a DuckDB database.