-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to dbt-common + dbt-adapters #342
Migrate to dbt-common + dbt-adapters #342
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what a week it has been
@@ -75,7 +76,7 @@ def as_dict(self) -> Dict[str, Any]: | |||
class TargetConfig: | |||
relation: BaseRelation | |||
column_list: Sequence[Column] | |||
config: RuntimeConfigObject | |||
config: Any # TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably the safest since our usage of this config object is pretty much limited to get
methods
dbt/adapters/duckdb/utils.py
Outdated
@@ -47,7 +48,7 @@ def as_dict(self) -> Dict[str, Any]: | |||
return base | |||
|
|||
@classmethod | |||
def create_from_source(cls, source: SourceDefinition) -> "SourceConfig": | |||
def create_from_source(cls, source: Any) -> "SourceConfig": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Losing the SourceDefinition
typing makes me a bit more nervous since we rely on its fields pretty extensively and the external_location
features are a pretty popular part of dbt-duckdb on top of simple data lakes or just CSV/Parquet files on a NAS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heard - let me bug @MichelleArk & @colin-rogers-dbt about this and get back to you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of the decoupling work, we removed the BaseRelation.create_from_source
and BaseRelation.create_from_node
methods in favor of a consolidated BaseRelation.create_from
method that accepts quoting: HasQuoting
and relation_config: RelationConfig
(dbt-labs/dbt-core#9210). I've updated this branch to reflect that typing + implemented DuckDBRelation.create_from
here: 5712711.
We should also extend the RelationConfig
to also provide meta
and tags
, as well as resource_type
so this kind of conditional logic can still be implemented reliably for sources in dbt-adapters: https://github.com/dbt-labs/dbt-adapters/pull/120/files.
Still need to figure out how we want to handle source_meta
, since its not a common property across sources + other node types. I see at a few options:
- Do nothing in dbt-adapters/core, and update the duckdb
create_from_source
method to usehasattr
or something similarly unsatisfying - Implement a
merged_meta
attribute on sources + nodes in dbt-core and extendRelationConfig
to expectmerged_meta
, which the duckdb adapter could use instead of accessingsource_meta
+ merging. This doesn't feel like the slickest interface design and would need changes to most of our node classes in core but it'd get the job done. - Potentially -- In dbt-core, update the
meta
field onSourceDefinition
to include fields fromsource_meta
by default, meaning the duckdb adapter wouldn't need to do theupdate
business at all. It might be tricky to do this in a backward-compatible way in dbt-core though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MichelleArk thank you for this, I really appreciate it. Will try to noodle on a good strategy here, but using the hasattr
stuff as an ugly-but-necessary fallback is okay with me if we don't find something better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead with option 3 above in dbt-core: dbt-labs/dbt-core#9766, since it was actually pretty straightforward to do in a backward-compatible way (meta merges table.meta and source.meta, with table.meta taking precedence), and updated the branch to remove the source_meta
reference in favor of just meta
which now includes both table + source meta attributes.
I also updated the reference to config._extra
to config.extra
, as those should be identical and extra
is already part of the RelationConfig protocol: https://github.com/dbt-labs/dbt-adapters/blob/35bd3629c390cf87a0e52d999679cc5e33f36c8f/dbt/adapters/contracts/relation.py#L41.
Tests were passing for me locally against dbt-core@main. @jwills -- could we kick off CI again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @MichelleArk , thank you!
@MichelleArk we don't need to worry about the MD one (it requires my secret token); the code quality stuff can be fixed by running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwills I think this is just about ready!
Could you help me look into the Buena Vista failures? I see what appear to be connection errors:
no results to fetch
server closed the connection unexpectedly
# add dbt-core to ensure backwards compatibility of installation, this is not a functional dependency | ||
"dbt-core>=1.8.0b1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though dbt-core
is no longer a functional dependency of adapters, we're going to keep including dbt-core
in setup.py
to prevent surprising/breaking changes to the behavior of pip install dbt-duckdb
on the day of the v1.8 final release.
dbt-duck
shouldn't import anything from dbt-core
directly, only from the interfaces defined in dbt-core
+ dbt-adapters
.
@jtcohen6 thank you so much! Will take a look at the BV stuff which I think is broken on main as well; when does all of this need to be ready to go out the door? |
@jwills Anytime in the next few weeks is good! Folks are beta-testing v1.8 now, we're planning to put out the release candidate (dbt-core==1.8.0rc1) on May 2. |
just punting on this for now #379 |
…ommon-adapters-interfaces
from dbt.tests.adapter.unit_testing.test_invalid_input import BaseUnitTestInvalidInput | ||
|
||
|
||
class TestUnitTestingTypesDuckDB(BaseUnitTestingTypes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwills I pulled these in (selfishly), since it seemed like unit testing might "just work" on dbt-duckdb
with the v1.8 interface updates.
It's all well & good on the Real Duck, but it looks like the test is failing on BuenaVista for the last data type in the list (ARRAY[1,2,3]
):
File "/home/runner/work/dbt-duckdb/dbt-duckdb/.tox/buenavista/lib/python3.9/site-packages/buenavista/postgres.py", line 104, in <lambda>
BVType.INTEGERARRAY: (1007, lambda v: "{" + ",".join(v) + "}"),
TypeError: sequence item 0: expected str instance, int found
sequence item 0: expected str instance, int found
---------------------- CSV report: buenavista_results.csv ----------------------
=========================== short test summary info ============================
FAILED tests/functional/adapter/test_unit_testing.py::TestUnitTestingTypesDuckDB::test_unit_test_data_type - AssertionError: unit test failed when testing model with ARRAY[1,2,3]
How would you feel about me adding @pytest.mark.skip_profile("buenavista")
to this test? I know that adds a bit of tech debt / mystery for you later on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that’s just because I don’t properly support array types in the BV Postgres serialization, so it’s perfectly understandable and okay to mark it as skipped
@jwills ready to ship when you see fit! |
resolves #341
Starting a draft PR for discussion!
This mostly just works by updating
import
statements. Caveats:dbt.adapters.duckdb.utils
:SourceConfig
andTargetConfig
use type definitions that are not available from the common/adapter interfaces. We could either stub those types withinduckdb.utils
, or discuss if they need to be pulled into the shared interfaces.