Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to dbt-common + dbt-adapters #342

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
1e25859
Migrate to dbt-common + dbt-adapters
jtcohen6 Feb 19, 2024
86eface
Try different install reqs
jtcohen6 Feb 19, 2024
2d36968
Fix unit tests
jtcohen6 Feb 19, 2024
3ad787e
Bonus: functional tests for dbt unit testing
jtcohen6 Feb 19, 2024
d4d5939
bump dbt-common and dbt-adapters to 1.0.0b1
MichelleArk Feb 29, 2024
10034b3
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
MichelleArk Feb 29, 2024
5712711
implement DuckDbRelation.create_from, fix TestExternalSources::test_e…
MichelleArk Feb 29, 2024
af5f989
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
MichelleArk Mar 15, 2024
9c8113a
use RelationConfig attributes in create_from_source
MichelleArk Mar 15, 2024
80b709d
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
MichelleArk Mar 19, 2024
b8fdca0
formatting + initial mypy fixes
MichelleArk Mar 19, 2024
6527752
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
jtcohen6 Apr 9, 2024
9a0ba02
Revert dev-requirements
jtcohen6 Apr 10, 2024
6dcc39c
Readd dbt-tests-adapter
jtcohen6 Apr 10, 2024
5f594c4
Readd dbt-core to setup.py for install back-compat
jtcohen6 Apr 16, 2024
8511b94
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
jtcohen6 Apr 16, 2024
0750e0c
Merge remote-tracking branch 'origin/master' into jerco/migrate-dbt-c…
jtcohen6 Apr 16, 2024
a85cef9
Skip BV for TestUnitTestingTypesDuckDB
jtcohen6 Apr 19, 2024
298f27c
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
jwills May 3, 2024
08d5530
Merge branch 'master' into jerco/migrate-dbt-common-adapters-interfaces
jwills May 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Migrate to dbt-common + dbt-adapters
  • Loading branch information
jtcohen6 committed Feb 19, 2024
commit 1e258597525a6b74338c65bd937174ab996c25fa
21 changes: 12 additions & 9 deletions dbt/adapters/duckdb/connections.py
Original file line number Diff line number Diff line change
@@ -1,29 +1,32 @@
import atexit
import threading
from contextlib import contextmanager
from multiprocessing.context import SpawnContext
from typing import Optional
from typing import Tuple

import agate

import dbt.exceptions
from . import environments
from dbt.adapters.contracts.connection import AdapterRequiredConfig
from dbt.adapters.contracts.connection import AdapterResponse
from dbt.adapters.contracts.connection import Connection
from dbt.adapters.contracts.connection import ConnectionState
from dbt.adapters.events.logging import AdapterLogger
from dbt.adapters.sql import SQLConnectionManager
from dbt.contracts.connection import AdapterRequiredConfig
from dbt.contracts.connection import AdapterResponse
from dbt.contracts.connection import Connection
from dbt.contracts.connection import ConnectionState
from dbt.logger import GLOBAL_LOGGER as logger

logger = AdapterLogger("DuckDB")


class DuckDBConnectionManager(SQLConnectionManager):
TYPE = "duckdb"
_LOCK = threading.RLock()
_ENV = None

def __init__(self, profile: AdapterRequiredConfig):
super().__init__(profile)
self.disable_transactions = profile.credentials.disable_transactions # type: ignore
def __init__(self, config: AdapterRequiredConfig, mp_context: SpawnContext) -> None:
super().__init__(config, mp_context)
self.disable_transactions = config.credentials.disable_transactions # type: ignore

@classmethod
def env(cls) -> environments.Environment:
Expand All @@ -50,7 +53,7 @@ def open(cls, connection: Connection) -> Connection:
logger.debug("Got an error when attempting to connect to DuckDB: '{}'".format(e))
connection.handle = None
connection.state = ConnectionState.FAIL
raise dbt.exceptions.FailedToConnectError(str(e))
raise dbt.adapters.exceptions.FailedToConnectError(str(e))

return connection

Expand Down
11 changes: 6 additions & 5 deletions dbt/adapters/duckdb/credentials.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,10 @@
from typing import Tuple
from urllib.parse import urlparse

import dbt.exceptions
from dbt.adapters.base import Credentials
from dbt.dataclass_schema import dbtClassMixin
from dbt_common.dataclass_schema import dbtClassMixin
from dbt_common.exceptions import DbtRuntimeError

from dbt.adapters.contracts.connection import Credentials


@dataclass
Expand Down Expand Up @@ -178,12 +179,12 @@ def __pre_deserialize__(cls, data: Dict[Any, Any]) -> Dict[Any, Any]:
data["database"] = path_db
elif path_db and data["database"] != path_db:
if not data.get("remote"):
raise dbt.exceptions.DbtRuntimeError(
raise DbtRuntimeError(
"Inconsistency detected between 'path' and 'database' fields in profile; "
f"the 'database' property must be set to '{path_db}' to match the 'path'"
)
elif not path_db:
raise dbt.exceptions.DbtRuntimeError(
raise DbtRuntimeError(
"Unable to determine target database name from 'path' field in profile"
)
return data
Expand Down
4 changes: 2 additions & 2 deletions dbt/adapters/duckdb/environments/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@
from typing import Optional

import duckdb
from dbt_common.exceptions import DbtRuntimeError

from ..credentials import DuckDBCredentials
from ..plugins import BasePlugin
from ..utils import SourceConfig
from ..utils import TargetConfig
from dbt.contracts.connection import AdapterResponse
from dbt.exceptions import DbtRuntimeError
from dbt.adapters.contracts.connection import AdapterResponse


def _ensure_event_loop():
Expand Down
2 changes: 1 addition & 1 deletion dbt/adapters/duckdb/environments/buenavista.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from . import Environment
from .. import credentials
from .. import utils
from dbt.contracts.connection import AdapterResponse
from dbt.adapters.contracts.connection import AdapterResponse


class BVEnvironment(Environment):
Expand Down
5 changes: 3 additions & 2 deletions dbt/adapters/duckdb/environments/local.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
import threading

from dbt_common.exceptions import DbtRuntimeError

from . import Environment
from .. import credentials
from .. import utils
from dbt.contracts.connection import AdapterResponse
from dbt.exceptions import DbtRuntimeError
from dbt.adapters.contracts.connection import AdapterResponse


class DuckDBCursorWrapper:
Expand Down
14 changes: 7 additions & 7 deletions dbt/adapters/duckdb/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,25 @@
from typing import Sequence

import agate
from dbt_common.contracts.constraints import ColumnLevelConstraint
from dbt_common.contracts.constraints import ConstraintType
from dbt_common.exceptions import DbtInternalError
from dbt_common.exceptions import DbtRuntimeError

from dbt.adapters.base import BaseRelation
from dbt.adapters.base.column import Column as BaseColumn
from dbt.adapters.base.impl import ConstraintSupport
from dbt.adapters.base.meta import available
from dbt.adapters.contracts.connection import AdapterResponse
from dbt.adapters.contracts.relation import Path
from dbt.adapters.contracts.relation import RelationType
from dbt.adapters.duckdb.column import DuckDBColumn
from dbt.adapters.duckdb.connections import DuckDBConnectionManager
from dbt.adapters.duckdb.relation import DuckDBRelation
from dbt.adapters.duckdb.utils import TargetConfig
from dbt.adapters.duckdb.utils import TargetLocation
from dbt.adapters.sql import SQLAdapter
from dbt.context.providers import RuntimeConfigObject
from dbt.contracts.connection import AdapterResponse
from dbt.contracts.graph.nodes import ColumnLevelConstraint
from dbt.contracts.graph.nodes import ConstraintType
from dbt.contracts.relation import Path
from dbt.contracts.relation import RelationType
from dbt.exceptions import DbtInternalError
from dbt.exceptions import DbtRuntimeError

TEMP_SCHEMA_NAME = "temp_schema_name"
DEFAULT_TEMP_SCHEMA_NAME = "dbt_temp"
Expand Down
2 changes: 1 addition & 1 deletion dbt/adapters/duckdb/plugins/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
from typing import Dict
from typing import Optional

from dbt_common.dataclass_schema import dbtClassMixin
from duckdb import DuckDBPyConnection

from ..credentials import DuckDBCredentials
from ..utils import SourceConfig
from ..utils import TargetConfig
from dbt.dataclass_schema import dbtClassMixin


class PluginConfig(dbtClassMixin):
Expand Down
9 changes: 5 additions & 4 deletions dbt/adapters/duckdb/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@

from dbt.adapters.base.column import Column
from dbt.adapters.base.relation import BaseRelation
from dbt.context.providers import RuntimeConfigObject
from dbt.contracts.graph.nodes import SourceDefinition
# TODO
# from dbt.context.providers import RuntimeConfigObject
# from dbt_common.contracts.graph.nodes import SourceDefinition


@dataclass
Expand Down Expand Up @@ -47,7 +48,7 @@ def as_dict(self) -> Dict[str, Any]:
return base

@classmethod
def create_from_source(cls, source: SourceDefinition) -> "SourceConfig":
def create_from_source(cls, source: Any) -> "SourceConfig":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Losing the SourceDefinition typing makes me a bit more nervous since we rely on its fields pretty extensively and the external_location features are a pretty popular part of dbt-duckdb on top of simple data lakes or just CSV/Parquet files on a NAS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heard - let me bug @MichelleArk & @colin-rogers-dbt about this and get back to you

Copy link
Contributor

@MichelleArk MichelleArk Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the decoupling work, we removed the BaseRelation.create_from_source and BaseRelation.create_from_node methods in favor of a consolidated BaseRelation.create_from method that accepts quoting: HasQuoting and relation_config: RelationConfig (dbt-labs/dbt-core#9210). I've updated this branch to reflect that typing + implemented DuckDBRelation.create_from here: 5712711.

We should also extend the RelationConfig to also provide meta and tags, as well as resource_type so this kind of conditional logic can still be implemented reliably for sources in dbt-adapters: https://github.com/dbt-labs/dbt-adapters/pull/120/files.

Still need to figure out how we want to handle source_meta, since its not a common property across sources + other node types. I see at a few options:

  1. Do nothing in dbt-adapters/core, and update the duckdb create_from_source method to use hasattr or something similarly unsatisfying
  2. Implement a merged_meta attribute on sources + nodes in dbt-core and extend RelationConfig to expect merged_meta, which the duckdb adapter could use instead of accessing source_meta + merging. This doesn't feel like the slickest interface design and would need changes to most of our node classes in core but it'd get the job done.
  3. Potentially -- In dbt-core, update the meta field on SourceDefinition to include fields from source_meta by default, meaning the duckdb adapter wouldn't need to do the update business at all. It might be tricky to do this in a backward-compatible way in dbt-core though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichelleArk thank you for this, I really appreciate it. Will try to noodle on a good strategy here, but using the hasattr stuff as an ugly-but-necessary fallback is okay with me if we don't find something better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead with option 3 above in dbt-core: dbt-labs/dbt-core#9766, since it was actually pretty straightforward to do in a backward-compatible way (meta merges table.meta and source.meta, with table.meta taking precedence), and updated the branch to remove the source_meta reference in favor of just meta which now includes both table + source meta attributes.

I also updated the reference to config._extra to config.extra, as those should be identical and extra is already part of the RelationConfig protocol: https://github.com/dbt-labs/dbt-adapters/blob/35bd3629c390cf87a0e52d999679cc5e33f36c8f/dbt/adapters/contracts/relation.py#L41.

Tests were passing for me locally against dbt-core@main. @jwills -- could we kick off CI again?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @MichelleArk , thank you!

meta = source.source_meta.copy()
meta.update(source.meta)
# Use the config properties as well if they are present
Expand Down Expand Up @@ -75,7 +76,7 @@ def as_dict(self) -> Dict[str, Any]:
class TargetConfig:
relation: BaseRelation
column_list: Sequence[Column]
config: RuntimeConfigObject
config: Any # TODO
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the safest since our usage of this config object is pretty much limited to get methods

location: Optional[TargetLocation] = None

def as_dict(self) -> Dict[str, Any]:
Expand Down
8 changes: 3 additions & 5 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# install latest changes in dbt-core
# git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-core&subdirectory=core
# git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-tests-adapter&subdirectory=tests/adapter

dbt-tests-adapter==1.7.7
# install latest changes in dbt-tests-adapter
git+https://github.com/dbt-labs/dbt-core.git#egg=dbt-core&subdirectory=core
git+https://github.com/dbt-labs/dbt-adapters.git#subdirectory=dbt-tests-adapter

boto3
mypy-boto3-glue
Expand Down
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ def _dbt_duckdb_version():
packages=find_namespace_packages(include=["dbt", "dbt.*"]),
include_package_data=True,
install_requires=[
"dbt-core~=1.7.0",
"dbt-common<1.0",
"dbt-adapters<=0.2.0",
"duckdb>=0.7.0",
],
extras_require={"glue": ["boto3", "mypy-boto3-glue"], "md": ["duckdb>=0.7.0,<=0.9.2"]},
Expand Down
Loading