Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: request contextualisation - core functionality #65

Open
wants to merge 57 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
21506e6
context logic subpackage; type-hint context extraction
Jun 21, 2024
a87e8e2
reworked type hint info extraction; extended functionality to also re…
ds-jakub-cierocki Jun 24, 2024
3ad4ecd
hidden args handling enabled
ds-jakub-cierocki Jun 24, 2024
b0cc0ae
improved type hints parsing and compatibility using package
ds-jakub-cierocki Jun 28, 2024
4ff5f62
dedicated exceptions for contex-related operations
ds-jakub-cierocki Jun 28, 2024
c479c50
useful classmethods for context-related operations
ds-jakub-cierocki Jun 28, 2024
e3bb127
make whole context utils module protected; added IQL parsing helper; …
ds-jakub-cierocki Jun 28, 2024
de72c7c
parsing type hints _extract_params_and_context() no longer excludes B…
ds-jakub-cierocki Jun 28, 2024
d3958c0
adjusted the existing code to be aware of contexts (promts yet untouc…
ds-jakub-cierocki Jun 28, 2024
be338bf
adjusted _type_validators.validate_arg_type() to handle typing.Union[]
ds-jakub-cierocki Jul 2, 2024
78f1535
context._utils._does_arg_allow_context() fix
ds-jakub-cierocki Jul 2, 2024
308e2e1
context record is now based on pydantic.BaseModel rather than datacla…
ds-jakub-cierocki Jul 2, 2024
73741d9
type hint lifting
ds-jakub-cierocki Jul 2, 2024
902f5ff
IQL generating LLM prompt passes BaseCallerContext() as filter argume…
ds-jakub-cierocki Jul 2, 2024
6309070
comments cleanup
ds-jakub-cierocki Jul 2, 2024
d523bf7
type hint fixes
ds-jakub-cierocki Jul 3, 2024
efe212f
Merge branch 'main' (which includes a large refactor by Michal) into …
ds-jakub-cierocki Jul 3, 2024
9ba89e5
post-merge fixes + minor refactor
ds-jakub-cierocki Jul 3, 2024
5fd802f
added missing docstrings; fixed type hints; fixed issues detected by …
ds-jakub-cierocki Jul 4, 2024
09bac55
reworked parse_param_type() function to increase performance, general…
ds-jakub-cierocki Jul 4, 2024
d42a369
fix: removed duplicated line from the prompt template
ds-jakub-cierocki Jul 4, 2024
c0b0522
adjusted existing unit tests to work with new contextualization logic
ds-jakub-cierocki Jul 4, 2024
9b2e131
linter-recommended fixes
ds-jakub-cierocki Jul 4, 2024
2d0ef4b
contextualization mechanism - dedicated unit tests
ds-jakub-cierocki Jul 5, 2024
6466f61
cleaned up overengineered code remanining from the previous iteration…
ds-jakub-cierocki Jul 5, 2024
637f7fa
replaced pydantic.BaseModel by dataclasses.dataclass, pydantic no lon…
ds-jakub-cierocki Jul 8, 2024
f867e25
BaseCallerContext: dataclass w.o. fields -> interface (abstract class…
ds-jakub-cierocki Jul 8, 2024
3423033
LLM now pastes Context() instead of BaseCallerContext() to indicate t…
ds-jakub-cierocki Jul 8, 2024
0d8cd1e
docstring typo fixes; more precise return type hint
ds-jakub-cierocki Jul 9, 2024
c97ba15
renamed Context() -> AskerContext(); added more detailed detailed exa…
ds-jakub-cierocki Jul 9, 2024
1294a9c
type hint parsing changes: SomeCustomContext -> AskerContext; Union[a…
ds-jakub-cierocki Jul 9, 2024
999759b
refactor: collection.results.[ViewExecutionResult, ExecutionResult]."…
ds-jakub-cierocki Jul 12, 2024
2e1005a
param type parsing: correctly handling builtins types with args (e.g.…
ds-jakub-cierocki Jul 12, 2024
820066d
type hint fix: explcitly marked BaseCallerContext.alias as typing.Cla…
ds-jakub-cierocki Jul 12, 2024
25fbfa6
docs + benchmarks adjusted to meet new naming [ExecutionResult, ViewE…
ds-jakub-cierocki Jul 15, 2024
a154577
redesigned context-not-available error to follow the same principles …
ds-jakub-cierocki Jul 15, 2024
623effd
EXPERIMENTAL: reworked context injection such it is handled immediate…
ds-jakub-cierocki Jul 15, 2024
afacf5b
additional unit tests for the new contextualization mechanism
ds-jakub-cierocki Jul 19, 2024
dd8b339
context benchmark script and data
ds-jakub-cierocki Jul 22, 2024
6bb0816
refactored main prompt (too long lines), missing end-of-line characters
ds-jakub-cierocki Jul 22, 2024
f388f92
better error handling
ds-jakub-cierocki Jul 22, 2024
fbecc51
context benchmark dataset fix
ds-jakub-cierocki Jul 23, 2024
5d4ff64
added polars-based accuracy summary to the benchmark
ds-jakub-cierocki Jul 23, 2024
e7e8826
adjusted prompt to reduce halucinations: nested filter/context calls …
ds-jakub-cierocki Jul 23, 2024
f8bf64e
merged main (inc. new benchmarks + large refactor) -> jc/issue-54-req…
ds-jakub-cierocki Aug 7, 2024
c1c871b
merge main
micpst Sep 23, 2024
8eefd9b
fix linters
micpst Sep 23, 2024
c28091f
fix tests
micpst Sep 23, 2024
69a8d58
fix tests
micpst Sep 23, 2024
d6c8fc6
fix tests
micpst Sep 23, 2024
d7026d4
rm old benchmarks
micpst Sep 23, 2024
e8271ac
some renames and stuff
micpst Sep 23, 2024
bdcc7b3
fix benchmarks
micpst Sep 23, 2024
71f53be
merge main
micpst Sep 25, 2024
c82e579
rm chroma file
micpst Sep 25, 2024
f5a40cb
add contexts to benchmarks + fix types
micpst Sep 30, 2024
fab9d3f
small refactor
micpst Oct 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
some renames and stuff
  • Loading branch information
micpst committed Sep 23, 2024
commit e8271ac2afc8690aa968ea104ab8b3b400b5a73e
21 changes: 12 additions & 9 deletions src/dbally/collection/collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@
import textwrap
import time
from collections import defaultdict
from typing import Callable, Dict, Iterable, List, Optional, Type, TypeVar
from typing import Callable, Dict, List, Optional, Type, TypeVar

import dbally
from dbally.audit.event_handlers.base import EventHandler
from dbally.audit.event_tracker import EventTracker
from dbally.audit.events import FallbackEvent, RequestEnd, RequestStart
from dbally.collection.exceptions import IndexUpdateError, NoViewFoundError
from dbally.collection.results import ExecutionResult, ViewExecutionResult
from dbally.context.context import BaseCallerContext
from dbally.context import Context
from dbally.iql_generator.prompt import UnsupportedQueryError
from dbally.llms.base import LLM
from dbally.llms.clients.base import LLMOptions
Expand Down Expand Up @@ -228,7 +228,7 @@ async def _ask_view(
event_tracker: EventTracker,
llm_options: Optional[LLMOptions],
dry_run: bool,
contexts: Iterable[BaseCallerContext],
contexts: List[Context],
) -> ViewExecutionResult:
"""
Ask the selected view to provide an answer to the question.
Expand All @@ -247,11 +247,11 @@ async def _ask_view(
view_result = await selected_view.ask(
query=question,
llm=self._llm,
contexts=contexts,
event_tracker=event_tracker,
n_retries=self.n_retries,
dry_run=dry_run,
llm_options=llm_options,
contexts=contexts,
)
return view_result

Expand Down Expand Up @@ -298,9 +298,11 @@ def get_all_event_handlers(self) -> List[EventHandler]:
return self._event_handlers
return list(set(self._event_handlers).union(self._fallback_collection.get_all_event_handlers()))

# pylint: disable=too-many-arguments
async def _handle_fallback(
self,
question: str,
contexts: Optional[List[Context]],
dry_run: bool,
return_natural_response: bool,
llm_options: Optional[LLMOptions],
Expand All @@ -322,7 +324,6 @@ async def _handle_fallback(

Returns:
The result from the fallback collection.

"""
if not self._fallback_collection:
raise caught_exception
Expand All @@ -337,6 +338,7 @@ async def _handle_fallback(
async with event_tracker.track_event(fallback_event) as span:
result = await self._fallback_collection.ask(
question=question,
contexts=contexts,
dry_run=dry_run,
return_natural_response=return_natural_response,
llm_options=llm_options,
Expand All @@ -348,10 +350,10 @@ async def _handle_fallback(
async def ask(
self,
question: str,
contexts: Optional[List[Context]] = None,
dry_run: bool = False,
return_natural_response: bool = False,
llm_options: Optional[LLMOptions] = None,
contexts: Optional[Iterable[BaseCallerContext]] = None,
event_tracker: Optional[EventTracker] = None,
) -> ExecutionResult:
"""
Expand All @@ -366,14 +368,14 @@ async def ask(

Args:
question: question posed using natural language representation e.g\
"What job offers for Data Scientists do we have?"
"What job offers for Data Scientists do we have?"
contexts: list of context objects, each being an instance of
a subclass of Context. May contain contexts irrelevant for the currently processed query.
dry_run: if True, only generate the query without executing it
return_natural_response: if True (and dry_run is False as natural response requires query results),
the natural response will be included in the answer
llm_options: options to use for the LLM client. If provided, these options will be merged with the default
options provided to the LLM client, prioritizing option values other than NOT_GIVEN
contexts: An iterable (typically a list) of context objects, each being an instance of
a subclass of BaseCallerContext. May contain contexts irrelevant for the currently processed query.
event_tracker: Event tracker object for given ask.

Returns:
Expand Down Expand Up @@ -433,6 +435,7 @@ async def ask(
if self._fallback_collection:
result = await self._handle_fallback(
question=question,
contexts=contexts,
dry_run=dry_run,
return_natural_response=return_natural_response,
llm_options=llm_options,
Expand Down
11 changes: 11 additions & 0 deletions src/dbally/context.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from abc import ABC
from typing import ClassVar


class Context(ABC):
"""
Base class for all contexts that are used to pass additional knowledge about the caller environment to the view.
"""

type_name: ClassVar[str] = "Context"
alias_name: ClassVar[str] = "CONTEXT"
3 changes: 0 additions & 3 deletions src/dbally/context/__init__.py

This file was deleted.

75 changes: 0 additions & 75 deletions src/dbally/context/_utils.py

This file was deleted.

17 changes: 0 additions & 17 deletions src/dbally/context/context.py

This file was deleted.

23 changes: 0 additions & 23 deletions src/dbally/context/exceptions.py
micpst marked this conversation as resolved.
Outdated
Show resolved Hide resolved

This file was deleted.

4 changes: 2 additions & 2 deletions src/dbally/iql/_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from typing import Any, Generic, List, Optional, TypeVar, Union

from dbally.audit.event_tracker import EventTracker
from dbally.context.context import BaseCallerContext
from dbally.context import Context
from dbally.iql import syntax
from dbally.iql._exceptions import (
IQLArgumentParsingError,
Expand Down Expand Up @@ -34,7 +34,7 @@ def __init__(
self,
source: str,
allowed_functions: List[ExposedFunction],
allowed_contexts: Optional[List[BaseCallerContext]] = None,
allowed_contexts: Optional[List[Context]] = None,
event_tracker: Optional[EventTracker] = None,
) -> None:
self.source = source
Expand Down
4 changes: 2 additions & 2 deletions src/dbally/iql/_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from ._processor import IQLAggregationProcessor, IQLFiltersProcessor, IQLProcessor, RootT

if TYPE_CHECKING:
from dbally.context.context import BaseCallerContext
from dbally.context import Context
from dbally.views.exposed_functions import ExposedFunction


Expand All @@ -33,7 +33,7 @@ async def parse(
cls,
source: str,
allowed_functions: List["ExposedFunction"],
allowed_contexts: Optional[List["BaseCallerContext"]] = None,
allowed_contexts: Optional[List["Context"]] = None,
event_tracker: Optional[EventTracker] = None,
) -> Self:
"""
Expand Down
8 changes: 4 additions & 4 deletions src/dbally/iql_generator/iql_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from typing import Generic, List, Optional, TypeVar, Union

from dbally.audit.event_tracker import EventTracker
from dbally.context.context import BaseCallerContext
from dbally.context import Context
from dbally.iql import IQLError, IQLQuery
from dbally.iql._query import IQLAggregationQuery, IQLFiltersQuery
from dbally.iql_generator.prompt import (
Expand Down Expand Up @@ -67,7 +67,7 @@ async def __call__(
question: str,
filters: List[ExposedFunction],
aggregations: List[ExposedFunction],
contexts: List[BaseCallerContext],
contexts: List[Context],
examples: List[FewShotExample],
llm: LLM,
event_tracker: Optional[EventTracker] = None,
Expand Down Expand Up @@ -146,7 +146,7 @@ async def __call__(
*,
question: str,
methods: List[ExposedFunction],
contexts: List[BaseCallerContext],
contexts: List[Context],
examples: List[FewShotExample],
llm: LLM,
event_tracker: Optional[EventTracker] = None,
Expand Down Expand Up @@ -265,7 +265,7 @@ async def __call__(
*,
question: str,
methods: List[ExposedFunction],
contexts: List[BaseCallerContext],
contexts: List[Context],
examples: List[FewShotExample],
llm: LLM,
llm_options: Optional[LLMOptions] = None,
Expand Down
8 changes: 4 additions & 4 deletions src/dbally/iql_generator/prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from typing import List, Optional

from dbally.audit.event_tracker import EventTracker
from dbally.context.context import BaseCallerContext
from dbally.context import Context
from dbally.exceptions import DbAllyError
from dbally.iql._query import IQLAggregationQuery, IQLFiltersQuery
from dbally.prompt.elements import FewShotExample
Expand All @@ -21,7 +21,7 @@ class UnsupportedQueryError(DbAllyError):
async def _iql_filters_parser(
response: str,
allowed_functions: List[ExposedFunction],
allowed_contexts: List[BaseCallerContext],
allowed_contexts: List[Context],
event_tracker: Optional[EventTracker] = None,
) -> IQLFiltersQuery:
"""
Expand Down Expand Up @@ -53,7 +53,7 @@ async def _iql_filters_parser(
async def _iql_aggregation_parser(
response: str,
allowed_functions: List[ExposedFunction],
allowed_contexts: List[BaseCallerContext],
allowed_contexts: List[Context],
event_tracker: Optional[EventTracker] = None,
) -> IQLAggregationQuery:
"""
Expand Down Expand Up @@ -127,7 +127,7 @@ def __init__(
*,
question: str,
methods: List[ExposedFunction],
contexts: List[BaseCallerContext],
contexts: List[Context],
examples: Optional[List[FewShotExample]] = None,
) -> None:
"""
Expand Down
6 changes: 3 additions & 3 deletions src/dbally/views/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

from dbally.audit.event_tracker import EventTracker
from dbally.collection.results import ViewExecutionResult
from dbally.context.context import BaseCallerContext
from dbally.context import Context
from dbally.llms.base import LLM
from dbally.llms.clients.base import LLMOptions
from dbally.prompt.elements import FewShotExample
Expand All @@ -25,7 +25,7 @@ async def ask(
self,
query: str,
llm: LLM,
contexts: Optional[List[BaseCallerContext]] = None,
contexts: Optional[List[Context]] = None,
event_tracker: Optional[EventTracker] = None,
n_retries: int = 3,
dry_run: bool = False,
Expand All @@ -38,7 +38,7 @@ async def ask(
query: The natural language query to execute.
llm: The LLM used to execute the query.
contexts: An iterable (typically a list) of context objects, each being
an instance of a subclass of BaseCallerContext.
an instance of a subclass of Context.
event_tracker: The event tracker used to audit the query execution.
n_retries: The number of retries to execute the query in case of errors.
dry_run: If True, the query will not be used to fetch data from the datasource.
Expand Down
Loading
Loading