Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(llmobs): refactor annotation context implementation #10976

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

lievan
Copy link
Contributor

@lievan lievan commented Oct 8, 2024

This PR fixes two issues with annotation contexts

  1. Nested annotations were applied in a random order
  2. Annotations polluted across all trace contexts because they were registered as a global on_span_start hook

I put it all in one PR since it's a single refactor of the underlying implementation for annotation contexts, but let me know if you would prefer this to be separated into two PR's -

Issue 1: Nesting
Currently, annotation contexts register a single annotate function to be called with specific arguments on span start when we the context manager is entered.

If multiple annotation contexts are nested, the order in which the annotations happen are non-deterministic. This is because span start function hooks are stored as a set.

Instead of registering each annotate function as a separate hook, we only register one annotation function on span start that iterates through a list of annotation arguments.

Entering/exiting the annotation context enqueues/dequeues arguments from the list of annotation arguments stored in the LLMObs instance.

This ensures that annotations are applied in the order in which annotation contexts are entered.

Issue 2: Cross trace context pollution
Annotations are applied as a global function hook on span start, meaning they apply to all spans across all trace contexts.

We update this so that we store each annotation invocation with an annotation_context_id representing the current trace context.

  1. If a trace has not started yet and there is no active context, we create and activate a "dummy" context containing annotation_context_id in the baggage
  2. If there is already a currently active context, we either extract the existing annotation_context_id from the baggage or add a new one

When we apply the annotations, we fetch the annotation_context_id from the current trace context and check that it matches the annotation_context_id stored with each annotation.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

Copy link
Contributor

github-actions bot commented Oct 8, 2024

CODEOWNERS have been resolved as:

releasenotes/notes/annotation-context-fixes-9fead4cbc6ebfde6.yaml       @DataDog/apm-python
ddtrace/llmobs/_constants.py                                            @DataDog/ml-observability
ddtrace/llmobs/_llmobs.py                                               @DataDog/ml-observability
ddtrace/llmobs/_utils.py                                                @DataDog/ml-observability
tests/llmobs/test_llmobs_service.py                                     @DataDog/ml-observability

current_tags = span.get_tag(TAGS)
if current_tags:
span_tags.update(json.loads(current_tags))
current_tags_str = span.get_tag(TAGS)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense for the current tag dictionary to be updated with the latest set of tags, not the other way around

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we need a deprecation notice for this? Since it technically changes the behavior of annotate ... and maybe some users relied on tags not being overriden

@pr-commenter
Copy link

pr-commenter bot commented Oct 8, 2024

Benchmarks

Benchmark execution time: 2024-10-09 21:30:14

Comparing candidate commit bc87235 in PR branch evan.li/annotation-context-order with baseline commit 5f25f36 in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 371 metrics, 53 unstable metrics.

@lievan lievan changed the title feat(llmobs): enforce ordering for annotation contexts fix(llmobs): annotation don't pollute other trace contexts Oct 9, 2024
@lievan lievan changed the title fix(llmobs): annotation don't pollute other trace contexts fix(llmobs): stop annotations from polluting to other trace contexts and enforce an ordering for annotations Oct 9, 2024
@lievan lievan changed the title fix(llmobs): stop annotations from polluting to other trace contexts and enforce an ordering for annotations fix(llmobs): refactor annotation context to enforce ordering and fix cross-context pollution of annotations Oct 9, 2024
@lievan lievan changed the title fix(llmobs): refactor annotation context to enforce ordering and fix cross-context pollution of annotations fix(llmobs): refactor annotation context implementation Oct 9, 2024
# default the context id to the annotation id
ctx_id = annotation_id
if current_ctx is None:
current_ctx = Context(is_remote=False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any problem to creating a "dummy" context like this just to store a baggage item?

@lievan lievan marked this pull request as ready for review October 10, 2024 14:38
@lievan lievan requested a review from a team as a code owner October 10, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant