Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regular (non-partial) Iterable streaming does not support validation_context/context #1290

Open
3 of 8 tasks
dmastylo opened this issue Jan 2, 2025 · 0 comments
Open
3 of 8 tasks
Labels
bug Something isn't working

Comments

@dmastylo
Copy link

dmastylo commented Jan 2, 2025

  • This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

What Model are you using?

  • gpt-3.5-turbo
  • gpt-4-turbo
  • gpt-4
  • gpt-4o

Describe the bug
When using iterable streaming, context / validation_context is not passed through to field or model validators on the pydantic model.

To Reproduce

class SegmentAnalysis(LLMOutput):
    chain_of_thought: str = Field(
        description="Let's think step by step to analyze what needs updates in this segment"
    )
    source_text: str = Field(
        description="some_desc"
    )
    updated_text: str = Field(
        description="some_desc"
    )

    @model_validator(mode="after")
    def source_text_exists(self, info: ValidationInfo):
        context = info.context
        if context:
            context = context.get("input_text")
            if self.source_text not in context:
                raise ValueError(
                    f"source_text {self.source_text} not found in input text {context}"
                )
        return self


segments = client.chat.completions.create_iterable(
        model="gpt-4o",
        response_model=SegmentAnalysis,
        stream=True,
        max_retries=3,
        messages=[
            {"role": "system", "content": "some prompt"},
            {"role": "user", "content": "1. 2024, this date is old, update. 2. 2025, this date is new, don't update"},
        ],
        context={"input_text": text},
    )

Expected behavior
Given no mention of the lack of support on the Stream Iterable docs or the Validation docs, I expected the validation context to get passed through.

I see that instructor does not pass the context into from_streaming_response.

def process_response(...):
...
    if (
        inspect.isclass(response_model)
        and issubclass(response_model, (IterableBase, PartialBase))
        and stream
    ):
        model = response_model.from_streaming_response(
            response,
            mode=mode,
        )
        return model

    model = response_model.from_response(
        response,
        validation_context=validation_context,
        strict=strict,
        mode=mode,
    )

I'm not versed in the inner-workings of either instructor or pydantic, but I do see the following call stack

instructor.process_response ->
  IterableBase.from_streaming_response -> 
  IterableBase.tasks_from_chunks ->
  cls.task_type.model_validate_json

which seems to me that pydantic validation is happening. Given the below screenshot, there's clearly some technical limitation as to why validation context isn't supported, but it's not clear that it's not supported for non-partial Iterable streaming.

Screenshots
image
There is a note on the Stream Partial docs that validator support is limited, but it was not immediately clear that this affected BOTH partial and non-partial streaming.

Next Steps
A. Support validation context for non-partial Iterable streaming
OR
B. Docs should be made clear that validation context is not supported for non-partial Iterable streaming in addition to partial Iterable streaming

Happy to do either (or at least attempt A) once we know the correct course of action.

@github-actions github-actions bot added the bug Something isn't working label Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant