Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StreamObject doesn't work with Anthropic #3422

Open
Kitenite opened this issue Oct 30, 2024 · 15 comments
Open

StreamObject doesn't work with Anthropic #3422

Kitenite opened this issue Oct 30, 2024 · 15 comments

Comments

@Kitenite
Copy link

Kitenite commented Oct 30, 2024

Description

The streamObject function does not actually stream for anthropic. It resolves at the end instead. streamText works fine for anthropic unless a tool is required. It works fine for OpenAi as well. Tested with Haiku and Sonnet.

Related issue: #1980

Code example

import { AnthropicProvider, createAnthropic } from '@ai-sdk/anthropic';
import { createOpenAI, OpenAIProvider } from '@ai-sdk/openai';
import { CoreMessage, DeepPartial, streamObject } from 'ai';

//  Other code...

const model = this.anthropic(CLAUDE_MODELS.HAIKU);

// This streams fine
// const model = this.openai(OPEN_AI_MODELS.GPT_4_TURBO);

            const result = await streamObject({
                model,
                system: 'You are a seasoned React and Tailwind expert.',
                schema: StreamReponseObject,
                messages,
            });

// This actually wait the whole time, resolving everything at the end 

            for await (const partialObject of result.partialObjectStream) {
                console.log(partialObject)
            }

Additional context

Using main node.js process in an electron app

@Kitenite
Copy link
Author

Kitenite commented Oct 31, 2024

Seems like a related issue to this: #3395

And related to this on Anthropic's side: anthropics/anthropic-sdk-typescript#529

Seems like this is a limitation on Anthropic's side. I suppose this can be closed but I'll wait for a confirmation before doing so.

@lgrammel
Copy link
Collaborator

lgrammel commented Oct 31, 2024

This is a long standing issue. I've explored it several times. We use tool calls and tool call streaming, because Anthropic does not support JSON outputs via options. The Anthropic API does "fake" tool call streaming, i.e. stream all chunks at once after a significant delay, leading to this effect.

@Kitenite
Copy link
Author

Thanks @lgrammel , is there a good workaround to this using the AI SDK? I was able to get streamText to adhere to a format and use that instead but I really don't trust that.

Seems like the best course of action for me here is to use the anthropic SDK directly?

@lgrammel
Copy link
Collaborator

How would this work directly with the Anthropic SDK?

@Kitenite
Copy link
Author

Kitenite commented Oct 31, 2024

I just tested... Hubris on my side but it doesn't work like you mentioned. they would stream the text delta until the tool call and then big delay there until the entire call is resolved.

Please feel free to close and thanks for the quick reply :)

@lgrammel
Copy link
Collaborator

Want to leave this open since it comes up every week or so.

@Kitenite
Copy link
Author

Kitenite commented Oct 31, 2024

FYI for folks who absolutely have to use anthropic for streaming. This is my hacky solution which passes the zod schema as a system prompt to streamText then partially resolve the streamed object.

I'm surprised this works consistently with Claude Sonnet latest. It will try to wrap the object in a codeblock so I just strip it there.

import { createAnthropic } from '@ai-sdk/anthropic';
import { StreamReponseObject } from '@onlook/models/chat'; // Zod object
import { CoreMessage, DeepPartial, LanguageModelV1, streamText } from 'ai';
import { Allow, parse } from 'partial-json';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';

// ...

    public async stream(
        messages: CoreMessage[],
    ): Promise<z.infer<typeof StreamReponseObject> | null> {
        try {
            const result = await streamText({
                model: this.model,
                system: 'You are a seasoned React and Tailwind expert.' + this.getFormatString(),
                messages,
            });

            let fullText = '';
            for await (const partialText of result.textStream) {
                fullText += partialText;
                const partialObject: DeepPartial<z.infer<typeof StreamReponseObject>> = parse(fullText, Allow.ALL);
               // Yay partial object!
            }

            const fullObject: z.infer<typeof StreamReponseObject>) = parse(fullText, Allow.ALL);
            return fullObject;
        } catch (error) {
            console.error('Error receiving stream', error);
            const errorMessage = this.getErrorMessage(error);
            this.emitErrorMessage('requestId', errorMessage);
            return null;
        }
    }

  getFormatString() {
        const jsonFormat = JSON.stringify(zodToJsonSchema(StreamReponseObject));
        return `\nReturn your response only in this JSON format: <format>${jsonFormat}</format>`;
    }

// ...

Edit: I should add that to also get text response along with the object, you can have it return an array of types of text or your other object

import type { PartialDeep } from 'type-fest';
import { z } from 'zod';

const TextBlockSchema = z.object({
    type: z.literal('text').describe('The type of the block, should be text'),
    text: z
        .string()
        .describe('Text reply to the user, can be a message to describe the code change'),
});

const CodeBlockSchema = z.object({
    type: z.literal('code').describe('The type of the block, should be code'),
    fileName: z.string().describe('The name of the file to be changed'),
    value: z
        .string()
        .describe(
            'The new or modified code for the file. Always include the full content of the file.',
        ),
});

const ResponseBlockSchema = z.discriminatedUnion('type', [TextBlockSchema, CodeBlockSchema]);

export const StreamReponseSchema = z
    .object({
        blocks: z
            .array(ResponseBlockSchema)
            .describe('Array of responses that can be text or code type'),
    })
    .describe('Generate a stream of text and code responses');

@sahanatvessel
Copy link

@lgrammel maybe add this to the docs for ai/sdk anthropic section? If I saw this in the docs I wouldn't have raised a bug last week 🙈

@automationghost
Copy link

automationghost commented Nov 5, 2024

can somone clarify if its supposed to be supported or not ? As in the docs its states :

https://sdk.vercel.ai/providers/ai-sdk-providers/anthropic

Anthropic language models can also be used in the streamText, generateObject, streamObject, and streamUI

So i was assuming this works as a stream and not as a result i get back when finished ? THat makes no sense since then I would assume to use generateObject instead

@Kitenite
Copy link
Author

Kitenite commented Nov 5, 2024

@automationghost the problem seems to be on the Anthropic side.
For tool calling, they don't actually stream the object (they send the whole object value at the end). streamObject seems to be just a tool call to Anthropic so this is what happens. I agree, a note on the streamObject doc would have saved a lot of work.

Disclaimer: I'm not a maintainer of either package but I used both the Anthropic SDK and vercel/ai for my use case

@automationghost
Copy link

automationghost commented Nov 5, 2024

@automationghost the problem seems to be on the Anthropic side. For tool calling, they don't actually stream the object (they send the whole object value at the end). streamObject seems to be just a tool call to Anthropic so this is what happens. I agree, a note on the streamObject doc would have saved a lot of work.

Disclaimer: I'm not a maintainer of either package but I used both the Anthropic SDK and vercel/ai for my use case

K thanks. I guess it "works" in a fake way then. So essentially this wont change. Iam experimenting with the anthropic api and was looking for ways to get "streamed structed output for it as well" .. Guess then you would need to resort to streamText and some hacky solution to get that going then. Are you by chance are aware by now of any libaries that do that for non openai models ? I have used partial json lib before but would hope not to have to use it. Any experience with other models ?

@Kitenite
Copy link
Author

Kitenite commented Nov 5, 2024

Are you by chance are aware by now of any libaries that do that for non openai models ? I have used partial json lib before but would hope not to have to use it. Any experience with other models?

I am not aware of any for Anthropic. I've found the above hack to be performant enough. W/ some better linting as Claude likes to wrap code in backticks it's working like a charm.

@automationghost
Copy link

automationghost commented Nov 5, 2024

Are you by chance are aware by now of any libaries that do that for non openai models ? I have used partial json lib before but would hope not to have to use it. Any experience with other models?

I am not aware of any for Anthropic. I've found the above hack to be performant enough. W/ some better linting as Claude likes to wrap code in backticks it's working like a charm.

Thanks for the feedback . I googled an hour to figure things out.
turns out you are spot on with your aproach using codeblocks and parsing the results out of them. As of now this is the way of gettings kind of streamed structred outputs from models across the board . Even v0 does it similiar like this (it creates multiple codeblocks in its system prompt ) and parses the results out them. so it doesnt even use a schema inside the codeblocks but it describes the desired ouput . But your little hack works in a similar way and is easier to parse.

I have made a copy paste example from your example.

import { streamText } from "ai";
import { Allow, parse } from "partial-json";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
import dotenv from "dotenv";
import { anthropic } from "@ai-sdk/anthropic";

dotenv.config();

const apiKey = process.env.ANTHROPIC_API_KEY;

const schema = z.object({
  code: z.object({
    code_output: z.string(),
  }),
});

try {
  const result = await streamText({
    model: anthropic("claude-3-5-sonnet-20241022", {
      cacheControl: false,
    }),
    apiKey: apiKey,
    system: "You are a seasoned React expert." + getFormatString(),
    prompt: "Create a example react component using mui with a tanstack table ",
  });

  let fullText = "";
  for await (const partialText of result.textStream) {
    fullText += partialText;
    const partialObject = parse(fullText, Allow.ALL);

    if (partialObject && partialObject.code && partialObject.code.code_output  ) {
      console.clear()
      console.log( partialObject.code.code_output );
    }
  }

/*   const fullObject = parse(fullText, Allow.ALL);
  console.log("fullObject", fullObject); */
} catch (error) {
  console.error("Error receiving stream", error);
  const errorMessage = this.getErrorMessage(error);
  this.emitErrorMessage("requestId", errorMessage);
}

function getFormatString() {
  const jsonFormat = JSON.stringify(zodToJsonSchema(schema));
  return `\nReturn your response only in this JSON format: <format>${jsonFormat}</format>`;
}

@Kitenite
Copy link
Author

Kitenite commented Nov 5, 2024

I have made a copy paste example from your example.

Nice, one more thing that may help is this helper just to handle random wrapped code:

export function stripFullText(fullText: string) {
    let text = fullText;

    if (text.startsWith('```')) {
        text = text.slice(3);
    }

    if (text.startsWith('```json\n')) {
        text = text.slice(8);
    }

    if (text.endsWith('```')) {
        text = text.slice(0, -3);
    }
    return text;
}

Usage:

export function parseObjectFromText(
    text: string,
): DeepPartial<z.infer<typeof StreamReponseObject>> {
    const cleanedText = stripFullText(text);
    return parse(cleanedText, Allow.ALL) as DeepPartial<z.infer<typeof StreamReponseObject>>;
}

@automationghost
Copy link

automationghost commented Nov 5, 2024

I have made a copy paste example from your example.

Nice, one more thing that may help is this helper just to handle random wrapped code:

export function stripFullText(fullText: string) {
    let text = fullText;

    if (text.startsWith('```')) {
        text = text.slice(3);
    }

    if (text.startsWith('```json\n')) {
        text = text.slice(8);
    }

    if (text.endsWith('```')) {
        text = text.slice(0, -3);
    }
    return text;
}

Usage:

export function parseObjectFromText(
    text: string,
): DeepPartial<z.infer<typeof StreamReponseObject>> {
    const cleanedText = stripFullText(text);
    return parse(cleanedText, Allow.ALL) as DeepPartial<z.infer<typeof StreamReponseObject>>;
}

thats actually something v0 also describes in its system prompt so this random wrapped code doesnt happen in the first place :). but yes might come in handy will see . thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants