Skip to content

Cache completions #389

Open
Open
@abrichr

Description

Is there some mechanism to avoid hitting the API if the prompt hasn't changed at all?

For example:

import ell

@ell.simple(model="gpt-4o")
def hello(name: str):
    """You are a helpful assistant.""" # System prompt
    return f"Say hello to {name}!" # User prompt

greeting = hello("Sam Altman")
print(greeting)

If we run this script twice, there is no need for the API to be called on the second time if we simply persist the result of the function call to disk.

Normally we can accomplish this with joblib.memory:

from joblib import Memory
import ell

memory = Memory("./cache")

@memory.cache()
@ell.simple(model="gpt-4o")
def hello(name: str):
    """You are a helpful assistant.""" # System prompt
    return f"Say hello to {name}!" # User prompt

greeting = hello("Sam Altman")
print(greeting)

Now if we run this script twice, the API will not be hit on the second call.

This behaves as we expect if we modify the parameters to the function, e.g. if we call hello("Sam"), the API will be hit, since the arguments changed.

However, if we change the prompt literal inside the function, unfortunately joblib is not able to pick up on it, and the stale result is returned.

Any suggestions for avoiding unnecessary API calls would be appreciated!

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions