Update context window management to avoid context shifts

### What are you trying to do?

Today, upon reaching the context window limit, a "context shift" occurs, effectively halving the number of tokens in the context window to make room for new generations. However, we should avoid this – OpenAI and other tools instead have token limits that, when reached, stop generation and let the user know.

### How should we solve this?

A few ideas:

* Make sure at least x% of the prompt is available for generation beyond the prompt
* Add a `reason` or similar key to `/api/generate` and `/api/chat` so it's obvious when the token limit is hit


### What is the impact of not solving this?

Possible run-ons and poorer responses from context shifting

### Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update context window management to avoid context shifts #3176

What are you trying to do?

How should we solve this?

What is the impact of not solving this?

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development