roadmap: Jan Context Length issues

## Goal 

- Jan needs an elegant way to deal with model context length issues

### Possible Scope

- e.g. Logic for Thread > context length?
- e.g. User can adjust the context length to the model within model bounds
- e.g. Can support longer context length support if model supported and hardware supported
- e.g. Jan has adaptive context length, given GGUF or model.yaml, and hardware detection

## Linked Issues

- [ ] https://github.com/janhq/jan/issues/2193

### Cortex Issue

- [x] https://github.com/janhq/cortex.cpp/issues/1151


# Original Post

**Problem**
In some cases, users can use the model to exceed the limit of 4096 tokens (~4000 words). But we haven't implemented any solutions to handle it.

**Success Criteria**
1. Have an alert that notifies users are exceed the context length
2. We can delete the very first user message (not the system) when exceed the context length

**Additional context**
Bug:

-----
@imtuyethan 

As discussed with @hahuyhoang411:
- Error when thread exceeds the context length
- Recommend users to delete message by themselves or create a new thread

## Design:
https://www.figma.com/file/ytn1nRZ17FUmJHTlhmZB9f/Jan-App-(version-1)?type=design&node-id=6847-111809&mode=design&t=ErX19MBkMjVhBSjO-4

<img width="615" alt="Screenshot 2024-03-27 at 3 59 07 PM" src="https://github.com/janhq/jan/assets/89722390/3bab69ea-867f-4d49-9451-9370f22bc064">


(This is the MVP for now, in the future we will have a standardized error format that will direct users to Discourse forum & users can see the answer there, see specs: https://www.notion.so/jan-ai/Standardized-Error-Format-for-Jan-abea56d32d6648bb8c6835f9176f800c?pvs=4)





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roadmap: Jan Context Length issues #2320

Goal

Possible Scope

Linked Issues

Cortex Issue

Original Post

Design:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development