Skip to content

roadmap: Jan Context Length issues #2320

Open
@hahuyhoang411

Description

Goal

  • Jan needs an elegant way to deal with model context length issues

Possible Scope

  • e.g. Logic for Thread > context length?
  • e.g. User can adjust the context length to the model within model bounds
  • e.g. Can support longer context length support if model supported and hardware supported
  • e.g. Jan has adaptive context length, given GGUF or model.yaml, and hardware detection

Linked Issues

Cortex Issue

Original Post

Problem
In some cases, users can use the model to exceed the limit of 4096 tokens (~4000 words). But we haven't implemented any solutions to handle it.

Success Criteria

  1. Have an alert that notifies users are exceed the context length
  2. We can delete the very first user message (not the system) when exceed the context length

Additional context
Bug:


@imtuyethan

As discussed with @hahuyhoang411:

  • Error when thread exceeds the context length
  • Recommend users to delete message by themselves or create a new thread

Design:

https://www.figma.com/file/ytn1nRZ17FUmJHTlhmZB9f/Jan-App-(version-1)?type=design&node-id=6847-111809&mode=design&t=ErX19MBkMjVhBSjO-4

Screenshot 2024-03-27 at 3 59 07 PM

(This is the MVP for now, in the future we will have a standardized error format that will direct users to Discourse forum & users can see the answer there, see specs: https://www.notion.so/jan-ai/Standardized-Error-Format-for-Jan-abea56d32d6648bb8c6835f9176f800c?pvs=4)

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    • Status

      Scheduled

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions