Skip to content

Cannot index 5MB PDF with default settings using bedrock  #94

Open
@dirkpetersen

Description

I try to upload this file (5MB, 2,384,000 chars) to LibreChat with bedrock API activated
https://pve.proxmox.com/pve-docs/pve-admin-guide.pdf

I tried dev and dev-lite containers but am getting an upload error ("An Error occurred while uploading a file) in the LibreChat GUI but no real error in the logs with DEBUG_RAG_API=true, Strange

If set CHUNK_SIZE=5000 it works however, these are my RAG settings

DEBUG_RAG_API=true
RAG_USE_FULL_CONTEXT=true
PDF_EXTRACT_IMAGES=false # false is default
CHUNK_SIZE=5000 # 1500 is default

AWS_DEFAULT_REGION=us-west-2
AWS_ACCESS_KEY_ID=cc
AWS_SECRET_ACCESS_KEY=cc

EMBEDDINGS_PROVIDER=bedrock
EMBEDDINGS_MODEL=amazon.titan-embed-text-v1

RAG_API_URL=http://host-gateway:8000

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions