Cannot index 5MB PDF with default settings using bedrock #94
Open
Description
I try to upload this file (5MB, 2,384,000 chars) to LibreChat with bedrock API activated
https://pve.proxmox.com/pve-docs/pve-admin-guide.pdf
I tried dev and dev-lite containers but am getting an upload error ("An Error occurred while uploading a file) in the LibreChat GUI but no real error in the logs with DEBUG_RAG_API=true, Strange
If set CHUNK_SIZE=5000 it works however, these are my RAG settings
DEBUG_RAG_API=true
RAG_USE_FULL_CONTEXT=true
PDF_EXTRACT_IMAGES=false # false is default
CHUNK_SIZE=5000 # 1500 is default
AWS_DEFAULT_REGION=us-west-2
AWS_ACCESS_KEY_ID=cc
AWS_SECRET_ACCESS_KEY=cc
EMBEDDINGS_PROVIDER=bedrock
EMBEDDINGS_MODEL=amazon.titan-embed-text-v1
RAG_API_URL=http://host-gateway:8000
Metadata
Assignees
Labels
No labels