-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDF Encrypted Case #2764
PDF Encrypted Case #2764
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This pull request enhances the handling of encrypted PDF files in the Danswer project's file processing module.
- Improved
read_pdf_file
function inbackend/danswer/file_processing/extract_file_text.py
to gracefully handle encrypted PDFs without passwords - Added logging for encrypted PDFs, allowing them to be discoverable by title without processing content
- Ensures encrypted PDFs don't cause errors during processing, maintaining system stability
- Aligns with existing error handling patterns in the file, improving overall code consistency
- May require updates to integration tests to cover new encrypted PDF handling scenarios
1 file(s) reviewed, 3 comment(s)
Edit PR Review Bot Settings | Greptile
else: | ||
logger.warning("No Password available to to decrypt pdf") | ||
elif pdf_reader.is_encrypted: | ||
logger.warning("No Password available to to decrypt pdf, returning empty") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: typo in 'to to', should be 'to'
elif pdf_reader.is_encrypted: | ||
logger.warning("No Password available to to decrypt pdf, returning empty") | ||
return "", metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: consider moving this block before the previous condition to avoid nested if statements
No description provided.