Some jobs report UNSUPPORTED_FILE_TYPE Exception but by Status is Good According to API #576
Open
Description
Describe the bug
I have a directory containing several hundred HTML files. When I parse them with theload_data
in the Python llama_parse
package, the code fails with the UNSUPPORTED_FILE_TYPE
exception.
- Each time, it happens with different files
- If I check the status quickly with web UI within a few seconds, I see "failed" and the error. But If I refresh the page, it lists success. The results seems correct.
- One can also see successful job status using the API
lp = llama_parse.LlamaParse(
api_key=lp,
result_type='markdown',
verbose=False,
use_vendor_multimodal_model=True,
vendor_multimodal_model_name='openai-gpt4o',
[... other options..]
)
lp.load_data(files) <-- files is a list of paths
File ".../llama_parse/base.py", line 654, in _get_job_result
raise Exception(exception_str)
Exception: Job ID: 3e3b9c18-8059-4c4d-bf84-a138d73e2208 failed with status: ERROR, Error code: UNSUPPORTED_FILE_TYPE, Error message: Unsupported file type.
[...]
% curl -X 'GET' \
'https://api.cloud.llamaindex.ai/api/parsing/job/3e3b9c18-8059-4c4d-bf84-a138d73e2208' \
-H 'accept: application/json' \
-H "Authorization: Bearer $LLAMA_CLOUD_API_KEY"
{"id":"3e3b9c18-8059-4c4d-bf84-a138d73e2208","status":"SUCCESS"}%
Job ID
3e3b9c18-8059-4c4d-bf84-a138d73e2208
Client:
Please remove untested options:
- Python Library