Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure and avoid Open AI rate limiting when ingesting files? #1601

Open
emahpour opened this issue Nov 17, 2024 · 4 comments
Open

Comments

@emahpour
Copy link

Describe the problem
I uploaded a json file with around 4000 entries inside. While I was monitoring the processes, I realized Open AI is enforcing rate limiting and the application was not responsive as it was keep retrying the failed calls to Open AI.
What is the recommendation to avoid running into this problem?

To Reproduce
Create a json file with large number of entries (eg 4000 rows).

  1. Upload the file as a document
  2. Initiate Graph Creation Process

Expected behavior
Processing the Graph generation with consideration of API rate limits with Open AI

Screenshots
image

@NolanTrem
Copy link
Collaborator

NolanTrem commented Nov 17, 2024

Edit: I didn't realize this was a graph process. If you're using the full version and a single job fails, you can retry that job look for the orchestration cookbook in the docs.

Given that this is a JSON file, it might make sense for you to upload entries as chunks rather than a single document. The embedding requests are sent in batches with exponential back off, though, so I suspect that eventually this will succeed. If you're using the full version, and it fails, you can always retry the job which is especially helpful when you've broken the file up or have many files.

@emahpour
Copy link
Author

Even using Hatchet with smaller chunks it can technically run into same rate limit issue, no?
Is there any configuration to apply this rate limiting in hatchet queues?

@NolanTrem
Copy link
Collaborator

I think what you're looking for then is the batch_size parameter in the configuration file. The default is 256. Changing this would only impact future graphs, though.

@emahpour
Copy link
Author

Unfortunately lowering batch_size did not help and instead it overloaded the container on cpu with tons of attempts to retry and fail. There should be a better way rather than brute forcing and hoping it eventually processes everything.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants