-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Distributed data collector (ray) #930
Conversation
# Broadcast agent weights | ||
self.local_collector().update_policy_weights_() | ||
state_dict = {"policy_state_dict": self._local_collector.policy.state_dict()} | ||
state_dict = ray.put(state_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: in regular data collectors we leave this to the user or ask explicitly if the update should be done at each iter.
Do we want to do the same here? ie isolating the weight updating in a dedicated method that can or won't be called at each iteration.
samples_ready = [] | ||
while len(samples_ready) < self.num_collectors: | ||
samples_ready, samples_not_ready = ray.wait( | ||
pending_samples, num_returns=len(pending_samples), timeout=0.001) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the role of timeout here? Is the value standard? Should it be a hyperparam or a global variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If timeout is set, the function returns either when the requested number of IDs are ready or when the timeout is reached, whichever occurs first. If it is not set, the function simply waits until that number of objects is ready and returns that exact number of object refs. Default value is None.
Actually for now we can just set timeout=None, but if we want to implement the preemption mechanism, we will need to check regularly how many of the tasks have ended. We can either set it to a reasonable value of allow the users to define it.
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Sorry for the late reply
I can't spot it, what is it that you changed? Also why aren't the test following the same pattern than other distributed collectors? I think it's fine if there is a good reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, see my comment in the PR
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lovely! You went the extra length with this!
Co-authored-by: Vincent Moens <vincentmoens@gmail.com>
Description
new class DistributedCollector, which allows to do distributed data collection with TorchRL collectors using Ray.
@vmoens This approach required adding a new function to the collectors, one that takes a single step of the iterator. I have done it with SyncDataCollector. Do you think it is fine?
The PR also includes an example of distributed data collection for the SyncDataCollector case.
Types of changes
What types of changes does your code introduce? Remove all that do not apply:
Checklist
Go over all the following points, and put an
x
in all the boxes that apply.If you are unsure about any of these, don't hesitate to ask. We are here to help!