-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Fine control over devices in collectors #1835
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1835
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Pending, 4 Unrelated FailuresAs of commit a7110b0 with merge base 6277226 (): NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 25, 2024
vmoens
changed the title
[Feature] Fine control over devices in collectors
[WIP, Feature] Fine control over devices in collectors
Jan 25, 2024
vmoens
added
enhancement
New feature or request
Refactoring
Refactoring of an existing feature
labels
Jan 25, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 62.6108ms | 60.6368ms | 16.4916 Ops/s | 16.2024 Ops/s | |
test_sync | 35.0002ms | 34.0588ms | 29.3610 Ops/s | 29.4708 Ops/s | |
test_async | 0.1296s | 33.4832ms | 29.8657 Ops/s | 30.9023 Ops/s | |
test_simple | 0.4699s | 0.4183s | 2.3907 Ops/s | 2.3115 Ops/s | |
test_transformed | 0.6227s | 0.5737s | 1.7430 Ops/s | 1.6795 Ops/s | |
test_serial | 1.3703s | 1.3250s | 0.7547 Ops/s | 0.7323 Ops/s | |
test_parallel | 1.2644s | 1.1889s | 0.8411 Ops/s | 0.8401 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.1607ms | 21.4148μs | 46.6967 KOps/s | 45.9580 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 53.8010μs | 13.0185μs | 76.8138 KOps/s | 75.4249 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 43.5110μs | 12.4748μs | 80.1613 KOps/s | 79.4873 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 39.5840μs | 7.5246μs | 132.8969 KOps/s | 131.8002 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 60.8540μs | 23.0551μs | 43.3744 KOps/s | 43.2874 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 45.1950μs | 14.3272μs | 69.7974 KOps/s | 68.0338 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 46.3070μs | 13.7300μs | 72.8330 KOps/s | 72.3675 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 28.0530μs | 8.8271μs | 113.2879 KOps/s | 111.1805 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 65.0130μs | 24.0038μs | 41.6600 KOps/s | 40.9134 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 39.5040μs | 15.6704μs | 63.8147 KOps/s | 62.0855 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 57.6980μs | 13.7449μs | 72.7540 KOps/s | 72.1365 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 38.1320μs | 8.8151μs | 113.4411 KOps/s | 111.6400 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 85.7810μs | 25.1199μs | 39.8090 KOps/s | 39.0601 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 98.7000μs | 16.8298μs | 59.4186 KOps/s | 57.4238 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 47.4090μs | 14.6560μs | 68.2314 KOps/s | 66.4878 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 36.7490μs | 9.9821μs | 100.1792 KOps/s | 97.2962 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 64.5210μs | 24.1238μs | 41.4528 KOps/s | 40.9980 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 61.0150μs | 15.6785μs | 63.7818 KOps/s | 61.8277 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 51.8060μs | 15.9499μs | 62.6962 KOps/s | 61.6868 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 39.5440μs | 10.0963μs | 99.0466 KOps/s | 96.9192 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 68.3080μs | 25.1982μs | 39.6854 KOps/s | 39.1767 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 69.6400μs | 16.7774μs | 59.6039 KOps/s | 57.5854 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 43.1500μs | 16.9546μs | 58.9810 KOps/s | 57.2304 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 34.7550μs | 11.1931μs | 89.3405 KOps/s | 86.0068 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 74.9800μs | 26.3803μs | 37.9070 KOps/s | 36.6367 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 46.5380μs | 18.1489μs | 55.0999 KOps/s | 52.7390 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 48.1500μs | 17.0214μs | 58.7496 KOps/s | 57.2618 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 34.4850μs | 11.2417μs | 88.9546 KOps/s | 85.1929 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 64.5710μs | 27.3011μs | 36.6286 KOps/s | 35.2840 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 46.0460μs | 19.1302μs | 52.2733 KOps/s | 49.5183 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 48.0100μs | 18.1315μs | 55.1526 KOps/s | 53.3442 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 68.0580μs | 12.1442μs | 82.3440 KOps/s | 77.7995 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 12.4822ms | 12.2505ms | 81.6292 Ops/s | 80.5270 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 34.9676ms | 27.7670ms | 36.0139 Ops/s | 36.4085 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.2355ms | 0.1814ms | 5.5122 KOps/s | 5.5081 KOps/s | |
test_values[td1_return_estimate-False-False] | 28.9916ms | 26.1219ms | 38.2820 Ops/s | 38.2658 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 35.7462ms | 27.8858ms | 35.8605 Ops/s | 36.0130 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 39.7976ms | 36.5688ms | 27.3457 Ops/s | 27.0481 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 34.8708ms | 27.6785ms | 36.1291 Ops/s | 36.3274 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.5174ms | 8.0156ms | 124.7562 Ops/s | 116.3695 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 10.1327ms | 1.8793ms | 532.1151 Ops/s | 495.5201 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5876ms | 0.4309ms | 2.3206 KOps/s | 2.2765 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 47.1210ms | 38.6052ms | 25.9032 Ops/s | 25.9491 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 13.8989ms | 2.6695ms | 374.6029 Ops/s | 379.1030 Ops/s | |
test_dqn_speed | 16.0569ms | 7.3372ms | 136.2916 Ops/s | 131.3168 Ops/s | |
test_ddpg_speed | 20.8676ms | 14.3344ms | 69.7622 Ops/s | 62.6039 Ops/s | |
test_sac_speed | 37.1564ms | 28.8375ms | 34.6770 Ops/s | 33.5912 Ops/s | |
test_redq_speed | 26.5272ms | 15.7061ms | 63.6695 Ops/s | 63.6825 Ops/s | |
test_redq_deprec_speed | 33.9492ms | 25.7463ms | 38.8405 Ops/s | 37.9636 Ops/s | |
test_td3_speed | 28.8913ms | 19.7973ms | 50.5119 Ops/s | 48.8921 Ops/s | |
test_cql_speed | 93.5330ms | 86.3713ms | 11.5779 Ops/s | 10.0658 Ops/s | |
test_a2c_speed | 33.8200ms | 26.8737ms | 37.2111 Ops/s | 36.2486 Ops/s | |
test_ppo_speed | 29.1337ms | 26.6606ms | 37.5086 Ops/s | 36.7487 Ops/s | |
test_reinforce_speed | 35.6343ms | 26.5262ms | 37.6985 Ops/s | 38.3879 Ops/s | |
test_iql_speed | 72.4773ms | 63.9367ms | 15.6405 Ops/s | 15.6691 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.0661ms | 2.8128ms | 355.5224 Ops/s | 357.1878 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 8.7792ms | 0.5282ms | 1.8932 KOps/s | 1.8876 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 8.8068ms | 0.5024ms | 1.9904 KOps/s | 2.0120 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.7381ms | 2.9380ms | 340.3656 Ops/s | 365.0922 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 8.7373ms | 0.5288ms | 1.8912 KOps/s | 1.9330 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 5.6081ms | 0.4936ms | 2.0259 KOps/s | 2.0109 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.7527ms | 2.6121ms | 382.8353 Ops/s | 402.1104 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 9.0261ms | 0.6731ms | 1.4858 KOps/s | 1.4926 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 10.1498ms | 0.6346ms | 1.5758 KOps/s | 1.5888 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 4.1848ms | 2.9248ms | 341.9030 Ops/s | 364.9601 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6726ms | 0.5228ms | 1.9129 KOps/s | 1.8661 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.3720ms | 0.4997ms | 2.0012 KOps/s | 1.9718 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.1334ms | 2.7876ms | 358.7259 Ops/s | 362.0951 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6456ms | 0.5157ms | 1.9390 KOps/s | 1.8786 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6617ms | 0.4883ms | 2.0480 KOps/s | 1.6607 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.7089ms | 2.4797ms | 403.2670 Ops/s | 404.5776 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 8.6051ms | 0.6582ms | 1.5194 KOps/s | 1.4835 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8623ms | 0.6200ms | 1.6128 KOps/s | 1.5815 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 94.3639ms | 9.7777ms | 102.2741 Ops/s | 83.0481 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 15.8764ms | 13.6388ms | 73.3201 Ops/s | 72.6085 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 5.8849ms | 3.3071ms | 302.3797 Ops/s | 298.2234 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 95.9108ms | 11.5940ms | 86.2514 Ops/s | 100.0549 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 19.6264ms | 13.6687ms | 73.1599 Ops/s | 72.6467 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 5.8131ms | 3.2660ms | 306.1881 Ops/s | 302.4910 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1051s | 10.4102ms | 96.0601 Ops/s | 82.7517 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1073s | 15.8047ms | 63.2722 Ops/s | 71.1787 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 6.1340ms | 3.5085ms | 285.0209 Ops/s | 280.2432 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1211s | 0.1205s | 8.2955 Ops/s | 8.4598 Ops/s | |
test_sync | 97.1235ms | 96.5872ms | 10.3533 Ops/s | 10.3943 Ops/s | |
test_async | 0.1833s | 92.3810ms | 10.8247 Ops/s | 10.8199 Ops/s | |
test_single_pixels | 0.1406s | 0.1406s | 7.1143 Ops/s | 7.0517 Ops/s | |
test_sync_pixels | 83.9111ms | 81.7513ms | 12.2322 Ops/s | 12.8437 Ops/s | |
test_async_pixels | 0.1779s | 75.5902ms | 13.2292 Ops/s | 13.9751 Ops/s | |
test_simple | 0.9065s | 0.8377s | 1.1937 Ops/s | 1.1878 Ops/s | |
test_transformed | 1.1523s | 1.0864s | 0.9204 Ops/s | 0.8927 Ops/s | |
test_serial | 2.2948s | 2.2912s | 0.4365 Ops/s | 0.4222 Ops/s | |
test_parallel | 1.9673s | 1.8862s | 0.5302 Ops/s | 0.5233 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 95.6310μs | 32.9036μs | 30.3918 KOps/s | 30.9892 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 35.9710μs | 19.6809μs | 50.8108 KOps/s | 50.6306 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 41.9010μs | 18.7712μs | 53.2732 KOps/s | 55.0939 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 26.9810μs | 11.2571μs | 88.8326 KOps/s | 90.5844 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 60.9710μs | 34.7779μs | 28.7539 KOps/s | 29.3854 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 41.6300μs | 21.6848μs | 46.1152 KOps/s | 47.7242 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 44.4700μs | 20.5471μs | 48.6687 KOps/s | 50.2006 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 32.3210μs | 13.1995μs | 75.7603 KOps/s | 78.2562 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 0.1058ms | 36.9042μs | 27.0972 KOps/s | 27.5794 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 53.6500μs | 23.9871μs | 41.6892 KOps/s | 43.0276 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 47.9200μs | 20.7132μs | 48.2784 KOps/s | 49.9350 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 32.2100μs | 13.0987μs | 76.3434 KOps/s | 77.6529 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 53.4900μs | 38.3958μs | 26.0445 KOps/s | 26.3099 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 42.7010μs | 25.2662μs | 39.5786 KOps/s | 39.8297 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 42.2600μs | 22.4106μs | 44.6217 KOps/s | 45.8205 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 38.1500μs | 15.1099μs | 66.1820 KOps/s | 67.7022 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 58.5310μs | 36.8789μs | 27.1158 KOps/s | 27.6086 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 54.3010μs | 23.4012μs | 42.7329 KOps/s | 42.6589 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 43.0300μs | 24.6811μs | 40.5168 KOps/s | 41.2483 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 30.4000μs | 14.9832μs | 66.7412 KOps/s | 67.4056 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 68.6310μs | 38.4689μs | 25.9950 KOps/s | 26.5272 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 50.1910μs | 25.3677μs | 39.4202 KOps/s | 39.7128 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 53.9600μs | 25.8754μs | 38.6467 KOps/s | 39.2338 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 36.4810μs | 16.7495μs | 59.7034 KOps/s | 61.1898 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 61.3810μs | 40.6870μs | 24.5779 KOps/s | 25.2061 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 47.5110μs | 27.2064μs | 36.7561 KOps/s | 36.9147 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 46.9500μs | 26.3610μs | 37.9349 KOps/s | 38.5072 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 34.1610μs | 16.8025μs | 59.5150 KOps/s | 59.9176 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 60.4410μs | 41.2941μs | 24.2165 KOps/s | 24.3771 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 47.2800μs | 28.8983μs | 34.6041 KOps/s | 34.5347 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 46.3610μs | 28.0033μs | 35.7101 KOps/s | 36.8950 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 41.9300μs | 18.5888μs | 53.7958 KOps/s | 54.9019 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 26.9177ms | 25.6825ms | 38.9370 Ops/s | 40.0381 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 82.7422ms | 3.2219ms | 310.3743 Ops/s | 311.4832 Ops/s | |
test_values[td0_return_estimate-False-False] | 99.3020μs | 61.6810μs | 16.2124 KOps/s | 16.0253 KOps/s | |
test_values[td1_return_estimate-False-False] | 57.3700ms | 55.0607ms | 18.1618 Ops/s | 18.7011 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 2.0860ms | 1.7878ms | 559.3554 Ops/s | 563.4426 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 90.7529ms | 89.4448ms | 11.1801 Ops/s | 11.7416 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 4.0702ms | 1.8397ms | 543.5714 Ops/s | 553.8366 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 25.5065ms | 25.3299ms | 39.4790 Ops/s | 41.7734 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.8666ms | 0.7021ms | 1.4244 KOps/s | 1.4088 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7251ms | 0.6556ms | 1.5253 KOps/s | 1.5188 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5170ms | 1.4567ms | 686.4599 Ops/s | 684.2040 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.9343ms | 0.6732ms | 1.4853 KOps/s | 1.4787 KOps/s | |
test_dqn_speed | 7.5939ms | 7.2275ms | 138.3611 Ops/s | 132.5893 Ops/s | |
test_ddpg_speed | 14.8358ms | 14.1002ms | 70.9210 Ops/s | 68.6949 Ops/s | |
test_sac_speed | 28.5679ms | 28.2829ms | 35.3571 Ops/s | 31.5892 Ops/s | |
test_redq_speed | 14.4193ms | 13.0993ms | 76.3400 Ops/s | 75.6365 Ops/s | |
test_redq_deprec_speed | 24.4171ms | 23.4315ms | 42.6776 Ops/s | 41.6607 Ops/s | |
test_td3_speed | 28.1682ms | 19.2556ms | 51.9329 Ops/s | 50.8344 Ops/s | |
test_cql_speed | 84.5048ms | 82.4972ms | 12.1216 Ops/s | 12.0749 Ops/s | |
test_a2c_speed | 27.8919ms | 26.8173ms | 37.2894 Ops/s | 37.1492 Ops/s | |
test_ppo_speed | 27.9535ms | 26.5351ms | 37.6859 Ops/s | 37.1229 Ops/s | |
test_reinforce_speed | 27.4771ms | 26.2329ms | 38.1200 Ops/s | 38.3855 Ops/s | |
test_iql_speed | 0.1551s | 61.5638ms | 16.2433 Ops/s | 17.4909 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.7879ms | 3.6125ms | 276.8143 Ops/s | 277.2711 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9979ms | 0.8492ms | 1.1776 KOps/s | 1.1793 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.9371ms | 0.8227ms | 1.2156 KOps/s | 1.2175 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.9437ms | 3.6043ms | 277.4469 Ops/s | 278.3521 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.9709ms | 0.8386ms | 1.1924 KOps/s | 1.1979 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.9268ms | 0.8119ms | 1.2317 KOps/s | 1.2353 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.4628ms | 3.3018ms | 302.8682 Ops/s | 303.2476 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0766ms | 0.9678ms | 1.0333 KOps/s | 1.0362 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.0582ms | 0.9411ms | 1.0626 KOps/s | 1.0626 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.7566ms | 3.6303ms | 275.4576 Ops/s | 276.9499 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.9457ms | 0.8514ms | 1.1746 KOps/s | 1.1801 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.9330ms | 0.8247ms | 1.2126 KOps/s | 1.2158 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.8262ms | 3.6143ms | 276.6798 Ops/s | 277.0607 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.9958ms | 0.8364ms | 1.1956 KOps/s | 1.1977 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.9401ms | 0.8112ms | 1.2328 KOps/s | 1.2328 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.4376ms | 3.3032ms | 302.7327 Ops/s | 303.4445 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0949ms | 0.9646ms | 1.0367 KOps/s | 1.0347 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 1.0663ms | 0.9400ms | 1.0638 KOps/s | 1.0600 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1371s | 10.2980ms | 97.1060 Ops/s | 98.7327 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 17.1878ms | 14.2348ms | 70.2502 Ops/s | 70.6935 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 5.4114ms | 3.3755ms | 296.2490 Ops/s | 294.4101 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1193s | 10.0033ms | 99.9674 Ops/s | 82.1481 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 16.7113ms | 14.2182ms | 70.3326 Ops/s | 70.8107 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 5.9201ms | 3.3935ms | 294.6795 Ops/s | 294.8993 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1185s | 10.1418ms | 98.6014 Ops/s | 98.5261 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1254s | 16.6141ms | 60.1899 Ops/s | 69.8672 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 6.3268ms | 3.5901ms | 278.5474 Ops/s | 277.9814 Ops/s |
…device-collectors
vmoens
changed the title
[WIP, Feature] Fine control over devices in collectors
[Feature] Fine control over devices in collectors
Jan 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bc breaking
backward compatibility breaking change
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Environments
Adds or modifies an environment wrapper
Refactoring
Refactoring of an existing feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
In this PR I propose a new API for the collector devices.
It offers a more fine grained device allocation for single or multiprocessed collectors, where one can now specify a generic collector through the
device
argument or the storing, env or policy device. The policy/env/storage devices always precede the main device if they are passed (e.g.,device="cpu", policy_device="cuda"
will execute the policy on cuda, but leave the env and storage on"cpu"
).Importantly, mixed devices will now be allowed in modules, environments, specs and collectors (neither the policy, the env nor the tensordict storage used to pass data will need to be on a single device).
ie. a policy can now have a None device and read input on one device and produce output on another device (or on more than one device).
Restrictions with ComnpositeSpec devices are lifted, which also means that now the device of a composite spec must be specified if we don't want it to be None (as it is the case for TensorDict).
These changes are mildly BC-breaking.
Distributed collectors now always deliver data on the default device, and all the devices passed are dispatched to the remote collectors.
TODO:
cc @skandermoalla