-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix sampling without replacement with ndim storages #1999
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1999
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 New FailuresAs of commit 887b83e with merge base fe6c070 (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 60.9970ms | 60.2178ms | 16.6064 Ops/s | 16.7231 Ops/s | |
test_sync | 33.6331ms | 32.8454ms | 30.4456 Ops/s | 30.8374 Ops/s | |
test_async | 62.6889ms | 29.9776ms | 33.3583 Ops/s | 33.9784 Ops/s | |
test_simple | 0.4857s | 0.4249s | 2.3533 Ops/s | 2.3710 Ops/s | |
test_transformed | 0.6215s | 0.5706s | 1.7524 Ops/s | 1.7230 Ops/s | |
test_serial | 1.4759s | 1.4107s | 0.7089 Ops/s | 0.7215 Ops/s | |
test_parallel | 1.4465s | 1.3934s | 0.7177 Ops/s | 0.7234 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.2390ms | 21.3347μs | 46.8721 KOps/s | 46.1631 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 38.2710μs | 13.0412μs | 76.6801 KOps/s | 76.8079 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 34.7850μs | 12.4971μs | 80.0185 KOps/s | 78.5767 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 27.9320μs | 7.6123μs | 131.3657 KOps/s | 129.4680 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 53.6790μs | 23.0498μs | 43.3843 KOps/s | 43.6090 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 42.0680μs | 14.3931μs | 69.4779 KOps/s | 68.8721 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 65.9600μs | 13.6471μs | 73.2756 KOps/s | 72.1810 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 59.2800μs | 8.7845μs | 113.8372 KOps/s | 111.6352 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 54.9020μs | 24.0845μs | 41.5205 KOps/s | 40.8037 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 68.4370μs | 15.3794μs | 65.0220 KOps/s | 62.9618 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 46.8270μs | 13.7119μs | 72.9291 KOps/s | 72.1179 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 28.6430μs | 8.7728μs | 113.9892 KOps/s | 111.8438 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 51.0140μs | 25.3054μs | 39.5173 KOps/s | 39.1193 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 42.7590μs | 16.8501μs | 59.3467 KOps/s | 58.6947 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 48.0390μs | 14.8598μs | 67.2958 KOps/s | 66.1725 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 41.2260μs | 10.0394μs | 99.6071 KOps/s | 98.3348 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 50.8550μs | 23.9702μs | 41.7185 KOps/s | 40.9585 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 41.7080μs | 15.5990μs | 64.1065 KOps/s | 63.3907 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 50.6640μs | 16.1261μs | 62.0112 KOps/s | 61.5763 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 43.1400μs | 10.0878μs | 99.1293 KOps/s | 98.6607 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 65.1910μs | 25.8730μs | 38.6504 KOps/s | 38.7059 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 64.5000μs | 16.9242μs | 59.0871 KOps/s | 58.6996 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 39.9740μs | 17.2545μs | 57.9560 KOps/s | 58.2829 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 38.9920μs | 11.3143μs | 88.3834 KOps/s | 88.3810 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 90.0090μs | 26.3691μs | 37.9232 KOps/s | 36.6376 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 57.1360μs | 18.0055μs | 55.5385 KOps/s | 55.0027 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 47.0970μs | 17.2524μs | 57.9630 KOps/s | 57.4908 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 56.6250μs | 11.3277μs | 88.2790 KOps/s | 88.1337 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 54.4610μs | 27.2829μs | 36.6530 KOps/s | 35.3251 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 55.8040μs | 19.1503μs | 52.2185 KOps/s | 51.8672 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 60.8140μs | 18.1366μs | 55.1372 KOps/s | 54.7685 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 56.2240μs | 12.2703μs | 81.4973 KOps/s | 80.4569 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 11.0056ms | 9.5661ms | 104.5357 Ops/s | 104.0785 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 35.9879ms | 33.5975ms | 29.7641 Ops/s | 29.9755 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.2540ms | 0.1843ms | 5.4269 KOps/s | 5.6825 KOps/s | |
test_values[td1_return_estimate-False-False] | 27.7087ms | 24.3411ms | 41.0827 Ops/s | 42.4168 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 35.0941ms | 33.6865ms | 29.6855 Ops/s | 29.8909 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 35.6119ms | 34.8973ms | 28.6555 Ops/s | 29.1999 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 35.0851ms | 33.6788ms | 29.6923 Ops/s | 29.8647 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 11.5883ms | 8.3932ms | 119.1436 Ops/s | 120.2800 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.4637ms | 2.0577ms | 485.9837 Ops/s | 512.0872 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.6700ms | 0.3578ms | 2.7946 KOps/s | 2.8168 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 50.1522ms | 46.1261ms | 21.6797 Ops/s | 23.7528 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.7624ms | 3.0706ms | 325.6667 Ops/s | 330.5338 Ops/s | |
test_dqn_speed | 6.9518ms | 1.3785ms | 725.4361 Ops/s | 743.9947 Ops/s | |
test_ddpg_speed | 3.4652ms | 2.7274ms | 366.6543 Ops/s | 376.8843 Ops/s | |
test_sac_speed | 9.2401ms | 8.2822ms | 120.7415 Ops/s | 121.6956 Ops/s | |
test_redq_speed | 14.4575ms | 13.1785ms | 75.8814 Ops/s | 76.2355 Ops/s | |
test_redq_deprec_speed | 79.1337ms | 14.3340ms | 69.7643 Ops/s | 77.2293 Ops/s | |
test_td3_speed | 8.7955ms | 8.3411ms | 119.8885 Ops/s | 122.8891 Ops/s | |
test_cql_speed | 38.0072ms | 36.4701ms | 27.4197 Ops/s | 27.7541 Ops/s | |
test_a2c_speed | 8.8137ms | 7.5616ms | 132.2471 Ops/s | 135.4036 Ops/s | |
test_ppo_speed | 9.0147ms | 8.0887ms | 123.6298 Ops/s | 130.0478 Ops/s | |
test_reinforce_speed | 12.6824ms | 7.2544ms | 137.8479 Ops/s | 150.2258 Ops/s | |
test_iql_speed | 34.1959ms | 33.2739ms | 30.0536 Ops/s | 30.4747 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.6069ms | 2.3975ms | 417.1067 Ops/s | 428.4256 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.0198ms | 0.5160ms | 1.9379 KOps/s | 1.9775 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7758ms | 0.4885ms | 2.0470 KOps/s | 2.0873 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.6707ms | 2.3866ms | 419.0128 Ops/s | 429.7131 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.0749ms | 0.4975ms | 2.0099 KOps/s | 1.9948 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7390ms | 0.4730ms | 2.1142 KOps/s | 2.1168 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.8467ms | 1.3215ms | 756.7024 Ops/s | 792.0921 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.7422ms | 1.2413ms | 805.5826 Ops/s | 837.5714 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.6665ms | 2.3575ms | 424.1754 Ops/s | 437.7523 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9961ms | 0.6134ms | 1.6303 KOps/s | 1.6302 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9390ms | 0.5893ms | 1.6970 KOps/s | 1.6898 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.3547ms | 2.2686ms | 440.7955 Ops/s | 436.1876 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8623ms | 0.5204ms | 1.9216 KOps/s | 1.6244 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.9079ms | 0.4897ms | 2.0420 KOps/s | 2.1212 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.8777ms | 2.3663ms | 422.6050 Ops/s | 443.9349 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6147ms | 0.4954ms | 2.0188 KOps/s | 2.0436 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7587ms | 0.4842ms | 2.0653 KOps/s | 2.0933 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.6361ms | 2.4056ms | 415.7010 Ops/s | 417.0885 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1827ms | 0.6212ms | 1.6099 KOps/s | 1.3480 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8668ms | 0.5973ms | 1.6743 KOps/s | 1.6744 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 91.4688ms | 7.0329ms | 142.1887 Ops/s | 178.9965 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 16.7212ms | 12.3563ms | 80.9303 Ops/s | 81.4689 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 4.3326ms | 1.1279ms | 886.5911 Ops/s | 983.6858 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 94.5798ms | 7.1364ms | 140.1264 Ops/s | 136.1541 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 14.8538ms | 12.4244ms | 80.4867 Ops/s | 81.9493 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 4.1833ms | 1.1298ms | 885.0827 Ops/s | 923.0552 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 97.1122ms | 5.8168ms | 171.9152 Ops/s | 135.5284 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1008s | 14.3977ms | 69.4556 Ops/s | 79.3424 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.1144ms | 1.3781ms | 725.6343 Ops/s | 706.6403 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1157s | 0.1155s | 8.6563 Ops/s | 8.5871 Ops/s | |
test_sync | 95.2852ms | 95.0553ms | 10.5202 Ops/s | 10.3786 Ops/s | |
test_async | 0.1791s | 90.7689ms | 11.0170 Ops/s | 10.9784 Ops/s | |
test_single_pixels | 0.1259s | 0.1254s | 7.9762 Ops/s | 7.9111 Ops/s | |
test_sync_pixels | 81.9354ms | 80.7189ms | 12.3887 Ops/s | 12.2627 Ops/s | |
test_async_pixels | 0.1495s | 65.3788ms | 15.2955 Ops/s | 15.2790 Ops/s | |
test_simple | 0.8925s | 0.8334s | 1.1998 Ops/s | 1.1834 Ops/s | |
test_transformed | 1.1196s | 1.0631s | 0.9406 Ops/s | 0.9267 Ops/s | |
test_serial | 2.4834s | 2.4270s | 0.4120 Ops/s | 0.4018 Ops/s | |
test_parallel | 2.1678s | 2.1075s | 0.4745 Ops/s | 0.4710 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.1011ms | 33.4079μs | 29.9330 KOps/s | 29.7217 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 0.1631ms | 20.1140μs | 49.7167 KOps/s | 49.2096 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 35.9310μs | 19.0081μs | 52.6091 KOps/s | 53.0392 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 29.7410μs | 11.1241μs | 89.8948 KOps/s | 88.6914 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 58.2610μs | 34.6945μs | 28.8230 KOps/s | 28.4584 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 46.8910μs | 21.8295μs | 45.8095 KOps/s | 45.2638 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 40.2210μs | 20.8159μs | 48.0402 KOps/s | 47.7399 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 30.6110μs | 13.1026μs | 76.3206 KOps/s | 74.2729 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 61.3910μs | 37.2876μs | 26.8186 KOps/s | 26.6475 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 45.8010μs | 23.9040μs | 41.8340 KOps/s | 41.3479 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 40.1600μs | 21.2214μs | 47.1222 KOps/s | 48.1805 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 32.1600μs | 13.2127μs | 75.6846 KOps/s | 75.4623 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 58.6010μs | 38.7111μs | 25.8324 KOps/s | 25.3458 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 45.9710μs | 25.6258μs | 39.0231 KOps/s | 38.8261 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 45.9710μs | 22.1634μs | 45.1194 KOps/s | 44.3898 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 34.9000μs | 15.0083μs | 66.6296 KOps/s | 66.4990 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 64.3320μs | 37.5432μs | 26.6359 KOps/s | 26.6353 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 43.9400μs | 23.6812μs | 42.2277 KOps/s | 42.0344 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 50.3410μs | 24.2223μs | 41.2842 KOps/s | 40.3293 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 36.6300μs | 15.0265μs | 66.5491 KOps/s | 67.8245 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 64.1010μs | 39.8163μs | 25.1153 KOps/s | 25.3613 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 59.6210μs | 25.6759μs | 38.9470 KOps/s | 38.5787 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 50.9110μs | 26.3309μs | 37.9782 KOps/s | 37.0734 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 36.2410μs | 16.7416μs | 59.7315 KOps/s | 58.4528 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 67.5120μs | 40.9931μs | 24.3944 KOps/s | 24.4408 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 49.8710μs | 27.4561μs | 36.4217 KOps/s | 35.5320 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 47.5710μs | 26.5084μs | 37.7239 KOps/s | 37.8663 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 35.6810μs | 16.6428μs | 60.0862 KOps/s | 58.3384 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 68.5210μs | 41.9975μs | 23.8110 KOps/s | 23.7610 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 52.5910μs | 29.2207μs | 34.2223 KOps/s | 33.9195 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 42.3510μs | 28.0557μs | 35.6434 KOps/s | 35.8501 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 41.0500μs | 18.4426μs | 54.2222 KOps/s | 53.6272 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 26.8668ms | 26.4790ms | 37.7657 Ops/s | 35.4544 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 86.7080ms | 3.3153ms | 301.6361 Ops/s | 288.4627 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.1028ms | 66.0000μs | 15.1515 KOps/s | 14.6590 KOps/s | |
test_values[td1_return_estimate-False-False] | 56.9697ms | 55.9616ms | 17.8694 Ops/s | 17.1131 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 2.1739ms | 1.7830ms | 560.8656 Ops/s | 557.2825 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 90.9388ms | 88.8689ms | 11.2525 Ops/s | 10.8420 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 2.1456ms | 1.7791ms | 562.0888 Ops/s | 558.6854 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 26.5052ms | 26.0341ms | 38.4112 Ops/s | 38.9473 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.9150ms | 0.7204ms | 1.3881 KOps/s | 1.3615 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7359ms | 0.6868ms | 1.4560 KOps/s | 1.4793 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.6172ms | 1.4701ms | 680.2051 Ops/s | 674.5316 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.9713ms | 0.6882ms | 1.4530 KOps/s | 1.4363 KOps/s | |
test_dqn_speed | 8.1140ms | 1.4716ms | 679.5484 Ops/s | 617.5489 Ops/s | |
test_ddpg_speed | 3.0900ms | 2.7521ms | 363.3631 Ops/s | 358.5430 Ops/s | |
test_sac_speed | 8.7136ms | 8.1406ms | 122.8409 Ops/s | 121.1985 Ops/s | |
test_redq_speed | 11.0077ms | 10.1408ms | 98.6112 Ops/s | 96.8959 Ops/s | |
test_redq_deprec_speed | 11.4759ms | 11.0327ms | 90.6393 Ops/s | 88.4069 Ops/s | |
test_td3_speed | 15.8196ms | 8.1666ms | 122.4502 Ops/s | 121.3105 Ops/s | |
test_cql_speed | 26.4805ms | 25.2004ms | 39.6819 Ops/s | 39.3810 Ops/s | |
test_a2c_speed | 5.7727ms | 5.4812ms | 182.4413 Ops/s | 179.5641 Ops/s | |
test_ppo_speed | 6.2156ms | 5.8546ms | 170.8062 Ops/s | 169.4295 Ops/s | |
test_reinforce_speed | 4.7363ms | 4.5203ms | 221.2241 Ops/s | 221.2841 Ops/s | |
test_iql_speed | 19.8145ms | 19.2566ms | 51.9303 Ops/s | 52.0642 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.0069ms | 2.9055ms | 344.1691 Ops/s | 344.1852 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.2394ms | 0.5401ms | 1.8516 KOps/s | 1.8222 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6901ms | 0.5154ms | 1.9404 KOps/s | 1.9292 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.1000ms | 2.9206ms | 342.3975 Ops/s | 341.8731 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.2629ms | 0.5335ms | 1.8743 KOps/s | 1.8384 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6632ms | 0.5077ms | 1.9696 KOps/s | 1.9450 KOps/s | |
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.6433ms | 1.5370ms | 650.6259 Ops/s | 646.4109 Ops/s | |
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6269ms | 1.4603ms | 684.7912 Ops/s | 670.2791 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.1372ms | 2.9977ms | 333.5885 Ops/s | 331.3470 Ops/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9280ms | 0.6666ms | 1.5001 KOps/s | 1.4844 KOps/s | |
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.1082s | 0.7242ms | 1.3808 KOps/s | 1.5450 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.0105ms | 2.8941ms | 345.5317 Ops/s | 345.8247 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6600ms | 0.5412ms | 1.8478 KOps/s | 1.8335 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 4.8975ms | 0.5238ms | 1.9092 KOps/s | 1.5743 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.2205ms | 2.9377ms | 340.4044 Ops/s | 342.0949 Ops/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.1057s | 0.6774ms | 1.4762 KOps/s | 1.8499 KOps/s | |
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6573ms | 0.5096ms | 1.9622 KOps/s | 1.9539 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.1255ms | 3.0466ms | 328.2343 Ops/s | 331.2959 Ops/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.7741ms | 0.6684ms | 1.4962 KOps/s | 1.4919 KOps/s | |
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 4.9604ms | 0.6444ms | 1.5519 KOps/s | 1.2724 KOps/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1070s | 8.7714ms | 114.0063 Ops/s | 150.5914 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 17.2516ms | 15.0581ms | 66.4095 Ops/s | 63.0597 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.9368ms | 1.0688ms | 935.6512 Ops/s | 934.5577 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 99.6234ms | 6.6950ms | 149.3659 Ops/s | 116.7187 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 17.2661ms | 14.9544ms | 66.8699 Ops/s | 63.8407 Ops/s | |
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.1498ms | 1.1237ms | 889.9106 Ops/s | 929.8765 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1007s | 8.9784ms | 111.3790 Ops/s | 140.3155 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 17.6600ms | 15.3836ms | 65.0042 Ops/s | 62.6849 Ops/s | |
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.4057ms | 1.4135ms | 707.4393 Ops/s | 696.8788 Ops/s |
# like a non-zero through stacking. | ||
def tuple_to_tensor(traj_idx, lengths=lengths): | ||
if isinstance(traj_idx, tuple): | ||
traj_idx = torch.arange(len(storage), device=lengths.device).view( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I compared this with np.ravel_multi_index
using
torch.as_tensor(np.ravel_multi_index(tuple(idx.numpy() for idx in unravelled), shape))
Rumtimes are roughly equivalent, with a slight advantage for the numpy version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is way slower, about 2.5x the numpy solution
def ravel_multi_index(x, shape):
out = 0
shape_modif = np.cumprod(list(reversed((*shape, 1))))
for i, idx in enumerate(reversed(x)):
out += idx * shape_modif[i]
return out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more vectorized version still underperforms arange and numpy
def ravel_multi_index(x, shape):
out = 0
shape_modif = torch.flipud(
torch.cumprod(torch.tensor(list(reversed((*shape[1:], 1)))), 0)
).unsqueeze(0)
return (torch.stack(x, -1) * shape_modif).sum(-1)
(cherry picked from commit 07eb02d)
No description provided.