-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix RNNs trajectory split in VMAP calls #1736
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1736
Note: Links to docs will display an error until the docs builds have been completed. ⏳ 1 Pending, 7 Unrelated FailuresAs of commit e4830db with merge base 25bd8a5 (): FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Dec 6, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 62.4729ms | 62.3050ms | 16.0501 Ops/s | 15.4727 Ops/s | |
test_sync | 36.1186ms | 34.0500ms | 29.3686 Ops/s | 29.0412 Ops/s | |
test_async | 77.8250ms | 33.4864ms | 29.8629 Ops/s | 29.9895 Ops/s | |
test_simple | 0.4848s | 0.4312s | 2.3189 Ops/s | 2.3322 Ops/s | |
test_transformed | 0.6568s | 0.5977s | 1.6730 Ops/s | 1.6940 Ops/s | |
test_serial | 1.3741s | 1.3273s | 0.7534 Ops/s | 0.7602 Ops/s | |
test_parallel | 1.3150s | 1.2737s | 0.7851 Ops/s | 0.7722 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.2596ms | 22.4059μs | 44.6312 KOps/s | 45.1755 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 45.2950μs | 13.7440μs | 72.7589 KOps/s | 74.4933 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 54.2410μs | 13.7468μs | 72.7441 KOps/s | 73.9382 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 28.1330μs | 8.2757μs | 120.8353 KOps/s | 123.6047 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 74.5800μs | 23.9708μs | 41.7173 KOps/s | 42.5877 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 37.2490μs | 14.9508μs | 66.8861 KOps/s | 67.9856 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 51.5370μs | 14.8891μs | 67.1634 KOps/s | 67.9013 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 73.9130μs | 9.7346μs | 102.7258 KOps/s | 106.6703 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 0.1133ms | 25.3665μs | 39.4221 KOps/s | 40.4230 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 52.9690μs | 16.1471μs | 61.9306 KOps/s | 61.7724 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 50.8240μs | 14.7903μs | 67.6117 KOps/s | 66.3864 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 51.1950μs | 9.5848μs | 104.3313 KOps/s | 105.3482 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 60.1730μs | 26.3154μs | 38.0006 KOps/s | 38.2871 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 43.1410μs | 17.5197μs | 57.0785 KOps/s | 58.6994 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 47.1180μs | 16.3052μs | 61.3301 KOps/s | 63.3293 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 41.4480μs | 10.7092μs | 93.3775 KOps/s | 95.7040 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 79.9500μs | 25.5745μs | 39.1015 KOps/s | 40.1772 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 38.6230μs | 16.3699μs | 61.0876 KOps/s | 61.7471 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 43.2410μs | 17.5254μs | 57.0600 KOps/s | 58.0485 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 40.8760μs | 10.7599μs | 92.9381 KOps/s | 92.6964 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 82.0860μs | 26.7091μs | 37.4405 KOps/s | 38.2819 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 47.6090μs | 17.5577μs | 56.9550 KOps/s | 57.3317 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 69.5170μs | 18.3429μs | 54.5169 KOps/s | 54.1544 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 58.3990μs | 11.8972μs | 84.0531 KOps/s | 85.7811 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 64.7700μs | 27.7817μs | 35.9949 KOps/s | 36.6284 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 62.0360μs | 18.8021μs | 53.1856 KOps/s | 53.9781 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 58.0690μs | 18.3436μs | 54.5148 KOps/s | 53.6599 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 43.8020μs | 11.8702μs | 84.2448 KOps/s | 84.4889 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 61.5150μs | 28.7104μs | 34.8306 KOps/s | 34.8649 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 54.3320μs | 19.9077μs | 50.2319 KOps/s | 51.0596 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 48.9810μs | 19.7084μs | 50.7398 KOps/s | 51.7002 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 36.8290μs | 13.0796μs | 76.4552 KOps/s | 77.9678 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 16.5994ms | 11.9141ms | 83.9344 Ops/s | 84.9609 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 35.0960ms | 27.6827ms | 36.1236 Ops/s | 36.8629 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.2384ms | 0.1743ms | 5.7387 KOps/s | 5.6848 KOps/s | |
test_values[td1_return_estimate-False-False] | 25.2784ms | 25.0908ms | 39.8553 Ops/s | 40.0601 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 36.0232ms | 27.7448ms | 36.0428 Ops/s | 34.5767 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 35.5488ms | 35.1436ms | 28.4547 Ops/s | 28.4911 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 36.9708ms | 27.8804ms | 35.8675 Ops/s | 36.9607 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 7.9694ms | 7.8479ms | 127.4219 Ops/s | 128.4934 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.1367ms | 1.8937ms | 528.0724 Ops/s | 563.5431 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 8.6675ms | 0.4356ms | 2.2956 KOps/s | 2.2668 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 44.3928ms | 39.0428ms | 25.6129 Ops/s | 25.6919 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.7381ms | 2.6102ms | 383.1192 Ops/s | 392.4063 Ops/s | |
test_dqn_speed | 8.1862ms | 1.6178ms | 618.1329 Ops/s | 614.4082 Ops/s | |
test_ddpg_speed | 12.1894ms | 3.6107ms | 276.9511 Ops/s | 271.2368 Ops/s | |
test_sac_speed | 18.2228ms | 10.1580ms | 98.4442 Ops/s | 98.4520 Ops/s | |
test_redq_speed | 26.0371ms | 19.4827ms | 51.3276 Ops/s | 51.6185 Ops/s | |
test_redq_deprec_speed | 85.7187ms | 16.3149ms | 61.2935 Ops/s | 65.1665 Ops/s | |
test_td3_speed | 18.3125ms | 10.3685ms | 96.4459 Ops/s | 94.3028 Ops/s | |
test_cql_speed | 45.6702ms | 38.0346ms | 26.2919 Ops/s | 25.9011 Ops/s | |
test_a2c_speed | 16.5492ms | 8.0851ms | 123.6839 Ops/s | 123.3517 Ops/s | |
test_ppo_speed | 16.5383ms | 8.4347ms | 118.5579 Ops/s | 118.8014 Ops/s | |
test_reinforce_speed | 17.0420ms | 7.2367ms | 138.1847 Ops/s | 138.7615 Ops/s | |
test_iql_speed | 42.1560ms | 34.1999ms | 29.2399 Ops/s | 28.5804 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.3113ms | 1.8458ms | 541.7770 Ops/s | 544.2045 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.1063s | 2.1620ms | 462.5452 Ops/s | 494.7293 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.1818ms | 1.9559ms | 511.2844 Ops/s | 481.6566 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.9271ms | 1.8543ms | 539.2768 Ops/s | 531.1994 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.1043s | 2.1585ms | 463.2796 Ops/s | 476.5990 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.7859ms | 1.9571ms | 510.9670 Ops/s | 477.8078 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.3768ms | 1.8484ms | 541.0022 Ops/s | 522.3526 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.1053s | 2.1773ms | 459.2812 Ops/s | 491.6693 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 2.5979ms | 1.9338ms | 517.1148 Ops/s | 495.2307 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.7699ms | 1.8532ms | 539.6053 Ops/s | 530.8084 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.1070s | 2.1910ms | 456.4211 Ops/s | 495.8942 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 2.9042ms | 1.9980ms | 500.5127 Ops/s | 510.6322 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.5688ms | 1.9133ms | 522.6638 Ops/s | 536.2025 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.1358s | 2.3783ms | 420.4676 Ops/s | 512.4814 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 2.9786ms | 1.9673ms | 508.3050 Ops/s | 511.8242 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.2206ms | 1.8655ms | 536.0606 Ops/s | 538.7182 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.1160s | 2.1715ms | 460.5040 Ops/s | 510.5925 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 2.7042ms | 1.9696ms | 507.7169 Ops/s | 509.6035 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1667s | 17.6158ms | 56.7672 Ops/s | 58.4661 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1053s | 16.1956ms | 61.7450 Ops/s | 61.4954 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 0.1093s | 16.3162ms | 61.2889 Ops/s | 54.9021 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1188s | 16.7884ms | 59.5651 Ops/s | 61.6821 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1111s | 16.3934ms | 61.0002 Ops/s | 61.9529 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 0.1091s | 16.4373ms | 60.8371 Ops/s | 62.0388 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1160s | 16.7060ms | 59.8588 Ops/s | 62.2903 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1199s | 17.0734ms | 58.5707 Ops/s | 61.8246 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 0.1141s | 16.4904ms | 60.6414 Ops/s | 61.5437 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1153s | 0.1150s | 8.6934 Ops/s | 8.5465 Ops/s | |
test_sync | 0.1006s | 0.1002s | 9.9804 Ops/s | 9.9557 Ops/s | |
test_async | 0.2658s | 99.0745ms | 10.0934 Ops/s | 10.3963 Ops/s | |
test_single_pixels | 0.1381s | 0.1378s | 7.2585 Ops/s | 7.2684 Ops/s | |
test_sync_pixels | 93.6747ms | 91.7365ms | 10.9008 Ops/s | 10.7997 Ops/s | |
test_async_pixels | 0.1742s | 86.0832ms | 11.6167 Ops/s | 11.5436 Ops/s | |
test_simple | 0.9002s | 0.8349s | 1.1978 Ops/s | 1.1990 Ops/s | |
test_transformed | 1.1242s | 1.0619s | 0.9417 Ops/s | 0.9428 Ops/s | |
test_serial | 2.3627s | 2.3028s | 0.4342 Ops/s | 0.4349 Ops/s | |
test_parallel | 2.5106s | 2.4429s | 0.4094 Ops/s | 0.4126 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 71.9210μs | 27.7418μs | 36.0467 KOps/s | 35.6217 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 39.3300μs | 16.8743μs | 59.2618 KOps/s | 58.9770 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 40.6010μs | 16.4936μs | 60.6298 KOps/s | 60.4426 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 28.7200μs | 9.9874μs | 100.1264 KOps/s | 98.2805 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 56.3000μs | 29.4817μs | 33.9193 KOps/s | 33.8222 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 41.4200μs | 18.0488μs | 55.4053 KOps/s | 53.2611 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 44.8900μs | 17.9490μs | 55.7133 KOps/s | 54.9891 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 43.6510μs | 11.5271μs | 86.7518 KOps/s | 85.7221 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 67.3820μs | 30.9700μs | 32.2894 KOps/s | 31.9507 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 44.4300μs | 19.8865μs | 50.2855 KOps/s | 49.5662 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 43.4200μs | 18.0999μs | 55.2489 KOps/s | 55.1560 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 31.5100μs | 11.6410μs | 85.9031 KOps/s | 84.9741 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 59.1400μs | 33.1113μs | 30.2012 KOps/s | 30.6049 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 48.2310μs | 21.4734μs | 46.5693 KOps/s | 46.3052 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 55.2110μs | 19.5025μs | 51.2755 KOps/s | 51.1870 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 33.7110μs | 13.2172μs | 75.6592 KOps/s | 75.2350 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 62.9700μs | 30.9220μs | 32.3394 KOps/s | 32.3101 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 43.6500μs | 20.2183μs | 49.4602 KOps/s | 49.7100 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 46.2200μs | 22.0639μs | 45.3229 KOps/s | 46.2934 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 37.1210μs | 13.1858μs | 75.8393 KOps/s | 75.1700 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 61.5010μs | 33.5261μs | 29.8275 KOps/s | 30.4481 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 50.2300μs | 21.2835μs | 46.9847 KOps/s | 45.7998 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 0.1835ms | 22.9255μs | 43.6196 KOps/s | 43.6419 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 89.1510μs | 14.9223μs | 67.0138 KOps/s | 68.5061 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 66.0610μs | 33.7138μs | 29.6614 KOps/s | 29.5733 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 49.0300μs | 23.0663μs | 43.3532 KOps/s | 42.7691 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 59.5100μs | 23.0523μs | 43.3795 KOps/s | 43.2179 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 39.3300μs | 14.7487μs | 67.8024 KOps/s | 67.3332 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 71.0610μs | 35.1313μs | 28.4647 KOps/s | 28.2006 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 48.7910μs | 24.6406μs | 40.5834 KOps/s | 39.8929 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 49.5710μs | 23.5736μs | 42.4203 KOps/s | 41.7288 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 44.2810μs | 16.4649μs | 60.7351 KOps/s | 61.9318 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 26.6140ms | 25.8910ms | 38.6234 Ops/s | 37.4947 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 85.7796ms | 3.2969ms | 303.3171 Ops/s | 310.0842 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.1044ms | 66.4625μs | 15.0461 KOps/s | 14.6697 KOps/s | |
test_values[td1_return_estimate-False-False] | 60.1695ms | 56.9134ms | 17.5706 Ops/s | 17.6307 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 2.1589ms | 1.7864ms | 559.7978 Ops/s | 572.1826 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 92.9865ms | 91.2995ms | 10.9530 Ops/s | 10.8590 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 2.0178ms | 1.7821ms | 561.1276 Ops/s | 577.1634 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 25.2971ms | 24.4476ms | 40.9038 Ops/s | 40.7368 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.8818ms | 0.7185ms | 1.3918 KOps/s | 1.3949 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7746ms | 0.6944ms | 1.4400 KOps/s | 1.4340 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5481ms | 1.4756ms | 677.7084 Ops/s | 678.6787 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.9791ms | 0.6930ms | 1.4430 KOps/s | 1.3780 KOps/s | |
test_dqn_speed | 7.7039ms | 1.3681ms | 730.9192 Ops/s | 724.2191 Ops/s | |
test_ddpg_speed | 4.4519ms | 3.0791ms | 324.7704 Ops/s | 303.0028 Ops/s | |
test_sac_speed | 9.9046ms | 8.7512ms | 114.2694 Ops/s | 115.6547 Ops/s | |
test_redq_speed | 16.0487ms | 15.3840ms | 65.0026 Ops/s | 65.1090 Ops/s | |
test_redq_deprec_speed | 13.8389ms | 12.4675ms | 80.2085 Ops/s | 81.7304 Ops/s | |
test_td3_speed | 17.8773ms | 8.8848ms | 112.5513 Ops/s | 113.2423 Ops/s | |
test_cql_speed | 32.6927ms | 31.2298ms | 32.0207 Ops/s | 33.0996 Ops/s | |
test_a2c_speed | 8.5030ms | 7.1345ms | 140.1635 Ops/s | 143.1502 Ops/s | |
test_ppo_speed | 8.8351ms | 7.4654ms | 133.9508 Ops/s | 137.2279 Ops/s | |
test_reinforce_speed | 7.6448ms | 6.1752ms | 161.9390 Ops/s | 166.5317 Ops/s | |
test_iql_speed | 0.1273s | 29.2328ms | 34.2082 Ops/s | 38.0672 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.7141ms | 2.1244ms | 470.7132 Ops/s | 468.1961 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.4279ms | 2.2776ms | 439.0546 Ops/s | 382.3039 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.8170ms | 2.2957ms | 435.5893 Ops/s | 432.9420 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.7382ms | 2.1093ms | 474.0904 Ops/s | 475.6538 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 3.4474ms | 2.2960ms | 435.5352 Ops/s | 435.2561 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.1122ms | 2.2920ms | 436.2970 Ops/s | 433.6112 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.4948ms | 2.1306ms | 469.3418 Ops/s | 471.1282 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 3.7273ms | 2.2951ms | 435.7093 Ops/s | 433.5276 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.1187s | 2.5660ms | 389.7155 Ops/s | 433.0458 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.3264ms | 2.1198ms | 471.7463 Ops/s | 471.7108 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.2093ms | 2.2980ms | 435.1568 Ops/s | 435.1688 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.3050ms | 2.2857ms | 437.5082 Ops/s | 435.4531 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.5548ms | 2.1283ms | 469.8566 Ops/s | 471.0720 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 3.4413ms | 2.2956ms | 435.6139 Ops/s | 435.5780 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.5739ms | 2.3053ms | 433.7895 Ops/s | 434.0599 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.6422ms | 2.1342ms | 468.5667 Ops/s | 472.2010 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2044ms | 2.1497ms | 465.1890 Ops/s | 433.5703 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 3.1839ms | 2.2998ms | 434.8189 Ops/s | 433.8295 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.2181s | 18.2760ms | 54.7165 Ops/s | 55.4495 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1250s | 14.0430ms | 71.2098 Ops/s | 61.8474 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 0.1256s | 16.3552ms | 61.1428 Ops/s | 72.0612 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1273s | 16.3855ms | 61.0294 Ops/s | 61.6957 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1268s | 16.3877ms | 61.0215 Ops/s | 61.5741 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 0.1261s | 16.4227ms | 60.8914 Ops/s | 61.7320 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1261s | 14.0852ms | 70.9967 Ops/s | 61.7646 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1287s | 16.3716ms | 61.0814 Ops/s | 61.6801 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 0.1307s | 14.2142ms | 70.3524 Ops/s | 72.0727 Ops/s |
albertbou92
approved these changes
Dec 6, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fixes #1735
The issue was that a newly created tensor was being populated with a non-leaf tensor during a call to torch.vmap, making vmap complain about it.
The solution is to rely on
torch.masked_scatter
instead of__setitem__
(it,tensor[mask] = smth
).