Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] DQN compatibility with compile #2571

Merged
merged 47 commits into from
Dec 15, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 15, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2571

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 2 New Failures, 8 Unrelated Failures

As of commit e5a358b with merge base bb6f87a (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 15, 2024
vmoens added a commit that referenced this pull request Nov 15, 2024
ghstack-source-id: 3d2b4d32e61eae7ef867057b4bcc4ba82d8118f7
Pull Request resolved: #2571
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4308s 0.4293s 2.3291 Ops/s 2.1817 Ops/s $\textbf{\color{#35bf28}+6.76\%}$
test_transformed 0.6127s 0.6093s 1.6413 Ops/s 1.5601 Ops/s $\textbf{\color{#35bf28}+5.20\%}$
test_serial 1.3511s 1.3366s 0.7482 Ops/s 0.7284 Ops/s $\color{#35bf28}+2.72\%$
test_parallel 1.3110s 1.2972s 0.7709 Ops/s 0.7590 Ops/s $\color{#35bf28}+1.57\%$
test_step_mdp_speed[True-True-True-True-True] 0.2123ms 29.5953μs 33.7892 KOps/s 32.1821 KOps/s $\color{#35bf28}+4.99\%$
test_step_mdp_speed[True-True-True-True-False] 56.3700μs 17.6059μs 56.7990 KOps/s 54.3276 KOps/s $\color{#35bf28}+4.55\%$
test_step_mdp_speed[True-True-True-False-True] 56.6660μs 17.0496μs 58.6525 KOps/s 58.8982 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-True-True-False-False] 51.1110μs 10.0038μs 99.9624 KOps/s 97.6044 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[True-True-False-True-True] 77.2480μs 32.2898μs 30.9696 KOps/s 30.8894 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[True-True-False-True-False] 65.2720μs 19.4278μs 51.4727 KOps/s 50.4369 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[True-True-False-False-True] 0.6011ms 18.8813μs 52.9624 KOps/s 52.9290 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-True-False-False-False] 38.2410μs 11.8041μs 84.7165 KOps/s 82.6202 KOps/s $\color{#35bf28}+2.54\%$
test_step_mdp_speed[True-False-True-True-True] 91.5140μs 33.6517μs 29.7162 KOps/s 29.3836 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[True-False-True-True-False] 51.7970μs 21.3231μs 46.8976 KOps/s 45.8400 KOps/s $\color{#35bf28}+2.31\%$
test_step_mdp_speed[True-False-True-False-True] 61.5850μs 18.6795μs 53.5346 KOps/s 52.8547 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[True-False-True-False-False] 51.0350μs 11.8513μs 84.3791 KOps/s 83.0074 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[True-False-False-True-True] 87.8340μs 35.4998μs 28.1692 KOps/s 28.1524 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-False-False-True-False] 93.4340μs 22.8889μs 43.6892 KOps/s 42.9449 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[True-False-False-False-True] 81.3280μs 20.1470μs 49.6353 KOps/s 49.0129 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[True-False-False-False-False] 40.6460μs 13.5188μs 73.9709 KOps/s 72.9543 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-True-True-True] 71.6340μs 33.3553μs 29.9803 KOps/s 29.4424 KOps/s $\color{#35bf28}+1.83\%$
test_step_mdp_speed[False-True-True-True-False] 55.4240μs 21.2601μs 47.0364 KOps/s 45.4173 KOps/s $\color{#35bf28}+3.56\%$
test_step_mdp_speed[False-True-True-False-True] 51.8970μs 21.4148μs 46.6967 KOps/s 46.9624 KOps/s $\color{#d91a1a}-0.57\%$
test_step_mdp_speed[False-True-True-False-False] 44.6740μs 13.3389μs 74.9685 KOps/s 74.5728 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-True-False-True-True] 72.0340μs 35.1325μs 28.4636 KOps/s 27.8184 KOps/s $\color{#35bf28}+2.32\%$
test_step_mdp_speed[False-True-False-True-False] 80.1030μs 22.9830μs 43.5104 KOps/s 42.5948 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[False-True-False-False-True] 2.8101ms 22.9269μs 43.6169 KOps/s 43.3573 KOps/s $\color{#35bf28}+0.60\%$
test_step_mdp_speed[False-True-False-False-False] 53.7810μs 14.9130μs 67.0557 KOps/s 66.6114 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[False-False-True-True-True] 0.6122ms 37.0911μs 26.9606 KOps/s 26.9848 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-False-True-True-False] 55.6440μs 24.8556μs 40.2325 KOps/s 39.3950 KOps/s $\color{#35bf28}+2.13\%$
test_step_mdp_speed[False-False-True-False-True] 83.7560μs 22.8832μs 43.7002 KOps/s 44.4548 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[False-False-True-False-False] 44.9840μs 14.9467μs 66.9046 KOps/s 66.6167 KOps/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[False-False-False-True-True] 89.6070μs 38.3983μs 26.0429 KOps/s 25.6802 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[False-False-False-True-False] 79.2570μs 26.3054μs 38.0151 KOps/s 36.8232 KOps/s $\color{#35bf28}+3.24\%$
test_step_mdp_speed[False-False-False-False-True] 55.0630μs 24.2940μs 41.1625 KOps/s 41.2347 KOps/s $\color{#d91a1a}-0.17\%$
test_step_mdp_speed[False-False-False-False-False] 66.8250μs 16.4849μs 60.6614 KOps/s 59.6053 KOps/s $\color{#35bf28}+1.77\%$
test_values[generalized_advantage_estimate-True-True] 11.4554ms 9.5069ms 105.1867 Ops/s 106.0848 Ops/s $\color{#d91a1a}-0.85\%$
test_values[vec_generalized_advantage_estimate-True-True] 40.3718ms 37.1488ms 26.9187 Ops/s 27.6887 Ops/s $\color{#d91a1a}-2.78\%$
test_values[td0_return_estimate-False-False] 0.2416ms 0.1813ms 5.5162 KOps/s 5.5352 KOps/s $\color{#d91a1a}-0.34\%$
test_values[td1_return_estimate-False-False] 27.3847ms 24.2553ms 41.2280 Ops/s 42.1270 Ops/s $\color{#d91a1a}-2.13\%$
test_values[vec_td1_return_estimate-False-False] 41.5074ms 35.7197ms 27.9957 Ops/s 27.6298 Ops/s $\color{#35bf28}+1.32\%$
test_values[td_lambda_return_estimate-True-False] 37.0006ms 34.5348ms 28.9563 Ops/s 28.8814 Ops/s $\color{#35bf28}+0.26\%$
test_values[vec_td_lambda_return_estimate-True-False] 38.4370ms 35.6585ms 28.0438 Ops/s 26.9014 Ops/s $\color{#35bf28}+4.25\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.2932ms 8.2878ms 120.6596 Ops/s 121.8982 Ops/s $\color{#d91a1a}-1.02\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3978ms 1.8867ms 530.0220 Ops/s 556.8892 Ops/s $\color{#d91a1a}-4.82\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4148ms 0.3593ms 2.7831 KOps/s 2.7608 KOps/s $\color{#35bf28}+0.81\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 50.5717ms 48.7592ms 20.5089 Ops/s 20.2576 Ops/s $\color{#35bf28}+1.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7891ms 3.0432ms 328.5971 Ops/s 304.9419 Ops/s $\textbf{\color{#35bf28}+7.76\%}$
test_dqn_speed[False-None] 1.7651ms 1.3689ms 730.5180 Ops/s 714.4918 Ops/s $\color{#35bf28}+2.24\%$
test_dqn_speed[False-backward] 1.8798ms 1.8414ms 543.0708 Ops/s 535.1741 Ops/s $\color{#35bf28}+1.48\%$
test_dqn_speed[True-None] 0.7674ms 0.4617ms 2.1660 KOps/s 2.1129 KOps/s $\color{#35bf28}+2.51\%$
test_dqn_speed[True-backward] 0.9273ms 0.8809ms 1.1353 KOps/s 875.1538 Ops/s $\textbf{\color{#35bf28}+29.72\%}$
test_dqn_speed[reduce-overhead-None] 0.6062ms 0.4641ms 2.1549 KOps/s 2.0946 KOps/s $\color{#35bf28}+2.87\%$
test_dqn_speed[reduce-overhead-backward] 1.0604ms 0.8967ms 1.1151 KOps/s 1.1016 KOps/s $\color{#35bf28}+1.23\%$
test_ddpg_speed[False-None] 3.5704ms 2.8558ms 350.1645 Ops/s 343.5486 Ops/s $\color{#35bf28}+1.93\%$
test_ddpg_speed[False-backward] 4.1203ms 3.9740ms 251.6332 Ops/s 247.5250 Ops/s $\color{#35bf28}+1.66\%$
test_ddpg_speed[True-None] 1.4576ms 0.9949ms 1.0051 KOps/s 987.3217 Ops/s $\color{#35bf28}+1.80\%$
test_ddpg_speed[True-backward] 1.9422ms 1.8750ms 533.3258 Ops/s 515.3567 Ops/s $\color{#35bf28}+3.49\%$
test_ddpg_speed[reduce-overhead-None] 1.6600ms 0.9964ms 1.0036 KOps/s 984.4337 Ops/s $\color{#35bf28}+1.95\%$
test_ddpg_speed[reduce-overhead-backward] 1.9036ms 1.8696ms 534.8791 Ops/s 523.0572 Ops/s $\color{#35bf28}+2.26\%$
test_sac_speed[False-None] 8.9941ms 7.9775ms 125.3518 Ops/s 121.8157 Ops/s $\color{#35bf28}+2.90\%$
test_sac_speed[False-backward] 13.4488ms 11.0125ms 90.8061 Ops/s 91.1158 Ops/s $\color{#d91a1a}-0.34\%$
test_sac_speed[True-None] 2.3474ms 1.8293ms 546.6585 Ops/s 544.0914 Ops/s $\color{#35bf28}+0.47\%$
test_sac_speed[True-backward] 9.7307ms 3.9890ms 250.6892 Ops/s 267.7028 Ops/s $\textbf{\color{#d91a1a}-6.36\%}$
test_sac_speed[reduce-overhead-None] 2.1546ms 1.8289ms 546.7631 Ops/s 543.2537 Ops/s $\color{#35bf28}+0.65\%$
test_sac_speed[reduce-overhead-backward] 3.7841ms 3.5757ms 279.6623 Ops/s 278.0149 Ops/s $\color{#35bf28}+0.59\%$
test_redq_speed[False-None] 19.7288ms 13.9970ms 71.4437 Ops/s 74.9670 Ops/s $\color{#d91a1a}-4.70\%$
test_redq_speed[False-backward] 24.7226ms 22.4735ms 44.4968 Ops/s 44.8101 Ops/s $\color{#d91a1a}-0.70\%$
test_redq_speed[True-None] 5.6032ms 4.8595ms 205.7820 Ops/s 208.1809 Ops/s $\color{#d91a1a}-1.15\%$
test_redq_speed[True-backward] 13.2432ms 12.2571ms 81.5851 Ops/s 79.6197 Ops/s $\color{#35bf28}+2.47\%$
test_redq_speed[reduce-overhead-None] 5.3896ms 4.7534ms 210.3738 Ops/s 204.1099 Ops/s $\color{#35bf28}+3.07\%$
test_redq_speed[reduce-overhead-backward] 13.5888ms 12.4147ms 80.5494 Ops/s 80.1315 Ops/s $\color{#35bf28}+0.52\%$
test_redq_deprec_speed[False-None] 15.2148ms 13.3078ms 75.1438 Ops/s 74.7969 Ops/s $\color{#35bf28}+0.46\%$
test_redq_deprec_speed[False-backward] 20.6573ms 18.8031ms 53.1828 Ops/s 51.1771 Ops/s $\color{#35bf28}+3.92\%$
test_redq_deprec_speed[True-None] 4.3370ms 3.6725ms 272.2920 Ops/s 272.8579 Ops/s $\color{#d91a1a}-0.21\%$
test_redq_deprec_speed[True-backward] 9.2336ms 8.4385ms 118.5048 Ops/s 116.9918 Ops/s $\color{#35bf28}+1.29\%$
test_redq_deprec_speed[reduce-overhead-None] 4.1204ms 3.6570ms 273.4460 Ops/s 273.5504 Ops/s $\color{#d91a1a}-0.04\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.6186ms 8.1001ms 123.4554 Ops/s 122.1051 Ops/s $\color{#35bf28}+1.11\%$
test_td3_speed[False-None] 9.7075ms 8.0301ms 124.5311 Ops/s 121.9295 Ops/s $\color{#35bf28}+2.13\%$
test_td3_speed[False-backward] 10.9828ms 10.4431ms 95.7566 Ops/s 94.5833 Ops/s $\color{#35bf28}+1.24\%$
test_td3_speed[True-None] 2.0220ms 1.7109ms 584.4761 Ops/s 567.7429 Ops/s $\color{#35bf28}+2.95\%$
test_td3_speed[True-backward] 3.4181ms 3.3297ms 300.3271 Ops/s 290.4989 Ops/s $\color{#35bf28}+3.38\%$
test_td3_speed[reduce-overhead-None] 1.8947ms 1.7013ms 587.7821 Ops/s 572.1311 Ops/s $\color{#35bf28}+2.74\%$
test_td3_speed[reduce-overhead-backward] 3.4812ms 3.3746ms 296.3357 Ops/s 295.7421 Ops/s $\color{#35bf28}+0.20\%$
test_cql_speed[False-None] 38.7020ms 36.9329ms 27.0761 Ops/s 27.1125 Ops/s $\color{#d91a1a}-0.13\%$
test_cql_speed[False-backward] 49.9428ms 46.5477ms 21.4833 Ops/s 21.3132 Ops/s $\color{#35bf28}+0.80\%$
test_cql_speed[True-None] 16.8930ms 15.9713ms 62.6121 Ops/s 63.2422 Ops/s $\color{#d91a1a}-1.00\%$
test_cql_speed[True-backward] 24.6466ms 22.6400ms 44.1696 Ops/s 44.2783 Ops/s $\color{#d91a1a}-0.25\%$
test_cql_speed[reduce-overhead-None] 17.5816ms 15.9840ms 62.5625 Ops/s 63.6282 Ops/s $\color{#d91a1a}-1.67\%$
test_cql_speed[reduce-overhead-backward] 24.0335ms 22.6358ms 44.1778 Ops/s 44.5489 Ops/s $\color{#d91a1a}-0.83\%$
test_a2c_speed[False-None] 8.4662ms 7.1656ms 139.5560 Ops/s 138.1382 Ops/s $\color{#35bf28}+1.03\%$
test_a2c_speed[False-backward] 15.8308ms 14.3732ms 69.5740 Ops/s 68.6668 Ops/s $\color{#35bf28}+1.32\%$
test_a2c_speed[True-None] 4.8839ms 4.2826ms 233.5018 Ops/s 234.1726 Ops/s $\color{#d91a1a}-0.29\%$
test_a2c_speed[True-backward] 11.6053ms 10.7310ms 93.1883 Ops/s 92.1850 Ops/s $\color{#35bf28}+1.09\%$
test_a2c_speed[reduce-overhead-None] 4.4893ms 4.1913ms 238.5898 Ops/s 237.7591 Ops/s $\color{#35bf28}+0.35\%$
test_a2c_speed[reduce-overhead-backward] 12.1918ms 11.1102ms 90.0077 Ops/s 91.7085 Ops/s $\color{#d91a1a}-1.85\%$
test_ppo_speed[False-None] 8.4227ms 7.4709ms 133.8522 Ops/s 132.8183 Ops/s $\color{#35bf28}+0.78\%$
test_ppo_speed[False-backward] 16.3910ms 15.0659ms 66.3749 Ops/s 67.7767 Ops/s $\color{#d91a1a}-2.07\%$
test_ppo_speed[True-None] 4.4163ms 3.6837ms 271.4640 Ops/s 271.0850 Ops/s $\color{#35bf28}+0.14\%$
test_ppo_speed[True-backward] 10.5169ms 9.6539ms 103.5848 Ops/s 103.3838 Ops/s $\color{#35bf28}+0.19\%$
test_ppo_speed[reduce-overhead-None] 4.0902ms 3.6878ms 271.1663 Ops/s 269.8612 Ops/s $\color{#35bf28}+0.48\%$
test_ppo_speed[reduce-overhead-backward] 11.3728ms 10.1122ms 98.8908 Ops/s 104.2411 Ops/s $\textbf{\color{#d91a1a}-5.13\%}$
test_reinforce_speed[False-None] 8.4338ms 6.6195ms 151.0678 Ops/s 152.0255 Ops/s $\color{#d91a1a}-0.63\%$
test_reinforce_speed[False-backward] 10.1038ms 9.9494ms 100.5084 Ops/s 101.3371 Ops/s $\color{#d91a1a}-0.82\%$
test_reinforce_speed[True-None] 3.1056ms 2.6602ms 375.9119 Ops/s 377.6812 Ops/s $\color{#d91a1a}-0.47\%$
test_reinforce_speed[True-backward] 9.6644ms 8.6374ms 115.7752 Ops/s 115.5869 Ops/s $\color{#35bf28}+0.16\%$
test_reinforce_speed[reduce-overhead-None] 3.2662ms 2.6823ms 372.8118 Ops/s 375.9838 Ops/s $\color{#d91a1a}-0.84\%$
test_reinforce_speed[reduce-overhead-backward] 8.9670ms 8.5902ms 116.4120 Ops/s 116.9043 Ops/s $\color{#d91a1a}-0.42\%$
test_iql_speed[False-None] 33.2565ms 32.4125ms 30.8523 Ops/s 30.4642 Ops/s $\color{#35bf28}+1.27\%$
test_iql_speed[False-backward] 47.3885ms 45.2289ms 22.1098 Ops/s 21.9745 Ops/s $\color{#35bf28}+0.62\%$
test_iql_speed[True-None] 11.6245ms 10.6873ms 93.5694 Ops/s 92.2197 Ops/s $\color{#35bf28}+1.46\%$
test_iql_speed[True-backward] 22.5183ms 21.7715ms 45.9315 Ops/s 45.5552 Ops/s $\color{#35bf28}+0.83\%$
test_iql_speed[reduce-overhead-None] 11.3944ms 10.7247ms 93.2426 Ops/s 94.1955 Ops/s $\color{#d91a1a}-1.01\%$
test_iql_speed[reduce-overhead-backward] 22.8365ms 21.8119ms 45.8465 Ops/s 44.9818 Ops/s $\color{#35bf28}+1.92\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.4261ms 5.0266ms 198.9429 Ops/s 193.4101 Ops/s $\color{#35bf28}+2.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9909ms 0.5162ms 1.9371 KOps/s 1.9005 KOps/s $\color{#35bf28}+1.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7666ms 0.4931ms 2.0279 KOps/s 1.1603 KOps/s $\textbf{\color{#35bf28}+74.78\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3343ms 4.8344ms 206.8519 Ops/s 207.6152 Ops/s $\color{#d91a1a}-0.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.9084ms 0.5084ms 1.9669 KOps/s 1.9845 KOps/s $\color{#d91a1a}-0.89\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8257ms 0.4957ms 2.0174 KOps/s 2.0787 KOps/s $\color{#d91a1a}-2.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.8464ms 1.6335ms 612.1743 Ops/s 609.8213 Ops/s $\color{#35bf28}+0.39\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.3902ms 1.5978ms 625.8476 Ops/s 637.3313 Ops/s $\color{#d91a1a}-1.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.5250ms 4.8994ms 204.1058 Ops/s 201.2176 Ops/s $\color{#35bf28}+1.44\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2232ms 0.6470ms 1.5455 KOps/s 1.5453 KOps/s $\color{#35bf28}+0.01\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.2542ms 0.6289ms 1.5900 KOps/s 1.6016 KOps/s $\color{#d91a1a}-0.73\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.4764ms 4.8002ms 208.3228 Ops/s 209.2082 Ops/s $\color{#d91a1a}-0.42\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.2291ms 0.5139ms 1.9459 KOps/s 1.9439 KOps/s $\color{#35bf28}+0.11\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8103ms 0.4963ms 2.0149 KOps/s 2.0170 KOps/s $\color{#d91a1a}-0.10\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1470ms 4.7323ms 211.3116 Ops/s 205.9574 Ops/s $\color{#35bf28}+2.60\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0008ms 0.5050ms 1.9804 KOps/s 2.0053 KOps/s $\color{#d91a1a}-1.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6924ms 0.4756ms 2.1027 KOps/s 2.0960 KOps/s $\color{#35bf28}+0.32\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.3365ms 4.9276ms 202.9387 Ops/s 200.8127 Ops/s $\color{#35bf28}+1.06\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0397ms 0.6552ms 1.5262 KOps/s 1.4973 KOps/s $\color{#35bf28}+1.93\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9976ms 0.6265ms 1.5962 KOps/s 1.6167 KOps/s $\color{#d91a1a}-1.27\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5802ms 4.2163ms 237.1749 Ops/s 230.5372 Ops/s $\color{#35bf28}+2.88\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.6525ms 2.3126ms 432.4209 Ops/s 468.9599 Ops/s $\textbf{\color{#d91a1a}-7.79\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 3.7575ms 1.2716ms 786.4281 Ops/s 755.9594 Ops/s $\color{#35bf28}+4.03\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4077s 12.3386ms 81.0464 Ops/s 239.5448 Ops/s $\textbf{\color{#d91a1a}-66.17\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 4.9975ms 2.2344ms 447.5525 Ops/s 436.5787 Ops/s $\color{#35bf28}+2.51\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.9796ms 1.3443ms 743.9051 Ops/s 727.4572 Ops/s $\color{#35bf28}+2.26\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.9241ms 4.4232ms 226.0822 Ops/s 242.7685 Ops/s $\textbf{\color{#d91a1a}-6.87\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.4552ms 2.4301ms 411.4999 Ops/s 392.3966 Ops/s $\color{#35bf28}+4.87\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.0301ms 1.4569ms 686.3727 Ops/s 685.9685 Ops/s $\color{#35bf28}+0.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.6697ms 11.3259ms 88.2931 Ops/s 82.3663 Ops/s $\textbf{\color{#35bf28}+7.20\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.4906ms 15.1128ms 66.1690 Ops/s 66.3794 Ops/s $\color{#d91a1a}-0.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.8161ms 19.8974ms 50.2579 Ops/s 49.4853 Ops/s $\color{#35bf28}+1.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.2576ms 15.0706ms 66.3545 Ops/s 63.1829 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.4638ms 20.0677ms 49.8313 Ops/s 49.0892 Ops/s $\color{#35bf28}+1.51\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.4031ms 16.4030ms 60.9643 Ops/s 59.7939 Ops/s $\color{#35bf28}+1.96\%$

Copy link

github-actions bot commented Dec 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}23$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7414s 0.7406s 1.3503 Ops/s 1.3131 Ops/s $\color{#35bf28}+2.83\%$
test_transformed 0.9981s 0.9975s 1.0025 Ops/s 0.9897 Ops/s $\color{#35bf28}+1.29\%$
test_serial 2.1360s 2.1281s 0.4699 Ops/s 0.4653 Ops/s $\color{#35bf28}+0.98\%$
test_parallel 2.0111s 1.9907s 0.5023 Ops/s 0.5069 Ops/s $\color{#d91a1a}-0.91\%$
test_step_mdp_speed[True-True-True-True-True] 0.1845ms 39.7512μs 25.1565 KOps/s 25.3212 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[True-True-True-True-False] 48.0520μs 22.9897μs 43.4978 KOps/s 44.0312 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-True-True-False-True] 0.4159ms 21.8417μs 45.7840 KOps/s 45.1928 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[True-True-True-False-False] 0.3879ms 12.7123μs 78.6637 KOps/s 78.3603 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[True-True-False-True-True] 77.4630μs 41.6616μs 24.0029 KOps/s 23.3854 KOps/s $\color{#35bf28}+2.64\%$
test_step_mdp_speed[True-True-False-True-False] 0.4166ms 24.6929μs 40.4974 KOps/s 39.6772 KOps/s $\color{#35bf28}+2.07\%$
test_step_mdp_speed[True-True-False-False-True] 0.4150ms 24.7974μs 40.3269 KOps/s 39.9654 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[True-True-False-False-False] 42.2610μs 14.9825μs 66.7444 KOps/s 66.6005 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[True-False-True-True-True] 0.4407ms 43.9247μs 22.7662 KOps/s 22.1448 KOps/s $\color{#35bf28}+2.81\%$
test_step_mdp_speed[True-False-True-True-False] 0.4043ms 27.1333μs 36.8551 KOps/s 36.2654 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[True-False-True-False-True] 0.4256ms 23.9049μs 41.8325 KOps/s 40.6487 KOps/s $\color{#35bf28}+2.91\%$
test_step_mdp_speed[True-False-True-False-False] 37.4110μs 15.0219μs 66.5693 KOps/s 66.5804 KOps/s $\color{#d91a1a}-0.02\%$
test_step_mdp_speed[True-False-False-True-True] 0.4328ms 45.9039μs 21.7846 KOps/s 20.8504 KOps/s $\color{#35bf28}+4.48\%$
test_step_mdp_speed[True-False-False-True-False] 0.4295ms 29.5492μs 33.8419 KOps/s 33.7194 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[True-False-False-False-True] 0.4039ms 26.2966μs 38.0277 KOps/s 36.6435 KOps/s $\color{#35bf28}+3.78\%$
test_step_mdp_speed[True-False-False-False-False] 43.4310μs 17.0816μs 58.5424 KOps/s 58.4897 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[False-True-True-True-True] 0.4215ms 44.3181μs 22.5641 KOps/s 22.0013 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[False-True-True-True-False] 0.4020ms 27.3496μs 36.5636 KOps/s 35.9952 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-True-True-False-True] 0.4084ms 27.9403μs 35.7906 KOps/s 34.9392 KOps/s $\color{#35bf28}+2.44\%$
test_step_mdp_speed[False-True-True-False-False] 42.3720μs 16.6213μs 60.1637 KOps/s 58.8319 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[False-True-False-True-True] 76.4030μs 46.3996μs 21.5519 KOps/s 20.9989 KOps/s $\color{#35bf28}+2.63\%$
test_step_mdp_speed[False-True-False-True-False] 0.4101ms 29.3837μs 34.0325 KOps/s 33.6379 KOps/s $\color{#35bf28}+1.17\%$
test_step_mdp_speed[False-True-False-False-True] 3.1918ms 30.8485μs 32.4165 KOps/s 32.7570 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[False-True-False-False-False] 50.3020μs 18.5864μs 53.8028 KOps/s 53.1105 KOps/s $\color{#35bf28}+1.30\%$
test_step_mdp_speed[False-False-True-True-True] 0.4287ms 48.9571μs 20.4261 KOps/s 20.0837 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-False-True-True-False] 0.4077ms 31.4884μs 31.7577 KOps/s 30.9974 KOps/s $\color{#35bf28}+2.45\%$
test_step_mdp_speed[False-False-True-False-True] 0.4079ms 30.0867μs 33.2372 KOps/s 32.5609 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[False-False-True-False-False] 45.9710μs 18.5414μs 53.9333 KOps/s 51.9401 KOps/s $\color{#35bf28}+3.84\%$
test_step_mdp_speed[False-False-False-True-True] 73.8120μs 49.8730μs 20.0509 KOps/s 19.4281 KOps/s $\color{#35bf28}+3.21\%$
test_step_mdp_speed[False-False-False-True-False] 0.4074ms 34.0611μs 29.3590 KOps/s 29.1172 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[False-False-False-False-True] 0.4064ms 31.9315μs 31.3170 KOps/s 31.2822 KOps/s $\color{#35bf28}+0.11\%$
test_step_mdp_speed[False-False-False-False-False] 0.3947ms 21.0632μs 47.4761 KOps/s 47.7159 KOps/s $\color{#d91a1a}-0.50\%$
test_values[generalized_advantage_estimate-True-True] 25.1726ms 24.5167ms 40.7884 Ops/s 40.5657 Ops/s $\color{#35bf28}+0.55\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1008s 2.9128ms 343.3088 Ops/s 348.8152 Ops/s $\color{#d91a1a}-1.58\%$
test_values[td0_return_estimate-False-False] 0.1033ms 79.6212μs 12.5595 KOps/s 12.4870 KOps/s $\color{#35bf28}+0.58\%$
test_values[td1_return_estimate-False-False] 55.4153ms 55.0084ms 18.1791 Ops/s 18.2865 Ops/s $\color{#d91a1a}-0.59\%$
test_values[vec_td1_return_estimate-False-False] 1.3596ms 1.0794ms 926.4754 Ops/s 929.5458 Ops/s $\color{#d91a1a}-0.33\%$
test_values[td_lambda_return_estimate-True-False] 89.8686ms 87.1299ms 11.4771 Ops/s 11.5642 Ops/s $\color{#d91a1a}-0.75\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3837ms 1.0764ms 929.0556 Ops/s 926.4800 Ops/s $\color{#35bf28}+0.28\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.6513ms 24.1957ms 41.3297 Ops/s 41.2419 Ops/s $\color{#35bf28}+0.21\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0397ms 0.7444ms 1.3433 KOps/s 1.3348 KOps/s $\color{#35bf28}+0.64\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7631ms 0.6829ms 1.4643 KOps/s 1.5064 KOps/s $\color{#d91a1a}-2.79\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.8642ms 1.4749ms 678.0052 Ops/s 678.1106 Ops/s $\color{#d91a1a}-0.02\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8194ms 0.6917ms 1.4456 KOps/s 1.4766 KOps/s $\color{#d91a1a}-2.10\%$
test_dqn_speed[False-None] 6.9076ms 1.5041ms 664.8559 Ops/s 675.3676 Ops/s $\color{#d91a1a}-1.56\%$
test_dqn_speed[False-backward] 2.1620ms 2.0854ms 479.5303 Ops/s 476.2656 Ops/s $\color{#35bf28}+0.69\%$
test_dqn_speed[True-None] 0.6787ms 0.5364ms 1.8643 KOps/s 1.8579 KOps/s $\color{#35bf28}+0.35\%$
test_dqn_speed[True-backward] 1.1667ms 1.0918ms 915.9416 Ops/s 828.4933 Ops/s $\textbf{\color{#35bf28}+10.56\%}$
test_dqn_speed[reduce-overhead-None] 0.6315ms 0.5490ms 1.8214 KOps/s 1.8019 KOps/s $\color{#35bf28}+1.08\%$
test_dqn_speed[reduce-overhead-backward] 1.0051ms 0.9496ms 1.0531 KOps/s 933.2143 Ops/s $\textbf{\color{#35bf28}+12.85\%}$
test_ddpg_speed[False-None] 3.1705ms 2.8317ms 353.1431 Ops/s 352.7416 Ops/s $\color{#35bf28}+0.11\%$
test_ddpg_speed[False-backward] 4.2931ms 4.0473ms 247.0798 Ops/s 242.5732 Ops/s $\color{#35bf28}+1.86\%$
test_ddpg_speed[True-None] 1.1424ms 1.0687ms 935.6942 Ops/s 921.2162 Ops/s $\color{#35bf28}+1.57\%$
test_ddpg_speed[True-backward] 2.1999ms 2.1359ms 468.1841 Ops/s 434.0801 Ops/s $\textbf{\color{#35bf28}+7.86\%}$
test_ddpg_speed[reduce-overhead-None] 1.1543ms 1.0853ms 921.3815 Ops/s 919.8313 Ops/s $\color{#35bf28}+0.17\%$
test_ddpg_speed[reduce-overhead-backward] 1.7504ms 1.6204ms 617.1425 Ops/s 563.8179 Ops/s $\textbf{\color{#35bf28}+9.46\%}$
test_sac_speed[False-None] 8.4623ms 7.9589ms 125.6451 Ops/s 125.5374 Ops/s $\color{#35bf28}+0.09\%$
test_sac_speed[False-backward] 11.2443ms 10.7606ms 92.9313 Ops/s 90.3594 Ops/s $\color{#35bf28}+2.85\%$
test_sac_speed[True-None] 1.6624ms 1.5309ms 653.1945 Ops/s 640.9744 Ops/s $\color{#35bf28}+1.91\%$
test_sac_speed[True-backward] 3.6896ms 3.2495ms 307.7425 Ops/s 307.7829 Ops/s $\color{#d91a1a}-0.01\%$
test_sac_speed[reduce-overhead-None] 23.0512ms 12.4624ms 80.2415 Ops/s 80.2822 Ops/s $\color{#d91a1a}-0.05\%$
test_sac_speed[reduce-overhead-backward] 1.3905ms 1.3174ms 759.0997 Ops/s 669.1748 Ops/s $\textbf{\color{#35bf28}+13.44\%}$
test_redq_speed[False-None] 8.1294ms 7.4225ms 134.7250 Ops/s 133.8217 Ops/s $\color{#35bf28}+0.68\%$
test_redq_speed[False-backward] 12.0608ms 11.1410ms 89.7582 Ops/s 86.4299 Ops/s $\color{#35bf28}+3.85\%$
test_redq_speed[True-None] 2.1098ms 1.9998ms 500.0508 Ops/s 494.0515 Ops/s $\color{#35bf28}+1.21\%$
test_redq_speed[True-backward] 3.7188ms 3.6363ms 275.0067 Ops/s 261.2147 Ops/s $\textbf{\color{#35bf28}+5.28\%}$
test_redq_speed[reduce-overhead-None] 2.0729ms 1.9938ms 501.5576 Ops/s 497.0452 Ops/s $\color{#35bf28}+0.91\%$
test_redq_speed[reduce-overhead-backward] 4.1858ms 3.6665ms 272.7426 Ops/s 259.7110 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_redq_deprec_speed[False-None] 9.8303ms 8.9652ms 111.5429 Ops/s 110.9693 Ops/s $\color{#35bf28}+0.52\%$
test_redq_deprec_speed[False-backward] 12.3194ms 11.8480ms 84.4025 Ops/s 81.6930 Ops/s $\color{#35bf28}+3.32\%$
test_redq_deprec_speed[True-None] 2.4128ms 2.3254ms 430.0415 Ops/s 419.8582 Ops/s $\color{#35bf28}+2.43\%$
test_redq_deprec_speed[True-backward] 4.3086ms 3.9918ms 250.5133 Ops/s 237.1266 Ops/s $\textbf{\color{#35bf28}+5.65\%}$
test_redq_deprec_speed[reduce-overhead-None] 2.4191ms 2.3165ms 431.6828 Ops/s 431.2549 Ops/s $\color{#35bf28}+0.10\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.5135ms 4.0102ms 249.3615 Ops/s 249.7299 Ops/s $\color{#d91a1a}-0.15\%$
test_td3_speed[False-None] 8.0201ms 7.8586ms 127.2486 Ops/s 128.2276 Ops/s $\color{#d91a1a}-0.76\%$
test_td3_speed[False-backward] 10.5772ms 10.1110ms 98.9018 Ops/s 100.1164 Ops/s $\color{#d91a1a}-1.21\%$
test_td3_speed[True-None] 1.6693ms 1.5963ms 626.4628 Ops/s 638.1550 Ops/s $\color{#d91a1a}-1.83\%$
test_td3_speed[True-backward] 3.1410ms 3.1019ms 322.3828 Ops/s 323.1476 Ops/s $\color{#d91a1a}-0.24\%$
test_td3_speed[reduce-overhead-None] 49.7743ms 25.5624ms 39.1200 Ops/s 36.7261 Ops/s $\textbf{\color{#35bf28}+6.52\%}$
test_td3_speed[reduce-overhead-backward] 1.3303ms 1.2669ms 789.3173 Ops/s 693.5950 Ops/s $\textbf{\color{#35bf28}+13.80\%}$
test_cql_speed[False-None] 17.0792ms 16.5323ms 60.4875 Ops/s 60.4625 Ops/s $\color{#35bf28}+0.04\%$
test_cql_speed[False-backward] 22.0083ms 21.5374ms 46.4308 Ops/s 45.5589 Ops/s $\color{#35bf28}+1.91\%$
test_cql_speed[True-None] 3.0476ms 2.9457ms 339.4729 Ops/s 339.8239 Ops/s $\color{#d91a1a}-0.10\%$
test_cql_speed[True-backward] 5.4828ms 5.0837ms 196.7053 Ops/s 194.5065 Ops/s $\color{#35bf28}+1.13\%$
test_cql_speed[reduce-overhead-None] 21.4523ms 13.1441ms 76.0797 Ops/s 76.0621 Ops/s $\color{#35bf28}+0.02\%$
test_cql_speed[reduce-overhead-backward] 1.5601ms 1.4968ms 668.0861 Ops/s 599.2689 Ops/s $\textbf{\color{#35bf28}+11.48\%}$
test_a2c_speed[False-None] 3.3269ms 3.1632ms 316.1396 Ops/s 314.2850 Ops/s $\color{#35bf28}+0.59\%$
test_a2c_speed[False-backward] 6.5461ms 5.9983ms 166.7143 Ops/s 158.4188 Ops/s $\textbf{\color{#35bf28}+5.24\%}$
test_a2c_speed[True-None] 1.0562ms 0.9963ms 1.0037 KOps/s 997.9119 Ops/s $\color{#35bf28}+0.58\%$
test_a2c_speed[True-backward] 2.6928ms 2.6094ms 383.2302 Ops/s 361.9684 Ops/s $\textbf{\color{#35bf28}+5.87\%}$
test_a2c_speed[reduce-overhead-None] 21.2312ms 11.4659ms 87.2153 Ops/s 86.1733 Ops/s $\color{#35bf28}+1.21\%$
test_a2c_speed[reduce-overhead-backward] 0.9948ms 0.9567ms 1.0453 KOps/s 888.4125 Ops/s $\textbf{\color{#35bf28}+17.66\%}$
test_ppo_speed[False-None] 3.9140ms 3.7005ms 270.2344 Ops/s 273.1902 Ops/s $\color{#d91a1a}-1.08\%$
test_ppo_speed[False-backward] 7.1762ms 6.7578ms 147.9762 Ops/s 143.2079 Ops/s $\color{#35bf28}+3.33\%$
test_ppo_speed[True-None] 1.0352ms 0.9575ms 1.0444 KOps/s 1.0600 KOps/s $\color{#d91a1a}-1.47\%$
test_ppo_speed[True-backward] 2.6562ms 2.5535ms 391.6213 Ops/s 391.9274 Ops/s $\color{#d91a1a}-0.08\%$
test_ppo_speed[reduce-overhead-None] 0.5581ms 0.5016ms 1.9938 KOps/s 1.9304 KOps/s $\color{#35bf28}+3.28\%$
test_ppo_speed[reduce-overhead-backward] 0.9864ms 0.9459ms 1.0572 KOps/s 1.0177 KOps/s $\color{#35bf28}+3.88\%$
test_reinforce_speed[False-None] 2.3476ms 2.2434ms 445.7531 Ops/s 441.4848 Ops/s $\color{#35bf28}+0.97\%$
test_reinforce_speed[False-backward] 3.7052ms 3.2160ms 310.9431 Ops/s 311.0866 Ops/s $\color{#d91a1a}-0.05\%$
test_reinforce_speed[True-None] 0.9186ms 0.8374ms 1.1942 KOps/s 1.2028 KOps/s $\color{#d91a1a}-0.71\%$
test_reinforce_speed[True-backward] 2.4752ms 2.4052ms 415.7584 Ops/s 388.4382 Ops/s $\textbf{\color{#35bf28}+7.03\%}$
test_reinforce_speed[reduce-overhead-None] 22.5110ms 11.6725ms 85.6717 Ops/s 86.3921 Ops/s $\color{#d91a1a}-0.83\%$
test_reinforce_speed[reduce-overhead-backward] 1.0586ms 1.0252ms 975.4427 Ops/s 959.0172 Ops/s $\color{#35bf28}+1.71\%$
test_iql_speed[False-None] 9.7013ms 9.1499ms 109.2909 Ops/s 109.9889 Ops/s $\color{#d91a1a}-0.63\%$
test_iql_speed[False-backward] 13.2476ms 12.6932ms 78.7826 Ops/s 78.8214 Ops/s $\color{#d91a1a}-0.05\%$
test_iql_speed[True-None] 1.8906ms 1.7653ms 566.4806 Ops/s 571.5241 Ops/s $\color{#d91a1a}-0.88\%$
test_iql_speed[True-backward] 4.3346ms 4.2521ms 235.1756 Ops/s 233.9091 Ops/s $\color{#35bf28}+0.54\%$
test_iql_speed[reduce-overhead-None] 20.4699ms 11.5295ms 86.7338 Ops/s 110.8517 Ops/s $\textbf{\color{#d91a1a}-21.76\%}$
test_iql_speed[reduce-overhead-backward] 1.4597ms 1.4049ms 711.7804 Ops/s 637.9139 Ops/s $\textbf{\color{#35bf28}+11.58\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.8777ms 6.4174ms 155.8258 Ops/s 153.8090 Ops/s $\color{#35bf28}+1.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6777ms 0.2716ms 3.6824 KOps/s 3.1384 KOps/s $\textbf{\color{#35bf28}+17.33\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4080ms 0.2494ms 4.0104 KOps/s 3.4884 KOps/s $\textbf{\color{#35bf28}+14.97\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5397ms 6.2076ms 161.0918 Ops/s 161.0681 Ops/s $\color{#35bf28}+0.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5566ms 0.2858ms 3.4993 KOps/s 3.4530 KOps/s $\color{#35bf28}+1.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5379ms 0.2851ms 3.5081 KOps/s 4.1401 KOps/s $\textbf{\color{#d91a1a}-15.26\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4401ms 1.2328ms 811.1701 Ops/s 803.4549 Ops/s $\color{#35bf28}+0.96\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5021ms 1.1840ms 844.5785 Ops/s 854.6761 Ops/s $\color{#d91a1a}-1.18\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6178ms 6.3820ms 156.6912 Ops/s 155.9071 Ops/s $\color{#35bf28}+0.50\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8882ms 0.4428ms 2.2584 KOps/s 2.4613 KOps/s $\textbf{\color{#d91a1a}-8.24\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6347ms 0.4205ms 2.3780 KOps/s 2.2441 KOps/s $\textbf{\color{#35bf28}+5.97\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3021ms 6.1729ms 161.9983 Ops/s 160.0216 Ops/s $\color{#35bf28}+1.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9270ms 0.2722ms 3.6741 KOps/s 3.4797 KOps/s $\textbf{\color{#35bf28}+5.59\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4437ms 0.2484ms 4.0257 KOps/s 3.9823 KOps/s $\color{#35bf28}+1.09\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5029ms 6.1204ms 163.3891 Ops/s 161.5691 Ops/s $\color{#35bf28}+1.13\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3361ms 0.3563ms 2.8065 KOps/s 2.9506 KOps/s $\color{#d91a1a}-4.88\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5747ms 0.3306ms 3.0249 KOps/s 4.0843 KOps/s $\textbf{\color{#d91a1a}-25.94\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5270ms 6.3274ms 158.0438 Ops/s 157.2574 Ops/s $\color{#35bf28}+0.50\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1928ms 0.4073ms 2.4551 KOps/s 2.4640 KOps/s $\color{#d91a1a}-0.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6052ms 0.3859ms 2.5917 KOps/s 2.5820 KOps/s $\color{#35bf28}+0.38\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.8873ms 5.2111ms 191.8994 Ops/s 192.5939 Ops/s $\color{#d91a1a}-0.36\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 10.2491ms 2.0680ms 483.5554 Ops/s 448.3594 Ops/s $\textbf{\color{#35bf28}+7.85\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.7348ms 1.1863ms 842.9493 Ops/s 812.6192 Ops/s $\color{#35bf28}+3.73\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.8067ms 5.2455ms 190.6389 Ops/s 192.2909 Ops/s $\color{#d91a1a}-0.86\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.4880s 11.7917ms 84.8057 Ops/s 422.0524 Ops/s $\textbf{\color{#d91a1a}-79.91\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.0659ms 1.0994ms 909.6019 Ops/s 852.4967 Ops/s $\textbf{\color{#35bf28}+6.70\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.4044ms 5.4730ms 182.7146 Ops/s 33.3335 Ops/s $\textbf{\color{#35bf28}+448.14\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.5927ms 2.1640ms 462.1064 Ops/s 471.9053 Ops/s $\color{#d91a1a}-2.08\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 8.6897ms 1.4251ms 701.7129 Ops/s 732.0584 Ops/s $\color{#d91a1a}-4.15\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.4374ms 13.2377ms 75.5418 Ops/s 75.2926 Ops/s $\color{#35bf28}+0.33\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 21.7004ms 18.5121ms 54.0188 Ops/s 59.5560 Ops/s $\textbf{\color{#d91a1a}-9.30\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.2809ms 17.8543ms 56.0089 Ops/s 54.6859 Ops/s $\color{#35bf28}+2.42\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.5112ms 18.1338ms 55.1456 Ops/s 56.8236 Ops/s $\color{#d91a1a}-2.95\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7490ms 17.4714ms 57.2363 Ops/s 55.5358 Ops/s $\color{#35bf28}+3.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.4336ms 19.2888ms 51.8435 Ops/s 52.7809 Ops/s $\color{#d91a1a}-1.78\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 14, 2024
ghstack-source-id: b853800ff1661109f635a60e84a7534d74988b09
Pull Request resolved: #2571
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Dec 14, 2024
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit e5a358b into gh/vmoens/41/base Dec 15, 2024
66 of 72 checks passed
vmoens added a commit that referenced this pull request Dec 15, 2024
ghstack-source-id: 113dc8c4a5562d217ed867ace1942b2f6b8a39f9
Pull Request resolved: #2571
@vmoens vmoens deleted the gh/vmoens/41/head branch December 15, 2024 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants