Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] SAC compatibility with compile #2655

Merged
merged 6 commits into from
Dec 16, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 16, 2024

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2655

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 8 Unrelated Failures

As of commit 18f92c6 with merge base 187de7c (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 16, 2024
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: ef858e8d49d28d2db724835d787150d6177b1e13
Pull Request resolved: #2655
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: ec5fe4f410c9b5bece189a8845c5d937a64f858b
Pull Request resolved: #2655
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: ce4cbedbeeaa58ac7da30a74a3c3a240d2280f8a
Pull Request resolved: #2655
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: fca7dc8cd197ed9fdb4048d99a9e608e6e100af4
Pull Request resolved: #2655
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: 4dfbef136a90a2eb4bc13e31c3c9533f7145a8f4
Pull Request resolved: #2655
[ghstack-poisoned]
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4291s 0.4277s 2.3378 Ops/s 2.2238 Ops/s $\textbf{\color{#35bf28}+5.13\%}$
test_transformed 0.6036s 0.6025s 1.6599 Ops/s 1.5879 Ops/s $\color{#35bf28}+4.54\%$
test_serial 1.3499s 1.3476s 0.7421 Ops/s 0.7185 Ops/s $\color{#35bf28}+3.28\%$
test_parallel 1.3830s 1.3061s 0.7657 Ops/s 0.7624 Ops/s $\color{#35bf28}+0.43\%$
test_step_mdp_speed[True-True-True-True-True] 0.1639ms 29.4204μs 33.9900 KOps/s 33.8978 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-True-True-True-False] 47.4890μs 17.4289μs 57.3760 KOps/s 57.3100 KOps/s $\color{#35bf28}+0.12\%$
test_step_mdp_speed[True-True-True-False-True] 57.2470μs 16.8214μs 59.4481 KOps/s 59.2979 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-True-True-False-False] 35.6460μs 9.8699μs 101.3183 KOps/s 102.7175 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[True-True-False-True-True] 80.1990μs 31.9493μs 31.2996 KOps/s 31.5508 KOps/s $\color{#d91a1a}-0.80\%$
test_step_mdp_speed[True-True-False-True-False] 56.1250μs 19.5432μs 51.1686 KOps/s 51.7697 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-True-False-False-True] 0.6425ms 18.8679μs 53.0001 KOps/s 53.4509 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-True-False-False-False] 36.7490μs 11.7005μs 85.4661 KOps/s 85.1767 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[True-False-True-True-True] 84.8890μs 33.8209μs 29.5675 KOps/s 29.5587 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[True-False-True-True-False] 63.3060μs 21.2743μs 47.0051 KOps/s 48.0099 KOps/s $\color{#d91a1a}-2.09\%$
test_step_mdp_speed[True-False-True-False-True] 51.8170μs 18.4299μs 54.2596 KOps/s 54.3455 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[True-False-True-False-False] 45.4550μs 11.6884μs 85.5552 KOps/s 85.8505 KOps/s $\color{#d91a1a}-0.34\%$
test_step_mdp_speed[True-False-False-True-True] 93.5970μs 35.0019μs 28.5699 KOps/s 28.2218 KOps/s $\color{#35bf28}+1.23\%$
test_step_mdp_speed[True-False-False-True-False] 69.7200μs 22.9041μs 43.6604 KOps/s 44.1686 KOps/s $\color{#d91a1a}-1.15\%$
test_step_mdp_speed[True-False-False-False-True] 52.8780μs 20.2087μs 49.4837 KOps/s 49.1737 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-False-False-False-False] 37.6200μs 13.3572μs 74.8657 KOps/s 76.0851 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-True-True-True-True] 75.2210μs 33.4451μs 29.8997 KOps/s 29.8602 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[False-True-True-True-False] 60.2120μs 21.5673μs 46.3665 KOps/s 46.8655 KOps/s $\color{#d91a1a}-1.06\%$
test_step_mdp_speed[False-True-True-False-True] 58.6890μs 20.7525μs 48.1870 KOps/s 47.6408 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-True-True-False-False] 50.5950μs 12.8618μs 77.7497 KOps/s 77.1188 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-True-False-True-True] 75.4410μs 34.8265μs 28.7138 KOps/s 28.3021 KOps/s $\color{#35bf28}+1.45\%$
test_step_mdp_speed[False-True-False-True-False] 71.6830μs 22.7389μs 43.9775 KOps/s 43.6092 KOps/s $\color{#35bf28}+0.84\%$
test_step_mdp_speed[False-True-False-False-True] 2.7494ms 22.7133μs 44.0271 KOps/s 43.4776 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[False-True-False-False-False] 40.8660μs 14.6905μs 68.0714 KOps/s 66.9959 KOps/s $\color{#35bf28}+1.61\%$
test_step_mdp_speed[False-False-True-True-True] 84.4070μs 37.0981μs 26.9556 KOps/s 27.1434 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-False-True-True-False] 0.5859ms 24.7159μs 40.4598 KOps/s 40.0057 KOps/s $\color{#35bf28}+1.13\%$
test_step_mdp_speed[False-False-True-False-True] 57.8380μs 22.6954μs 44.0618 KOps/s 44.1443 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-False-True-False-False] 57.0870μs 14.4112μs 69.3903 KOps/s 67.1982 KOps/s $\color{#35bf28}+3.26\%$
test_step_mdp_speed[False-False-False-True-True] 95.4290μs 38.0226μs 26.3001 KOps/s 25.7444 KOps/s $\color{#35bf28}+2.16\%$
test_step_mdp_speed[False-False-False-True-False] 72.3250μs 26.4648μs 37.7861 KOps/s 37.9363 KOps/s $\color{#d91a1a}-0.40\%$
test_step_mdp_speed[False-False-False-False-True] 58.3290μs 23.9251μs 41.7971 KOps/s 41.4040 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[False-False-False-False-False] 58.3890μs 16.1542μs 61.9035 KOps/s 61.6656 KOps/s $\color{#35bf28}+0.39\%$
test_values[generalized_advantage_estimate-True-True] 9.7024ms 9.2165ms 108.5012 Ops/s 103.8720 Ops/s $\color{#35bf28}+4.46\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.2726ms 35.5608ms 28.1209 Ops/s 28.2656 Ops/s $\color{#d91a1a}-0.51\%$
test_values[td0_return_estimate-False-False] 0.2422ms 0.1773ms 5.6398 KOps/s 5.6278 KOps/s $\color{#35bf28}+0.21\%$
test_values[td1_return_estimate-False-False] 25.7017ms 23.0599ms 43.3653 Ops/s 42.5840 Ops/s $\color{#35bf28}+1.83\%$
test_values[vec_td1_return_estimate-False-False] 38.2710ms 35.8455ms 27.8975 Ops/s 28.0963 Ops/s $\color{#d91a1a}-0.71\%$
test_values[td_lambda_return_estimate-True-False] 35.6235ms 33.3493ms 29.9856 Ops/s 29.3240 Ops/s $\color{#35bf28}+2.26\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.6968ms 35.7887ms 27.9418 Ops/s 27.9581 Ops/s $\color{#d91a1a}-0.06\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.3742ms 8.1155ms 123.2205 Ops/s 116.9093 Ops/s $\textbf{\color{#35bf28}+5.40\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4988ms 1.9736ms 506.6923 Ops/s 515.2124 Ops/s $\color{#d91a1a}-1.65\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4381ms 0.3629ms 2.7554 KOps/s 2.7721 KOps/s $\color{#d91a1a}-0.60\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.4287ms 48.1514ms 20.7678 Ops/s 21.6875 Ops/s $\color{#d91a1a}-4.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9120ms 3.0217ms 330.9382 Ops/s 330.6267 Ops/s $\color{#35bf28}+0.09\%$
test_dqn_speed[False-None] 6.0195ms 1.4029ms 712.8243 Ops/s 714.7876 Ops/s $\color{#d91a1a}-0.27\%$
test_dqn_speed[False-backward] 2.0213ms 1.8991ms 526.5747 Ops/s 536.9262 Ops/s $\color{#d91a1a}-1.93\%$
test_dqn_speed[True-None] 0.7828ms 0.4668ms 2.1424 KOps/s 2.0943 KOps/s $\color{#35bf28}+2.30\%$
test_dqn_speed[True-backward] 0.9695ms 0.8953ms 1.1170 KOps/s 1.0996 KOps/s $\color{#35bf28}+1.58\%$
test_dqn_speed[reduce-overhead-None] 0.7523ms 0.4688ms 2.1333 KOps/s 2.1097 KOps/s $\color{#35bf28}+1.12\%$
test_dqn_speed[reduce-overhead-backward] 0.9735ms 0.9098ms 1.0991 KOps/s 1.0979 KOps/s $\color{#35bf28}+0.11\%$
test_ddpg_speed[False-None] 3.9005ms 2.9164ms 342.8864 Ops/s 344.8360 Ops/s $\color{#d91a1a}-0.57\%$
test_ddpg_speed[False-backward] 4.7299ms 4.1028ms 243.7351 Ops/s 250.0545 Ops/s $\color{#d91a1a}-2.53\%$
test_ddpg_speed[True-None] 1.2614ms 1.0069ms 993.1058 Ops/s 968.3148 Ops/s $\color{#35bf28}+2.56\%$
test_ddpg_speed[True-backward] 1.9335ms 1.8837ms 530.8695 Ops/s 519.9313 Ops/s $\color{#35bf28}+2.10\%$
test_ddpg_speed[reduce-overhead-None] 1.1926ms 0.9904ms 1.0097 KOps/s 968.0129 Ops/s $\color{#35bf28}+4.30\%$
test_ddpg_speed[reduce-overhead-backward] 1.9689ms 1.8874ms 529.8157 Ops/s 517.7472 Ops/s $\color{#35bf28}+2.33\%$
test_sac_speed[False-None] 9.2716ms 8.1255ms 123.0696 Ops/s 124.1092 Ops/s $\color{#d91a1a}-0.84\%$
test_sac_speed[False-backward] 11.2159ms 10.8713ms 91.9849 Ops/s 93.0812 Ops/s $\color{#d91a1a}-1.18\%$
test_sac_speed[True-None] 2.2050ms 1.8105ms 552.3247 Ops/s 539.6015 Ops/s $\color{#35bf28}+2.36\%$
test_sac_speed[True-backward] 4.4579ms 3.5518ms 281.5500 Ops/s 282.8711 Ops/s $\color{#d91a1a}-0.47\%$
test_sac_speed[reduce-overhead-None] 2.4002ms 1.8180ms 550.0680 Ops/s 541.7945 Ops/s $\color{#35bf28}+1.53\%$
test_sac_speed[reduce-overhead-backward] 3.5737ms 3.4870ms 286.7781 Ops/s 280.9837 Ops/s $\color{#35bf28}+2.06\%$
test_redq_speed[False-None] 19.5499ms 14.0879ms 70.9827 Ops/s 76.1906 Ops/s $\textbf{\color{#d91a1a}-6.84\%}$
test_redq_speed[False-backward] 23.1701ms 22.3452ms 44.7523 Ops/s 45.4539 Ops/s $\color{#d91a1a}-1.54\%$
test_redq_speed[True-None] 4.9229ms 4.4516ms 224.6374 Ops/s 222.5090 Ops/s $\color{#35bf28}+0.96\%$
test_redq_speed[True-backward] 13.5217ms 11.9991ms 83.3394 Ops/s 83.6525 Ops/s $\color{#d91a1a}-0.37\%$
test_redq_speed[reduce-overhead-None] 5.3378ms 4.4526ms 224.5881 Ops/s 222.0474 Ops/s $\color{#35bf28}+1.14\%$
test_redq_speed[reduce-overhead-backward] 12.1386ms 11.9224ms 83.8760 Ops/s 82.9292 Ops/s $\color{#35bf28}+1.14\%$
test_redq_deprec_speed[False-None] 14.4964ms 12.7081ms 78.6900 Ops/s 77.7742 Ops/s $\color{#35bf28}+1.18\%$
test_redq_deprec_speed[False-backward] 20.3526ms 18.4175ms 54.2963 Ops/s 54.1637 Ops/s $\color{#35bf28}+0.24\%$
test_redq_deprec_speed[True-None] 4.7606ms 3.5354ms 282.8512 Ops/s 276.4710 Ops/s $\color{#35bf28}+2.31\%$
test_redq_deprec_speed[True-backward] 9.1527ms 8.0781ms 123.7912 Ops/s 123.7801 Ops/s $+0.01\%$
test_redq_deprec_speed[reduce-overhead-None] 4.0111ms 3.5298ms 283.3033 Ops/s 278.8284 Ops/s $\color{#35bf28}+1.60\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.6230ms 7.9508ms 125.7736 Ops/s 123.8501 Ops/s $\color{#35bf28}+1.55\%$
test_td3_speed[False-None] 8.2035ms 7.9649ms 125.5508 Ops/s 123.5176 Ops/s $\color{#35bf28}+1.65\%$
test_td3_speed[False-backward] 10.8341ms 10.4199ms 95.9705 Ops/s 96.1410 Ops/s $\color{#d91a1a}-0.18\%$
test_td3_speed[True-None] 1.8636ms 1.6983ms 588.8163 Ops/s 571.9151 Ops/s $\color{#35bf28}+2.96\%$
test_td3_speed[True-backward] 3.3878ms 3.3044ms 302.6223 Ops/s 281.4635 Ops/s $\textbf{\color{#35bf28}+7.52\%}$
test_td3_speed[reduce-overhead-None] 1.8901ms 1.6934ms 590.5239 Ops/s 561.4740 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_td3_speed[reduce-overhead-backward] 3.4354ms 3.3011ms 302.9317 Ops/s 296.3272 Ops/s $\color{#35bf28}+2.23\%$
test_cql_speed[False-None] 39.2091ms 36.1316ms 27.6766 Ops/s 27.2520 Ops/s $\color{#35bf28}+1.56\%$
test_cql_speed[False-backward] 49.2996ms 46.5483ms 21.4831 Ops/s 21.4864 Ops/s $\color{#d91a1a}-0.02\%$
test_cql_speed[True-None] 16.6705ms 15.6194ms 64.0231 Ops/s 62.4088 Ops/s $\color{#35bf28}+2.59\%$
test_cql_speed[True-backward] 23.3996ms 22.6010ms 44.2458 Ops/s 42.8787 Ops/s $\color{#35bf28}+3.19\%$
test_cql_speed[reduce-overhead-None] 18.0488ms 15.9130ms 62.8417 Ops/s 63.2802 Ops/s $\color{#d91a1a}-0.69\%$
test_cql_speed[reduce-overhead-backward] 23.3917ms 22.1483ms 45.1502 Ops/s 44.4239 Ops/s $\color{#35bf28}+1.64\%$
test_a2c_speed[False-None] 9.1656ms 7.1287ms 140.2783 Ops/s 137.7825 Ops/s $\color{#35bf28}+1.81\%$
test_a2c_speed[False-backward] 14.3741ms 14.0399ms 71.2254 Ops/s 69.6592 Ops/s $\color{#35bf28}+2.25\%$
test_a2c_speed[True-None] 4.9403ms 4.1864ms 238.8665 Ops/s 236.3591 Ops/s $\color{#35bf28}+1.06\%$
test_a2c_speed[True-backward] 11.0215ms 10.5751ms 94.5615 Ops/s 93.1959 Ops/s $\color{#35bf28}+1.47\%$
test_a2c_speed[reduce-overhead-None] 5.1936ms 4.1819ms 239.1246 Ops/s 235.0086 Ops/s $\color{#35bf28}+1.75\%$
test_a2c_speed[reduce-overhead-backward] 11.8306ms 10.6183ms 94.1769 Ops/s 93.3423 Ops/s $\color{#35bf28}+0.89\%$
test_ppo_speed[False-None] 7.5223ms 7.3851ms 135.4082 Ops/s 133.2492 Ops/s $\color{#35bf28}+1.62\%$
test_ppo_speed[False-backward] 15.3343ms 14.6612ms 68.2071 Ops/s 67.7418 Ops/s $\color{#35bf28}+0.69\%$
test_ppo_speed[True-None] 4.0545ms 3.6750ms 272.1102 Ops/s 267.0520 Ops/s $\color{#35bf28}+1.89\%$
test_ppo_speed[True-backward] 9.9678ms 9.5307ms 104.9244 Ops/s 104.1411 Ops/s $\color{#35bf28}+0.75\%$
test_ppo_speed[reduce-overhead-None] 4.0203ms 3.6649ms 272.8582 Ops/s 269.4086 Ops/s $\color{#35bf28}+1.28\%$
test_ppo_speed[reduce-overhead-backward] 9.8533ms 9.5601ms 104.6009 Ops/s 104.0320 Ops/s $\color{#35bf28}+0.55\%$
test_reinforce_speed[False-None] 7.4448ms 6.4892ms 154.1010 Ops/s 152.0597 Ops/s $\color{#35bf28}+1.34\%$
test_reinforce_speed[False-backward] 10.6024ms 9.8517ms 101.5056 Ops/s 100.5662 Ops/s $\color{#35bf28}+0.93\%$
test_reinforce_speed[True-None] 3.5098ms 2.6958ms 370.9473 Ops/s 371.3517 Ops/s $\color{#d91a1a}-0.11\%$
test_reinforce_speed[True-backward] 8.8801ms 8.5045ms 117.5845 Ops/s 115.8982 Ops/s $\color{#35bf28}+1.45\%$
test_reinforce_speed[reduce-overhead-None] 3.2710ms 2.6292ms 380.3415 Ops/s 371.4722 Ops/s $\color{#35bf28}+2.39\%$
test_reinforce_speed[reduce-overhead-backward] 9.4746ms 8.5013ms 117.6294 Ops/s 114.9888 Ops/s $\color{#35bf28}+2.30\%$
test_iql_speed[False-None] 33.2844ms 31.5332ms 31.7126 Ops/s 31.1206 Ops/s $\color{#35bf28}+1.90\%$
test_iql_speed[False-backward] 45.7760ms 44.1266ms 22.6621 Ops/s 22.2964 Ops/s $\color{#35bf28}+1.64\%$
test_iql_speed[True-None] 11.6136ms 10.5556ms 94.7364 Ops/s 92.9761 Ops/s $\color{#35bf28}+1.89\%$
test_iql_speed[True-backward] 22.1540ms 21.3479ms 46.8431 Ops/s 46.7186 Ops/s $\color{#35bf28}+0.27\%$
test_iql_speed[reduce-overhead-None] 11.3825ms 10.4817ms 95.4042 Ops/s 92.7091 Ops/s $\color{#35bf28}+2.91\%$
test_iql_speed[reduce-overhead-backward] 22.4261ms 21.4196ms 46.6861 Ops/s 46.5162 Ops/s $\color{#35bf28}+0.37\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3571ms 4.9001ms 204.0789 Ops/s 199.6227 Ops/s $\color{#35bf28}+2.23\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8861ms 0.5028ms 1.9888 KOps/s 1.9722 KOps/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7225ms 0.4757ms 2.1022 KOps/s 2.0659 KOps/s $\color{#35bf28}+1.76\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.9638ms 4.7315ms 211.3512 Ops/s 215.2984 Ops/s $\color{#d91a1a}-1.83\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2557ms 0.5179ms 1.9309 KOps/s 2.0183 KOps/s $\color{#d91a1a}-4.33\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8091ms 0.4647ms 2.1521 KOps/s 2.1484 KOps/s $\color{#35bf28}+0.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.3693ms 1.6229ms 616.1782 Ops/s 604.6947 Ops/s $\color{#35bf28}+1.90\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7679ms 1.5667ms 638.2776 Ops/s 629.8050 Ops/s $\color{#35bf28}+1.35\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.3549ms 4.8392ms 206.6454 Ops/s 205.2856 Ops/s $\color{#35bf28}+0.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.3382ms 0.6317ms 1.5830 KOps/s 1.5306 KOps/s $\color{#35bf28}+3.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0516ms 0.6115ms 1.6352 KOps/s 1.6211 KOps/s $\color{#35bf28}+0.87\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.8082ms 4.7150ms 212.0880 Ops/s 212.1173 Ops/s $\color{#d91a1a}-0.01\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9267ms 0.5075ms 1.9703 KOps/s 1.9338 KOps/s $\color{#35bf28}+1.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8515ms 0.4863ms 2.0564 KOps/s 2.0760 KOps/s $\color{#d91a1a}-0.94\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1867ms 4.7223ms 211.7633 Ops/s 213.6493 Ops/s $\color{#d91a1a}-0.88\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2809ms 0.4951ms 2.0196 KOps/s 2.0276 KOps/s $\color{#d91a1a}-0.39\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6930ms 0.4633ms 2.1585 KOps/s 2.0820 KOps/s $\color{#35bf28}+3.67\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.2553ms 4.8589ms 205.8081 Ops/s 199.9175 Ops/s $\color{#35bf28}+2.95\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2765ms 0.6318ms 1.5827 KOps/s 1.5507 KOps/s $\color{#35bf28}+2.07\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0509ms 0.6126ms 1.6323 KOps/s 1.5990 KOps/s $\color{#35bf28}+2.09\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4086ms 4.1693ms 239.8496 Ops/s 38.4322 Ops/s $\textbf{\color{#35bf28}+524.08\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.3976s 10.1902ms 98.1338 Ops/s 433.7825 Ops/s $\textbf{\color{#d91a1a}-77.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 3.7538ms 1.2515ms 799.0105 Ops/s 746.6219 Ops/s $\textbf{\color{#35bf28}+7.02\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 5.5631ms 4.1690ms 239.8645 Ops/s 229.9625 Ops/s $\color{#35bf28}+4.31\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.1566ms 2.3553ms 424.5758 Ops/s 431.0581 Ops/s $\color{#d91a1a}-1.50\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.7021ms 1.2151ms 822.9826 Ops/s 824.8922 Ops/s $\color{#d91a1a}-0.23\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3643s 11.5248ms 86.7695 Ops/s 238.6658 Ops/s $\textbf{\color{#d91a1a}-63.64\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 6.5247ms 2.4060ms 415.6275 Ops/s 389.8454 Ops/s $\textbf{\color{#35bf28}+6.61\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.1217ms 1.5329ms 652.3476 Ops/s 679.9195 Ops/s $\color{#d91a1a}-4.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.0743ms 11.5100ms 86.8811 Ops/s 82.9833 Ops/s $\color{#35bf28}+4.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 16.7983ms 15.1535ms 65.9914 Ops/s 66.1975 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.1670ms 20.1663ms 49.5878 Ops/s 48.4388 Ops/s $\color{#35bf28}+2.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.2088ms 15.2264ms 65.6753 Ops/s 64.5275 Ops/s $\color{#35bf28}+1.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.3451ms 19.8797ms 50.3025 Ops/s 49.0729 Ops/s $\color{#35bf28}+2.51\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.9602ms 16.5293ms 60.4985 Ops/s 58.6549 Ops/s $\color{#35bf28}+3.14\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7507s 0.7490s 1.3351 Ops/s 1.2938 Ops/s $\color{#35bf28}+3.19\%$
test_transformed 1.0028s 1.0024s 0.9976 Ops/s 0.9912 Ops/s $\color{#35bf28}+0.65\%$
test_serial 2.1414s 2.1397s 0.4674 Ops/s 0.4647 Ops/s $\color{#35bf28}+0.56\%$
test_parallel 2.0228s 1.9828s 0.5043 Ops/s 0.5115 Ops/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-True-True-True-True] 0.1673ms 39.3781μs 25.3948 KOps/s 24.8007 KOps/s $\color{#35bf28}+2.40\%$
test_step_mdp_speed[True-True-True-True-False] 55.0310μs 22.6515μs 44.1473 KOps/s 43.5736 KOps/s $\color{#35bf28}+1.32\%$
test_step_mdp_speed[True-True-True-False-True] 49.6510μs 21.4870μs 46.5399 KOps/s 44.6337 KOps/s $\color{#35bf28}+4.27\%$
test_step_mdp_speed[True-True-True-False-False] 46.5110μs 12.5698μs 79.5555 KOps/s 79.1670 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-True-False-True-True] 86.9210μs 41.4479μs 24.1266 KOps/s 23.9330 KOps/s $\color{#35bf28}+0.81\%$
test_step_mdp_speed[True-True-False-True-False] 58.2010μs 24.6884μs 40.5048 KOps/s 41.0651 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[True-True-False-False-True] 62.7010μs 23.8594μs 41.9122 KOps/s 41.3064 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[True-True-False-False-False] 46.8800μs 14.9175μs 67.0353 KOps/s 67.9843 KOps/s $\color{#d91a1a}-1.40\%$
test_step_mdp_speed[True-False-True-True-True] 71.9810μs 44.1329μs 22.6588 KOps/s 22.3807 KOps/s $\color{#35bf28}+1.24\%$
test_step_mdp_speed[True-False-True-True-False] 54.5110μs 26.7939μs 37.3219 KOps/s 37.0720 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-True-False-True] 59.2310μs 23.6609μs 42.2639 KOps/s 40.5095 KOps/s $\color{#35bf28}+4.33\%$
test_step_mdp_speed[True-False-True-False-False] 43.8700μs 14.8200μs 67.4765 KOps/s 65.7266 KOps/s $\color{#35bf28}+2.66\%$
test_step_mdp_speed[True-False-False-True-True] 86.0020μs 45.4049μs 22.0241 KOps/s 21.2366 KOps/s $\color{#35bf28}+3.71\%$
test_step_mdp_speed[True-False-False-True-False] 51.9710μs 28.5306μs 35.0501 KOps/s 34.1037 KOps/s $\color{#35bf28}+2.78\%$
test_step_mdp_speed[True-False-False-False-True] 62.6810μs 25.7938μs 38.7690 KOps/s 37.8705 KOps/s $\color{#35bf28}+2.37\%$
test_step_mdp_speed[True-False-False-False-False] 0.3809ms 16.5662μs 60.3637 KOps/s 58.6602 KOps/s $\color{#35bf28}+2.90\%$
test_step_mdp_speed[False-True-True-True-True] 74.5210μs 43.8012μs 22.8304 KOps/s 22.3228 KOps/s $\color{#35bf28}+2.27\%$
test_step_mdp_speed[False-True-True-True-False] 0.4310ms 26.9024μs 37.1714 KOps/s 36.8520 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[False-True-True-False-True] 0.4137ms 27.7326μs 36.0587 KOps/s 34.8134 KOps/s $\color{#35bf28}+3.58\%$
test_step_mdp_speed[False-True-True-False-False] 0.4128ms 16.5713μs 60.3451 KOps/s 58.8670 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[False-True-False-True-True] 79.3710μs 45.3743μs 22.0389 KOps/s 21.2769 KOps/s $\color{#35bf28}+3.58\%$
test_step_mdp_speed[False-True-False-True-False] 0.4266ms 29.1450μs 34.3112 KOps/s 34.5802 KOps/s $\color{#d91a1a}-0.78\%$
test_step_mdp_speed[False-True-False-False-True] 3.3442ms 30.0173μs 33.3141 KOps/s 33.5443 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-True-False-False-False] 51.8110μs 18.6243μs 53.6933 KOps/s 53.1426 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[False-False-True-True-True] 0.4559ms 48.2358μs 20.7315 KOps/s 20.3533 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[False-False-True-True-False] 0.4388ms 31.2780μs 31.9713 KOps/s 31.8542 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-False-True-False-True] 0.4261ms 29.5157μs 33.8803 KOps/s 32.4962 KOps/s $\color{#35bf28}+4.26\%$
test_step_mdp_speed[False-False-True-False-False] 59.1610μs 18.4619μs 54.1655 KOps/s 52.3576 KOps/s $\color{#35bf28}+3.45\%$
test_step_mdp_speed[False-False-False-True-True] 0.4396ms 48.7018μs 20.5331 KOps/s 19.9212 KOps/s $\color{#35bf28}+3.07\%$
test_step_mdp_speed[False-False-False-True-False] 0.4249ms 33.1884μs 30.1310 KOps/s 29.7116 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[False-False-False-False-True] 0.4350ms 31.3944μs 31.8528 KOps/s 32.3157 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[False-False-False-False-False] 0.4229ms 20.5170μs 48.7400 KOps/s 48.4258 KOps/s $\color{#35bf28}+0.65\%$
test_values[generalized_advantage_estimate-True-True] 25.4353ms 24.5090ms 40.8014 Ops/s 41.3817 Ops/s $\color{#d91a1a}-1.40\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1149s 3.1924ms 313.2456 Ops/s 355.0592 Ops/s $\textbf{\color{#d91a1a}-11.78\%}$
test_values[td0_return_estimate-False-False] 0.1036ms 80.0608μs 12.4905 KOps/s 12.5316 KOps/s $\color{#d91a1a}-0.33\%$
test_values[td1_return_estimate-False-False] 56.8781ms 55.5790ms 17.9924 Ops/s 18.5525 Ops/s $\color{#d91a1a}-3.02\%$
test_values[vec_td1_return_estimate-False-False] 1.3238ms 1.0810ms 925.1030 Ops/s 926.1543 Ops/s $\color{#d91a1a}-0.11\%$
test_values[td_lambda_return_estimate-True-False] 90.2151ms 87.8948ms 11.3772 Ops/s 11.6506 Ops/s $\color{#d91a1a}-2.35\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2946ms 1.0741ms 930.9873 Ops/s 927.5659 Ops/s $\color{#35bf28}+0.37\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.3750ms 24.1039ms 41.4870 Ops/s 41.1849 Ops/s $\color{#35bf28}+0.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0343ms 0.7484ms 1.3362 KOps/s 1.3332 KOps/s $\color{#35bf28}+0.23\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8593ms 0.6655ms 1.5026 KOps/s 1.5044 KOps/s $\color{#d91a1a}-0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5256ms 1.4792ms 676.0472 Ops/s 676.6368 Ops/s $\color{#d91a1a}-0.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7551ms 0.6790ms 1.4727 KOps/s 1.4738 KOps/s $\color{#d91a1a}-0.08\%$
test_dqn_speed[False-None] 7.0476ms 1.5038ms 664.9810 Ops/s 666.3626 Ops/s $\color{#d91a1a}-0.21\%$
test_dqn_speed[False-backward] 2.3097ms 2.0947ms 477.3990 Ops/s 474.5411 Ops/s $\color{#35bf28}+0.60\%$
test_dqn_speed[True-None] 0.9547ms 0.5336ms 1.8742 KOps/s 1.8409 KOps/s $\color{#35bf28}+1.81\%$
test_dqn_speed[True-backward] 1.2553ms 1.1954ms 836.5374 Ops/s 817.9807 Ops/s $\color{#35bf28}+2.27\%$
test_dqn_speed[reduce-overhead-None] 0.6232ms 0.5481ms 1.8244 KOps/s 1.7653 KOps/s $\color{#35bf28}+3.35\%$
test_dqn_speed[reduce-overhead-backward] 1.1309ms 1.0621ms 941.5000 Ops/s 929.8864 Ops/s $\color{#35bf28}+1.25\%$
test_ddpg_speed[False-None] 3.1324ms 2.8122ms 355.5885 Ops/s 347.3204 Ops/s $\color{#35bf28}+2.38\%$
test_ddpg_speed[False-backward] 4.5878ms 4.1562ms 240.6046 Ops/s 238.0404 Ops/s $\color{#35bf28}+1.08\%$
test_ddpg_speed[True-None] 1.1542ms 1.0756ms 929.6944 Ops/s 927.4850 Ops/s $\color{#35bf28}+0.24\%$
test_ddpg_speed[True-backward] 2.3271ms 2.2793ms 438.7366 Ops/s 434.9631 Ops/s $\color{#35bf28}+0.87\%$
test_ddpg_speed[reduce-overhead-None] 1.5032ms 1.0826ms 923.6725 Ops/s 895.8134 Ops/s $\color{#35bf28}+3.11\%$
test_ddpg_speed[reduce-overhead-backward] 1.8226ms 1.7593ms 568.4126 Ops/s 559.7306 Ops/s $\color{#35bf28}+1.55\%$
test_sac_speed[False-None] 8.3488ms 7.9583ms 125.6542 Ops/s 123.9977 Ops/s $\color{#35bf28}+1.34\%$
test_sac_speed[False-backward] 11.5469ms 11.0851ms 90.2115 Ops/s 89.8644 Ops/s $\color{#35bf28}+0.39\%$
test_sac_speed[True-None] 1.6953ms 1.5614ms 640.4327 Ops/s 624.8173 Ops/s $\color{#35bf28}+2.50\%$
test_sac_speed[True-backward] 3.4913ms 3.4099ms 293.2657 Ops/s 293.8115 Ops/s $\color{#d91a1a}-0.19\%$
test_sac_speed[reduce-overhead-None] 22.6393ms 12.5443ms 79.7176 Ops/s 80.9672 Ops/s $\color{#d91a1a}-1.54\%$
test_sac_speed[reduce-overhead-backward] 1.3865ms 1.3298ms 751.9677 Ops/s 756.1436 Ops/s $\color{#d91a1a}-0.55\%$
test_redq_speed[False-None] 8.4040ms 7.4299ms 134.5910 Ops/s 132.5501 Ops/s $\color{#35bf28}+1.54\%$
test_redq_speed[False-backward] 12.1613ms 11.2330ms 89.0237 Ops/s 89.0429 Ops/s $\color{#d91a1a}-0.02\%$
test_redq_speed[True-None] 2.1897ms 2.0225ms 494.4341 Ops/s 499.4307 Ops/s $\color{#d91a1a}-1.00\%$
test_redq_speed[True-backward] 3.7235ms 3.6334ms 275.2261 Ops/s 258.6503 Ops/s $\textbf{\color{#35bf28}+6.41\%}$
test_redq_speed[reduce-overhead-None] 2.0910ms 1.9898ms 502.5706 Ops/s 492.4742 Ops/s $\color{#35bf28}+2.05\%$
test_redq_speed[reduce-overhead-backward] 3.7920ms 3.6893ms 271.0565 Ops/s 271.5987 Ops/s $\color{#d91a1a}-0.20\%$
test_redq_deprec_speed[False-None] 10.1541ms 9.2023ms 108.6682 Ops/s 109.7371 Ops/s $\color{#d91a1a}-0.97\%$
test_redq_deprec_speed[False-backward] 12.8154ms 12.0671ms 82.8698 Ops/s 82.6180 Ops/s $\color{#35bf28}+0.30\%$
test_redq_deprec_speed[True-None] 2.4130ms 2.3224ms 430.5924 Ops/s 422.5614 Ops/s $\color{#35bf28}+1.90\%$
test_redq_deprec_speed[True-backward] 4.4723ms 4.0089ms 249.4465 Ops/s 248.2589 Ops/s $\color{#35bf28}+0.48\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4402ms 2.3245ms 430.1927 Ops/s 430.6202 Ops/s $\color{#d91a1a}-0.10\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.6938ms 4.1948ms 238.3904 Ops/s 239.5152 Ops/s $\color{#d91a1a}-0.47\%$
test_td3_speed[False-None] 7.8866ms 7.8344ms 127.6422 Ops/s 125.5290 Ops/s $\color{#35bf28}+1.68\%$
test_td3_speed[False-backward] 11.0107ms 10.3915ms 96.2321 Ops/s 96.2304 Ops/s $+0.00\%$
test_td3_speed[True-None] 1.5999ms 1.5817ms 632.2478 Ops/s 632.6496 Ops/s $\color{#d91a1a}-0.06\%$
test_td3_speed[True-backward] 3.7309ms 3.2790ms 304.9679 Ops/s 304.2793 Ops/s $\color{#35bf28}+0.23\%$
test_td3_speed[reduce-overhead-None] 83.1907ms 26.2906ms 38.0364 Ops/s 36.9198 Ops/s $\color{#35bf28}+3.02\%$
test_td3_speed[reduce-overhead-backward] 1.4799ms 1.4387ms 695.0870 Ops/s 774.9897 Ops/s $\textbf{\color{#d91a1a}-10.31\%}$
test_cql_speed[False-None] 17.1138ms 16.5561ms 60.4009 Ops/s 59.0130 Ops/s $\color{#35bf28}+2.35\%$
test_cql_speed[False-backward] 22.6774ms 21.9387ms 45.5816 Ops/s 45.4016 Ops/s $\color{#35bf28}+0.40\%$
test_cql_speed[True-None] 3.1092ms 2.9568ms 338.2008 Ops/s 336.1386 Ops/s $\color{#35bf28}+0.61\%$
test_cql_speed[True-backward] 5.5163ms 5.0981ms 196.1527 Ops/s 185.5887 Ops/s $\textbf{\color{#35bf28}+5.69\%}$
test_cql_speed[reduce-overhead-None] 21.3829ms 13.1770ms 75.8898 Ops/s 76.9251 Ops/s $\color{#d91a1a}-1.35\%$
test_cql_speed[reduce-overhead-backward] 1.5733ms 1.5025ms 665.5788 Ops/s 589.8037 Ops/s $\textbf{\color{#35bf28}+12.85\%}$
test_a2c_speed[False-None] 3.5087ms 3.2738ms 305.4562 Ops/s 310.6924 Ops/s $\color{#d91a1a}-1.69\%$
test_a2c_speed[False-backward] 6.6305ms 6.1069ms 163.7496 Ops/s 158.9339 Ops/s $\color{#35bf28}+3.03\%$
test_a2c_speed[True-None] 1.0740ms 1.0025ms 997.5483 Ops/s 989.1140 Ops/s $\color{#35bf28}+0.85\%$
test_a2c_speed[True-backward] 2.6599ms 2.5922ms 385.7659 Ops/s 355.0226 Ops/s $\textbf{\color{#35bf28}+8.66\%}$
test_a2c_speed[reduce-overhead-None] 21.5234ms 11.6044ms 86.1742 Ops/s 86.9978 Ops/s $\color{#d91a1a}-0.95\%$
test_a2c_speed[reduce-overhead-backward] 1.0121ms 0.9584ms 1.0434 KOps/s 925.3893 Ops/s $\textbf{\color{#35bf28}+12.75\%}$
test_ppo_speed[False-None] 3.7812ms 3.6287ms 275.5811 Ops/s 270.3900 Ops/s $\color{#35bf28}+1.92\%$
test_ppo_speed[False-backward] 7.1388ms 6.7686ms 147.7406 Ops/s 143.4617 Ops/s $\color{#35bf28}+2.98\%$
test_ppo_speed[True-None] 1.0279ms 0.9574ms 1.0445 KOps/s 1.0537 KOps/s $\color{#d91a1a}-0.87\%$
test_ppo_speed[True-backward] 2.6163ms 2.5403ms 393.6493 Ops/s 368.4725 Ops/s $\textbf{\color{#35bf28}+6.83\%}$
test_ppo_speed[reduce-overhead-None] 0.5843ms 0.5040ms 1.9842 KOps/s 1.9202 KOps/s $\color{#35bf28}+3.34\%$
test_ppo_speed[reduce-overhead-backward] 1.0303ms 0.9485ms 1.0543 KOps/s 1.0226 KOps/s $\color{#35bf28}+3.09\%$
test_reinforce_speed[False-None] 2.3423ms 2.2329ms 447.8464 Ops/s 440.0597 Ops/s $\color{#35bf28}+1.77\%$
test_reinforce_speed[False-backward] 3.6277ms 3.2055ms 311.9622 Ops/s 307.4090 Ops/s $\color{#35bf28}+1.48\%$
test_reinforce_speed[True-None] 0.9057ms 0.8258ms 1.2109 KOps/s 1.1959 KOps/s $\color{#35bf28}+1.26\%$
test_reinforce_speed[True-backward] 2.5102ms 2.4343ms 410.7984 Ops/s 383.4004 Ops/s $\textbf{\color{#35bf28}+7.15\%}$
test_reinforce_speed[reduce-overhead-None] 21.4993ms 11.4906ms 87.0274 Ops/s 87.3393 Ops/s $\color{#d91a1a}-0.36\%$
test_reinforce_speed[reduce-overhead-backward] 1.1056ms 1.0249ms 975.6695 Ops/s 953.9301 Ops/s $\color{#35bf28}+2.28\%$
test_iql_speed[False-None] 9.9023ms 9.1706ms 109.0442 Ops/s 108.3434 Ops/s $\color{#35bf28}+0.65\%$
test_iql_speed[False-backward] 13.2326ms 12.7305ms 78.5512 Ops/s 77.7731 Ops/s $\color{#35bf28}+1.00\%$
test_iql_speed[True-None] 1.9496ms 1.7739ms 563.7334 Ops/s 557.2508 Ops/s $\color{#35bf28}+1.16\%$
test_iql_speed[True-backward] 4.9025ms 4.4337ms 225.5432 Ops/s 224.9927 Ops/s $\color{#35bf28}+0.24\%$
test_iql_speed[reduce-overhead-None] 20.6510ms 11.5508ms 86.5738 Ops/s 88.1424 Ops/s $\color{#d91a1a}-1.78\%$
test_iql_speed[reduce-overhead-backward] 1.6188ms 1.5777ms 633.8208 Ops/s 637.3893 Ops/s $\color{#d91a1a}-0.56\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.0183ms 6.4922ms 154.0302 Ops/s 153.5188 Ops/s $\color{#35bf28}+0.33\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5572ms 0.3407ms 2.9347 KOps/s 2.9531 KOps/s $\color{#d91a1a}-0.62\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6537ms 0.2995ms 3.3392 KOps/s 3.0852 KOps/s $\textbf{\color{#35bf28}+8.23\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4087ms 6.2015ms 161.2501 Ops/s 159.9089 Ops/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8290ms 0.2872ms 3.4820 KOps/s 3.1020 KOps/s $\textbf{\color{#35bf28}+12.25\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6475ms 0.2801ms 3.5702 KOps/s 3.4068 KOps/s $\color{#35bf28}+4.79\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4313ms 1.2237ms 817.2055 Ops/s 741.0869 Ops/s $\textbf{\color{#35bf28}+10.27\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4895ms 1.1760ms 850.3564 Ops/s 845.3673 Ops/s $\color{#35bf28}+0.59\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4633ms 6.2897ms 158.9901 Ops/s 155.6443 Ops/s $\color{#35bf28}+2.15\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8959ms 0.4958ms 2.0168 KOps/s 2.3310 KOps/s $\textbf{\color{#d91a1a}-13.48\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7037ms 0.4708ms 2.1241 KOps/s 2.5873 KOps/s $\textbf{\color{#d91a1a}-17.90\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2891ms 6.1930ms 161.4735 Ops/s 159.3364 Ops/s $\color{#35bf28}+1.34\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.5178ms 0.3030ms 3.3001 KOps/s 3.6627 KOps/s $\textbf{\color{#d91a1a}-9.90\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5107ms 0.2867ms 3.4880 KOps/s 2.9021 KOps/s $\textbf{\color{#35bf28}+20.19\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4275ms 6.1301ms 163.1307 Ops/s 160.1287 Ops/s $\color{#35bf28}+1.87\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5847ms 0.3375ms 2.9632 KOps/s 3.2440 KOps/s $\textbf{\color{#d91a1a}-8.65\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5458ms 0.3328ms 3.0051 KOps/s 3.2693 KOps/s $\textbf{\color{#d91a1a}-8.08\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5036ms 6.3530ms 157.4052 Ops/s 156.3443 Ops/s $\color{#35bf28}+0.68\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8309ms 0.4264ms 2.3452 KOps/s 2.1497 KOps/s $\textbf{\color{#35bf28}+9.10\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6157ms 0.3883ms 2.5754 KOps/s 2.2551 KOps/s $\textbf{\color{#35bf28}+14.21\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9154ms 5.2258ms 191.3565 Ops/s 184.8892 Ops/s $\color{#35bf28}+3.50\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.1608ms 2.0470ms 488.5193 Ops/s 446.6989 Ops/s $\textbf{\color{#35bf28}+9.36\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.8622ms 1.2293ms 813.4824 Ops/s 779.1553 Ops/s $\color{#35bf28}+4.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.2314ms 5.3357ms 187.4162 Ops/s 186.0425 Ops/s $\color{#35bf28}+0.74\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.3847ms 2.0247ms 493.9013 Ops/s 438.3480 Ops/s $\textbf{\color{#35bf28}+12.67\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.1077ms 1.2293ms 813.4588 Ops/s 853.4408 Ops/s $\color{#d91a1a}-4.68\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5076s 15.5894ms 64.1463 Ops/s 32.7967 Ops/s $\textbf{\color{#35bf28}+95.59\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.8580ms 2.2392ms 446.5900 Ops/s 445.8564 Ops/s $\color{#35bf28}+0.16\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.2658ms 1.3430ms 744.5922 Ops/s 739.5812 Ops/s $\color{#35bf28}+0.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.7860ms 13.1913ms 75.8076 Ops/s 74.6155 Ops/s $\color{#35bf28}+1.60\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.4354ms 17.3708ms 57.5680 Ops/s 59.5466 Ops/s $\color{#d91a1a}-3.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.5309ms 17.7038ms 56.4852 Ops/s 54.5358 Ops/s $\color{#35bf28}+3.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.4745ms 17.5350ms 57.0289 Ops/s 57.7470 Ops/s $\color{#d91a1a}-1.24\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.6765ms 17.4176ms 57.4133 Ops/s 55.2533 Ops/s $\color{#35bf28}+3.91\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.4218ms 18.8808ms 52.9638 Ops/s 53.6227 Ops/s $\color{#d91a1a}-1.23\%$

vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: b57caeaf6e2d3690fb3311f4c9b8cca8575d3974
Pull Request resolved: #2655
@vmoens vmoens merged commit 18f92c6 into gh/vmoens/56/base Dec 16, 2024
69 of 78 checks passed
vmoens added a commit that referenced this pull request Dec 16, 2024
ghstack-source-id: b57caeaf6e2d3690fb3311f4c9b8cca8575d3974
Pull Request resolved: #2655
@vmoens vmoens deleted the gh/vmoens/56/head branch December 16, 2024 04:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants