Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor,Feature] group_optimizers #2577

Merged
merged 1 commit into from
Nov 18, 2024
Merged

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Nov 18, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2577

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

❌ 3 New Failures, 8 Unrelated Failures

As of commit a51f5c5 with merge base 83a7a57 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 18, 2024
@vmoens vmoens merged commit a51f5c5 into gh/vmoens/46/base Nov 18, 2024
56 of 64 checks passed
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 81a94ed641544a420bb1c455921ca6a17ecd6a22
Pull Request resolved: #2577
@vmoens vmoens deleted the gh/vmoens/46/head branch November 18, 2024 15:20
@vmoens vmoens added the enhancement New feature or request label Nov 18, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4371s 0.4332s 2.3085 Ops/s 2.2423 Ops/s $\color{#35bf28}+2.95\%$
test_transformed 0.6251s 0.6152s 1.6255 Ops/s 1.6164 Ops/s $\color{#35bf28}+0.56\%$
test_serial 1.3676s 1.3551s 0.7379 Ops/s 0.7386 Ops/s $\color{#d91a1a}-0.09\%$
test_parallel 1.2874s 1.2753s 0.7841 Ops/s 0.7683 Ops/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[True-True-True-True-True] 0.1222ms 28.1914μs 35.4718 KOps/s 37.1191 KOps/s $\color{#d91a1a}-4.44\%$
test_step_mdp_speed[True-True-True-True-False] 48.1190μs 16.1894μs 61.7687 KOps/s 63.7480 KOps/s $\color{#d91a1a}-3.10\%$
test_step_mdp_speed[True-True-True-False-True] 39.9950μs 15.8046μs 63.2728 KOps/s 64.0795 KOps/s $\color{#d91a1a}-1.26\%$
test_step_mdp_speed[True-True-True-False-False] 41.5780μs 9.2282μs 108.3634 KOps/s 111.4921 KOps/s $\color{#d91a1a}-2.81\%$
test_step_mdp_speed[True-True-False-True-True] 82.6840μs 29.8879μs 33.4583 KOps/s 33.8647 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[True-True-False-True-False] 44.1130μs 17.9990μs 55.5588 KOps/s 56.8082 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[True-True-False-False-True] 60.4330μs 17.5490μs 56.9835 KOps/s 57.9719 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[True-True-False-False-False] 49.9430μs 11.0632μs 90.3897 KOps/s 93.3345 KOps/s $\color{#d91a1a}-3.16\%$
test_step_mdp_speed[True-False-True-True-True] 87.5630μs 31.4886μs 31.7576 KOps/s 32.2181 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-False-True-True-False] 47.6490μs 19.6718μs 50.8341 KOps/s 51.7890 KOps/s $\color{#d91a1a}-1.84\%$
test_step_mdp_speed[True-False-True-False-True] 73.2770μs 17.5267μs 57.0559 KOps/s 57.5546 KOps/s $\color{#d91a1a}-0.87\%$
test_step_mdp_speed[True-False-True-False-False] 40.6250μs 10.9462μs 91.3558 KOps/s 93.1768 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[True-False-False-True-True] 87.0530μs 33.0416μs 30.2649 KOps/s 30.3765 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-False-False-True-False] 70.9530μs 21.3914μs 46.7477 KOps/s 47.8547 KOps/s $\color{#d91a1a}-2.31\%$
test_step_mdp_speed[True-False-False-False-True] 62.2560μs 19.0826μs 52.4038 KOps/s 53.0782 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-False-False-False-False] 64.1290μs 12.5869μs 79.4480 KOps/s 81.4299 KOps/s $\color{#d91a1a}-2.43\%$
test_step_mdp_speed[False-True-True-True-True] 65.2730μs 31.3351μs 31.9130 KOps/s 32.6160 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[False-True-True-True-False] 83.2350μs 19.6175μs 50.9749 KOps/s 51.8277 KOps/s $\color{#d91a1a}-1.65\%$
test_step_mdp_speed[False-True-True-False-True] 72.2630μs 20.0691μs 49.8279 KOps/s 50.8135 KOps/s $\color{#d91a1a}-1.94\%$
test_step_mdp_speed[False-True-True-False-False] 64.2710μs 12.3040μs 81.2744 KOps/s 82.7231 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[False-True-False-True-True] 95.2280μs 33.1441μs 30.1713 KOps/s 30.6378 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[False-True-False-True-False] 80.6310μs 21.5338μs 46.4385 KOps/s 48.2138 KOps/s $\color{#d91a1a}-3.68\%$
test_step_mdp_speed[False-True-False-False-True] 2.9556ms 22.4929μs 44.4585 KOps/s 46.4615 KOps/s $\color{#d91a1a}-4.31\%$
test_step_mdp_speed[False-True-False-False-False] 39.7240μs 13.7838μs 72.5491 KOps/s 73.8113 KOps/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[False-False-True-True-True] 88.0440μs 34.8721μs 28.6762 KOps/s 29.1796 KOps/s $\color{#d91a1a}-1.73\%$
test_step_mdp_speed[False-False-True-True-False] 79.8190μs 22.6877μs 44.0767 KOps/s 44.7953 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-False-True-False-True] 49.8730μs 21.6916μs 46.1009 KOps/s 47.0103 KOps/s $\color{#d91a1a}-1.93\%$
test_step_mdp_speed[False-False-True-False-False] 49.2320μs 14.0079μs 71.3881 KOps/s 74.4018 KOps/s $\color{#d91a1a}-4.05\%$
test_step_mdp_speed[False-False-False-True-True] 77.4750μs 36.4903μs 27.4046 KOps/s 28.0734 KOps/s $\color{#d91a1a}-2.38\%$
test_step_mdp_speed[False-False-False-True-False] 52.9290μs 24.7034μs 40.4802 KOps/s 42.4139 KOps/s $\color{#d91a1a}-4.56\%$
test_step_mdp_speed[False-False-False-False-True] 96.4710μs 23.0978μs 43.2942 KOps/s 44.0878 KOps/s $\color{#d91a1a}-1.80\%$
test_step_mdp_speed[False-False-False-False-False] 44.0620μs 15.4088μs 64.8978 KOps/s 66.9764 KOps/s $\color{#d91a1a}-3.10\%$
test_values[generalized_advantage_estimate-True-True] 11.3793ms 9.7848ms 102.1989 Ops/s 103.7871 Ops/s $\color{#d91a1a}-1.53\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.4308ms 35.3972ms 28.2508 Ops/s 29.8809 Ops/s $\textbf{\color{#d91a1a}-5.46\%}$
test_values[td0_return_estimate-False-False] 0.2149ms 0.1652ms 6.0540 KOps/s 5.8864 KOps/s $\color{#35bf28}+2.85\%$
test_values[td1_return_estimate-False-False] 27.3052ms 24.5213ms 40.7809 Ops/s 42.1128 Ops/s $\color{#d91a1a}-3.16\%$
test_values[vec_td1_return_estimate-False-False] 38.4936ms 35.5563ms 28.1244 Ops/s 29.8131 Ops/s $\textbf{\color{#d91a1a}-5.66\%}$
test_values[td_lambda_return_estimate-True-False] 39.3311ms 35.4295ms 28.2251 Ops/s 29.4305 Ops/s $\color{#d91a1a}-4.10\%$
test_values[vec_td_lambda_return_estimate-True-False] 39.1300ms 35.5758ms 28.1090 Ops/s 29.7479 Ops/s $\textbf{\color{#d91a1a}-5.51\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.6990ms 8.6934ms 115.0302 Ops/s 117.3948 Ops/s $\color{#d91a1a}-2.01\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3483ms 1.7990ms 555.8659 Ops/s 556.5600 Ops/s $\color{#d91a1a}-0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5587ms 0.3611ms 2.7694 KOps/s 2.8193 KOps/s $\color{#d91a1a}-1.77\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 41.3498ms 39.3527ms 25.4112 Ops/s 28.4510 Ops/s $\textbf{\color{#d91a1a}-10.68\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9333ms 3.0467ms 328.2206 Ops/s 327.8376 Ops/s $\color{#35bf28}+0.12\%$
test_dqn_speed[False-None] 1.9073ms 1.3321ms 750.7132 Ops/s 745.4116 Ops/s $\color{#35bf28}+0.71\%$
test_dqn_speed[False-backward] 1.8374ms 1.7984ms 556.0438 Ops/s 550.1796 Ops/s $\color{#35bf28}+1.07\%$
test_dqn_speed[True-None] 0.6267ms 0.4566ms 2.1902 KOps/s 2.1080 KOps/s $\color{#35bf28}+3.90\%$
test_dqn_speed[True-backward] 1.0119ms 0.8892ms 1.1246 KOps/s 1.1064 KOps/s $\color{#35bf28}+1.64\%$
test_dqn_speed[reduce-overhead-None] 0.6304ms 0.4679ms 2.1372 KOps/s 2.0772 KOps/s $\color{#35bf28}+2.89\%$
test_dqn_speed[reduce-overhead-backward] 0.9387ms 0.8832ms 1.1322 KOps/s 1.1220 KOps/s $\color{#35bf28}+0.91\%$
test_ddpg_speed[False-None] 3.5233ms 2.7962ms 357.6289 Ops/s 353.0877 Ops/s $\color{#35bf28}+1.29\%$
test_ddpg_speed[False-backward] 6.2934ms 4.0874ms 244.6568 Ops/s 254.0990 Ops/s $\color{#d91a1a}-3.72\%$
test_ddpg_speed[True-None] 1.1605ms 0.9960ms 1.0040 KOps/s 970.5269 Ops/s $\color{#35bf28}+3.45\%$
test_ddpg_speed[True-backward] 1.9891ms 1.8892ms 529.3371 Ops/s 426.6352 Ops/s $\textbf{\color{#35bf28}+24.07\%}$
test_ddpg_speed[reduce-overhead-None] 1.2674ms 0.9958ms 1.0042 KOps/s 979.5830 Ops/s $\color{#35bf28}+2.51\%$
test_ddpg_speed[reduce-overhead-backward] 1.9733ms 1.8824ms 531.2344 Ops/s 522.7876 Ops/s $\color{#35bf28}+1.62\%$
test_sac_speed[False-None] 9.6100ms 7.8741ms 126.9981 Ops/s 126.4693 Ops/s $\color{#35bf28}+0.42\%$
test_sac_speed[False-backward] 11.3450ms 10.5039ms 95.2032 Ops/s 92.0890 Ops/s $\color{#35bf28}+3.38\%$
test_sac_speed[True-None] 2.4201ms 1.8113ms 552.0894 Ops/s 535.3956 Ops/s $\color{#35bf28}+3.12\%$
test_sac_speed[True-backward] 3.8558ms 3.7196ms 268.8464 Ops/s 281.3057 Ops/s $\color{#d91a1a}-4.43\%$
test_sac_speed[reduce-overhead-None] 2.2498ms 1.8357ms 544.7495 Ops/s 538.8684 Ops/s $\color{#35bf28}+1.09\%$
test_sac_speed[reduce-overhead-backward] 3.5937ms 3.4873ms 286.7530 Ops/s 284.9230 Ops/s $\color{#35bf28}+0.64\%$
test_redq_speed[False-None] 13.8783ms 12.5978ms 79.3787 Ops/s 78.5138 Ops/s $\color{#35bf28}+1.10\%$
test_redq_speed[False-backward] 23.9828ms 22.2688ms 44.9059 Ops/s 45.1125 Ops/s $\color{#d91a1a}-0.46\%$
test_redq_speed[True-None] 6.2839ms 4.8736ms 205.1877 Ops/s 190.5988 Ops/s $\textbf{\color{#35bf28}+7.65\%}$
test_redq_speed[True-backward] 13.1408ms 12.1910ms 82.0280 Ops/s 82.0852 Ops/s $\color{#d91a1a}-0.07\%$
test_redq_speed[reduce-overhead-None] 5.0781ms 4.5891ms 217.9068 Ops/s 205.0634 Ops/s $\textbf{\color{#35bf28}+6.26\%}$
test_redq_speed[reduce-overhead-backward] 14.7841ms 12.6952ms 78.7697 Ops/s 81.2893 Ops/s $\color{#d91a1a}-3.10\%$
test_redq_deprec_speed[False-None] 15.4536ms 13.4332ms 74.4422 Ops/s 77.8454 Ops/s $\color{#d91a1a}-4.37\%$
test_redq_deprec_speed[False-backward] 20.4273ms 19.0839ms 52.4001 Ops/s 53.2908 Ops/s $\color{#d91a1a}-1.67\%$
test_redq_deprec_speed[True-None] 10.4168ms 3.6599ms 273.2299 Ops/s 277.0212 Ops/s $\color{#d91a1a}-1.37\%$
test_redq_deprec_speed[True-backward] 9.2418ms 8.0334ms 124.4805 Ops/s 117.2842 Ops/s $\textbf{\color{#35bf28}+6.14\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.1832ms 3.7703ms 265.2295 Ops/s 264.6159 Ops/s $\color{#35bf28}+0.23\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.0343ms 8.5593ms 116.8316 Ops/s 115.8109 Ops/s $\color{#35bf28}+0.88\%$
test_td3_speed[False-None] 8.3527ms 7.7569ms 128.9172 Ops/s 128.0195 Ops/s $\color{#35bf28}+0.70\%$
test_td3_speed[False-backward] 11.6358ms 10.0826ms 99.1806 Ops/s 94.1528 Ops/s $\textbf{\color{#35bf28}+5.34\%}$
test_td3_speed[True-None] 1.9518ms 1.7133ms 583.6750 Ops/s 576.1306 Ops/s $\color{#35bf28}+1.31\%$
test_td3_speed[True-backward] 3.6920ms 3.4775ms 287.5618 Ops/s 293.8829 Ops/s $\color{#d91a1a}-2.15\%$
test_td3_speed[reduce-overhead-None] 1.9363ms 1.6919ms 591.0365 Ops/s 572.4296 Ops/s $\color{#35bf28}+3.25\%$
test_td3_speed[reduce-overhead-backward] 3.4813ms 3.2732ms 305.5126 Ops/s 297.9921 Ops/s $\color{#35bf28}+2.52\%$
test_cql_speed[False-None] 38.1723ms 35.6131ms 28.0795 Ops/s 27.7742 Ops/s $\color{#35bf28}+1.10\%$
test_cql_speed[False-backward] 47.1036ms 45.2886ms 22.0806 Ops/s 21.4168 Ops/s $\color{#35bf28}+3.10\%$
test_cql_speed[True-None] 17.1361ms 15.5989ms 64.1071 Ops/s 63.9948 Ops/s $\color{#35bf28}+0.18\%$
test_cql_speed[True-backward] 27.4201ms 22.9033ms 43.6619 Ops/s 44.8209 Ops/s $\color{#d91a1a}-2.59\%$
test_cql_speed[reduce-overhead-None] 16.5995ms 15.5662ms 64.2417 Ops/s 63.2888 Ops/s $\color{#35bf28}+1.51\%$
test_cql_speed[reduce-overhead-backward] 23.0918ms 22.5585ms 44.3293 Ops/s 44.6748 Ops/s $\color{#d91a1a}-0.77\%$
test_a2c_speed[False-None] 7.7772ms 7.0053ms 142.7481 Ops/s 141.8302 Ops/s $\color{#35bf28}+0.65\%$
test_a2c_speed[False-backward] 15.4536ms 14.1146ms 70.8486 Ops/s 71.4052 Ops/s $\color{#d91a1a}-0.78\%$
test_a2c_speed[True-None] 3.5893ms 3.2906ms 303.8946 Ops/s 295.2287 Ops/s $\color{#35bf28}+2.94\%$
test_a2c_speed[True-backward] 11.0955ms 9.8501ms 101.5215 Ops/s 94.1103 Ops/s $\textbf{\color{#35bf28}+7.88\%}$
test_a2c_speed[reduce-overhead-None] 3.7492ms 3.2860ms 304.3247 Ops/s 291.9075 Ops/s $\color{#35bf28}+4.25\%$
test_a2c_speed[reduce-overhead-backward] 10.3472ms 10.0156ms 99.8440 Ops/s 100.3153 Ops/s $\color{#d91a1a}-0.47\%$
test_ppo_speed[False-None] 9.1734ms 7.6807ms 130.1965 Ops/s 134.5448 Ops/s $\color{#d91a1a}-3.23\%$
test_ppo_speed[False-backward] 27.8133ms 15.5780ms 64.1930 Ops/s 68.9511 Ops/s $\textbf{\color{#d91a1a}-6.90\%}$
test_ppo_speed[True-None] 4.3495ms 3.6822ms 271.5749 Ops/s 265.4083 Ops/s $\color{#35bf28}+2.32\%$
test_ppo_speed[True-backward] 10.3814ms 9.5495ms 104.7174 Ops/s 103.2391 Ops/s $\color{#35bf28}+1.43\%$
test_ppo_speed[reduce-overhead-None] 3.9845ms 3.7151ms 269.1715 Ops/s 263.4933 Ops/s $\color{#35bf28}+2.15\%$
test_ppo_speed[reduce-overhead-backward] 10.1764ms 9.5835ms 104.3463 Ops/s 103.4739 Ops/s $\color{#35bf28}+0.84\%$
test_reinforce_speed[False-None] 7.7967ms 6.4472ms 155.1060 Ops/s 153.8780 Ops/s $\color{#35bf28}+0.80\%$
test_reinforce_speed[False-backward] 13.9605ms 9.7958ms 102.0847 Ops/s 103.7910 Ops/s $\color{#d91a1a}-1.64\%$
test_reinforce_speed[True-None] 3.3674ms 2.6518ms 377.0968 Ops/s 362.7467 Ops/s $\color{#35bf28}+3.96\%$
test_reinforce_speed[True-backward] 9.3184ms 8.6192ms 116.0195 Ops/s 115.2565 Ops/s $\color{#35bf28}+0.66\%$
test_reinforce_speed[reduce-overhead-None] 3.9137ms 2.6228ms 381.2779 Ops/s 362.4631 Ops/s $\textbf{\color{#35bf28}+5.19\%}$
test_reinforce_speed[reduce-overhead-backward] 10.3129ms 8.7049ms 114.8773 Ops/s 115.0440 Ops/s $\color{#d91a1a}-0.14\%$
test_iql_speed[False-None] 33.8926ms 32.3759ms 30.8871 Ops/s 29.9207 Ops/s $\color{#35bf28}+3.23\%$
test_iql_speed[False-backward] 47.8973ms 45.2363ms 22.1062 Ops/s 21.2903 Ops/s $\color{#35bf28}+3.83\%$
test_iql_speed[True-None] 11.6978ms 10.7756ms 92.8020 Ops/s 87.7477 Ops/s $\textbf{\color{#35bf28}+5.76\%}$
test_iql_speed[True-backward] 23.0961ms 22.1868ms 45.0718 Ops/s 45.5885 Ops/s $\color{#d91a1a}-1.13\%$
test_iql_speed[reduce-overhead-None] 11.4657ms 10.9639ms 91.2088 Ops/s 91.1964 Ops/s $\color{#35bf28}+0.01\%$
test_iql_speed[reduce-overhead-backward] 24.0840ms 22.0019ms 45.4507 Ops/s 43.9669 Ops/s $\color{#35bf28}+3.37\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.4164ms 4.8430ms 206.4819 Ops/s 198.3276 Ops/s $\color{#35bf28}+4.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.6734ms 0.5228ms 1.9128 KOps/s 1.9090 KOps/s $\color{#35bf28}+0.20\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.5096ms 0.4962ms 2.0155 KOps/s 2.0351 KOps/s $\color{#d91a1a}-0.96\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.0579ms 4.6605ms 214.5701 Ops/s 214.2358 Ops/s $\color{#35bf28}+0.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.7663ms 0.5079ms 1.9688 KOps/s 1.9487 KOps/s $\color{#35bf28}+1.03\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.2044ms 0.4903ms 2.0395 KOps/s 2.0611 KOps/s $\color{#d91a1a}-1.05\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.4632ms 1.6390ms 610.1313 Ops/s 609.6114 Ops/s $\color{#35bf28}+0.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 3.8703ms 1.5911ms 628.5032 Ops/s 630.8726 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2879ms 4.8002ms 208.3238 Ops/s 207.8518 Ops/s $\color{#35bf28}+0.23\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.7984ms 0.6358ms 1.5727 KOps/s 1.5186 KOps/s $\color{#35bf28}+3.56\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8831ms 0.6025ms 1.6597 KOps/s 1.6209 KOps/s $\color{#35bf28}+2.39\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8562ms 4.6288ms 216.0404 Ops/s 214.3174 Ops/s $\color{#35bf28}+0.80\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.5202ms 0.5202ms 1.9222 KOps/s 1.9539 KOps/s $\color{#d91a1a}-1.62\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 7.4899ms 0.5097ms 1.9617 KOps/s 1.9704 KOps/s $\color{#d91a1a}-0.44\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.1825ms 4.8386ms 206.6732 Ops/s 206.9545 Ops/s $\color{#d91a1a}-0.14\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1612ms 0.5175ms 1.9324 KOps/s 1.9787 KOps/s $\color{#d91a1a}-2.34\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.5472ms 0.4990ms 2.0041 KOps/s 2.0739 KOps/s $\color{#d91a1a}-3.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2575ms 4.9917ms 200.3315 Ops/s 212.9467 Ops/s $\textbf{\color{#d91a1a}-5.92\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.1973ms 0.6475ms 1.5445 KOps/s 1.5350 KOps/s $\color{#35bf28}+0.62\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.3558ms 0.6220ms 1.6077 KOps/s 1.5919 KOps/s $\color{#35bf28}+0.99\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.3018ms 4.0890ms 244.5558 Ops/s 38.8210 Ops/s $\textbf{\color{#35bf28}+529.96\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.4962ms 2.3502ms 425.4888 Ops/s 393.6262 Ops/s $\textbf{\color{#35bf28}+8.09\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.6369ms 1.2569ms 795.6178 Ops/s 755.1959 Ops/s $\textbf{\color{#35bf28}+5.35\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3898s 11.8099ms 84.6746 Ops/s 234.0351 Ops/s $\textbf{\color{#d91a1a}-63.82\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.1833ms 2.2224ms 449.9688 Ops/s 431.4914 Ops/s $\color{#35bf28}+4.28\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.8053ms 1.1865ms 842.8192 Ops/s 684.9454 Ops/s $\textbf{\color{#35bf28}+23.05\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 5.6202ms 4.2684ms 234.2778 Ops/s 237.5932 Ops/s $\color{#d91a1a}-1.40\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.3799ms 2.4851ms 402.3922 Ops/s 403.5113 Ops/s $\color{#d91a1a}-0.28\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.1320ms 1.4702ms 680.1955 Ops/s 648.1807 Ops/s $\color{#35bf28}+4.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 11.2825ms 10.9394ms 91.4131 Ops/s 88.8492 Ops/s $\color{#35bf28}+2.89\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.1246ms 14.4380ms 69.2619 Ops/s 69.5331 Ops/s $\color{#d91a1a}-0.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.7934ms 19.8976ms 50.2573 Ops/s 50.0406 Ops/s $\color{#35bf28}+0.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.8881ms 14.7998ms 67.5684 Ops/s 69.0961 Ops/s $\color{#d91a1a}-2.21\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.1562ms 20.1592ms 49.6052 Ops/s 50.1638 Ops/s $\color{#d91a1a}-1.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.4043ms 15.8602ms 63.0508 Ops/s 63.8501 Ops/s $\color{#d91a1a}-1.25\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}14$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7469s 0.7462s 1.3401 Ops/s 1.3095 Ops/s $\color{#35bf28}+2.33\%$
test_transformed 0.9884s 0.9878s 1.0123 Ops/s 1.0139 Ops/s $\color{#d91a1a}-0.16\%$
test_serial 2.1391s 2.1365s 0.4681 Ops/s 0.4664 Ops/s $\color{#35bf28}+0.35\%$
test_parallel 2.1157s 1.9900s 0.5025 Ops/s 0.4968 Ops/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[True-True-True-True-True] 0.1260ms 34.2505μs 29.1967 KOps/s 28.3621 KOps/s $\color{#35bf28}+2.94\%$
test_step_mdp_speed[True-True-True-True-False] 48.4410μs 20.3963μs 49.0285 KOps/s 49.0335 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[True-True-True-False-True] 0.2139ms 19.4697μs 51.3620 KOps/s 51.0525 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[True-True-True-False-False] 37.1010μs 11.5155μs 86.8396 KOps/s 87.0024 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-True-False-True-True] 69.2410μs 37.8585μs 26.4141 KOps/s 26.5669 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[True-True-False-True-False] 55.7810μs 22.3247μs 44.7935 KOps/s 44.9030 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[True-True-False-False-True] 56.2110μs 22.0397μs 45.3727 KOps/s 46.2750 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[True-True-False-False-False] 42.3110μs 13.6227μs 73.4069 KOps/s 74.2149 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-False-True-True-True] 92.5520μs 39.2641μs 25.4686 KOps/s 25.0144 KOps/s $\color{#35bf28}+1.82\%$
test_step_mdp_speed[True-False-True-True-False] 56.5910μs 24.4262μs 40.9396 KOps/s 40.9716 KOps/s $\color{#d91a1a}-0.08\%$
test_step_mdp_speed[True-False-True-False-True] 55.2110μs 21.8812μs 45.7012 KOps/s 45.9258 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[True-False-True-False-False] 41.4110μs 13.5624μs 73.7335 KOps/s 72.8484 KOps/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[True-False-False-True-True] 84.2610μs 41.2173μs 24.2616 KOps/s 24.0421 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[True-False-False-True-False] 59.2620μs 26.0154μs 38.4387 KOps/s 37.7726 KOps/s $\color{#35bf28}+1.76\%$
test_step_mdp_speed[True-False-False-False-True] 57.8420μs 23.7263μs 42.1473 KOps/s 42.4723 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[True-False-False-False-False] 43.9910μs 15.4221μs 64.8420 KOps/s 64.5563 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-True-True-True-True] 68.7420μs 39.6830μs 25.1997 KOps/s 25.1443 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[False-True-True-True-False] 73.2210μs 24.4686μs 40.8687 KOps/s 40.5466 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-True-True-False-True] 56.2310μs 25.6678μs 38.9594 KOps/s 38.8525 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[False-True-True-False-False] 45.2600μs 15.0051μs 66.6438 KOps/s 65.9874 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-True-False-True-True] 0.1937ms 40.5275μs 24.6746 KOps/s 23.7952 KOps/s $\color{#35bf28}+3.70\%$
test_step_mdp_speed[False-True-False-True-False] 59.4020μs 26.0061μs 38.4525 KOps/s 38.0352 KOps/s $\color{#35bf28}+1.10\%$
test_step_mdp_speed[False-True-False-False-True] 3.6127ms 27.1798μs 36.7920 KOps/s 36.7750 KOps/s $\color{#35bf28}+0.05\%$
test_step_mdp_speed[False-True-False-False-False] 56.2610μs 16.7224μs 59.8000 KOps/s 58.0603 KOps/s $\color{#35bf28}+3.00\%$
test_step_mdp_speed[False-False-True-True-True] 95.3820μs 43.6619μs 22.9033 KOps/s 23.0415 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-False-True-True-False] 91.0320μs 27.4625μs 36.4133 KOps/s 35.2160 KOps/s $\color{#35bf28}+3.40\%$
test_step_mdp_speed[False-False-True-False-True] 74.6210μs 26.5059μs 37.7275 KOps/s 36.6052 KOps/s $\color{#35bf28}+3.07\%$
test_step_mdp_speed[False-False-True-False-False] 73.7720μs 16.7460μs 59.7156 KOps/s 58.6199 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[False-False-False-True-True] 83.9420μs 44.2671μs 22.5902 KOps/s 22.0520 KOps/s $\color{#35bf28}+2.44\%$
test_step_mdp_speed[False-False-False-True-False] 84.1720μs 29.3704μs 34.0479 KOps/s 33.1399 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[False-False-False-False-True] 58.4710μs 28.3959μs 35.2164 KOps/s 34.7253 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[False-False-False-False-False] 49.8610μs 18.4824μs 54.1056 KOps/s 53.1836 KOps/s $\color{#35bf28}+1.73\%$
test_values[generalized_advantage_estimate-True-True] 27.0447ms 26.0286ms 38.4193 Ops/s 37.9540 Ops/s $\color{#35bf28}+1.23\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1043s 2.9850ms 335.0030 Ops/s 352.2553 Ops/s $\color{#d91a1a}-4.90\%$
test_values[td0_return_estimate-False-False] 88.2120μs 66.2249μs 15.1001 KOps/s 15.0953 KOps/s $\color{#35bf28}+0.03\%$
test_values[td1_return_estimate-False-False] 58.4306ms 57.9063ms 17.2693 Ops/s 16.6351 Ops/s $\color{#35bf28}+3.81\%$
test_values[vec_td1_return_estimate-False-False] 1.3197ms 1.0900ms 917.4057 Ops/s 919.5235 Ops/s $\color{#d91a1a}-0.23\%$
test_values[td_lambda_return_estimate-True-False] 93.8337ms 91.9154ms 10.8796 Ops/s 10.4434 Ops/s $\color{#35bf28}+4.18\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3006ms 1.0843ms 922.2677 Ops/s 918.4065 Ops/s $\color{#35bf28}+0.42\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.0795ms 25.8025ms 38.7559 Ops/s 37.2675 Ops/s $\color{#35bf28}+3.99\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0381ms 0.7703ms 1.2982 KOps/s 1.3439 KOps/s $\color{#d91a1a}-3.40\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8156ms 0.6962ms 1.4365 KOps/s 1.4922 KOps/s $\color{#d91a1a}-3.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5297ms 1.4764ms 677.3037 Ops/s 678.2256 Ops/s $\color{#d91a1a}-0.14\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8836ms 0.7026ms 1.4232 KOps/s 1.4596 KOps/s $\color{#d91a1a}-2.49\%$
test_dqn_speed[False-None] 0.1013s 1.4653ms 682.4725 Ops/s 764.9195 Ops/s $\textbf{\color{#d91a1a}-10.78\%}$
test_dqn_speed[False-backward] 1.9212ms 1.8081ms 553.0541 Ops/s 550.3473 Ops/s $\color{#35bf28}+0.49\%$
test_dqn_speed[True-None] 1.1340ms 0.5635ms 1.7746 KOps/s 1.7561 KOps/s $\color{#35bf28}+1.05\%$
test_dqn_speed[True-backward] 1.0738ms 1.0338ms 967.3072 Ops/s 982.8097 Ops/s $\color{#d91a1a}-1.58\%$
test_dqn_speed[reduce-overhead-None] 0.9282ms 0.5669ms 1.7638 KOps/s 1.7868 KOps/s $\color{#d91a1a}-1.28\%$
test_dqn_speed[reduce-overhead-backward] 1.3581ms 1.0659ms 938.2037 Ops/s 991.2392 Ops/s $\textbf{\color{#d91a1a}-5.35\%}$
test_ddpg_speed[False-None] 3.3489ms 2.7102ms 368.9827 Ops/s 370.6926 Ops/s $\color{#d91a1a}-0.46\%$
test_ddpg_speed[False-backward] 4.2841ms 3.9796ms 251.2806 Ops/s 254.2039 Ops/s $\color{#d91a1a}-1.15\%$
test_ddpg_speed[True-None] 1.4464ms 1.2660ms 789.9114 Ops/s 810.7269 Ops/s $\color{#d91a1a}-2.57\%$
test_ddpg_speed[True-backward] 2.4180ms 2.2852ms 437.5950 Ops/s 445.5060 Ops/s $\color{#d91a1a}-1.78\%$
test_ddpg_speed[reduce-overhead-None] 1.6913ms 1.2566ms 795.8072 Ops/s 807.4015 Ops/s $\color{#d91a1a}-1.44\%$
test_ddpg_speed[reduce-overhead-backward] 2.3016ms 2.2567ms 443.1183 Ops/s 448.5969 Ops/s $\color{#d91a1a}-1.22\%$
test_sac_speed[False-None] 8.5997ms 7.5269ms 132.8571 Ops/s 132.5206 Ops/s $\color{#35bf28}+0.25\%$
test_sac_speed[False-backward] 11.1142ms 10.7701ms 92.8497 Ops/s 92.6997 Ops/s $\color{#35bf28}+0.16\%$
test_sac_speed[True-None] 2.3609ms 2.0312ms 492.3255 Ops/s 491.6623 Ops/s $\color{#35bf28}+0.13\%$
test_sac_speed[True-backward] 4.1229ms 3.9936ms 250.3983 Ops/s 224.8784 Ops/s $\textbf{\color{#35bf28}+11.35\%}$
test_sac_speed[reduce-overhead-None] 2.3440ms 2.0351ms 491.3671 Ops/s 495.0688 Ops/s $\color{#d91a1a}-0.75\%$
test_sac_speed[reduce-overhead-backward] 4.1211ms 4.0021ms 249.8676 Ops/s 252.7289 Ops/s $\color{#d91a1a}-1.13\%$
test_redq_speed[False-None] 15.6341ms 10.6885ms 93.5581 Ops/s 71.5579 Ops/s $\textbf{\color{#35bf28}+30.74\%}$
test_redq_speed[False-backward] 18.6477ms 17.5043ms 57.1287 Ops/s 52.6928 Ops/s $\textbf{\color{#35bf28}+8.42\%}$
test_redq_speed[True-None] 3.7406ms 3.4992ms 285.7778 Ops/s 268.2163 Ops/s $\textbf{\color{#35bf28}+6.55\%}$
test_redq_speed[True-backward] 9.1057ms 8.6758ms 115.2638 Ops/s 115.7454 Ops/s $\color{#d91a1a}-0.42\%$
test_redq_speed[reduce-overhead-None] 4.8285ms 3.6051ms 277.3875 Ops/s 269.5997 Ops/s $\color{#35bf28}+2.89\%$
test_redq_speed[reduce-overhead-backward] 9.3395ms 8.8595ms 112.8737 Ops/s 113.3072 Ops/s $\color{#d91a1a}-0.38\%$
test_redq_deprec_speed[False-None] 0.2280s 13.2990ms 75.1935 Ops/s 93.1934 Ops/s $\textbf{\color{#d91a1a}-19.31\%}$
test_redq_deprec_speed[False-backward] 16.0210ms 15.5195ms 64.4351 Ops/s 64.0436 Ops/s $\color{#35bf28}+0.61\%$
test_redq_deprec_speed[True-None] 3.5973ms 3.2568ms 307.0521 Ops/s 305.9173 Ops/s $\color{#35bf28}+0.37\%$
test_redq_deprec_speed[True-backward] 7.6287ms 7.3133ms 136.7375 Ops/s 134.3435 Ops/s $\color{#35bf28}+1.78\%$
test_redq_deprec_speed[reduce-overhead-None] 3.4175ms 3.2341ms 309.2086 Ops/s 311.2157 Ops/s $\color{#d91a1a}-0.64\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.3869ms 7.1937ms 139.0112 Ops/s 142.4217 Ops/s $\color{#d91a1a}-2.39\%$
test_td3_speed[False-None] 7.6186ms 7.4498ms 134.2312 Ops/s 134.2484 Ops/s $\color{#d91a1a}-0.01\%$
test_td3_speed[False-backward] 10.7729ms 10.3544ms 96.5774 Ops/s 96.1975 Ops/s $\color{#35bf28}+0.39\%$
test_td3_speed[True-None] 2.3821ms 1.9383ms 515.9246 Ops/s 526.1711 Ops/s $\color{#d91a1a}-1.95\%$
test_td3_speed[True-backward] 5.2278ms 4.1207ms 242.6759 Ops/s 259.1283 Ops/s $\textbf{\color{#d91a1a}-6.35\%}$
test_td3_speed[reduce-overhead-None] 1.9604ms 1.9149ms 522.2164 Ops/s 524.3839 Ops/s $\color{#d91a1a}-0.41\%$
test_td3_speed[reduce-overhead-backward] 3.8721ms 3.7663ms 265.5155 Ops/s 269.6632 Ops/s $\color{#d91a1a}-1.54\%$
test_cql_speed[False-None] 27.6623ms 24.7552ms 40.3956 Ops/s 40.5382 Ops/s $\color{#d91a1a}-0.35\%$
test_cql_speed[False-backward] 37.9947ms 34.0773ms 29.3450 Ops/s 29.7658 Ops/s $\color{#d91a1a}-1.41\%$
test_cql_speed[True-None] 11.5944ms 10.9416ms 91.3944 Ops/s 94.3970 Ops/s $\color{#d91a1a}-3.18\%$
test_cql_speed[True-backward] 17.1736ms 16.7294ms 59.7748 Ops/s 61.8130 Ops/s $\color{#d91a1a}-3.30\%$
test_cql_speed[reduce-overhead-None] 11.3133ms 10.8710ms 91.9881 Ops/s 93.8488 Ops/s $\color{#d91a1a}-1.98\%$
test_cql_speed[reduce-overhead-backward] 17.9035ms 16.6484ms 60.0660 Ops/s 61.6102 Ops/s $\color{#d91a1a}-2.51\%$
test_a2c_speed[False-None] 5.5733ms 5.2223ms 191.4874 Ops/s 186.3737 Ops/s $\color{#35bf28}+2.74\%$
test_a2c_speed[False-backward] 12.0578ms 11.6612ms 85.7542 Ops/s 83.9401 Ops/s $\color{#35bf28}+2.16\%$
test_a2c_speed[True-None] 3.3695ms 3.0288ms 330.1590 Ops/s 328.0623 Ops/s $\color{#35bf28}+0.64\%$
test_a2c_speed[True-backward] 8.6493ms 8.4560ms 118.2592 Ops/s 120.4478 Ops/s $\color{#d91a1a}-1.82\%$
test_a2c_speed[reduce-overhead-None] 3.4661ms 3.0559ms 327.2393 Ops/s 325.2129 Ops/s $\color{#35bf28}+0.62\%$
test_a2c_speed[reduce-overhead-backward] 8.7557ms 8.3656ms 119.5375 Ops/s 120.2280 Ops/s $\color{#d91a1a}-0.57\%$
test_ppo_speed[False-None] 6.2801ms 5.5730ms 179.4351 Ops/s 179.6878 Ops/s $\color{#d91a1a}-0.14\%$
test_ppo_speed[False-backward] 12.5907ms 12.2815ms 81.4232 Ops/s 82.6990 Ops/s $\color{#d91a1a}-1.54\%$
test_ppo_speed[True-None] 3.5419ms 3.4302ms 291.5316 Ops/s 292.0370 Ops/s $\color{#d91a1a}-0.17\%$
test_ppo_speed[True-backward] 8.7705ms 8.2628ms 121.0239 Ops/s 122.4525 Ops/s $\color{#d91a1a}-1.17\%$
test_ppo_speed[reduce-overhead-None] 3.6871ms 3.4196ms 292.4357 Ops/s 294.0021 Ops/s $\color{#d91a1a}-0.53\%$
test_ppo_speed[reduce-overhead-backward] 8.5747ms 8.3293ms 120.0576 Ops/s 124.7932 Ops/s $\color{#d91a1a}-3.79\%$
test_reinforce_speed[False-None] 4.8385ms 4.4521ms 224.6135 Ops/s 224.0883 Ops/s $\color{#35bf28}+0.23\%$
test_reinforce_speed[False-backward] 7.6929ms 7.2529ms 137.8768 Ops/s 137.7079 Ops/s $\color{#35bf28}+0.12\%$
test_reinforce_speed[True-None] 2.4085ms 2.1986ms 454.8289 Ops/s 487.9024 Ops/s $\textbf{\color{#d91a1a}-6.78\%}$
test_reinforce_speed[True-backward] 7.4603ms 7.1202ms 140.4451 Ops/s 122.0498 Ops/s $\textbf{\color{#35bf28}+15.07\%}$
test_reinforce_speed[reduce-overhead-None] 2.5916ms 2.2089ms 452.7043 Ops/s 454.5306 Ops/s $\color{#d91a1a}-0.40\%$
test_reinforce_speed[reduce-overhead-backward] 7.2765ms 7.0665ms 141.5120 Ops/s 143.2488 Ops/s $\color{#d91a1a}-1.21\%$
test_iql_speed[False-None] 24.7359ms 19.8251ms 50.4410 Ops/s 51.4955 Ops/s $\color{#d91a1a}-2.05\%$
test_iql_speed[False-backward] 30.9719ms 30.1753ms 33.1396 Ops/s 33.6544 Ops/s $\color{#d91a1a}-1.53\%$
test_iql_speed[True-None] 7.2502ms 6.6939ms 149.3905 Ops/s 149.1852 Ops/s $\color{#35bf28}+0.14\%$
test_iql_speed[True-backward] 19.8274ms 16.8745ms 59.2611 Ops/s 65.3850 Ops/s $\textbf{\color{#d91a1a}-9.37\%}$
test_iql_speed[reduce-overhead-None] 7.0347ms 6.7319ms 148.5471 Ops/s 147.4357 Ops/s $\color{#35bf28}+0.75\%$
test_iql_speed[reduce-overhead-backward] 16.0126ms 15.6095ms 64.0636 Ops/s 65.4288 Ops/s $\color{#d91a1a}-2.09\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4452ms 6.2884ms 159.0241 Ops/s 158.6438 Ops/s $\color{#35bf28}+0.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.2874s 0.5266ms 1.8990 KOps/s 2.9707 KOps/s $\textbf{\color{#d91a1a}-36.08\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4856ms 0.2797ms 3.5750 KOps/s 3.1364 KOps/s $\textbf{\color{#35bf28}+13.98\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4104ms 6.0823ms 164.4120 Ops/s 164.4288 Ops/s $\color{#d91a1a}-0.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0030ms 0.3055ms 3.2730 KOps/s 3.4236 KOps/s $\color{#d91a1a}-4.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5553ms 0.2888ms 3.4627 KOps/s 3.2039 KOps/s $\textbf{\color{#35bf28}+8.08\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4661ms 1.2740ms 784.9512 Ops/s 725.0365 Ops/s $\textbf{\color{#35bf28}+8.26\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4184ms 1.2197ms 819.8506 Ops/s 753.0888 Ops/s $\textbf{\color{#35bf28}+8.87\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3620ms 6.2388ms 160.2878 Ops/s 158.6565 Ops/s $\color{#35bf28}+1.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.6313ms 0.4108ms 2.4342 KOps/s 2.0697 KOps/s $\textbf{\color{#35bf28}+17.61\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6476ms 0.3945ms 2.5351 KOps/s 2.1608 KOps/s $\textbf{\color{#35bf28}+17.32\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2695ms 6.0800ms 164.4746 Ops/s 161.2811 Ops/s $\color{#35bf28}+1.98\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8937ms 0.3469ms 2.8831 KOps/s 2.8994 KOps/s $\color{#d91a1a}-0.56\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6033ms 0.3277ms 3.0514 KOps/s 3.0744 KOps/s $\color{#d91a1a}-0.75\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 9.4357ms 6.0923ms 164.1418 Ops/s 163.4018 Ops/s $\color{#35bf28}+0.45\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1708ms 0.3470ms 2.8820 KOps/s 3.3062 KOps/s $\textbf{\color{#d91a1a}-12.83\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5495ms 0.3303ms 3.0275 KOps/s 3.6524 KOps/s $\textbf{\color{#d91a1a}-17.11\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5467ms 6.2593ms 159.7616 Ops/s 159.5949 Ops/s $\color{#35bf28}+0.10\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9867ms 0.4484ms 2.2302 KOps/s 2.2251 KOps/s $\color{#35bf28}+0.23\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7847ms 0.4302ms 2.3247 KOps/s 2.3155 KOps/s $\color{#35bf28}+0.39\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0495ms 5.4839ms 182.3505 Ops/s 187.6075 Ops/s $\color{#d91a1a}-2.80\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.5655ms 2.0235ms 494.2020 Ops/s 439.3961 Ops/s $\textbf{\color{#35bf28}+12.47\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 9.0163ms 1.2699ms 787.4434 Ops/s 805.6339 Ops/s $\color{#d91a1a}-2.26\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4009s 13.4107ms 74.5671 Ops/s 189.8785 Ops/s $\textbf{\color{#d91a1a}-60.73\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.1902ms 1.9995ms 500.1368 Ops/s 444.4358 Ops/s $\textbf{\color{#35bf28}+12.53\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.0669ms 1.2193ms 820.1595 Ops/s 872.1295 Ops/s $\textbf{\color{#d91a1a}-5.96\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.6160ms 5.7143ms 174.9985 Ops/s 185.0100 Ops/s $\textbf{\color{#d91a1a}-5.41\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.0814ms 2.2015ms 454.2322 Ops/s 407.7918 Ops/s $\textbf{\color{#35bf28}+11.39\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.3873ms 1.4239ms 702.2964 Ops/s 799.9041 Ops/s $\textbf{\color{#d91a1a}-12.20\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.9011ms 13.2215ms 75.6344 Ops/s 77.5219 Ops/s $\color{#d91a1a}-2.43\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.5011ms 16.7719ms 59.6236 Ops/s 60.4566 Ops/s $\color{#d91a1a}-1.38\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.3061ms 17.9042ms 55.8529 Ops/s 56.5286 Ops/s $\color{#d91a1a}-1.20\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.6882ms 17.2622ms 57.9302 Ops/s 59.4300 Ops/s $\color{#d91a1a}-2.52\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.4995ms 18.7062ms 53.4583 Ops/s 56.5522 Ops/s $\textbf{\color{#d91a1a}-5.47\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 19.3958ms 18.5877ms 53.7991 Ops/s 55.1717 Ops/s $\color{#d91a1a}-2.49\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants