Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TensorDictMap Query module #2305

Merged
merged 5 commits into from
Oct 14, 2024
Merged

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Jul 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2305

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 5 Unrelated Failures

As of commit f51a567 with merge base 194a5ff (image):

NEW FAILURES - The following jobs have failed:

  • Build Windows Wheels / pytorch/rl / upload / wheel-py3_9-cpu (gh)
    ERROR: failed to solve: python:3.12-slim: failed to resolve source metadata for docker.io/library/python:3.12-slim: failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/library/python/manifests/sha256:af4e85f1cac90dd3771e47292ea7c8a9830abfabbe4faa5c53f158854c2e819d: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
  • Habitat Tests on Linux / tests (3.9, 12.1) / linux-job (gh)
    RuntimeError: Command docker exec -t 1eeee06f3fe9d0802ab60b0b80615587d538b813c3e729fe36e32d23d764408b /exec failed with exit code 134

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 22, 2024
@vmoens vmoens mentioned this pull request Jul 22, 2024
@vmoens vmoens added the enhancement New feature or request label Jul 22, 2024
[ghstack-poisoned]
Copy link

github-actions bot commented Jul 23, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 143. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4191s 0.4177s 2.3942 Ops/s 2.4032 Ops/s $\color{#d91a1a}-0.38\%$
test_transformed 0.7079s 0.6178s 1.6187 Ops/s 1.6896 Ops/s $\color{#d91a1a}-4.20\%$
test_serial 1.4456s 1.3637s 0.7333 Ops/s 0.7455 Ops/s $\color{#d91a1a}-1.63\%$
test_parallel 1.2444s 1.2257s 0.8158 Ops/s 0.7934 Ops/s $\color{#35bf28}+2.83\%$
test_step_mdp_speed[True-True-True-True-True] 0.1883ms 28.9159μs 34.5831 KOps/s 35.0267 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[True-True-True-True-False] 50.5750μs 17.1503μs 58.3079 KOps/s 58.2549 KOps/s $\color{#35bf28}+0.09\%$
test_step_mdp_speed[True-True-True-False-True] 78.5080μs 16.1312μs 61.9916 KOps/s 62.3369 KOps/s $\color{#d91a1a}-0.55\%$
test_step_mdp_speed[True-True-True-False-False] 34.2240μs 9.4978μs 105.2879 KOps/s 103.9227 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[True-True-False-True-True] 89.8290μs 31.1109μs 32.1430 KOps/s 32.3700 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[True-True-False-True-False] 48.0900μs 19.2937μs 51.8304 KOps/s 51.5877 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[True-True-False-False-True] 77.0440μs 18.2701μs 54.7343 KOps/s 55.3095 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[True-True-False-False-False] 53.4200μs 11.6085μs 86.1437 KOps/s 85.2604 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[True-False-True-True-True] 0.1021ms 33.0982μs 30.2132 KOps/s 30.2038 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[True-False-True-True-False] 73.2570μs 21.3771μs 46.7790 KOps/s 46.5421 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[True-False-True-False-True] 46.0870μs 18.4377μs 54.2367 KOps/s 55.0247 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-False-True-False-False] 61.6060μs 11.6706μs 85.6857 KOps/s 86.9954 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[True-False-False-True-True] 94.7780μs 34.8780μs 28.6713 KOps/s 28.0992 KOps/s $\color{#35bf28}+2.04\%$
test_step_mdp_speed[True-False-False-True-False] 84.8090μs 23.2466μs 43.0170 KOps/s 41.4849 KOps/s $\color{#35bf28}+3.69\%$
test_step_mdp_speed[True-False-False-False-True] 64.4610μs 20.1732μs 49.5706 KOps/s 49.8264 KOps/s $\color{#d91a1a}-0.51\%$
test_step_mdp_speed[True-False-False-False-False] 56.5060μs 13.6572μs 73.2214 KOps/s 72.9916 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[False-True-True-True-True] 82.7550μs 33.5387μs 29.8163 KOps/s 30.2006 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[False-True-True-True-False] 60.8840μs 21.3751μs 46.7834 KOps/s 46.1270 KOps/s $\color{#35bf28}+1.42\%$
test_step_mdp_speed[False-True-True-False-True] 83.2060μs 21.4570μs 46.6048 KOps/s 46.3629 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-True-True-False-False] 2.5544ms 13.4673μs 74.2537 KOps/s 73.3619 KOps/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[False-True-False-True-True] 0.1124ms 35.0271μs 28.5493 KOps/s 28.4061 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[False-True-False-True-False] 68.2380μs 23.4499μs 42.6441 KOps/s 42.2731 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[False-True-False-False-True] 58.0690μs 23.6797μs 42.2302 KOps/s 42.2011 KOps/s $\color{#35bf28}+0.07\%$
test_step_mdp_speed[False-True-False-False-False] 63.4290μs 15.4093μs 64.8957 KOps/s 64.2536 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-False-True-True-True] 94.4730μs 36.8953μs 27.1038 KOps/s 26.6693 KOps/s $\color{#35bf28}+1.63\%$
test_step_mdp_speed[False-False-True-True-False] 78.4010μs 25.5159μs 39.1912 KOps/s 38.3718 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[False-False-True-False-True] 63.5800μs 23.5917μs 42.3878 KOps/s 42.1686 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[False-False-True-False-False] 57.4380μs 15.2649μs 65.5100 KOps/s 64.4493 KOps/s $\color{#35bf28}+1.65\%$
test_step_mdp_speed[False-False-False-True-True] 96.9310μs 38.4461μs 26.0104 KOps/s 25.5391 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[False-False-False-True-False] 67.9270μs 27.1804μs 36.7913 KOps/s 35.6226 KOps/s $\color{#35bf28}+3.28\%$
test_step_mdp_speed[False-False-False-False-True] 87.1730μs 25.2856μs 39.5482 KOps/s 39.9204 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[False-False-False-False-False] 70.6330μs 17.2294μs 58.0405 KOps/s 56.8699 KOps/s $\color{#35bf28}+2.06\%$
test_values[generalized_advantage_estimate-True-True] 10.5439ms 9.3524ms 106.9249 Ops/s 101.5088 Ops/s $\textbf{\color{#35bf28}+5.34\%}$
test_values[vec_generalized_advantage_estimate-True-True] 38.0089ms 35.9280ms 27.8334 Ops/s 29.8030 Ops/s $\textbf{\color{#d91a1a}-6.61\%}$
test_values[td0_return_estimate-False-False] 0.2222ms 0.1712ms 5.8401 KOps/s 5.5989 KOps/s $\color{#35bf28}+4.31\%$
test_values[td1_return_estimate-False-False] 27.5913ms 23.5383ms 42.4839 Ops/s 41.0513 Ops/s $\color{#35bf28}+3.49\%$
test_values[vec_td1_return_estimate-False-False] 38.8905ms 36.1354ms 27.6737 Ops/s 29.7987 Ops/s $\textbf{\color{#d91a1a}-7.13\%}$
test_values[td_lambda_return_estimate-True-False] 35.2889ms 33.8422ms 29.5489 Ops/s 28.7389 Ops/s $\color{#35bf28}+2.82\%$
test_values[vec_td_lambda_return_estimate-True-False] 39.0478ms 36.1098ms 27.6933 Ops/s 29.7636 Ops/s $\textbf{\color{#d91a1a}-6.96\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.3070ms 8.2566ms 121.1149 Ops/s 116.5545 Ops/s $\color{#35bf28}+3.91\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.1042ms 1.7773ms 562.6659 Ops/s 493.1469 Ops/s $\textbf{\color{#35bf28}+14.10\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5150ms 0.3534ms 2.8296 KOps/s 2.7921 KOps/s $\color{#35bf28}+1.34\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.7545ms 47.0107ms 21.2717 Ops/s 24.8745 Ops/s $\textbf{\color{#d91a1a}-14.48\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7470ms 3.0398ms 328.9678 Ops/s 326.6142 Ops/s $\color{#35bf28}+0.72\%$
test_dqn_speed[False-None] 1.7913ms 1.3422ms 745.0731 Ops/s 749.4554 Ops/s $\color{#d91a1a}-0.58\%$
test_dqn_speed[False-backward] 2.1509ms 1.8410ms 543.1975 Ops/s 545.8213 Ops/s $\color{#d91a1a}-0.48\%$
test_dqn_speed[True-None] 1.2446ms 0.4663ms 2.1446 KOps/s 2.0684 KOps/s $\color{#35bf28}+3.68\%$
test_dqn_speed[True-backward] 0.9224ms 0.8818ms 1.1341 KOps/s 1.1121 KOps/s $\color{#35bf28}+1.98\%$
test_dqn_speed[reduce-overhead-None] 0.6845ms 0.4698ms 2.1286 KOps/s 2.1372 KOps/s $\color{#d91a1a}-0.41\%$
test_dqn_speed[reduce-overhead-backward] 0.9355ms 0.8772ms 1.1400 KOps/s 1.1004 KOps/s $\color{#35bf28}+3.60\%$
test_ddpg_speed[False-None] 3.8384ms 2.7898ms 358.4441 Ops/s 352.3652 Ops/s $\color{#35bf28}+1.73\%$
test_ddpg_speed[False-backward] 4.3303ms 3.9496ms 253.1882 Ops/s 241.8656 Ops/s $\color{#35bf28}+4.68\%$
test_ddpg_speed[True-None] 1.3577ms 1.0220ms 978.4809 Ops/s 983.4312 Ops/s $\color{#d91a1a}-0.50\%$
test_ddpg_speed[True-backward] 2.0288ms 1.9278ms 518.7141 Ops/s 523.5745 Ops/s $\color{#d91a1a}-0.93\%$
test_ddpg_speed[reduce-overhead-None] 1.3107ms 1.0247ms 975.8769 Ops/s 987.2365 Ops/s $\color{#d91a1a}-1.15\%$
test_ddpg_speed[reduce-overhead-backward] 2.3907ms 1.9660ms 508.6502 Ops/s 523.0039 Ops/s $\color{#d91a1a}-2.74\%$
test_sac_speed[False-None] 9.6999ms 8.0578ms 124.1041 Ops/s 123.5267 Ops/s $\color{#35bf28}+0.47\%$
test_sac_speed[False-backward] 11.2675ms 10.8078ms 92.5260 Ops/s 91.6661 Ops/s $\color{#35bf28}+0.94\%$
test_sac_speed[True-None] 2.4805ms 1.8675ms 535.4863 Ops/s 535.0530 Ops/s $\color{#35bf28}+0.08\%$
test_sac_speed[True-backward] 3.6815ms 3.5719ms 279.9636 Ops/s 279.3726 Ops/s $\color{#35bf28}+0.21\%$
test_sac_speed[reduce-overhead-None] 5.8541ms 1.8885ms 529.5261 Ops/s 533.1266 Ops/s $\color{#d91a1a}-0.68\%$
test_sac_speed[reduce-overhead-backward] 3.7058ms 3.6053ms 277.3668 Ops/s 275.0708 Ops/s $\color{#35bf28}+0.83\%$
test_redq_speed[False-None] 18.1836ms 13.2197ms 75.6446 Ops/s 76.2478 Ops/s $\color{#d91a1a}-0.79\%$
test_redq_speed[False-backward] 27.1215ms 22.5959ms 44.2558 Ops/s 44.3158 Ops/s $\color{#d91a1a}-0.14\%$
test_redq_speed[True-None] 5.6742ms 4.9620ms 201.5332 Ops/s 204.6146 Ops/s $\color{#d91a1a}-1.51\%$
test_redq_speed[True-backward] 13.4955ms 12.5289ms 79.8156 Ops/s 80.4837 Ops/s $\color{#d91a1a}-0.83\%$
test_redq_speed[reduce-overhead-None] 6.0316ms 5.1216ms 195.2512 Ops/s 204.9644 Ops/s $\color{#d91a1a}-4.74\%$
test_redq_speed[reduce-overhead-backward] 13.3221ms 12.6282ms 79.1879 Ops/s 80.5231 Ops/s $\color{#d91a1a}-1.66\%$
test_redq_deprec_speed[False-None] 15.1300ms 12.9355ms 77.3069 Ops/s 77.7242 Ops/s $\color{#d91a1a}-0.54\%$
test_redq_deprec_speed[False-backward] 20.2482ms 18.8572ms 53.0302 Ops/s 53.3842 Ops/s $\color{#d91a1a}-0.66\%$
test_redq_deprec_speed[True-None] 4.2981ms 3.7361ms 267.6586 Ops/s 269.6950 Ops/s $\color{#d91a1a}-0.76\%$
test_redq_deprec_speed[True-backward] 9.4172ms 8.6326ms 115.8406 Ops/s 120.0898 Ops/s $\color{#d91a1a}-3.54\%$
test_redq_deprec_speed[reduce-overhead-None] 4.3492ms 3.6738ms 272.1940 Ops/s 275.5025 Ops/s $\color{#d91a1a}-1.20\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.6369ms 8.5491ms 116.9715 Ops/s 122.3062 Ops/s $\color{#d91a1a}-4.36\%$
test_td3_speed[False-None] 9.1259ms 7.8525ms 127.3488 Ops/s 124.9256 Ops/s $\color{#35bf28}+1.94\%$
test_td3_speed[False-backward] 10.6214ms 10.1665ms 98.3620 Ops/s 96.1021 Ops/s $\color{#35bf28}+2.35\%$
test_td3_speed[True-None] 1.9067ms 1.7616ms 567.6815 Ops/s 560.6387 Ops/s $\color{#35bf28}+1.26\%$
test_td3_speed[True-backward] 3.7942ms 3.4062ms 293.5843 Ops/s 285.0962 Ops/s $\color{#35bf28}+2.98\%$
test_td3_speed[reduce-overhead-None] 2.1031ms 1.7711ms 564.6246 Ops/s 564.4405 Ops/s $\color{#35bf28}+0.03\%$
test_td3_speed[reduce-overhead-backward] 5.0346ms 3.4323ms 291.3471 Ops/s 295.3158 Ops/s $\color{#d91a1a}-1.34\%$
test_cql_speed[False-None] 55.9261ms 37.0979ms 26.9557 Ops/s 27.6836 Ops/s $\color{#d91a1a}-2.63\%$
test_cql_speed[False-backward] 51.8645ms 46.8956ms 21.3240 Ops/s 21.1230 Ops/s $\color{#35bf28}+0.95\%$
test_cql_speed[True-None] 18.3753ms 15.8985ms 62.8992 Ops/s 62.8656 Ops/s $\color{#35bf28}+0.05\%$
test_cql_speed[True-backward] 29.5467ms 22.7415ms 43.9725 Ops/s 44.4570 Ops/s $\color{#d91a1a}-1.09\%$
test_cql_speed[reduce-overhead-None] 16.5070ms 15.9875ms 62.5487 Ops/s 63.0601 Ops/s $\color{#d91a1a}-0.81\%$
test_cql_speed[reduce-overhead-backward] 23.7902ms 22.5582ms 44.3297 Ops/s 44.1175 Ops/s $\color{#35bf28}+0.48\%$
test_a2c_speed[False-None] 8.7233ms 7.2422ms 138.0801 Ops/s 134.6263 Ops/s $\color{#35bf28}+2.57\%$
test_a2c_speed[False-backward] 14.6714ms 14.4037ms 69.4266 Ops/s 67.5765 Ops/s $\color{#35bf28}+2.74\%$
test_a2c_speed[True-None] 3.7674ms 3.3480ms 298.6836 Ops/s 294.3162 Ops/s $\color{#35bf28}+1.48\%$
test_a2c_speed[True-backward] 10.4115ms 9.9330ms 100.6742 Ops/s 98.2914 Ops/s $\color{#35bf28}+2.42\%$
test_a2c_speed[reduce-overhead-None] 3.6288ms 3.3221ms 301.0101 Ops/s 296.7163 Ops/s $\color{#35bf28}+1.45\%$
test_a2c_speed[reduce-overhead-backward] 10.7815ms 10.0891ms 99.1172 Ops/s 99.5740 Ops/s $\color{#d91a1a}-0.46\%$
test_ppo_speed[False-None] 9.8621ms 7.5513ms 132.4278 Ops/s 130.7668 Ops/s $\color{#35bf28}+1.27\%$
test_ppo_speed[False-backward] 15.6301ms 15.0313ms 66.5277 Ops/s 65.1749 Ops/s $\color{#35bf28}+2.08\%$
test_ppo_speed[True-None] 4.0817ms 3.7690ms 265.3219 Ops/s 265.9848 Ops/s $\color{#d91a1a}-0.25\%$
test_ppo_speed[True-backward] 12.2631ms 10.1659ms 98.3678 Ops/s 101.7844 Ops/s $\color{#d91a1a}-3.36\%$
test_ppo_speed[reduce-overhead-None] 3.9961ms 3.7405ms 267.3409 Ops/s 264.2850 Ops/s $\color{#35bf28}+1.16\%$
test_ppo_speed[reduce-overhead-backward] 10.9868ms 10.0577ms 99.4263 Ops/s 97.1241 Ops/s $\color{#35bf28}+2.37\%$
test_reinforce_speed[False-None] 7.2190ms 6.4994ms 153.8610 Ops/s 149.4317 Ops/s $\color{#35bf28}+2.96\%$
test_reinforce_speed[False-backward] 11.8644ms 9.8653ms 101.3654 Ops/s 99.8862 Ops/s $\color{#35bf28}+1.48\%$
test_reinforce_speed[True-None] 3.1640ms 2.7484ms 363.8430 Ops/s 367.4071 Ops/s $\color{#d91a1a}-0.97\%$
test_reinforce_speed[True-backward] 9.5238ms 8.9784ms 111.3786 Ops/s 109.1430 Ops/s $\color{#35bf28}+2.05\%$
test_reinforce_speed[reduce-overhead-None] 3.5115ms 2.7662ms 361.5075 Ops/s 367.7668 Ops/s $\color{#d91a1a}-1.70\%$
test_reinforce_speed[reduce-overhead-backward] 9.5677ms 9.0269ms 110.7799 Ops/s 110.6119 Ops/s $\color{#35bf28}+0.15\%$
test_iql_speed[False-None] 33.9317ms 32.4360ms 30.8299 Ops/s 29.9723 Ops/s $\color{#35bf28}+2.86\%$
test_iql_speed[False-backward] 45.8301ms 44.5059ms 22.4689 Ops/s 21.8812 Ops/s $\color{#35bf28}+2.69\%$
test_iql_speed[True-None] 14.1747ms 13.6241ms 73.3995 Ops/s 72.6267 Ops/s $\color{#35bf28}+1.06\%$
test_iql_speed[True-backward] 26.3043ms 25.1693ms 39.7309 Ops/s 39.8868 Ops/s $\color{#d91a1a}-0.39\%$
test_iql_speed[reduce-overhead-None] 14.3371ms 13.8050ms 72.4375 Ops/s 72.0649 Ops/s $\color{#35bf28}+0.52\%$
test_iql_speed[reduce-overhead-backward] 30.4845ms 25.7728ms 38.8005 Ops/s 39.5900 Ops/s $\color{#d91a1a}-1.99\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3117ms 4.9357ms 202.6036 Ops/s 198.3574 Ops/s $\color{#35bf28}+2.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.9786ms 0.4904ms 2.0391 KOps/s 2.0428 KOps/s $\color{#d91a1a}-0.18\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6931ms 0.4694ms 2.1305 KOps/s 2.1467 KOps/s $\color{#d91a1a}-0.75\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.9317ms 4.9451ms 202.2187 Ops/s 197.4770 Ops/s $\color{#35bf28}+2.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7858ms 0.4835ms 2.0681 KOps/s 2.0571 KOps/s $\color{#35bf28}+0.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7034ms 0.4625ms 2.1621 KOps/s 2.1547 KOps/s $\color{#35bf28}+0.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.0370ms 1.5773ms 633.9755 Ops/s 622.9575 Ops/s $\color{#35bf28}+1.77\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.8266ms 1.5269ms 654.9367 Ops/s 642.2732 Ops/s $\color{#35bf28}+1.97\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.0141ms 5.0991ms 196.1117 Ops/s 190.3455 Ops/s $\color{#35bf28}+3.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.5568ms 0.6206ms 1.6113 KOps/s 1.5872 KOps/s $\color{#35bf28}+1.51\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8488ms 0.5987ms 1.6704 KOps/s 1.6628 KOps/s $\color{#35bf28}+0.45\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.3972ms 4.8671ms 205.4604 Ops/s 193.0289 Ops/s $\textbf{\color{#35bf28}+6.44\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7067ms 0.4870ms 2.0532 KOps/s 2.0483 KOps/s $\color{#35bf28}+0.24\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 7.8793ms 0.4766ms 2.0981 KOps/s 2.0987 KOps/s $\color{#d91a1a}-0.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 7.7125ms 4.9822ms 200.7150 Ops/s 189.5772 Ops/s $\textbf{\color{#35bf28}+5.88\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5171ms 0.4903ms 2.0394 KOps/s 2.0410 KOps/s $\color{#d91a1a}-0.08\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6758ms 0.4719ms 2.1191 KOps/s 2.1809 KOps/s $\color{#d91a1a}-2.83\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4286ms 5.2222ms 191.4920 Ops/s 192.3443 Ops/s $\color{#d91a1a}-0.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4461ms 0.6352ms 1.5744 KOps/s 1.5746 KOps/s $\color{#d91a1a}-0.01\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8338ms 0.6053ms 1.6519 KOps/s 1.6488 KOps/s $\color{#35bf28}+0.19\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4770s 13.7694ms 72.6246 Ops/s 220.4532 Ops/s $\textbf{\color{#d91a1a}-67.06\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.2699ms 2.3433ms 426.7425 Ops/s 392.7470 Ops/s $\textbf{\color{#35bf28}+8.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.8772ms 1.1927ms 838.4455 Ops/s 786.9968 Ops/s $\textbf{\color{#35bf28}+6.54\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.1469ms 4.2650ms 234.4655 Ops/s 233.2310 Ops/s $\color{#35bf28}+0.53\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.0432ms 2.3522ms 425.1423 Ops/s 432.5034 Ops/s $\color{#d91a1a}-1.70\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.2864ms 1.3280ms 753.0207 Ops/s 772.2336 Ops/s $\color{#d91a1a}-2.49\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3954s 12.4456ms 80.3496 Ops/s 33.8948 Ops/s $\textbf{\color{#35bf28}+137.06\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.9909ms 2.4626ms 406.0733 Ops/s 424.6032 Ops/s $\color{#d91a1a}-4.36\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.7606ms 1.4515ms 688.9637 Ops/s 659.5056 Ops/s $\color{#35bf28}+4.47\%$

Copy link

github-actions bot commented Jul 23, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 143. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7336s 0.7231s 1.3830 Ops/s 1.4008 Ops/s $\color{#d91a1a}-1.27\%$
test_transformed 1.0407s 0.9673s 1.0338 Ops/s 1.0570 Ops/s $\color{#d91a1a}-2.20\%$
test_serial 2.1481s 2.0717s 0.4827 Ops/s 0.4842 Ops/s $\color{#d91a1a}-0.30\%$
test_parallel 1.9422s 1.8684s 0.5352 Ops/s 0.5377 Ops/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[True-True-True-True-True] 0.1949ms 37.9364μs 26.3599 KOps/s 26.6075 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[True-True-True-True-False] 0.2151ms 22.3081μs 44.8268 KOps/s 45.1059 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-True-True-False-True] 47.9710μs 20.3157μs 49.2231 KOps/s 50.3317 KOps/s $\color{#d91a1a}-2.20\%$
test_step_mdp_speed[True-True-True-False-False] 65.7510μs 11.6162μs 86.0865 KOps/s 84.9230 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-True-False-True-True] 0.1062ms 40.4994μs 24.6917 KOps/s 25.2890 KOps/s $\color{#d91a1a}-2.36\%$
test_step_mdp_speed[True-True-False-True-False] 45.7600μs 24.4440μs 40.9098 KOps/s 40.9879 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-True-False-False-True] 60.3410μs 23.3670μs 42.7955 KOps/s 43.8683 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[True-True-False-False-False] 48.9110μs 14.4236μs 69.3307 KOps/s 69.1266 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[True-False-True-True-True] 0.1725ms 42.4046μs 23.5823 KOps/s 23.3635 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[True-False-True-True-False] 62.2010μs 27.1129μs 36.8828 KOps/s 36.7806 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[True-False-True-False-True] 85.1010μs 22.8465μs 43.7704 KOps/s 44.9467 KOps/s $\color{#d91a1a}-2.62\%$
test_step_mdp_speed[True-False-True-False-False] 0.4194ms 14.3097μs 69.8825 KOps/s 69.8389 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[True-False-False-True-True] 88.7910μs 45.2239μs 22.1122 KOps/s 22.3474 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[True-False-False-True-False] 0.1138ms 29.5953μs 33.7891 KOps/s 34.1455 KOps/s $\color{#d91a1a}-1.04\%$
test_step_mdp_speed[True-False-False-False-True] 52.5310μs 25.5103μs 39.1999 KOps/s 40.1369 KOps/s $\color{#d91a1a}-2.33\%$
test_step_mdp_speed[True-False-False-False-False] 44.2200μs 17.1208μs 58.4084 KOps/s 58.2005 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-True-True-True-True] 82.5010μs 42.8858μs 23.3177 KOps/s 23.4674 KOps/s $\color{#d91a1a}-0.64\%$
test_step_mdp_speed[False-True-True-True-False] 89.3910μs 26.6350μs 37.5446 KOps/s 36.9582 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[False-True-True-False-True] 57.4710μs 26.9729μs 37.0743 KOps/s 36.3877 KOps/s $\color{#35bf28}+1.89\%$
test_step_mdp_speed[False-True-True-False-False] 2.9506ms 17.1863μs 58.1857 KOps/s 51.6055 KOps/s $\textbf{\color{#35bf28}+12.75\%}$
test_step_mdp_speed[False-True-False-True-True] 0.1726ms 46.0096μs 21.7346 KOps/s 21.9431 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[False-True-False-True-False] 55.0810μs 29.4761μs 33.9258 KOps/s 33.7582 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[False-True-False-False-True] 64.7010μs 30.7166μs 32.5557 KOps/s 33.3576 KOps/s $\color{#d91a1a}-2.40\%$
test_step_mdp_speed[False-True-False-False-False] 41.8310μs 19.5805μs 51.0713 KOps/s 52.2297 KOps/s $\color{#d91a1a}-2.22\%$
test_step_mdp_speed[False-False-True-True-True] 79.8220μs 48.4908μs 20.6225 KOps/s 21.1114 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[False-False-True-True-False] 63.7110μs 31.8580μs 31.3893 KOps/s 31.1929 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-False-True-False-True] 56.5310μs 29.5400μs 33.8524 KOps/s 34.4793 KOps/s $\color{#d91a1a}-1.82\%$
test_step_mdp_speed[False-False-True-False-False] 0.1733ms 19.5015μs 51.2782 KOps/s 52.4827 KOps/s $\color{#d91a1a}-2.30\%$
test_step_mdp_speed[False-False-False-True-True] 99.2820μs 50.5027μs 19.8009 KOps/s 20.1347 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-False-False-True-False] 63.0810μs 34.9034μs 28.6505 KOps/s 29.0350 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[False-False-False-False-True] 65.6710μs 32.9251μs 30.3720 KOps/s 31.2552 KOps/s $\color{#d91a1a}-2.83\%$
test_step_mdp_speed[False-False-False-False-False] 56.4510μs 21.9408μs 45.5773 KOps/s 46.2604 KOps/s $\color{#d91a1a}-1.48\%$
test_values[generalized_advantage_estimate-True-True] 25.0638ms 24.6704ms 40.5344 Ops/s 40.0719 Ops/s $\color{#35bf28}+1.15\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1023s 2.9267ms 341.6830 Ops/s 360.2288 Ops/s $\textbf{\color{#d91a1a}-5.15\%}$
test_values[td0_return_estimate-False-False] 84.8210μs 64.6536μs 15.4670 KOps/s 15.0390 KOps/s $\color{#35bf28}+2.85\%$
test_values[td1_return_estimate-False-False] 55.1343ms 54.4572ms 18.3630 Ops/s 17.6491 Ops/s $\color{#35bf28}+4.05\%$
test_values[vec_td1_return_estimate-False-False] 1.4118ms 1.0639ms 939.9547 Ops/s 938.7133 Ops/s $\color{#35bf28}+0.13\%$
test_values[td_lambda_return_estimate-True-False] 87.1889ms 86.1652ms 11.6056 Ops/s 11.6596 Ops/s $\color{#d91a1a}-0.46\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3865ms 1.0580ms 945.1375 Ops/s 938.7483 Ops/s $\color{#35bf28}+0.68\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.7180ms 24.3421ms 41.0811 Ops/s 41.1595 Ops/s $\color{#d91a1a}-0.19\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0512ms 0.7291ms 1.3715 KOps/s 1.3731 KOps/s $\color{#d91a1a}-0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8033ms 0.6457ms 1.5487 KOps/s 1.5494 KOps/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6053ms 1.4602ms 684.8348 Ops/s 686.4341 Ops/s $\color{#d91a1a}-0.23\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8318ms 0.6846ms 1.4607 KOps/s 1.5086 KOps/s $\color{#d91a1a}-3.18\%$
test_dqn_speed[False-None] 7.1599ms 1.3175ms 759.0046 Ops/s 757.6091 Ops/s $\color{#35bf28}+0.18\%$
test_dqn_speed[False-backward] 2.0296ms 1.8360ms 544.6754 Ops/s 538.0500 Ops/s $\color{#35bf28}+1.23\%$
test_dqn_speed[True-None] 0.7180ms 0.5614ms 1.7813 KOps/s 1.7804 KOps/s $\color{#35bf28}+0.05\%$
test_dqn_speed[True-backward] 1.0560ms 1.0068ms 993.2889 Ops/s 873.5478 Ops/s $\textbf{\color{#35bf28}+13.71\%}$
test_dqn_speed[reduce-overhead-None] 0.7106ms 0.5571ms 1.7949 KOps/s 1.7302 KOps/s $\color{#35bf28}+3.74\%$
test_dqn_speed[reduce-overhead-backward] 1.1100ms 1.0233ms 977.2509 Ops/s 976.2229 Ops/s $\color{#35bf28}+0.11\%$
test_ddpg_speed[False-None] 3.3936ms 2.6857ms 372.3428 Ops/s 369.3441 Ops/s $\color{#35bf28}+0.81\%$
test_ddpg_speed[False-backward] 4.2808ms 3.9929ms 250.4449 Ops/s 252.0412 Ops/s $\color{#d91a1a}-0.63\%$
test_ddpg_speed[True-None] 1.5681ms 1.2613ms 792.8620 Ops/s 790.3255 Ops/s $\color{#35bf28}+0.32\%$
test_ddpg_speed[True-backward] 2.4462ms 2.2898ms 436.7129 Ops/s 444.2785 Ops/s $\color{#d91a1a}-1.70\%$
test_ddpg_speed[reduce-overhead-None] 1.4802ms 1.2722ms 786.0662 Ops/s 780.9594 Ops/s $\color{#35bf28}+0.65\%$
test_ddpg_speed[reduce-overhead-backward] 2.4569ms 2.2687ms 440.7879 Ops/s 444.4018 Ops/s $\color{#d91a1a}-0.81\%$
test_sac_speed[False-None] 7.8737ms 7.5395ms 132.6340 Ops/s 129.9235 Ops/s $\color{#35bf28}+2.09\%$
test_sac_speed[False-backward] 11.3015ms 10.8818ms 91.8963 Ops/s 91.2623 Ops/s $\color{#35bf28}+0.69\%$
test_sac_speed[True-None] 2.2468ms 2.0673ms 483.7281 Ops/s 482.2828 Ops/s $\color{#35bf28}+0.30\%$
test_sac_speed[True-backward] 4.4621ms 4.0482ms 247.0227 Ops/s 235.6173 Ops/s $\color{#35bf28}+4.84\%$
test_sac_speed[reduce-overhead-None] 2.2453ms 2.0701ms 483.0609 Ops/s 481.0060 Ops/s $\color{#35bf28}+0.43\%$
test_sac_speed[reduce-overhead-backward] 4.2901ms 4.0774ms 245.2547 Ops/s 244.7340 Ops/s $\color{#35bf28}+0.21\%$
test_redq_speed[False-None] 15.8306ms 10.8579ms 92.0986 Ops/s 83.0241 Ops/s $\textbf{\color{#35bf28}+10.93\%}$
test_redq_speed[False-backward] 19.2665ms 17.8094ms 56.1501 Ops/s 54.9731 Ops/s $\color{#35bf28}+2.14\%$
test_redq_speed[True-None] 3.9002ms 3.7000ms 270.2733 Ops/s 268.5700 Ops/s $\color{#35bf28}+0.63\%$
test_redq_speed[True-backward] 9.4550ms 8.8789ms 112.6271 Ops/s 112.8067 Ops/s $\color{#d91a1a}-0.16\%$
test_redq_speed[reduce-overhead-None] 4.0074ms 3.6780ms 271.8865 Ops/s 275.0147 Ops/s $\color{#d91a1a}-1.14\%$
test_redq_speed[reduce-overhead-backward] 9.2347ms 8.8359ms 113.1742 Ops/s 108.5747 Ops/s $\color{#35bf28}+4.24\%$
test_redq_deprec_speed[False-None] 11.3506ms 10.6577ms 93.8290 Ops/s 91.8625 Ops/s $\color{#35bf28}+2.14\%$
test_redq_deprec_speed[False-backward] 16.3179ms 15.7101ms 63.6531 Ops/s 63.2897 Ops/s $\color{#35bf28}+0.57\%$
test_redq_deprec_speed[True-None] 3.6706ms 3.2787ms 304.9969 Ops/s 296.2755 Ops/s $\color{#35bf28}+2.94\%$
test_redq_deprec_speed[True-backward] 7.4907ms 7.2526ms 137.8809 Ops/s 138.2509 Ops/s $\color{#d91a1a}-0.27\%$
test_redq_deprec_speed[reduce-overhead-None] 3.5050ms 3.2447ms 308.1910 Ops/s 310.2653 Ops/s $\color{#d91a1a}-0.67\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.5454ms 7.2297ms 138.3185 Ops/s 143.8504 Ops/s $\color{#d91a1a}-3.85\%$
test_td3_speed[False-None] 8.3148ms 7.4944ms 133.4335 Ops/s 131.2021 Ops/s $\color{#35bf28}+1.70\%$
test_td3_speed[False-backward] 10.7944ms 10.3692ms 96.4393 Ops/s 95.0517 Ops/s $\color{#35bf28}+1.46\%$
test_td3_speed[True-None] 2.0091ms 1.9512ms 512.5056 Ops/s 508.1797 Ops/s $\color{#35bf28}+0.85\%$
test_td3_speed[True-backward] 3.9911ms 3.8232ms 261.5620 Ops/s 225.6041 Ops/s $\textbf{\color{#35bf28}+15.94\%}$
test_td3_speed[reduce-overhead-None] 2.0039ms 1.9564ms 511.1346 Ops/s 507.2911 Ops/s $\color{#35bf28}+0.76\%$
test_td3_speed[reduce-overhead-backward] 4.0031ms 3.8084ms 262.5763 Ops/s 259.7502 Ops/s $\color{#35bf28}+1.09\%$
test_cql_speed[False-None] 27.7422ms 25.3957ms 39.3768 Ops/s 39.3292 Ops/s $\color{#35bf28}+0.12\%$
test_cql_speed[False-backward] 39.1274ms 35.2065ms 28.4038 Ops/s 28.7373 Ops/s $\color{#d91a1a}-1.16\%$
test_cql_speed[True-None] 11.6562ms 11.1831ms 89.4208 Ops/s 88.3909 Ops/s $\color{#35bf28}+1.17\%$
test_cql_speed[True-backward] 18.0512ms 17.1912ms 58.1692 Ops/s 56.7822 Ops/s $\color{#35bf28}+2.44\%$
test_cql_speed[reduce-overhead-None] 11.8435ms 11.3511ms 88.0973 Ops/s 88.7065 Ops/s $\color{#d91a1a}-0.69\%$
test_cql_speed[reduce-overhead-backward] 17.8820ms 17.1623ms 58.2673 Ops/s 57.2412 Ops/s $\color{#35bf28}+1.79\%$
test_a2c_speed[False-None] 7.5581ms 5.4185ms 184.5537 Ops/s 180.4189 Ops/s $\color{#35bf28}+2.29\%$
test_a2c_speed[False-backward] 12.3166ms 12.0908ms 82.7073 Ops/s 82.4229 Ops/s $\color{#35bf28}+0.35\%$
test_a2c_speed[True-None] 3.4475ms 3.1647ms 315.9822 Ops/s 314.6107 Ops/s $\color{#35bf28}+0.44\%$
test_a2c_speed[True-backward] 8.9123ms 8.6982ms 114.9667 Ops/s 113.7288 Ops/s $\color{#35bf28}+1.09\%$
test_a2c_speed[reduce-overhead-None] 3.5087ms 3.1183ms 320.6884 Ops/s 321.1352 Ops/s $\color{#d91a1a}-0.14\%$
test_a2c_speed[reduce-overhead-backward] 9.0240ms 8.7106ms 114.8025 Ops/s 114.8653 Ops/s $\color{#d91a1a}-0.05\%$
test_ppo_speed[False-None] 6.0156ms 5.7621ms 173.5480 Ops/s 168.3630 Ops/s $\color{#35bf28}+3.08\%$
test_ppo_speed[False-backward] 12.9948ms 12.5691ms 79.5603 Ops/s 77.5435 Ops/s $\color{#35bf28}+2.60\%$
test_ppo_speed[True-None] 3.6773ms 3.4910ms 286.4478 Ops/s 283.1978 Ops/s $\color{#35bf28}+1.15\%$
test_ppo_speed[True-backward] 8.5925ms 8.3705ms 119.4678 Ops/s 119.0137 Ops/s $\color{#35bf28}+0.38\%$
test_ppo_speed[reduce-overhead-None] 3.6886ms 3.4994ms 285.7607 Ops/s 283.0152 Ops/s $\color{#35bf28}+0.97\%$
test_ppo_speed[reduce-overhead-backward] 8.4994ms 8.3224ms 120.1569 Ops/s 119.2447 Ops/s $\color{#35bf28}+0.77\%$
test_reinforce_speed[False-None] 6.4803ms 4.5084ms 221.8087 Ops/s 217.7572 Ops/s $\color{#35bf28}+1.86\%$
test_reinforce_speed[False-backward] 7.7442ms 7.3649ms 135.7799 Ops/s 135.0992 Ops/s $\color{#35bf28}+0.50\%$
test_reinforce_speed[True-None] 2.5021ms 2.2809ms 438.4248 Ops/s 444.1627 Ops/s $\color{#d91a1a}-1.29\%$
test_reinforce_speed[True-backward] 7.5105ms 7.2701ms 137.5496 Ops/s 135.7723 Ops/s $\color{#35bf28}+1.31\%$
test_reinforce_speed[reduce-overhead-None] 2.5012ms 2.2909ms 436.5015 Ops/s 435.4280 Ops/s $\color{#35bf28}+0.25\%$
test_reinforce_speed[reduce-overhead-backward] 7.3828ms 7.1713ms 139.4450 Ops/s 137.6866 Ops/s $\color{#35bf28}+1.28\%$
test_iql_speed[False-None] 25.7008ms 20.1697ms 49.5794 Ops/s 49.2946 Ops/s $\color{#35bf28}+0.58\%$
test_iql_speed[False-backward] 36.1839ms 30.7447ms 32.5259 Ops/s 33.0988 Ops/s $\color{#d91a1a}-1.73\%$
test_iql_speed[True-None] 8.9080ms 8.0349ms 124.4566 Ops/s 123.5271 Ops/s $\color{#35bf28}+0.75\%$
test_iql_speed[True-backward] 17.4441ms 16.9458ms 59.0118 Ops/s 56.8400 Ops/s $\color{#35bf28}+3.82\%$
test_iql_speed[reduce-overhead-None] 8.5668ms 8.1379ms 122.8814 Ops/s 125.2144 Ops/s $\color{#d91a1a}-1.86\%$
test_iql_speed[reduce-overhead-backward] 18.0542ms 17.2282ms 58.0442 Ops/s 58.1961 Ops/s $\color{#d91a1a}-0.26\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2382ms 6.0528ms 165.2124 Ops/s 163.6296 Ops/s $\color{#35bf28}+0.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9457ms 0.3281ms 3.0475 KOps/s 3.0947 KOps/s $\color{#d91a1a}-1.52\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5645ms 0.3249ms 3.0782 KOps/s 3.3967 KOps/s $\textbf{\color{#d91a1a}-9.38\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2150ms 5.9055ms 169.3323 Ops/s 167.5738 Ops/s $\color{#35bf28}+1.05\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7644ms 0.3359ms 2.9771 KOps/s 3.8581 KOps/s $\textbf{\color{#d91a1a}-22.84\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6586ms 0.3196ms 3.1294 KOps/s 4.6875 KOps/s $\textbf{\color{#d91a1a}-33.24\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6682ms 1.4025ms 713.0353 Ops/s 799.4484 Ops/s $\textbf{\color{#d91a1a}-10.81\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5847ms 1.3198ms 757.7144 Ops/s 858.2820 Ops/s $\textbf{\color{#d91a1a}-11.72\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3700ms 6.1503ms 162.5927 Ops/s 162.4639 Ops/s $\color{#35bf28}+0.08\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.2387ms 0.4738ms 2.1106 KOps/s 2.4304 KOps/s $\textbf{\color{#d91a1a}-13.16\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7396ms 0.4318ms 2.3157 KOps/s 2.6770 KOps/s $\textbf{\color{#d91a1a}-13.50\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3468ms 6.0964ms 164.0313 Ops/s 165.0320 Ops/s $\color{#d91a1a}-0.61\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0844ms 0.2940ms 3.4008 KOps/s 2.7922 KOps/s $\textbf{\color{#35bf28}+21.80\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4677ms 0.2422ms 4.1289 KOps/s 2.9374 KOps/s $\textbf{\color{#35bf28}+40.56\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3577ms 6.0248ms 165.9806 Ops/s 164.3893 Ops/s $\color{#35bf28}+0.97\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0353ms 0.3095ms 3.2309 KOps/s 2.8032 KOps/s $\textbf{\color{#35bf28}+15.26\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5270ms 0.2170ms 4.6078 KOps/s 2.9820 KOps/s $\textbf{\color{#35bf28}+54.52\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4245ms 6.2148ms 160.9062 Ops/s 159.7968 Ops/s $\color{#35bf28}+0.69\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8262ms 0.3983ms 2.5106 KOps/s 2.2682 KOps/s $\textbf{\color{#35bf28}+10.69\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6083ms 0.3645ms 2.7434 KOps/s 2.3416 KOps/s $\textbf{\color{#35bf28}+17.16\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4473s 14.2129ms 70.3587 Ops/s 187.4898 Ops/s $\textbf{\color{#d91a1a}-62.47\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.8437ms 2.0794ms 480.9016 Ops/s 456.6601 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 9.2717ms 1.3206ms 757.2238 Ops/s 830.2348 Ops/s $\textbf{\color{#d91a1a}-8.79\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 8.4121ms 5.3600ms 186.5673 Ops/s 184.6110 Ops/s $\color{#35bf28}+1.06\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.1925ms 2.0687ms 483.4038 Ops/s 448.9932 Ops/s $\textbf{\color{#35bf28}+7.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.2423ms 1.3033ms 767.2774 Ops/s 832.8722 Ops/s $\textbf{\color{#d91a1a}-7.88\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4371s 14.1745ms 70.5491 Ops/s 183.4370 Ops/s $\textbf{\color{#d91a1a}-61.54\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.9904ms 2.2569ms 443.0807 Ops/s 402.0714 Ops/s $\textbf{\color{#35bf28}+10.20\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.4774ms 1.4542ms 687.6593 Ops/s 799.0295 Ops/s $\textbf{\color{#d91a1a}-13.94\%}$

vmoens added 2 commits August 3, 2024 17:50
[ghstack-poisoned]
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 7, 2024
ghstack-source-id: 9d70a3ea62402ee0619822c87dc9c05d8219101b
Pull Request resolved: #2305
[ghstack-poisoned]
@vmoens vmoens merged commit f51a567 into gh/vmoens/2/base Oct 14, 2024
71 of 78 checks passed
vmoens added a commit that referenced this pull request Oct 14, 2024
ghstack-source-id: d0ef69f42342b66e5d3e39f4029818373169bca0
Pull Request resolved: #2305
@vmoens vmoens deleted the gh/vmoens/2/head branch October 14, 2024 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants