Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix, Feature] Fix DDQN implementation #1737

Merged
merged 2 commits into from
Dec 7, 2023
Merged

[BugFix, Feature] Fix DDQN implementation #1737

merged 2 commits into from
Dec 7, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 6, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 6, 2023
Copy link

pytorch-bot bot commented Dec 6, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1737

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 3 Unrelated Failures

As of commit ffd9518 with merge base 841f8d9 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@vmoens vmoens linked an issue Dec 6, 2023 that may be closed by this pull request
2 tasks
Copy link

github-actions bot commented Dec 6, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 63.2481ms 62.7858ms 15.9272 Ops/s 14.8040 Ops/s $\textbf{\color{#35bf28}+7.59\%}$
test_sync 35.9388ms 34.2901ms 29.1629 Ops/s 27.9333 Ops/s $\color{#35bf28}+4.40\%$
test_async 55.2821ms 32.9975ms 30.3053 Ops/s 30.1825 Ops/s $\color{#35bf28}+0.41\%$
test_simple 0.4798s 0.4301s 2.3249 Ops/s 2.3209 Ops/s $\color{#35bf28}+0.17\%$
test_transformed 0.6354s 0.5894s 1.6966 Ops/s 1.6809 Ops/s $\color{#35bf28}+0.94\%$
test_serial 1.3565s 1.3158s 0.7600 Ops/s 0.7394 Ops/s $\color{#35bf28}+2.79\%$
test_parallel 1.3241s 1.2892s 0.7757 Ops/s 0.7779 Ops/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-True-True-True-True] 0.1725ms 22.7674μs 43.9225 KOps/s 44.8446 KOps/s $\color{#d91a1a}-2.06\%$
test_step_mdp_speed[True-True-True-True-False] 39.3540μs 13.8606μs 72.1472 KOps/s 73.9615 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[True-True-True-False-True] 34.3250μs 13.8564μs 72.1687 KOps/s 72.8380 KOps/s $\color{#d91a1a}-0.92\%$
test_step_mdp_speed[True-True-True-False-False] 33.2420μs 8.3839μs 119.2758 KOps/s 122.4744 KOps/s $\color{#d91a1a}-2.61\%$
test_step_mdp_speed[True-True-False-True-True] 52.1770μs 24.2678μs 41.2068 KOps/s 42.2424 KOps/s $\color{#d91a1a}-2.45\%$
test_step_mdp_speed[True-True-False-True-False] 45.9660μs 15.4343μs 64.7907 KOps/s 67.4932 KOps/s $\color{#d91a1a}-4.00\%$
test_step_mdp_speed[True-True-False-False-True] 46.2970μs 15.2514μs 65.5678 KOps/s 66.8904 KOps/s $\color{#d91a1a}-1.98\%$
test_step_mdp_speed[True-True-False-False-False] 43.7720μs 9.7034μs 103.0572 KOps/s 104.4918 KOps/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-False-True-True-True] 54.2910μs 25.7576μs 38.8235 KOps/s 39.4805 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[True-False-True-True-False] 52.8790μs 16.7272μs 59.7827 KOps/s 62.5405 KOps/s $\color{#d91a1a}-4.41\%$
test_step_mdp_speed[True-False-True-False-True] 44.4230μs 15.2073μs 65.7577 KOps/s 67.3349 KOps/s $\color{#d91a1a}-2.34\%$
test_step_mdp_speed[True-False-True-False-False] 32.8620μs 9.5922μs 104.2515 KOps/s 105.8477 KOps/s $\color{#d91a1a}-1.51\%$
test_step_mdp_speed[True-False-False-True-True] 59.4510μs 26.7315μs 37.4090 KOps/s 38.0744 KOps/s $\color{#d91a1a}-1.75\%$
test_step_mdp_speed[True-False-False-True-False] 41.4780μs 17.9521μs 55.7038 KOps/s 57.9388 KOps/s $\color{#d91a1a}-3.86\%$
test_step_mdp_speed[True-False-False-False-True] 41.5980μs 16.3034μs 61.3370 KOps/s 62.0879 KOps/s $\color{#d91a1a}-1.21\%$
test_step_mdp_speed[True-False-False-False-False] 31.8900μs 11.0104μs 90.8234 KOps/s 94.4526 KOps/s $\color{#d91a1a}-3.84\%$
test_step_mdp_speed[False-True-True-True-True] 74.1180μs 25.4813μs 39.2445 KOps/s 39.8680 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[False-True-True-True-False] 37.0390μs 16.5738μs 60.3361 KOps/s 61.5582 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-True-False-True] 40.4350μs 17.3124μs 57.7621 KOps/s 57.7839 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-True-True-False-False] 30.8280μs 10.9553μs 91.2799 KOps/s 93.1125 KOps/s $\color{#d91a1a}-1.97\%$
test_step_mdp_speed[False-True-False-True-True] 67.9360μs 26.8081μs 37.3022 KOps/s 38.1272 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[False-True-False-True-False] 86.9620μs 17.5821μs 56.8760 KOps/s 58.0054 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-True-False-False-True] 76.5360μs 18.1981μs 54.9509 KOps/s 54.0547 KOps/s $\color{#35bf28}+1.66\%$
test_step_mdp_speed[False-True-False-False-False] 59.5180μs 12.1741μs 82.1414 KOps/s 84.1288 KOps/s $\color{#d91a1a}-2.36\%$
test_step_mdp_speed[False-False-True-True-True] 66.4640μs 27.8863μs 35.8599 KOps/s 35.9752 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[False-False-True-True-False] 49.0320μs 18.9888μs 52.6627 KOps/s 53.5725 KOps/s $\color{#d91a1a}-1.70\%$
test_step_mdp_speed[False-False-True-False-True] 43.2610μs 18.5252μs 53.9806 KOps/s 53.6187 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-False-True-False-False] 43.1010μs 11.9346μs 83.7899 KOps/s 84.2339 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[False-False-False-True-True] 58.5490μs 28.6001μs 34.9649 KOps/s 34.9296 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[False-False-False-True-False] 42.8400μs 20.3678μs 49.0971 KOps/s 50.8968 KOps/s $\color{#d91a1a}-3.54\%$
test_step_mdp_speed[False-False-False-False-True] 48.8320μs 19.5853μs 51.0587 KOps/s 51.2960 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-False-False-False-False] 59.8510μs 13.2090μs 75.7061 KOps/s 77.1249 KOps/s $\color{#d91a1a}-1.84\%$
test_values[generalized_advantage_estimate-True-True] 19.3488ms 12.1461ms 82.3313 Ops/s 82.2453 Ops/s $\color{#35bf28}+0.10\%$
test_values[vec_generalized_advantage_estimate-True-True] 36.5738ms 27.2330ms 36.7202 Ops/s 37.4750 Ops/s $\color{#d91a1a}-2.01\%$
test_values[td0_return_estimate-False-False] 0.2243ms 0.1552ms 6.4422 KOps/s 5.0954 KOps/s $\textbf{\color{#35bf28}+26.43\%}$
test_values[td1_return_estimate-False-False] 26.9685ms 25.2506ms 39.6030 Ops/s 38.9023 Ops/s $\color{#35bf28}+1.80\%$
test_values[vec_td1_return_estimate-False-False] 35.9779ms 27.1721ms 36.8024 Ops/s 37.3008 Ops/s $\color{#d91a1a}-1.34\%$
test_values[td_lambda_return_estimate-True-False] 37.9492ms 35.7133ms 28.0008 Ops/s 27.8639 Ops/s $\color{#35bf28}+0.49\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.1492ms 27.1244ms 36.8671 Ops/s 36.9690 Ops/s $\color{#d91a1a}-0.28\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.3137ms 8.1090ms 123.3191 Ops/s 124.6660 Ops/s $\color{#d91a1a}-1.08\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.1650ms 1.8601ms 537.6137 Ops/s 543.6527 Ops/s $\color{#d91a1a}-1.11\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 11.0770ms 0.4418ms 2.2637 KOps/s 2.2369 KOps/s $\color{#35bf28}+1.20\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 49.7962ms 39.6816ms 25.2006 Ops/s 26.0976 Ops/s $\color{#d91a1a}-3.44\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 11.9962ms 2.5584ms 390.8738 Ops/s 388.2429 Ops/s $\color{#35bf28}+0.68\%$
test_dqn_speed 11.8764ms 1.6411ms 609.3576 Ops/s 609.7263 Ops/s $\color{#d91a1a}-0.06\%$
test_ddpg_speed 14.4149ms 3.8646ms 258.7596 Ops/s 273.4573 Ops/s $\textbf{\color{#d91a1a}-5.37\%}$
test_sac_speed 20.9693ms 10.2941ms 97.1431 Ops/s 96.8322 Ops/s $\color{#35bf28}+0.32\%$
test_redq_speed 29.3511ms 19.1806ms 52.1359 Ops/s 50.8958 Ops/s $\color{#35bf28}+2.44\%$
test_redq_deprec_speed 87.8426ms 16.3322ms 61.2286 Ops/s 64.0184 Ops/s $\color{#d91a1a}-4.36\%$
test_td3_speed 18.0378ms 10.3772ms 96.3651 Ops/s 94.2602 Ops/s $\color{#35bf28}+2.23\%$
test_cql_speed 40.1895ms 39.0238ms 25.6254 Ops/s 25.5658 Ops/s $\color{#35bf28}+0.23\%$
test_a2c_speed 19.9088ms 8.9628ms 111.5727 Ops/s 115.0178 Ops/s $\color{#d91a1a}-3.00\%$
test_ppo_speed 20.8638ms 9.2714ms 107.8586 Ops/s 110.5943 Ops/s $\color{#d91a1a}-2.47\%$
test_reinforce_speed 18.7343ms 7.8668ms 127.1163 Ops/s 129.0526 Ops/s $\color{#d91a1a}-1.50\%$
test_iql_speed 45.1369ms 35.3840ms 28.2613 Ops/s 28.1322 Ops/s $\color{#35bf28}+0.46\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.5166ms 1.9634ms 509.3241 Ops/s 528.5601 Ops/s $\color{#d91a1a}-3.64\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1052s 2.2962ms 435.5108 Ops/s 501.4566 Ops/s $\textbf{\color{#d91a1a}-13.15\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.2179ms 2.0968ms 476.9068 Ops/s 499.9296 Ops/s $\color{#d91a1a}-4.61\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.5848ms 1.9489ms 513.1060 Ops/s 524.1819 Ops/s $\color{#d91a1a}-2.11\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1063s 2.3187ms 431.2709 Ops/s 488.8212 Ops/s $\textbf{\color{#d91a1a}-11.77\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.4139ms 2.0863ms 479.3119 Ops/s 484.5819 Ops/s $\color{#d91a1a}-1.09\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.2035ms 1.9487ms 513.1653 Ops/s 524.7014 Ops/s $\color{#d91a1a}-2.20\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1067s 2.3297ms 429.2455 Ops/s 468.7646 Ops/s $\textbf{\color{#d91a1a}-8.43\%}$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.4900ms 2.1117ms 473.5529 Ops/s 471.7474 Ops/s $\color{#35bf28}+0.38\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.0657ms 1.9700ms 507.6096 Ops/s 511.8896 Ops/s $\color{#d91a1a}-0.84\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1115s 2.3140ms 432.1482 Ops/s 489.7017 Ops/s $\textbf{\color{#d91a1a}-11.75\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.2031ms 2.1843ms 457.8029 Ops/s 480.9862 Ops/s $\color{#d91a1a}-4.82\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.8094ms 2.0456ms 488.8584 Ops/s 506.4464 Ops/s $\color{#d91a1a}-3.47\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1157s 2.4355ms 410.5977 Ops/s 488.8028 Ops/s $\textbf{\color{#d91a1a}-16.00\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.1644ms 2.1948ms 455.6128 Ops/s 480.9122 Ops/s $\textbf{\color{#d91a1a}-5.26\%}$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.8965ms 2.0809ms 480.5696 Ops/s 526.3578 Ops/s $\textbf{\color{#d91a1a}-8.70\%}$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1129s 2.4145ms 414.1640 Ops/s 490.3746 Ops/s $\textbf{\color{#d91a1a}-15.54\%}$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.1872ms 2.1530ms 464.4771 Ops/s 487.0311 Ops/s $\color{#d91a1a}-4.63\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1795s 18.3040ms 54.6329 Ops/s 56.3830 Ops/s $\color{#d91a1a}-3.10\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1166s 16.8596ms 59.3134 Ops/s 60.8320 Ops/s $\color{#d91a1a}-2.50\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1167s 16.9992ms 58.8263 Ops/s 61.1481 Ops/s $\color{#d91a1a}-3.80\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1155s 16.8350ms 59.4001 Ops/s 53.7469 Ops/s $\textbf{\color{#35bf28}+10.52\%}$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1190s 16.9444ms 59.0166 Ops/s 61.4770 Ops/s $\color{#d91a1a}-4.00\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1231s 17.3022ms 57.7960 Ops/s 60.4701 Ops/s $\color{#d91a1a}-4.42\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1197s 17.0568ms 58.6276 Ops/s 65.9400 Ops/s $\textbf{\color{#d91a1a}-11.09\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1166s 16.7645ms 59.6499 Ops/s 59.5587 Ops/s $\color{#35bf28}+0.15\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1217s 19.3667ms 51.6350 Ops/s 58.0029 Ops/s $\textbf{\color{#d91a1a}-10.98\%}$

Copy link

github-actions bot commented Dec 6, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1241s 0.1240s 8.0641 Ops/s 8.0795 Ops/s $\color{#d91a1a}-0.19\%$
test_sync 0.1029s 0.1025s 9.7576 Ops/s 9.7095 Ops/s $\color{#35bf28}+0.50\%$
test_async 0.2835s 99.9024ms 10.0098 Ops/s 10.0532 Ops/s $\color{#d91a1a}-0.43\%$
test_single_pixels 0.1465s 0.1461s 6.8453 Ops/s 7.4872 Ops/s $\textbf{\color{#d91a1a}-8.57\%}$
test_sync_pixels 98.2095ms 96.2610ms 10.3884 Ops/s 10.4504 Ops/s $\color{#d91a1a}-0.59\%$
test_async_pixels 0.1893s 91.2012ms 10.9648 Ops/s 10.7046 Ops/s $\color{#35bf28}+2.43\%$
test_simple 0.9740s 0.8948s 1.1176 Ops/s 1.1117 Ops/s $\color{#35bf28}+0.54\%$
test_transformed 1.2143s 1.1461s 0.8725 Ops/s 0.8699 Ops/s $\color{#35bf28}+0.30\%$
test_serial 2.5720s 2.5039s 0.3994 Ops/s 0.3987 Ops/s $\color{#35bf28}+0.18\%$
test_parallel 2.5702s 2.4924s 0.4012 Ops/s 0.4020 Ops/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[True-True-True-True-True] 0.1683ms 33.9739μs 29.4344 KOps/s 28.1573 KOps/s $\color{#35bf28}+4.54\%$
test_step_mdp_speed[True-True-True-True-False] 45.1010μs 20.2582μs 49.3628 KOps/s 47.6654 KOps/s $\color{#35bf28}+3.56\%$
test_step_mdp_speed[True-True-True-False-True] 52.0610μs 20.5271μs 48.7160 KOps/s 48.4649 KOps/s $\color{#35bf28}+0.52\%$
test_step_mdp_speed[True-True-True-False-False] 41.3710μs 12.1897μs 82.0363 KOps/s 81.3480 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[True-True-False-True-True] 81.4910μs 36.7959μs 27.1769 KOps/s 26.8204 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[True-True-False-True-False] 42.7910μs 22.4737μs 44.4964 KOps/s 44.6698 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[True-True-False-False-True] 63.7420μs 22.3589μs 44.7249 KOps/s 44.4750 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-True-False-False-False] 55.0310μs 14.0815μs 71.0151 KOps/s 69.3159 KOps/s $\color{#35bf28}+2.45\%$
test_step_mdp_speed[True-False-True-True-True] 68.5510μs 38.7406μs 25.8127 KOps/s 25.6407 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-False-True-True-False] 53.9110μs 24.5065μs 40.8055 KOps/s 40.3140 KOps/s $\color{#35bf28}+1.22\%$
test_step_mdp_speed[True-False-True-False-True] 56.1610μs 22.3195μs 44.8038 KOps/s 43.3429 KOps/s $\color{#35bf28}+3.37\%$
test_step_mdp_speed[True-False-True-False-False] 33.0700μs 14.1660μs 70.5916 KOps/s 70.4040 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[True-False-False-True-True] 0.1030ms 39.7087μs 25.1834 KOps/s 24.1197 KOps/s $\color{#35bf28}+4.41\%$
test_step_mdp_speed[True-False-False-True-False] 67.8310μs 26.2575μs 38.0844 KOps/s 38.2344 KOps/s $\color{#d91a1a}-0.39\%$
test_step_mdp_speed[True-False-False-False-True] 43.3910μs 23.8636μs 41.9049 KOps/s 40.1900 KOps/s $\color{#35bf28}+4.27\%$
test_step_mdp_speed[True-False-False-False-False] 44.1210μs 15.8020μs 63.2833 KOps/s 62.6366 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-True-True-True-True] 59.6210μs 38.1745μs 26.1955 KOps/s 25.4880 KOps/s $\color{#35bf28}+2.78\%$
test_step_mdp_speed[False-True-True-True-False] 54.0910μs 24.2685μs 41.2057 KOps/s 40.3985 KOps/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[False-True-True-False-True] 59.1510μs 26.6092μs 37.5810 KOps/s 37.4016 KOps/s $\color{#35bf28}+0.48\%$
test_step_mdp_speed[False-True-True-False-False] 31.4010μs 15.8911μs 62.9281 KOps/s 62.2886 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-True-False-True-True] 69.4310μs 40.1732μs 24.8922 KOps/s 24.4479 KOps/s $\color{#35bf28}+1.82\%$
test_step_mdp_speed[False-True-False-True-False] 57.6710μs 26.2452μs 38.1023 KOps/s 38.0156 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[False-True-False-False-True] 49.2510μs 28.0995μs 35.5878 KOps/s 34.2524 KOps/s $\color{#35bf28}+3.90\%$
test_step_mdp_speed[False-True-False-False-False] 90.0320μs 17.7669μs 56.2844 KOps/s 55.3970 KOps/s $\color{#35bf28}+1.60\%$
test_step_mdp_speed[False-False-True-True-True] 74.6020μs 42.3620μs 23.6061 KOps/s 23.2790 KOps/s $\color{#35bf28}+1.40\%$
test_step_mdp_speed[False-False-True-True-False] 58.8900μs 28.1435μs 35.5321 KOps/s 35.5251 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-False-True-False-True] 56.8810μs 28.0690μs 35.6265 KOps/s 34.8477 KOps/s $\color{#35bf28}+2.23\%$
test_step_mdp_speed[False-False-True-False-False] 36.6310μs 17.8070μs 56.1578 KOps/s 56.3564 KOps/s $\color{#d91a1a}-0.35\%$
test_step_mdp_speed[False-False-False-True-True] 0.1084ms 43.5097μs 22.9834 KOps/s 22.8822 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[False-False-False-True-False] 51.5910μs 30.1388μs 33.1798 KOps/s 32.9706 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-False-False-False-True] 60.3920μs 29.7561μs 33.6066 KOps/s 33.5032 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[False-False-False-False-False] 47.5710μs 19.4756μs 51.3464 KOps/s 50.2921 KOps/s $\color{#35bf28}+2.10\%$
test_values[generalized_advantage_estimate-True-True] 27.3612ms 26.7581ms 37.3718 Ops/s 37.2800 Ops/s $\color{#35bf28}+0.25\%$
test_values[vec_generalized_advantage_estimate-True-True] 90.7575ms 3.4045ms 293.7300 Ops/s 296.9179 Ops/s $\color{#d91a1a}-1.07\%$
test_values[td0_return_estimate-False-False] 0.1052ms 69.4181μs 14.4055 KOps/s 14.8066 KOps/s $\color{#d91a1a}-2.71\%$
test_values[td1_return_estimate-False-False] 58.2961ms 57.9311ms 17.2619 Ops/s 17.4845 Ops/s $\color{#d91a1a}-1.27\%$
test_values[vec_td1_return_estimate-False-False] 2.0182ms 1.7596ms 568.3242 Ops/s 571.2529 Ops/s $\color{#d91a1a}-0.51\%$
test_values[td_lambda_return_estimate-True-False] 94.6425ms 92.7005ms 10.7874 Ops/s 11.0174 Ops/s $\color{#d91a1a}-2.09\%$
test_values[vec_td_lambda_return_estimate-True-False] 2.1037ms 1.7598ms 568.2530 Ops/s 570.4731 Ops/s $\color{#d91a1a}-0.39\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.8655ms 25.6084ms 39.0497 Ops/s 39.9921 Ops/s $\color{#d91a1a}-2.36\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9210ms 0.7552ms 1.3241 KOps/s 1.3476 KOps/s $\color{#d91a1a}-1.74\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7749ms 0.7118ms 1.4048 KOps/s 1.4200 KOps/s $\color{#d91a1a}-1.07\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5318ms 1.5017ms 665.9267 Ops/s 668.3528 Ops/s $\color{#d91a1a}-0.36\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.0168ms 0.7428ms 1.3462 KOps/s 1.3726 KOps/s $\color{#d91a1a}-1.93\%$
test_dqn_speed 4.3263ms 1.5089ms 662.7342 Ops/s 663.5660 Ops/s $\color{#d91a1a}-0.13\%$
test_ddpg_speed 4.8655ms 3.4521ms 289.6778 Ops/s 269.2719 Ops/s $\textbf{\color{#35bf28}+7.58\%}$
test_sac_speed 0.1009s 10.4355ms 95.8272 Ops/s 105.6036 Ops/s $\textbf{\color{#d91a1a}-9.26\%}$
test_redq_speed 17.4853ms 17.0978ms 58.4872 Ops/s 58.9181 Ops/s $\color{#d91a1a}-0.73\%$
test_redq_deprec_speed 14.8574ms 13.5045ms 74.0493 Ops/s 75.3215 Ops/s $\color{#d91a1a}-1.69\%$
test_td3_speed 19.7440ms 9.8099ms 101.9377 Ops/s 102.8343 Ops/s $\color{#d91a1a}-0.87\%$
test_cql_speed 35.5671ms 33.2125ms 30.1092 Ops/s 29.9995 Ops/s $\color{#35bf28}+0.37\%$
test_a2c_speed 9.4561ms 7.6178ms 131.2723 Ops/s 135.7103 Ops/s $\color{#d91a1a}-3.27\%$
test_ppo_speed 9.3584ms 7.9961ms 125.0605 Ops/s 129.7416 Ops/s $\color{#d91a1a}-3.61\%$
test_reinforce_speed 8.4202ms 6.6820ms 149.6563 Ops/s 158.9197 Ops/s $\textbf{\color{#d91a1a}-5.83\%}$
test_iql_speed 30.3217ms 28.4511ms 35.1481 Ops/s 35.7843 Ops/s $\color{#d91a1a}-1.78\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.8596ms 2.4790ms 403.3912 Ops/s 399.4555 Ops/s $\color{#35bf28}+0.99\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.0299ms 2.6719ms 374.2645 Ops/s 327.8796 Ops/s $\textbf{\color{#35bf28}+14.15\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.8131ms 2.7049ms 369.6968 Ops/s 371.5453 Ops/s $\color{#d91a1a}-0.50\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.7112ms 2.4981ms 400.2979 Ops/s 399.2131 Ops/s $\color{#35bf28}+0.27\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.3143ms 2.6943ms 371.1503 Ops/s 373.0619 Ops/s $\color{#d91a1a}-0.51\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.8038ms 2.6788ms 373.2952 Ops/s 369.9654 Ops/s $\color{#35bf28}+0.90\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.0768ms 2.4935ms 401.0365 Ops/s 399.2107 Ops/s $\color{#35bf28}+0.46\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.7803ms 2.6888ms 371.9180 Ops/s 371.9436 Ops/s $-0.01\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.9056ms 2.6839ms 372.5857 Ops/s 371.8115 Ops/s $\color{#35bf28}+0.21\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.0901ms 2.5046ms 399.2638 Ops/s 397.9710 Ops/s $\color{#35bf28}+0.32\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.9540ms 2.6846ms 372.4981 Ops/s 372.4265 Ops/s $\color{#35bf28}+0.02\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.1006ms 2.6922ms 371.4380 Ops/s 371.5812 Ops/s $\color{#d91a1a}-0.04\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1705ms 2.5157ms 397.5021 Ops/s 399.8207 Ops/s $\color{#d91a1a}-0.58\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.6175ms 2.6902ms 371.7165 Ops/s 372.8910 Ops/s $\color{#d91a1a}-0.31\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.3344ms 2.6957ms 370.9669 Ops/s 369.4833 Ops/s $\color{#35bf28}+0.40\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1247ms 2.4984ms 400.2631 Ops/s 399.7616 Ops/s $\color{#35bf28}+0.13\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.8729ms 2.7006ms 370.2912 Ops/s 369.8921 Ops/s $\color{#35bf28}+0.11\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.7449ms 2.6921ms 371.4530 Ops/s 369.5286 Ops/s $\color{#35bf28}+0.52\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.2278s 20.1777ms 49.5596 Ops/s 49.1892 Ops/s $\color{#35bf28}+0.75\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1460s 18.7636ms 53.2948 Ops/s 53.5830 Ops/s $\color{#d91a1a}-0.54\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1402s 18.4724ms 54.1348 Ops/s 51.5916 Ops/s $\color{#35bf28}+4.93\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1365s 15.9805ms 62.5764 Ops/s 53.7169 Ops/s $\textbf{\color{#35bf28}+16.49\%}$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1468s 18.8466ms 53.0600 Ops/s 62.7063 Ops/s $\textbf{\color{#d91a1a}-15.38\%}$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1451s 18.8158ms 53.1467 Ops/s 53.8885 Ops/s $\color{#d91a1a}-1.38\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1445s 18.7660ms 53.2877 Ops/s 53.6823 Ops/s $\color{#d91a1a}-0.73\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1464s 18.7604ms 53.3037 Ops/s 53.0812 Ops/s $\color{#35bf28}+0.42\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1360s 18.6402ms 53.6475 Ops/s 53.1147 Ops/s $\color{#35bf28}+1.00\%$

@vmoens vmoens added bug Something isn't working enhancement New feature or request labels Dec 6, 2023
@vmoens vmoens marked this pull request as ready for review December 6, 2023 12:06
Copy link

@Arlaz Arlaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this improvement! Everything seems ok on my side too

@vmoens vmoens merged commit ee89728 into main Dec 7, 2023
57 of 60 checks passed
@vmoens vmoens deleted the double_dqn branch February 27, 2024 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] No real DDQN when using delay_value
3 participants