Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Use non-default mp start method in ParallelEnv #1966

Merged
merged 2 commits into from
Feb 27, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 26, 2024

TODO:

  • Solve the threading issue
  • Write tests

cc @teopir

Copy link

pytorch-bot bot commented Feb 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1966

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 4cd9247 with merge base 8f04818 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 26, 2024
@vmoens vmoens added the enhancement New feature or request label Feb 26, 2024
Copy link

github-actions bot commented Feb 26, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}1$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 61.5759ms 61.0799ms 16.3720 Ops/s 16.7281 Ops/s $\color{#d91a1a}-2.13\%$
test_sync 39.6469ms 34.0627ms 29.3576 Ops/s 30.4998 Ops/s $\color{#d91a1a}-3.74\%$
test_async 60.2498ms 30.5710ms 32.7107 Ops/s 31.1709 Ops/s $\color{#35bf28}+4.94\%$
test_simple 0.4832s 0.4355s 2.2962 Ops/s 2.3443 Ops/s $\color{#d91a1a}-2.05\%$
test_transformed 0.6306s 0.5816s 1.7195 Ops/s 1.7134 Ops/s $\color{#35bf28}+0.36\%$
test_serial 1.4661s 1.4211s 0.7037 Ops/s 0.7053 Ops/s $\color{#d91a1a}-0.23\%$
test_parallel 1.4452s 1.3914s 0.7187 Ops/s 0.7312 Ops/s $\color{#d91a1a}-1.71\%$
test_step_mdp_speed[True-True-True-True-True] 0.1420ms 21.1162μs 47.3569 KOps/s 47.3924 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-True-True-True-False] 50.0330μs 12.8331μs 77.9237 KOps/s 77.6871 KOps/s $\color{#35bf28}+0.30\%$
test_step_mdp_speed[True-True-True-False-True] 34.1340μs 12.3431μs 81.0167 KOps/s 81.5575 KOps/s $\color{#d91a1a}-0.66\%$
test_step_mdp_speed[True-True-True-False-False] 33.7330μs 7.4923μs 133.4702 KOps/s 133.8753 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-True-False-True-True] 56.9260μs 22.5017μs 44.4411 KOps/s 44.8561 KOps/s $\color{#d91a1a}-0.93\%$
test_step_mdp_speed[True-True-False-True-False] 39.8040μs 14.1369μs 70.7367 KOps/s 70.8328 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-True-False-False-True] 40.2060μs 13.4726μs 74.2249 KOps/s 74.0575 KOps/s $\color{#35bf28}+0.23\%$
test_step_mdp_speed[True-True-False-False-False] 35.6760μs 8.7207μs 114.6701 KOps/s 115.2769 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-False-True-True-True] 53.2490μs 24.0623μs 41.5589 KOps/s 42.1138 KOps/s $\color{#d91a1a}-1.32\%$
test_step_mdp_speed[True-False-True-True-False] 57.7880μs 15.5302μs 64.3907 KOps/s 64.6288 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[True-False-True-False-True] 44.7940μs 13.6093μs 73.4792 KOps/s 74.1359 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[True-False-True-False-False] 34.0340μs 8.7376μs 114.4473 KOps/s 115.2439 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[True-False-False-True-True] 55.6340μs 25.3501μs 39.4476 KOps/s 40.1816 KOps/s $\color{#d91a1a}-1.83\%$
test_step_mdp_speed[True-False-False-True-False] 42.5490μs 16.9704μs 58.9262 KOps/s 60.0593 KOps/s $\color{#d91a1a}-1.89\%$
test_step_mdp_speed[True-False-False-False-True] 49.9830μs 14.8067μs 67.5368 KOps/s 67.9995 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-False-False-False-False] 31.1480μs 9.9181μs 100.8259 KOps/s 101.8199 KOps/s $\color{#d91a1a}-0.98\%$
test_step_mdp_speed[False-True-True-True-True] 51.3250μs 23.8481μs 41.9320 KOps/s 41.9131 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[False-True-True-True-False] 49.1120μs 15.4819μs 64.5914 KOps/s 64.3722 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-True-True-False-True] 40.6860μs 15.7386μs 63.5381 KOps/s 62.6887 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[False-True-True-False-False] 49.6130μs 9.9921μs 100.0793 KOps/s 99.7172 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-True-False-True-True] 38.3820μs 25.2739μs 39.5665 KOps/s 38.8485 KOps/s $\color{#35bf28}+1.85\%$
test_step_mdp_speed[False-True-False-True-False] 38.3320μs 16.7266μs 59.7849 KOps/s 59.9204 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-True-False-False-True] 47.4190μs 17.1396μs 58.3444 KOps/s 59.0960 KOps/s $\color{#d91a1a}-1.27\%$
test_step_mdp_speed[False-True-False-False-False] 37.1400μs 11.2375μs 88.9881 KOps/s 88.6886 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-False-True-True-True] 57.4170μs 26.4932μs 37.7455 KOps/s 37.8537 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[False-False-True-True-False] 57.5380μs 18.1740μs 55.0235 KOps/s 55.6073 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-False-True-False-True] 39.6440μs 17.2183μs 58.0777 KOps/s 58.8895 KOps/s $\color{#d91a1a}-1.38\%$
test_step_mdp_speed[False-False-True-False-False] 32.3710μs 11.0892μs 90.1782 KOps/s 89.4810 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-False-False-True-True] 58.9800μs 27.3324μs 36.5866 KOps/s 36.4715 KOps/s $\color{#35bf28}+0.32\%$
test_step_mdp_speed[False-False-False-True-False] 49.7530μs 19.0590μs 52.4687 KOps/s 52.5142 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[False-False-False-False-True] 58.8500μs 18.1911μs 54.9720 KOps/s 55.6660 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-False-False-False-False] 43.1610μs 12.2864μs 81.3905 KOps/s 82.2171 KOps/s $\color{#d91a1a}-1.01\%$
test_values[generalized_advantage_estimate-True-True] 10.2921ms 9.2221ms 108.4347 Ops/s 109.9929 Ops/s $\color{#d91a1a}-1.42\%$
test_values[vec_generalized_advantage_estimate-True-True] 51.2115ms 35.3997ms 28.2488 Ops/s 30.5813 Ops/s $\textbf{\color{#d91a1a}-7.63\%}$
test_values[td0_return_estimate-False-False] 0.2039ms 0.1642ms 6.0913 KOps/s 6.1535 KOps/s $\color{#d91a1a}-1.01\%$
test_values[td1_return_estimate-False-False] 25.4390ms 22.7963ms 43.8667 Ops/s 44.1070 Ops/s $\color{#d91a1a}-0.54\%$
test_values[vec_td1_return_estimate-False-False] 35.9677ms 35.0080ms 28.5649 Ops/s 30.4145 Ops/s $\textbf{\color{#d91a1a}-6.08\%}$
test_values[td_lambda_return_estimate-True-False] 35.2786ms 32.7033ms 30.5779 Ops/s 30.7263 Ops/s $\color{#d91a1a}-0.48\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.6527ms 35.0799ms 28.5064 Ops/s 30.3721 Ops/s $\textbf{\color{#d91a1a}-6.14\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.2061ms 8.0225ms 124.6487 Ops/s 122.3828 Ops/s $\color{#35bf28}+1.85\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3201ms 2.0105ms 497.3964 Ops/s 497.8034 Ops/s $\color{#d91a1a}-0.08\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5944ms 0.3491ms 2.8645 KOps/s 2.8838 KOps/s $\color{#d91a1a}-0.67\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 44.8810ms 43.1946ms 23.1510 Ops/s 25.1722 Ops/s $\textbf{\color{#d91a1a}-8.03\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.9185ms 3.0429ms 328.6332 Ops/s 330.8078 Ops/s $\color{#d91a1a}-0.66\%$
test_dqn_speed 72.1338ms 1.4882ms 671.9307 Ops/s 727.8788 Ops/s $\textbf{\color{#d91a1a}-7.69\%}$
test_ddpg_speed 2.9597ms 2.7806ms 359.6354 Ops/s 359.0870 Ops/s $\color{#35bf28}+0.15\%$
test_sac_speed 9.4900ms 8.3524ms 119.7259 Ops/s 122.2713 Ops/s $\color{#d91a1a}-2.08\%$
test_redq_speed 14.1971ms 13.1960ms 75.7803 Ops/s 78.0626 Ops/s $\color{#d91a1a}-2.92\%$
test_redq_deprec_speed 15.4658ms 13.7973ms 72.4780 Ops/s 77.1737 Ops/s $\textbf{\color{#d91a1a}-6.08\%}$
test_td3_speed 8.7756ms 8.3532ms 119.7144 Ops/s 122.4418 Ops/s $\color{#d91a1a}-2.23\%$
test_cql_speed 37.8212ms 36.4626ms 27.4254 Ops/s 27.9648 Ops/s $\color{#d91a1a}-1.93\%$
test_a2c_speed 8.2137ms 7.6070ms 131.4584 Ops/s 136.7028 Ops/s $\color{#d91a1a}-3.84\%$
test_ppo_speed 9.0719ms 7.8136ms 127.9821 Ops/s 129.2576 Ops/s $\color{#d91a1a}-0.99\%$
test_reinforce_speed 7.4660ms 6.7421ms 148.3222 Ops/s 152.2991 Ops/s $\color{#d91a1a}-2.61\%$
test_iql_speed 34.2219ms 32.9521ms 30.3471 Ops/s 30.9336 Ops/s $\color{#d91a1a}-1.90\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.5989ms 2.1805ms 458.6013 Ops/s 482.5277 Ops/s $\color{#d91a1a}-4.96\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9955ms 0.5186ms 1.9284 KOps/s 2.0380 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7654ms 0.4699ms 2.1282 KOps/s 2.1294 KOps/s $\color{#d91a1a}-0.05\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.6553ms 2.3089ms 433.1073 Ops/s 489.9737 Ops/s $\textbf{\color{#d91a1a}-11.61\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6081ms 0.4861ms 2.0574 KOps/s 2.0510 KOps/s $\color{#35bf28}+0.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 86.9319ms 0.5346ms 1.8704 KOps/s 2.1677 KOps/s $\textbf{\color{#d91a1a}-13.71\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.6517ms 2.2991ms 434.9580 Ops/s 464.6205 Ops/s $\textbf{\color{#d91a1a}-6.38\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8449ms 0.6116ms 1.6350 KOps/s 1.6646 KOps/s $\color{#d91a1a}-1.78\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.7116ms 0.5831ms 1.7151 KOps/s 1.7360 KOps/s $\color{#d91a1a}-1.20\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.5912ms 2.4001ms 416.6579 Ops/s 486.2304 Ops/s $\textbf{\color{#d91a1a}-14.31\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8042ms 0.5130ms 1.9494 KOps/s 1.7453 KOps/s $\textbf{\color{#35bf28}+11.69\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.4187ms 0.4905ms 2.0386 KOps/s 2.1242 KOps/s $\color{#d91a1a}-4.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.6808ms 2.4754ms 403.9801 Ops/s 486.1510 Ops/s $\textbf{\color{#d91a1a}-16.90\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6428ms 0.5043ms 1.9828 KOps/s 2.0543 KOps/s $\color{#d91a1a}-3.48\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7195ms 0.4703ms 2.1261 KOps/s 2.1450 KOps/s $\color{#d91a1a}-0.88\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.4272ms 2.3200ms 431.0342 Ops/s 458.7803 Ops/s $\textbf{\color{#d91a1a}-6.05\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1375ms 0.6129ms 1.6316 KOps/s 1.6587 KOps/s $\color{#d91a1a}-1.63\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0424ms 0.5939ms 1.6837 KOps/s 1.7207 KOps/s $\color{#d91a1a}-2.15\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1020s 5.5748ms 179.3798 Ops/s 185.6238 Ops/s $\color{#d91a1a}-3.36\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 14.0835ms 11.6522ms 85.8211 Ops/s 85.6846 Ops/s $\color{#35bf28}+0.16\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.7807ms 1.0750ms 930.2098 Ops/s 995.2308 Ops/s $\textbf{\color{#d91a1a}-6.53\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 85.1690ms 6.8113ms 146.8145 Ops/s 151.0166 Ops/s $\color{#d91a1a}-2.78\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 14.0264ms 11.6385ms 85.9218 Ops/s 86.1148 Ops/s $\color{#d91a1a}-0.22\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 3.7278ms 1.0951ms 913.1885 Ops/s 996.4614 Ops/s $\textbf{\color{#d91a1a}-8.36\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 83.8663ms 5.4691ms 182.8470 Ops/s 183.8331 Ops/s $\color{#d91a1a}-0.54\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 93.6176ms 13.5753ms 73.6630 Ops/s 72.9312 Ops/s $\color{#35bf28}+1.00\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.0872ms 1.3442ms 743.9229 Ops/s 742.2843 Ops/s $\color{#35bf28}+0.22\%$

Copy link

github-actions bot commented Feb 26, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1153s 0.1138s 8.7897 Ops/s 8.7505 Ops/s $\color{#35bf28}+0.45\%$
test_sync 95.6225ms 95.3596ms 10.4866 Ops/s 10.4128 Ops/s $\color{#35bf28}+0.71\%$
test_async 0.1814s 91.9067ms 10.8806 Ops/s 10.9138 Ops/s $\color{#d91a1a}-0.30\%$
test_single_pixels 0.2054s 0.1441s 6.9415 Ops/s 7.4227 Ops/s $\textbf{\color{#d91a1a}-6.48\%}$
test_sync_pixels 83.0110ms 80.5173ms 12.4197 Ops/s 12.5944 Ops/s $\color{#d91a1a}-1.39\%$
test_async_pixels 0.1443s 70.3403ms 14.2166 Ops/s 15.1816 Ops/s $\textbf{\color{#d91a1a}-6.36\%}$
test_simple 0.8142s 0.8102s 1.2343 Ops/s 1.2013 Ops/s $\color{#35bf28}+2.75\%$
test_transformed 1.0277s 1.0265s 0.9742 Ops/s 0.9646 Ops/s $\color{#35bf28}+1.00\%$
test_serial 2.4368s 2.3802s 0.4201 Ops/s 0.4181 Ops/s $\color{#35bf28}+0.49\%$
test_parallel 2.0907s 2.0567s 0.4862 Ops/s 0.4911 Ops/s $\color{#d91a1a}-0.99\%$
test_step_mdp_speed[True-True-True-True-True] 82.9920μs 33.0024μs 30.3009 KOps/s 30.6792 KOps/s $\color{#d91a1a}-1.23\%$
test_step_mdp_speed[True-True-True-True-False] 43.6210μs 19.4996μs 51.2831 KOps/s 51.6344 KOps/s $\color{#d91a1a}-0.68\%$
test_step_mdp_speed[True-True-True-False-True] 46.3510μs 18.6882μs 53.5096 KOps/s 55.1461 KOps/s $\color{#d91a1a}-2.97\%$
test_step_mdp_speed[True-True-True-False-False] 40.4310μs 11.1646μs 89.5691 KOps/s 92.4347 KOps/s $\color{#d91a1a}-3.10\%$
test_step_mdp_speed[True-True-False-True-True] 66.9410μs 34.7205μs 28.8014 KOps/s 29.3111 KOps/s $\color{#d91a1a}-1.74\%$
test_step_mdp_speed[True-True-False-True-False] 49.6210μs 21.2946μs 46.9602 KOps/s 48.1882 KOps/s $\color{#d91a1a}-2.55\%$
test_step_mdp_speed[True-True-False-False-True] 48.4210μs 20.7117μs 48.2818 KOps/s 50.8224 KOps/s $\color{#d91a1a}-5.00\%$
test_step_mdp_speed[True-True-False-False-False] 86.8820μs 13.1234μs 76.1997 KOps/s 78.3016 KOps/s $\color{#d91a1a}-2.68\%$
test_step_mdp_speed[True-False-True-True-True] 64.1810μs 37.2081μs 26.8759 KOps/s 27.9688 KOps/s $\color{#d91a1a}-3.91\%$
test_step_mdp_speed[True-False-True-True-False] 47.2010μs 23.4558μs 42.6334 KOps/s 43.8957 KOps/s $\color{#d91a1a}-2.88\%$
test_step_mdp_speed[True-False-True-False-True] 39.4900μs 20.6229μs 48.4898 KOps/s 48.7492 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-False-True-False-False] 31.7810μs 12.9513μs 77.2123 KOps/s 78.4817 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[True-False-False-True-True] 62.3710μs 38.4617μs 25.9999 KOps/s 26.4649 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[True-False-False-True-False] 49.3010μs 25.2437μs 39.6138 KOps/s 40.5367 KOps/s $\color{#d91a1a}-2.28\%$
test_step_mdp_speed[True-False-False-False-True] 50.7710μs 22.5105μs 44.4236 KOps/s 46.1639 KOps/s $\color{#d91a1a}-3.77\%$
test_step_mdp_speed[True-False-False-False-False] 34.2810μs 14.9783μs 66.7631 KOps/s 69.6624 KOps/s $\color{#d91a1a}-4.16\%$
test_step_mdp_speed[False-True-True-True-True] 56.5820μs 36.8191μs 27.1598 KOps/s 28.0142 KOps/s $\color{#d91a1a}-3.05\%$
test_step_mdp_speed[False-True-True-True-False] 45.2210μs 23.2961μs 42.9256 KOps/s 43.6989 KOps/s $\color{#d91a1a}-1.77\%$
test_step_mdp_speed[False-True-True-False-True] 55.5110μs 24.7327μs 40.4323 KOps/s 41.5014 KOps/s $\color{#d91a1a}-2.58\%$
test_step_mdp_speed[False-True-True-False-False] 36.4010μs 14.8950μs 67.1368 KOps/s 69.2403 KOps/s $\color{#d91a1a}-3.04\%$
test_step_mdp_speed[False-True-False-True-True] 62.7710μs 38.2580μs 26.1383 KOps/s 26.1232 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[False-True-False-True-False] 43.4310μs 25.1798μs 39.7144 KOps/s 40.2950 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[False-True-False-False-True] 61.8410μs 26.1674μs 38.2156 KOps/s 39.0158 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[False-True-False-False-False] 33.4910μs 16.4091μs 60.9419 KOps/s 61.9001 KOps/s $\color{#d91a1a}-1.55\%$
test_step_mdp_speed[False-False-True-True-True] 65.5310μs 40.9764μs 24.4043 KOps/s 25.2642 KOps/s $\color{#d91a1a}-3.40\%$
test_step_mdp_speed[False-False-True-True-False] 54.3810μs 27.4064μs 36.4878 KOps/s 37.6069 KOps/s $\color{#d91a1a}-2.98\%$
test_step_mdp_speed[False-False-True-False-True] 59.3420μs 26.3400μs 37.9651 KOps/s 39.0179 KOps/s $\color{#d91a1a}-2.70\%$
test_step_mdp_speed[False-False-True-False-False] 51.8910μs 16.8422μs 59.3745 KOps/s 62.1798 KOps/s $\color{#d91a1a}-4.51\%$
test_step_mdp_speed[False-False-False-True-True] 61.6910μs 42.2696μs 23.6577 KOps/s 24.1214 KOps/s $\color{#d91a1a}-1.92\%$
test_step_mdp_speed[False-False-False-True-False] 58.1710μs 29.1878μs 34.2608 KOps/s 34.9523 KOps/s $\color{#d91a1a}-1.98\%$
test_step_mdp_speed[False-False-False-False-True] 56.1610μs 28.2298μs 35.4236 KOps/s 37.0898 KOps/s $\color{#d91a1a}-4.49\%$
test_step_mdp_speed[False-False-False-False-False] 84.1210μs 18.3811μs 54.4037 KOps/s 56.3750 KOps/s $\color{#d91a1a}-3.50\%$
test_values[generalized_advantage_estimate-True-True] 24.7836ms 24.2547ms 41.2292 Ops/s 40.6296 Ops/s $\color{#35bf28}+1.48\%$
test_values[vec_generalized_advantage_estimate-True-True] 80.9131ms 3.1767ms 314.7892 Ops/s 301.2727 Ops/s $\color{#35bf28}+4.49\%$
test_values[td0_return_estimate-False-False] 99.2620μs 60.1896μs 16.6142 KOps/s 16.5659 KOps/s $\color{#35bf28}+0.29\%$
test_values[td1_return_estimate-False-False] 52.7707ms 52.0843ms 19.1996 Ops/s 18.7085 Ops/s $\color{#35bf28}+2.63\%$
test_values[vec_td1_return_estimate-False-False] 2.1247ms 1.7566ms 569.2853 Ops/s 569.0948 Ops/s $\color{#35bf28}+0.03\%$
test_values[td_lambda_return_estimate-True-False] 83.2763ms 83.0292ms 12.0440 Ops/s 11.7477 Ops/s $\color{#35bf28}+2.52\%$
test_values[vec_td_lambda_return_estimate-True-False] 3.9420ms 1.7925ms 557.8673 Ops/s 558.6854 Ops/s $\color{#d91a1a}-0.15\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 22.9045ms 22.7444ms 43.9668 Ops/s 42.5713 Ops/s $\color{#35bf28}+3.28\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9028ms 0.6897ms 1.4499 KOps/s 1.4345 KOps/s $\color{#35bf28}+1.07\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7035ms 0.6451ms 1.5501 KOps/s 1.5418 KOps/s $\color{#35bf28}+0.53\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.4773ms 1.4471ms 691.0156 Ops/s 688.7955 Ops/s $\color{#35bf28}+0.32\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9239ms 0.6660ms 1.5014 KOps/s 1.4970 KOps/s $\color{#35bf28}+0.30\%$
test_dqn_speed 8.8801ms 1.4746ms 678.1500 Ops/s 640.5885 Ops/s $\textbf{\color{#35bf28}+5.86\%}$
test_ddpg_speed 3.9409ms 2.8019ms 356.8985 Ops/s 370.1837 Ops/s $\color{#d91a1a}-3.59\%$
test_sac_speed 9.3110ms 8.0595ms 124.0765 Ops/s 124.8829 Ops/s $\color{#d91a1a}-0.65\%$
test_redq_speed 11.0071ms 10.1352ms 98.6659 Ops/s 100.2710 Ops/s $\color{#d91a1a}-1.60\%$
test_redq_deprec_speed 12.5229ms 11.3849ms 87.8354 Ops/s 89.0770 Ops/s $\color{#d91a1a}-1.39\%$
test_td3_speed 8.2300ms 8.0868ms 123.6584 Ops/s 124.7792 Ops/s $\color{#d91a1a}-0.90\%$
test_cql_speed 25.7920ms 24.9439ms 40.0900 Ops/s 40.8228 Ops/s $\color{#d91a1a}-1.79\%$
test_a2c_speed 5.7647ms 5.5199ms 181.1639 Ops/s 197.4889 Ops/s $\textbf{\color{#d91a1a}-8.27\%}$
test_ppo_speed 6.0478ms 5.7146ms 174.9907 Ops/s 186.4470 Ops/s $\textbf{\color{#d91a1a}-6.14\%}$
test_reinforce_speed 5.0227ms 4.4747ms 223.4773 Ops/s 244.8136 Ops/s $\textbf{\color{#d91a1a}-8.72\%}$
test_iql_speed 19.8032ms 19.0760ms 52.4218 Ops/s 54.6524 Ops/s $\color{#d91a1a}-4.08\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9661ms 2.8535ms 350.4518 Ops/s 353.7636 Ops/s $\color{#d91a1a}-0.94\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.6614ms 0.5290ms 1.8904 KOps/s 1.9066 KOps/s $\color{#d91a1a}-0.85\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7179ms 0.5051ms 1.9798 KOps/s 1.9955 KOps/s $\color{#d91a1a}-0.79\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1046ms 2.8879ms 346.2747 Ops/s 352.1513 Ops/s $\color{#d91a1a}-1.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6740ms 0.5269ms 1.8980 KOps/s 1.9234 KOps/s $\color{#d91a1a}-1.32\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.1407ms 0.4999ms 2.0004 KOps/s 2.0068 KOps/s $\color{#d91a1a}-0.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.0941ms 2.9908ms 334.3635 Ops/s 338.3034 Ops/s $\color{#d91a1a}-1.16\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7727ms 0.6525ms 1.5326 KOps/s 1.5511 KOps/s $\color{#d91a1a}-1.19\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8379ms 0.6332ms 1.5792 KOps/s 1.6181 KOps/s $\color{#d91a1a}-2.41\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9677ms 2.8872ms 346.3614 Ops/s 353.8243 Ops/s $\color{#d91a1a}-2.11\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.4787ms 0.5405ms 1.8502 KOps/s 1.6301 KOps/s $\textbf{\color{#35bf28}+13.50\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7194ms 0.5146ms 1.9434 KOps/s 1.9870 KOps/s $\color{#d91a1a}-2.20\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.0688ms 2.8841ms 346.7239 Ops/s 348.2026 Ops/s $\color{#d91a1a}-0.42\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6344ms 0.5238ms 1.9091 KOps/s 1.9212 KOps/s $\color{#d91a1a}-0.63\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 4.1721ms 0.5044ms 1.9826 KOps/s 2.0077 KOps/s $\color{#d91a1a}-1.25\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1566ms 3.0155ms 331.6220 Ops/s 337.3802 Ops/s $\color{#d91a1a}-1.71\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.5845ms 0.6527ms 1.5321 KOps/s 1.5440 KOps/s $\color{#d91a1a}-0.77\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8385ms 0.6270ms 1.5950 KOps/s 1.6048 KOps/s $\color{#d91a1a}-0.61\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1044s 8.6363ms 115.7909 Ops/s 140.4894 Ops/s $\textbf{\color{#d91a1a}-17.58\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 16.8894ms 14.4002ms 69.4434 Ops/s 69.8861 Ops/s $\color{#d91a1a}-0.63\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.1249ms 1.0329ms 968.1566 Ops/s 875.2359 Ops/s $\textbf{\color{#35bf28}+10.62\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1009s 6.7483ms 148.1854 Ops/s 116.8924 Ops/s $\textbf{\color{#35bf28}+26.77\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 17.0073ms 14.3511ms 69.6812 Ops/s 70.0890 Ops/s $\color{#d91a1a}-0.58\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.2480ms 1.0490ms 953.2821 Ops/s 813.4450 Ops/s $\textbf{\color{#35bf28}+17.19\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1023s 9.0406ms 110.6120 Ops/s 142.1618 Ops/s $\textbf{\color{#d91a1a}-22.19\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.3142ms 14.7633ms 67.7357 Ops/s 68.1105 Ops/s $\color{#d91a1a}-0.55\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.4664ms 1.3754ms 727.0856 Ops/s 687.5782 Ops/s $\textbf{\color{#35bf28}+5.75\%}$

@vmoens vmoens marked this pull request as ready for review February 27, 2024 00:34
@vmoens vmoens merged commit cadf4d9 into main Feb 27, 2024
65 of 68 checks passed
@vmoens vmoens deleted the parallel_env_ctx branch February 27, 2024 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants