Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] TorchRL2Gym conversion #1795

Merged
merged 44 commits into from
Jan 19, 2024
Merged

[Feature] TorchRL2Gym conversion #1795

merged 44 commits into from
Jan 19, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jan 13, 2024

Description

Proposes an API to register a torchrl env in gym(nasium).

This is aimed at being a universal converter from any simulator (incl torchrl itself) to gym.

For instance, you could use dm_control, brax or anything else with gym without any boilerplate code.

Check the docstrings and tests to know more!

TODO:

  • stateless envs: we should save the input data in the env during step for the next call to step.

Closes #1200 as well

cc @skandermoalla @BY571 @albertbou92 @duburcqa @vikashplus

Copy link

pytorch-bot bot commented Jan 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1795

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 1 Pending, 4 Unrelated Failures

As of commit 62dcee6 with merge base 57139bd (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 13, 2024
Copy link

github-actions bot commented Jan 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1353s 65.5729ms 15.2502 Ops/s 16.6657 Ops/s $\textbf{\color{#d91a1a}-8.49\%}$
test_sync 49.2860ms 32.9282ms 30.3691 Ops/s 25.8330 Ops/s $\textbf{\color{#35bf28}+17.56\%}$
test_async 65.0933ms 32.3859ms 30.8776 Ops/s 31.0756 Ops/s $\color{#d91a1a}-0.64\%$
test_simple 0.4809s 0.4258s 2.3486 Ops/s 2.3273 Ops/s $\color{#35bf28}+0.91\%$
test_transformed 0.6294s 0.5765s 1.7346 Ops/s 1.7128 Ops/s $\color{#35bf28}+1.27\%$
test_serial 1.3723s 1.3298s 0.7520 Ops/s 0.7607 Ops/s $\color{#d91a1a}-1.15\%$
test_parallel 1.2414s 1.2000s 0.8333 Ops/s 0.8138 Ops/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[True-True-True-True-True] 0.1033ms 21.2097μs 47.1482 KOps/s 46.9529 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[True-True-True-True-False] 34.8550μs 12.8457μs 77.8469 KOps/s 76.7546 KOps/s $\color{#35bf28}+1.42\%$
test_step_mdp_speed[True-True-True-False-True] 40.5750μs 12.4220μs 80.5021 KOps/s 80.7626 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-True-False-False] 26.3190μs 7.5812μs 131.9045 KOps/s 131.5598 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[True-True-False-True-True] 52.4070μs 22.5201μs 44.4047 KOps/s 44.0918 KOps/s $\color{#35bf28}+0.71\%$
test_step_mdp_speed[True-True-False-True-False] 50.5540μs 14.1088μs 70.8778 KOps/s 69.5435 KOps/s $\color{#35bf28}+1.92\%$
test_step_mdp_speed[True-True-False-False-True] 40.8960μs 13.6326μs 73.3534 KOps/s 73.4377 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[True-True-False-False-False] 38.6000μs 8.7210μs 114.6656 KOps/s 112.8903 KOps/s $\color{#35bf28}+1.57\%$
test_step_mdp_speed[True-False-True-True-True] 53.7710μs 23.9045μs 41.8331 KOps/s 41.4379 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-False-True-True-False] 35.1950μs 15.5951μs 64.1226 KOps/s 62.9733 KOps/s $\color{#35bf28}+1.82\%$
test_step_mdp_speed[True-False-True-False-True] 48.1000μs 13.7164μs 72.9056 KOps/s 72.2862 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[True-False-True-False-False] 0.1486ms 9.2222μs 108.4342 KOps/s 112.9591 KOps/s $\color{#d91a1a}-4.01\%$
test_step_mdp_speed[True-False-False-True-True] 51.8770μs 25.2221μs 39.6478 KOps/s 39.7552 KOps/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[True-False-False-True-False] 41.5170μs 16.7136μs 59.8314 KOps/s 59.4845 KOps/s $\color{#35bf28}+0.58\%$
test_step_mdp_speed[True-False-False-False-True] 55.8640μs 14.9719μs 66.7917 KOps/s 67.5735 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-False-False-False-False] 26.7790μs 10.0739μs 99.2665 KOps/s 99.6902 KOps/s $\color{#d91a1a}-0.43\%$
test_step_mdp_speed[False-True-True-True-True] 48.2200μs 23.7362μs 42.1298 KOps/s 41.7132 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-True-True-True-False] 42.1480μs 15.4951μs 64.5366 KOps/s 63.1483 KOps/s $\color{#35bf28}+2.20\%$
test_step_mdp_speed[False-True-True-False-True] 50.7740μs 15.8088μs 63.2559 KOps/s 62.0517 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[False-True-True-False-False] 25.3280μs 10.0773μs 99.2332 KOps/s 97.7830 KOps/s $\color{#35bf28}+1.48\%$
test_step_mdp_speed[False-True-False-True-True] 55.6840μs 24.6982μs 40.4889 KOps/s 39.7702 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[False-True-False-True-False] 0.1768ms 16.6509μs 60.0569 KOps/s 58.7146 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[False-True-False-False-True] 47.7180μs 16.9616μs 58.9567 KOps/s 58.6788 KOps/s $\color{#35bf28}+0.47\%$
test_step_mdp_speed[False-True-False-False-False] 29.0640μs 11.2578μs 88.8269 KOps/s 87.9024 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-False-True-True-True] 52.0470μs 26.0229μs 38.4277 KOps/s 38.2922 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[False-False-True-True-False] 43.4610μs 17.9912μs 55.5828 KOps/s 54.6692 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-False-True-False-True] 48.7910μs 17.0359μs 58.6996 KOps/s 58.5065 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-True-False-False] 27.1610μs 11.2675μs 88.7510 KOps/s 88.5578 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[False-False-False-True-True] 54.6710μs 27.2021μs 36.7619 KOps/s 36.4031 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-False-False-True-False] 53.8200μs 19.3497μs 51.6804 KOps/s 51.5189 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[False-False-False-False-True] 43.6110μs 18.0314μs 55.4588 KOps/s 55.2560 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[False-False-False-False-False] 42.3890μs 12.3622μs 80.8915 KOps/s 80.0732 KOps/s $\color{#35bf28}+1.02\%$
test_values[generalized_advantage_estimate-True-True] 13.0395ms 11.9970ms 83.3539 Ops/s 82.9559 Ops/s $\color{#35bf28}+0.48\%$
test_values[vec_generalized_advantage_estimate-True-True] 35.3174ms 27.9238ms 35.8118 Ops/s 35.7995 Ops/s $\color{#35bf28}+0.03\%$
test_values[td0_return_estimate-False-False] 0.2649ms 0.1801ms 5.5509 KOps/s 5.5984 KOps/s $\color{#d91a1a}-0.85\%$
test_values[td1_return_estimate-False-False] 25.7993ms 25.5226ms 39.1810 Ops/s 39.0112 Ops/s $\color{#35bf28}+0.44\%$
test_values[vec_td1_return_estimate-False-False] 35.6316ms 28.0751ms 35.6187 Ops/s 35.6768 Ops/s $\color{#d91a1a}-0.16\%$
test_values[td_lambda_return_estimate-True-False] 36.6685ms 35.3600ms 28.2806 Ops/s 27.6181 Ops/s $\color{#35bf28}+2.40\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.0770ms 27.9697ms 35.7529 Ops/s 35.9636 Ops/s $\color{#d91a1a}-0.59\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.8983ms 8.0016ms 124.9748 Ops/s 126.1373 Ops/s $\color{#d91a1a}-0.92\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.8652ms 1.9212ms 520.5009 Ops/s 539.3239 Ops/s $\color{#d91a1a}-3.49\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 8.9789ms 0.4359ms 2.2943 KOps/s 2.3621 KOps/s $\color{#d91a1a}-2.87\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 46.2095ms 38.8642ms 25.7306 Ops/s 26.6868 Ops/s $\color{#d91a1a}-3.58\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 11.1906ms 2.6619ms 375.6710 Ops/s 377.9637 Ops/s $\color{#d91a1a}-0.61\%$
test_dqn_speed 78.5183ms 7.7656ms 128.7725 Ops/s 125.2932 Ops/s $\color{#35bf28}+2.78\%$
test_ddpg_speed 28.1911ms 14.2544ms 70.1539 Ops/s 70.0702 Ops/s $\color{#35bf28}+0.12\%$
test_sac_speed 32.7106ms 28.6083ms 34.9549 Ops/s 34.9161 Ops/s $\color{#35bf28}+0.11\%$
test_redq_speed 48.0342ms 44.1004ms 22.6756 Ops/s 22.5143 Ops/s $\color{#35bf28}+0.72\%$
test_redq_deprec_speed 29.9960ms 25.2385ms 39.6221 Ops/s 40.0765 Ops/s $\color{#d91a1a}-1.13\%$
test_td3_speed 27.9478ms 19.7643ms 50.5964 Ops/s 50.7975 Ops/s $\color{#d91a1a}-0.40\%$
test_cql_speed 87.9028ms 85.6081ms 11.6811 Ops/s 10.7291 Ops/s $\textbf{\color{#35bf28}+8.87\%}$
test_a2c_speed 34.3550ms 25.9833ms 38.4862 Ops/s 37.7601 Ops/s $\color{#35bf28}+1.92\%$
test_ppo_speed 33.2212ms 26.9455ms 37.1119 Ops/s 37.2552 Ops/s $\color{#d91a1a}-0.38\%$
test_reinforce_speed 26.0726ms 25.2583ms 39.5910 Ops/s 39.8197 Ops/s $\color{#d91a1a}-0.57\%$
test_iql_speed 71.9146ms 62.7382ms 15.9392 Ops/s 15.5643 Ops/s $\color{#35bf28}+2.41\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.0402ms 1.3511ms 740.1350 Ops/s 711.7256 Ops/s $\color{#35bf28}+3.99\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 8.7101ms 0.5273ms 1.8965 KOps/s 1.9145 KOps/s $\color{#d91a1a}-0.94\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6787ms 0.4868ms 2.0542 KOps/s 2.0399 KOps/s $\color{#35bf28}+0.70\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 1.8207ms 1.3346ms 749.2702 Ops/s 764.5319 Ops/s $\color{#d91a1a}-2.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 8.8876ms 0.5182ms 1.9299 KOps/s 1.9467 KOps/s $\color{#d91a1a}-0.86\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 8.7635ms 0.4977ms 2.0091 KOps/s 2.0744 KOps/s $\color{#d91a1a}-3.15\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.4808ms 1.5420ms 648.5098 Ops/s 663.9192 Ops/s $\color{#d91a1a}-2.32\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 9.0224ms 0.6541ms 1.5287 KOps/s 1.5563 KOps/s $\color{#d91a1a}-1.77\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7438ms 0.6169ms 1.6209 KOps/s 1.6020 KOps/s $\color{#35bf28}+1.18\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 1.9592ms 1.3560ms 737.4705 Ops/s 752.8187 Ops/s $\color{#d91a1a}-2.04\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 8.7164ms 0.5241ms 1.9079 KOps/s 1.9551 KOps/s $\color{#d91a1a}-2.41\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5946ms 0.4860ms 2.0577 KOps/s 2.0627 KOps/s $\color{#d91a1a}-0.24\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.0123ms 1.3241ms 755.2362 Ops/s 761.2908 Ops/s $\color{#d91a1a}-0.80\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 9.6966ms 0.5200ms 1.9232 KOps/s 1.9574 KOps/s $\color{#d91a1a}-1.75\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6253ms 0.4804ms 2.0815 KOps/s 2.0313 KOps/s $\color{#35bf28}+2.47\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.2860ms 1.5137ms 660.6393 Ops/s 674.5929 Ops/s $\color{#d91a1a}-2.07\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 9.7194ms 0.6582ms 1.5193 KOps/s 1.5573 KOps/s $\color{#d91a1a}-2.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7409ms 0.6190ms 1.6155 KOps/s 1.5945 KOps/s $\color{#35bf28}+1.32\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1122s 12.1297ms 82.4423 Ops/s 84.1635 Ops/s $\color{#d91a1a}-2.05\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 15.5984ms 13.4758ms 74.2071 Ops/s 73.9723 Ops/s $\color{#35bf28}+0.32\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 11.7337ms 3.3585ms 297.7523 Ops/s 308.7847 Ops/s $\color{#d91a1a}-3.57\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1042s 10.2182ms 97.8645 Ops/s 101.3376 Ops/s $\color{#d91a1a}-3.43\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1085s 15.4047ms 64.9151 Ops/s 73.8502 Ops/s $\textbf{\color{#d91a1a}-12.10\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.7542ms 3.2123ms 311.3054 Ops/s 308.6608 Ops/s $\color{#35bf28}+0.86\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1024s 10.3379ms 96.7313 Ops/s 85.9386 Ops/s $\textbf{\color{#35bf28}+12.56\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 16.0737ms 13.7753ms 72.5939 Ops/s 71.6838 Ops/s $\color{#35bf28}+1.27\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 11.3974ms 3.6071ms 277.2346 Ops/s 288.9310 Ops/s $\color{#d91a1a}-4.05\%$

Copy link

github-actions bot commented Jan 13, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}39$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1169s 0.1132s 8.8305 Ops/s 8.5036 Ops/s $\color{#35bf28}+3.84\%$
test_sync 0.1743s 0.1030s 9.7065 Ops/s 9.6857 Ops/s $\color{#35bf28}+0.22\%$
test_async 0.1809s 91.4541ms 10.9344 Ops/s 10.9373 Ops/s $\color{#d91a1a}-0.03\%$
test_single_pixels 0.1361s 0.1358s 7.3634 Ops/s 7.1415 Ops/s $\color{#35bf28}+3.11\%$
test_sync_pixels 78.9467ms 75.4887ms 13.2470 Ops/s 13.1373 Ops/s $\color{#35bf28}+0.84\%$
test_async_pixels 0.1369s 71.3868ms 14.0082 Ops/s 13.2864 Ops/s $\textbf{\color{#35bf28}+5.43\%}$
test_simple 0.8951s 0.8268s 1.2094 Ops/s 1.1657 Ops/s $\color{#35bf28}+3.75\%$
test_transformed 1.1200s 1.0582s 0.9450 Ops/s 0.9281 Ops/s $\color{#35bf28}+1.82\%$
test_serial 2.3182s 2.2570s 0.4431 Ops/s 0.4233 Ops/s $\color{#35bf28}+4.67\%$
test_parallel 1.9188s 1.8383s 0.5440 Ops/s 0.5337 Ops/s $\color{#35bf28}+1.93\%$
test_step_mdp_speed[True-True-True-True-True] 0.1700ms 32.6899μs 30.5905 KOps/s 29.5518 KOps/s $\color{#35bf28}+3.52\%$
test_step_mdp_speed[True-True-True-True-False] 0.1110ms 19.3089μs 51.7896 KOps/s 49.1203 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_step_mdp_speed[True-True-True-False-True] 49.5700μs 18.1182μs 55.1932 KOps/s 51.7975 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_step_mdp_speed[True-True-True-False-False] 45.3210μs 10.9817μs 91.0603 KOps/s 85.9608 KOps/s $\textbf{\color{#35bf28}+5.93\%}$
test_step_mdp_speed[True-True-False-True-True] 80.5910μs 34.0394μs 29.3777 KOps/s 27.6917 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_step_mdp_speed[True-True-False-True-False] 54.6010μs 21.2767μs 46.9997 KOps/s 44.0163 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_step_mdp_speed[True-True-False-False-True] 0.1112ms 20.2820μs 49.3048 KOps/s 46.6970 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_step_mdp_speed[True-True-False-False-False] 45.8800μs 12.8254μs 77.9704 KOps/s 72.4297 KOps/s $\textbf{\color{#35bf28}+7.65\%}$
test_step_mdp_speed[True-False-True-True-True] 73.7810μs 36.1765μs 27.6423 KOps/s 25.7726 KOps/s $\textbf{\color{#35bf28}+7.25\%}$
test_step_mdp_speed[True-False-True-True-False] 53.8000μs 23.2606μs 42.9911 KOps/s 40.0799 KOps/s $\textbf{\color{#35bf28}+7.26\%}$
test_step_mdp_speed[True-False-True-False-True] 47.3800μs 20.0645μs 49.8392 KOps/s 46.4065 KOps/s $\textbf{\color{#35bf28}+7.40\%}$
test_step_mdp_speed[True-False-True-False-False] 33.8400μs 12.8496μs 77.8236 KOps/s 72.0219 KOps/s $\textbf{\color{#35bf28}+8.06\%}$
test_step_mdp_speed[True-False-False-True-True] 67.1710μs 38.2563μs 26.1395 KOps/s 25.2276 KOps/s $\color{#35bf28}+3.61\%$
test_step_mdp_speed[True-False-False-True-False] 48.6800μs 24.5995μs 40.6513 KOps/s 37.7387 KOps/s $\textbf{\color{#35bf28}+7.72\%}$
test_step_mdp_speed[True-False-False-False-True] 48.9500μs 21.7054μs 46.0714 KOps/s 42.8631 KOps/s $\textbf{\color{#35bf28}+7.48\%}$
test_step_mdp_speed[True-False-False-False-False] 52.4610μs 14.5897μs 68.5417 KOps/s 63.9903 KOps/s $\textbf{\color{#35bf28}+7.11\%}$
test_step_mdp_speed[False-True-True-True-True] 64.0510μs 36.2144μs 27.6134 KOps/s 26.2514 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_step_mdp_speed[False-True-True-True-False] 99.8210μs 23.0400μs 43.4029 KOps/s 40.0244 KOps/s $\textbf{\color{#35bf28}+8.44\%}$
test_step_mdp_speed[False-True-True-False-True] 75.5610μs 24.5256μs 40.7738 KOps/s 39.0041 KOps/s $\color{#35bf28}+4.54\%$
test_step_mdp_speed[False-True-True-False-False] 39.6000μs 14.6698μs 68.1671 KOps/s 63.9112 KOps/s $\textbf{\color{#35bf28}+6.66\%}$
test_step_mdp_speed[False-True-False-True-True] 74.9610μs 38.1995μs 26.1783 KOps/s 24.9072 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_step_mdp_speed[False-True-False-True-False] 69.7310μs 25.2059μs 39.6733 KOps/s 37.2105 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_step_mdp_speed[False-True-False-False-True] 53.9100μs 26.1285μs 38.2723 KOps/s 36.2872 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_step_mdp_speed[False-True-False-False-False] 42.2310μs 16.3440μs 61.1845 KOps/s 56.9246 KOps/s $\textbf{\color{#35bf28}+7.48\%}$
test_step_mdp_speed[False-False-True-True-True] 59.6210μs 39.5744μs 25.2689 KOps/s 23.1615 KOps/s $\textbf{\color{#35bf28}+9.10\%}$
test_step_mdp_speed[False-False-True-True-False] 67.3510μs 27.1734μs 36.8007 KOps/s 34.4280 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_step_mdp_speed[False-False-True-False-True] 44.3610μs 25.5616μs 39.1212 KOps/s 36.6361 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_step_mdp_speed[False-False-True-False-False] 34.7400μs 16.3853μs 61.0302 KOps/s 56.5901 KOps/s $\textbf{\color{#35bf28}+7.85\%}$
test_step_mdp_speed[False-False-False-True-True] 64.2910μs 40.3497μs 24.7833 KOps/s 22.5205 KOps/s $\textbf{\color{#35bf28}+10.05\%}$
test_step_mdp_speed[False-False-False-True-False] 58.4510μs 28.6270μs 34.9321 KOps/s 32.2654 KOps/s $\textbf{\color{#35bf28}+8.26\%}$
test_step_mdp_speed[False-False-False-False-True] 61.0010μs 27.3456μs 36.5689 KOps/s 34.1400 KOps/s $\textbf{\color{#35bf28}+7.11\%}$
test_step_mdp_speed[False-False-False-False-False] 35.7610μs 18.0122μs 55.5178 KOps/s 51.2051 KOps/s $\textbf{\color{#35bf28}+8.42\%}$
test_values[generalized_advantage_estimate-True-True] 24.3225ms 23.9934ms 41.6781 Ops/s 37.8968 Ops/s $\textbf{\color{#35bf28}+9.98\%}$
test_values[vec_generalized_advantage_estimate-True-True] 93.1371ms 3.4289ms 291.6352 Ops/s 296.3627 Ops/s $\color{#d91a1a}-1.60\%$
test_values[td0_return_estimate-False-False] 95.8410μs 61.4260μs 16.2798 KOps/s 15.6007 KOps/s $\color{#35bf28}+4.35\%$
test_values[td1_return_estimate-False-False] 52.1002ms 51.8469ms 19.2876 Ops/s 17.6024 Ops/s $\textbf{\color{#35bf28}+9.57\%}$
test_values[vec_td1_return_estimate-False-False] 2.0687ms 1.7554ms 569.6595 Ops/s 557.4034 Ops/s $\color{#35bf28}+2.20\%$
test_values[td_lambda_return_estimate-True-False] 84.9594ms 83.2015ms 12.0190 Ops/s 11.0803 Ops/s $\textbf{\color{#35bf28}+8.47\%}$
test_values[vec_td_lambda_return_estimate-True-False] 2.0759ms 1.7501ms 571.3971 Ops/s 564.3705 Ops/s $\color{#35bf28}+1.25\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.0294ms 22.8396ms 43.7837 Ops/s 42.6230 Ops/s $\color{#35bf28}+2.72\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8645ms 0.6900ms 1.4494 KOps/s 1.4086 KOps/s $\color{#35bf28}+2.89\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8057ms 0.6451ms 1.5501 KOps/s 1.5096 KOps/s $\color{#35bf28}+2.68\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6581ms 1.4489ms 690.1751 Ops/s 681.4311 Ops/s $\color{#35bf28}+1.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9615ms 0.6653ms 1.5030 KOps/s 1.4148 KOps/s $\textbf{\color{#35bf28}+6.24\%}$
test_dqn_speed 10.2954ms 7.0096ms 142.6611 Ops/s 126.3879 Ops/s $\textbf{\color{#35bf28}+12.88\%}$
test_ddpg_speed 14.8480ms 13.8352ms 72.2792 Ops/s 70.2525 Ops/s $\color{#35bf28}+2.88\%$
test_sac_speed 28.9942ms 28.0646ms 35.6321 Ops/s 34.4836 Ops/s $\color{#35bf28}+3.33\%$
test_redq_speed 49.3680ms 45.9951ms 21.7414 Ops/s 21.1068 Ops/s $\color{#35bf28}+3.01\%$
test_redq_deprec_speed 24.6509ms 23.6741ms 42.2403 Ops/s 41.5111 Ops/s $\color{#35bf28}+1.76\%$
test_td3_speed 28.5292ms 19.1736ms 52.1552 Ops/s 50.3675 Ops/s $\color{#35bf28}+3.55\%$
test_cql_speed 83.0395ms 81.2526ms 12.3073 Ops/s 11.9779 Ops/s $\color{#35bf28}+2.75\%$
test_a2c_speed 26.7881ms 25.8606ms 38.6688 Ops/s 37.4761 Ops/s $\color{#35bf28}+3.18\%$
test_ppo_speed 26.8899ms 26.1731ms 38.2072 Ops/s 37.0365 Ops/s $\color{#35bf28}+3.16\%$
test_reinforce_speed 25.8884ms 24.9348ms 40.1046 Ops/s 38.9215 Ops/s $\color{#35bf28}+3.04\%$
test_iql_speed 57.8844ms 56.1449ms 17.8111 Ops/s 17.3312 Ops/s $\color{#35bf28}+2.77\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.5376ms 1.7944ms 557.2946 Ops/s 534.0968 Ops/s $\color{#35bf28}+4.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9777ms 0.8379ms 1.1935 KOps/s 1.1810 KOps/s $\color{#35bf28}+1.06\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9646ms 0.8162ms 1.2252 KOps/s 1.2155 KOps/s $\color{#35bf28}+0.79\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.4382ms 1.7659ms 566.2949 Ops/s 541.0012 Ops/s $\color{#35bf28}+4.68\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9766ms 0.8251ms 1.2119 KOps/s 1.1970 KOps/s $\color{#35bf28}+1.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.9648ms 0.8056ms 1.2413 KOps/s 1.2286 KOps/s $\color{#35bf28}+1.03\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.9970ms 2.0377ms 490.7562 Ops/s 422.5166 Ops/s $\textbf{\color{#35bf28}+16.15\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1448ms 0.9533ms 1.0490 KOps/s 1.0330 KOps/s $\color{#35bf28}+1.55\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0795ms 0.9328ms 1.0720 KOps/s 1.0631 KOps/s $\color{#35bf28}+0.84\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.6003ms 1.8106ms 552.2976 Ops/s 467.3063 Ops/s $\textbf{\color{#35bf28}+18.19\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9848ms 0.8378ms 1.1936 KOps/s 1.1822 KOps/s $\color{#35bf28}+0.96\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9498ms 0.8152ms 1.2267 KOps/s 1.2159 KOps/s $\color{#35bf28}+0.89\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.5740ms 1.7798ms 561.8585 Ops/s 542.7542 Ops/s $\color{#35bf28}+3.52\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9636ms 0.8252ms 1.2119 KOps/s 1.2026 KOps/s $\color{#35bf28}+0.77\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.0473ms 0.8037ms 1.2443 KOps/s 1.2315 KOps/s $\color{#35bf28}+1.04\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.8037ms 2.0497ms 487.8843 Ops/s 477.9101 Ops/s $\color{#35bf28}+2.09\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1051ms 0.9567ms 1.0452 KOps/s 901.2331 Ops/s $\textbf{\color{#35bf28}+15.98\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0938ms 0.9316ms 1.0734 KOps/s 1.0622 KOps/s $\color{#35bf28}+1.06\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1304s 10.2111ms 97.9322 Ops/s 99.7935 Ops/s $\color{#d91a1a}-1.87\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1230s 15.8483ms 63.0984 Ops/s 71.5408 Ops/s $\textbf{\color{#d91a1a}-11.80\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.3816ms 3.2881ms 304.1247 Ops/s 298.5713 Ops/s $\color{#35bf28}+1.86\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1182s 9.8552ms 101.4691 Ops/s 101.1856 Ops/s $\color{#35bf28}+0.28\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 15.8989ms 13.4841ms 74.1614 Ops/s 71.9419 Ops/s $\color{#35bf28}+3.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.7970ms 3.3133ms 301.8119 Ops/s 297.1087 Ops/s $\color{#35bf28}+1.58\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1204s 10.2196ms 97.8512 Ops/s 80.9555 Ops/s $\textbf{\color{#35bf28}+20.87\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 16.3274ms 13.8004ms 72.4616 Ops/s 70.3122 Ops/s $\color{#35bf28}+3.06\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.2217ms 3.4654ms 288.5684 Ops/s 280.8510 Ops/s $\color{#35bf28}+2.75\%$

@vmoens vmoens added the enhancement New feature or request label Jan 15, 2024
@vmoens
Copy link
Contributor Author

vmoens commented Jan 17, 2024

I made some pretty fun improvements: you can now select what is to be considered as info and what is observation, and you can also use it with stateless envs.

Do you guys think we should have an option to collapse the observation dictionary if there is a single observation in it?

@vmoens vmoens added the Data Data-related PR, will launch data-related jobs label Jan 18, 2024
@vmoens vmoens linked an issue Jan 18, 2024 that may be closed by this pull request
@vmoens
Copy link
Contributor Author

vmoens commented Jan 19, 2024

I had to add a new transform (RemoveEmptySpecs) to remove empty specs and tensordicts from an env output

This might be relevant to MARL folks (@matteobettini) since they have highly nested structures.

@duburcqa
Copy link
Contributor

If I remember correctly it was a feature I requested at some point !

@duburcqa
Copy link
Contributor

Here we are: #1200

@vmoens
Copy link
Contributor Author

vmoens commented Jan 19, 2024

Yes I remembered that too but I can't find it haha

@vmoens
Copy link
Contributor Author

vmoens commented Jan 19, 2024

Yep thanks!
Interestingly I encountered the same problem with SelectTransform so I thought that having a single solution for all was preferrable

@vmoens vmoens merged commit c3ffb5a into main Jan 19, 2024
59 of 63 checks passed
@vmoens vmoens deleted the torchrl-to-gym-env branch January 19, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Data Data-related PR, will launch data-related jobs enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] TorchRL to gym API [Feature Request] CatTensors could delete now empty nested tensordicts
3 participants