Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Execute rollouts with regular nn.Module instances #1947

Merged
merged 2 commits into from
Feb 22, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 22, 2024

From the docstrings:

            >>> from torch import nn
            >>> env = GymEnv("CartPole-v1", categorical_action_encoding=True)
            >>> class ArgMaxModule(nn.Module):
            ...     def forward(self, values):
            ...         return values.argmax(-1)
            >>> n_obs = env.observation_spec["observation"].shape[-1]
            >>> n_act = env.action_spec.n
            >>> # A deterministic policy
            >>> policy = nn.Sequential(
            ...     nn.Linear(n_obs, n_act),
            ...     ArgMaxModule())
            >>> env.rollout(max_steps=10, policy=policy)
            TensorDict(
                fields={
                    action: Tensor(shape=torch.Size([10]), device=cpu, dtype=torch.int64, is_shared=False),
                    done: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                    next: TensorDict(
                        fields={
                            done: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                            observation: Tensor(shape=torch.Size([10, 4]), device=cpu, dtype=torch.float32, is_shared=False),
                            reward: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.float32, is_shared=False),
                            terminated: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                            truncated: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
                        batch_size=torch.Size([10]),
                        device=cpu,
                        is_shared=False),
                    observation: Tensor(shape=torch.Size([10, 4]), device=cpu, dtype=torch.float32, is_shared=False),
                    terminated: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                    truncated: Tensor(shape=torch.Size([10, 1]), device=cpu, dtype=torch.bool, is_shared=False)},
                batch_size=torch.Size([10]),
                device=cpu,
                is_shared=False)

Copy link

pytorch-bot bot commented Feb 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1947

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Merge Blocking SEVs

There is 1 active merge blocking SEVs. Please view them below:

If you must merge, use @pytorchbot merge -f.

❌ 1 New Failure, 2 Unrelated Failures

As of commit 85811dc with merge base bb44067 (image):

NEW FAILURE - The following job has failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 22, 2024
@vmoens vmoens added the enhancement New feature or request label Feb 22, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}26$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1144s 66.1941ms 15.1071 Ops/s 15.7600 Ops/s $\color{#d91a1a}-4.14\%$
test_sync 34.8505ms 33.4726ms 29.8752 Ops/s 29.3420 Ops/s $\color{#35bf28}+1.82\%$
test_async 60.9826ms 30.3389ms 32.9610 Ops/s 31.2846 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_simple 0.4304s 0.4235s 2.3612 Ops/s 2.3036 Ops/s $\color{#35bf28}+2.50\%$
test_transformed 0.6365s 0.5867s 1.7045 Ops/s 1.7215 Ops/s $\color{#d91a1a}-0.99\%$
test_serial 1.4896s 1.4371s 0.6958 Ops/s 0.6987 Ops/s $\color{#d91a1a}-0.42\%$
test_parallel 1.4888s 1.4403s 0.6943 Ops/s 0.7143 Ops/s $\color{#d91a1a}-2.80\%$
test_step_mdp_speed[True-True-True-True-True] 0.1392ms 22.1360μs 45.1754 KOps/s 47.6158 KOps/s $\textbf{\color{#d91a1a}-5.13\%}$
test_step_mdp_speed[True-True-True-True-False] 41.7370μs 13.6094μs 73.4787 KOps/s 77.6584 KOps/s $\textbf{\color{#d91a1a}-5.38\%}$
test_step_mdp_speed[True-True-True-False-True] 39.0120μs 12.9035μs 77.4983 KOps/s 81.5350 KOps/s $\color{#d91a1a}-4.95\%$
test_step_mdp_speed[True-True-True-False-False] 28.7640μs 7.9069μs 126.4721 KOps/s 134.4262 KOps/s $\textbf{\color{#d91a1a}-5.92\%}$
test_step_mdp_speed[True-True-False-True-True] 53.4900μs 23.4785μs 42.5922 KOps/s 44.6268 KOps/s $\color{#d91a1a}-4.56\%$
test_step_mdp_speed[True-True-False-True-False] 44.3320μs 14.7257μs 67.9086 KOps/s 70.8237 KOps/s $\color{#d91a1a}-4.12\%$
test_step_mdp_speed[True-True-False-False-True] 74.2480μs 14.2413μs 70.2184 KOps/s 73.6860 KOps/s $\color{#d91a1a}-4.71\%$
test_step_mdp_speed[True-True-False-False-False] 35.3250μs 9.0974μs 109.9215 KOps/s 115.1041 KOps/s $\color{#d91a1a}-4.50\%$
test_step_mdp_speed[True-False-True-True-True] 63.0770μs 24.7489μs 40.4059 KOps/s 42.0919 KOps/s $\color{#d91a1a}-4.01\%$
test_step_mdp_speed[True-False-True-True-False] 41.7380μs 16.2650μs 61.4817 KOps/s 64.8420 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_step_mdp_speed[True-False-True-False-True] 44.2720μs 14.2767μs 70.0443 KOps/s 75.4556 KOps/s $\textbf{\color{#d91a1a}-7.17\%}$
test_step_mdp_speed[True-False-True-False-False] 33.5730μs 9.2088μs 108.5914 KOps/s 114.7907 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_step_mdp_speed[True-False-False-True-True] 66.2120μs 25.8735μs 38.6496 KOps/s 40.7498 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_step_mdp_speed[True-False-False-True-False] 46.2960μs 17.5658μs 56.9287 KOps/s 60.5984 KOps/s $\textbf{\color{#d91a1a}-6.06\%}$
test_step_mdp_speed[True-False-False-False-True] 48.5100μs 15.4307μs 64.8057 KOps/s 69.0324 KOps/s $\textbf{\color{#d91a1a}-6.12\%}$
test_step_mdp_speed[True-False-False-False-False] 38.2510μs 10.5491μs 94.7947 KOps/s 102.2992 KOps/s $\textbf{\color{#d91a1a}-7.34\%}$
test_step_mdp_speed[False-True-True-True-True] 58.5490μs 24.7059μs 40.4761 KOps/s 42.5244 KOps/s $\color{#d91a1a}-4.82\%$
test_step_mdp_speed[False-True-True-True-False] 41.4470μs 16.2286μs 61.6197 KOps/s 65.0929 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_step_mdp_speed[False-True-True-False-True] 58.5390μs 16.3222μs 61.2662 KOps/s 63.9225 KOps/s $\color{#d91a1a}-4.16\%$
test_step_mdp_speed[False-True-True-False-False] 34.2640μs 10.4593μs 95.6086 KOps/s 101.1437 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_step_mdp_speed[False-True-False-True-True] 41.3370μs 26.2627μs 38.0768 KOps/s 40.2253 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_step_mdp_speed[False-True-False-True-False] 49.7820μs 17.5209μs 57.0747 KOps/s 60.5158 KOps/s $\textbf{\color{#d91a1a}-5.69\%}$
test_step_mdp_speed[False-True-False-False-True] 42.0480μs 17.5571μs 56.9570 KOps/s 59.8112 KOps/s $\color{#d91a1a}-4.77\%$
test_step_mdp_speed[False-True-False-False-False] 34.2330μs 11.6679μs 85.7055 KOps/s 90.7427 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_step_mdp_speed[False-False-True-True-True] 59.5810μs 27.4369μs 36.4472 KOps/s 38.7072 KOps/s $\textbf{\color{#d91a1a}-5.84\%}$
test_step_mdp_speed[False-False-True-True-False] 65.1940μs 18.7144μs 53.4348 KOps/s 56.4439 KOps/s $\textbf{\color{#d91a1a}-5.33\%}$
test_step_mdp_speed[False-False-True-False-True] 40.9970μs 17.5354μs 57.0274 KOps/s 59.9059 KOps/s $\color{#d91a1a}-4.81\%$
test_step_mdp_speed[False-False-True-False-False] 39.9050μs 11.5131μs 86.8573 KOps/s 90.2829 KOps/s $\color{#d91a1a}-3.79\%$
test_step_mdp_speed[False-False-False-True-True] 65.1000μs 28.5673μs 35.0050 KOps/s 37.0749 KOps/s $\textbf{\color{#d91a1a}-5.58\%}$
test_step_mdp_speed[False-False-False-True-False] 84.4570μs 19.8191μs 50.4563 KOps/s 52.9230 KOps/s $\color{#d91a1a}-4.66\%$
test_step_mdp_speed[False-False-False-False-True] 50.0120μs 18.8092μs 53.1655 KOps/s 56.9104 KOps/s $\textbf{\color{#d91a1a}-6.58\%}$
test_step_mdp_speed[False-False-False-False-False] 37.3290μs 12.7410μs 78.4870 KOps/s 82.7690 KOps/s $\textbf{\color{#d91a1a}-5.17\%}$
test_values[generalized_advantage_estimate-True-True] 10.6218ms 9.5593ms 104.6105 Ops/s 107.3220 Ops/s $\color{#d91a1a}-2.53\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.5892ms 35.2701ms 28.3526 Ops/s 29.9759 Ops/s $\textbf{\color{#d91a1a}-5.42\%}$
test_values[td0_return_estimate-False-False] 0.2427ms 0.1640ms 6.0974 KOps/s 5.9476 KOps/s $\color{#35bf28}+2.52\%$
test_values[td1_return_estimate-False-False] 25.0059ms 23.8909ms 41.8570 Ops/s 42.9685 Ops/s $\color{#d91a1a}-2.59\%$
test_values[vec_td1_return_estimate-False-False] 38.4730ms 35.7666ms 27.9590 Ops/s 30.0250 Ops/s $\textbf{\color{#d91a1a}-6.88\%}$
test_values[td_lambda_return_estimate-True-False] 37.2125ms 34.4082ms 29.0629 Ops/s 29.8439 Ops/s $\color{#d91a1a}-2.62\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.9255ms 35.7774ms 27.9506 Ops/s 30.0767 Ops/s $\textbf{\color{#d91a1a}-7.07\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.4528ms 8.4104ms 118.9002 Ops/s 122.2409 Ops/s $\color{#d91a1a}-2.73\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.4224ms 1.9203ms 520.7557 Ops/s 516.5813 Ops/s $\color{#35bf28}+0.81\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5279ms 0.3482ms 2.8722 KOps/s 2.8730 KOps/s $\color{#d91a1a}-0.03\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.8837ms 44.0891ms 22.6814 Ops/s 21.0028 Ops/s $\textbf{\color{#35bf28}+7.99\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7640ms 3.0064ms 332.6190 Ops/s 331.3688 Ops/s $\color{#35bf28}+0.38\%$
test_dqn_speed 68.0522ms 1.5025ms 665.5614 Ops/s 723.7752 Ops/s $\textbf{\color{#d91a1a}-8.04\%}$
test_ddpg_speed 4.5309ms 2.8332ms 352.9616 Ops/s 360.4319 Ops/s $\color{#d91a1a}-2.07\%$
test_sac_speed 9.1274ms 8.4810ms 117.9105 Ops/s 121.1094 Ops/s $\color{#d91a1a}-2.64\%$
test_redq_speed 14.2661ms 13.2217ms 75.6331 Ops/s 76.2040 Ops/s $\color{#d91a1a}-0.75\%$
test_redq_deprec_speed 13.9939ms 13.2021ms 75.7456 Ops/s 76.0732 Ops/s $\color{#d91a1a}-0.43\%$
test_td3_speed 11.0477ms 8.4743ms 118.0034 Ops/s 120.5656 Ops/s $\color{#d91a1a}-2.13\%$
test_cql_speed 38.1316ms 36.6191ms 27.3081 Ops/s 27.7477 Ops/s $\color{#d91a1a}-1.58\%$
test_a2c_speed 9.5971ms 7.4225ms 134.7259 Ops/s 135.7005 Ops/s $\color{#d91a1a}-0.72\%$
test_ppo_speed 8.4723ms 7.6458ms 130.7909 Ops/s 129.0866 Ops/s $\color{#35bf28}+1.32\%$
test_reinforce_speed 7.1155ms 6.5738ms 152.1197 Ops/s 152.0719 Ops/s $\color{#35bf28}+0.03\%$
test_iql_speed 34.3651ms 32.7334ms 30.5499 Ops/s 28.6439 Ops/s $\textbf{\color{#35bf28}+6.65\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.3742ms 2.9300ms 341.2997 Ops/s 360.1116 Ops/s $\textbf{\color{#d91a1a}-5.22\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6300ms 0.5142ms 1.9449 KOps/s 1.9667 KOps/s $\color{#d91a1a}-1.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.9330ms 0.4884ms 2.0474 KOps/s 1.8484 KOps/s $\textbf{\color{#35bf28}+10.77\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.3171ms 2.8249ms 353.9934 Ops/s 367.4220 Ops/s $\color{#d91a1a}-3.65\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8694ms 0.5068ms 1.9733 KOps/s 1.9686 KOps/s $\color{#35bf28}+0.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6572ms 0.4802ms 2.0826 KOps/s 2.0728 KOps/s $\color{#35bf28}+0.48\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.2689ms 2.9728ms 336.3802 Ops/s 349.3420 Ops/s $\color{#d91a1a}-3.71\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7417ms 0.6286ms 1.5908 KOps/s 1.5967 KOps/s $\color{#d91a1a}-0.37\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9031ms 0.6055ms 1.6516 KOps/s 1.6731 KOps/s $\color{#d91a1a}-1.29\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.0586ms 2.7862ms 358.9124 Ops/s 364.6666 Ops/s $\color{#d91a1a}-1.58\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7780ms 0.5165ms 1.9361 KOps/s 1.9667 KOps/s $\color{#d91a1a}-1.56\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7877ms 0.4874ms 2.0517 KOps/s 2.0839 KOps/s $\color{#d91a1a}-1.55\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1062ms 2.7838ms 359.2174 Ops/s 359.3269 Ops/s $\color{#d91a1a}-0.03\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7921ms 0.5109ms 1.9574 KOps/s 1.9665 KOps/s $\color{#d91a1a}-0.46\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8562ms 0.4823ms 2.0734 KOps/s 2.0919 KOps/s $\color{#d91a1a}-0.89\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.2868ms 2.9427ms 339.8243 Ops/s 346.2238 Ops/s $\color{#d91a1a}-1.85\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9041ms 0.6370ms 1.5700 KOps/s 1.5950 KOps/s $\color{#d91a1a}-1.57\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8036ms 0.6027ms 1.6593 KOps/s 1.6661 KOps/s $\color{#d91a1a}-0.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1074s 8.0019ms 124.9710 Ops/s 105.9414 Ops/s $\textbf{\color{#35bf28}+17.96\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 15.3589ms 13.3359ms 74.9854 Ops/s 76.7475 Ops/s $\color{#d91a1a}-2.30\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 4.3452ms 2.5777ms 387.9417 Ops/s 395.0994 Ops/s $\color{#d91a1a}-1.81\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 92.9239ms 9.4014ms 106.3675 Ops/s 110.7870 Ops/s $\color{#d91a1a}-3.99\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 16.4754ms 13.3779ms 74.7500 Ops/s 77.3534 Ops/s $\color{#d91a1a}-3.37\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 4.1384ms 2.5322ms 394.9073 Ops/s 396.6289 Ops/s $\color{#d91a1a}-0.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 90.2236ms 9.4988ms 105.2762 Ops/s 128.8100 Ops/s $\textbf{\color{#d91a1a}-18.27\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 15.8369ms 13.6026ms 73.5152 Ops/s 75.7281 Ops/s $\color{#d91a1a}-2.92\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 4.8260ms 2.8092ms 355.9709 Ops/s 365.3596 Ops/s $\color{#d91a1a}-2.57\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}21$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1186s 0.1171s 8.5370 Ops/s 8.4152 Ops/s $\color{#35bf28}+1.45\%$
test_sync 95.9867ms 95.7316ms 10.4459 Ops/s 10.4080 Ops/s $\color{#35bf28}+0.36\%$
test_async 0.1811s 91.7008ms 10.9050 Ops/s 10.9231 Ops/s $\color{#d91a1a}-0.17\%$
test_single_pixels 0.2008s 0.1378s 7.2592 Ops/s 7.0928 Ops/s $\color{#35bf28}+2.35\%$
test_sync_pixels 82.6125ms 79.1135ms 12.6401 Ops/s 11.9992 Ops/s $\textbf{\color{#35bf28}+5.34\%}$
test_async_pixels 0.1542s 75.5559ms 13.2352 Ops/s 15.1696 Ops/s $\textbf{\color{#d91a1a}-12.75\%}$
test_simple 0.9014s 0.8416s 1.1882 Ops/s 1.2140 Ops/s $\color{#d91a1a}-2.12\%$
test_transformed 1.1322s 1.0740s 0.9311 Ops/s 0.9569 Ops/s $\color{#d91a1a}-2.69\%$
test_serial 2.5633s 2.4986s 0.4002 Ops/s 0.4184 Ops/s $\color{#d91a1a}-4.34\%$
test_parallel 2.1854s 2.1103s 0.4739 Ops/s 0.4835 Ops/s $\color{#d91a1a}-1.98\%$
test_step_mdp_speed[True-True-True-True-True] 0.1187ms 34.0428μs 29.3748 KOps/s 31.0470 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_step_mdp_speed[True-True-True-True-False] 94.8520μs 20.5684μs 48.6182 KOps/s 51.7806 KOps/s $\textbf{\color{#d91a1a}-6.11\%}$
test_step_mdp_speed[True-True-True-False-True] 37.2200μs 19.0102μs 52.6033 KOps/s 54.6073 KOps/s $\color{#d91a1a}-3.67\%$
test_step_mdp_speed[True-True-True-False-False] 30.0600μs 11.4858μs 87.0644 KOps/s 91.2198 KOps/s $\color{#d91a1a}-4.56\%$
test_step_mdp_speed[True-True-False-True-True] 63.3500μs 35.6632μs 28.0401 KOps/s 29.6094 KOps/s $\textbf{\color{#d91a1a}-5.30\%}$
test_step_mdp_speed[True-True-False-True-False] 47.5010μs 22.3738μs 44.6951 KOps/s 47.5042 KOps/s $\textbf{\color{#d91a1a}-5.91\%}$
test_step_mdp_speed[True-True-False-False-True] 39.4200μs 21.2666μs 47.0222 KOps/s 50.1090 KOps/s $\textbf{\color{#d91a1a}-6.16\%}$
test_step_mdp_speed[True-True-False-False-False] 91.2920μs 13.4425μs 74.3912 KOps/s 77.6206 KOps/s $\color{#d91a1a}-4.16\%$
test_step_mdp_speed[True-False-True-True-True] 58.9410μs 38.3747μs 26.0588 KOps/s 27.7556 KOps/s $\textbf{\color{#d91a1a}-6.11\%}$
test_step_mdp_speed[True-False-True-True-False] 40.7110μs 24.1077μs 41.4805 KOps/s 43.6428 KOps/s $\color{#d91a1a}-4.95\%$
test_step_mdp_speed[True-False-True-False-True] 44.6710μs 20.8144μs 48.0436 KOps/s 49.9240 KOps/s $\color{#d91a1a}-3.77\%$
test_step_mdp_speed[True-False-True-False-False] 37.5600μs 13.5276μs 73.9232 KOps/s 78.2995 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_step_mdp_speed[True-False-False-True-True] 0.1140ms 39.8234μs 25.1109 KOps/s 26.4504 KOps/s $\textbf{\color{#d91a1a}-5.06\%}$
test_step_mdp_speed[True-False-False-True-False] 44.5600μs 25.9964μs 38.4669 KOps/s 40.1646 KOps/s $\color{#d91a1a}-4.23\%$
test_step_mdp_speed[True-False-False-False-True] 38.7100μs 23.2018μs 43.1001 KOps/s 46.3969 KOps/s $\textbf{\color{#d91a1a}-7.11\%}$
test_step_mdp_speed[True-False-False-False-False] 33.1710μs 15.4161μs 64.8673 KOps/s 68.9061 KOps/s $\textbf{\color{#d91a1a}-5.86\%}$
test_step_mdp_speed[False-True-True-True-True] 70.7310μs 37.9099μs 26.3784 KOps/s 27.6668 KOps/s $\color{#d91a1a}-4.66\%$
test_step_mdp_speed[False-True-True-True-False] 40.8210μs 24.1133μs 41.4709 KOps/s 43.4936 KOps/s $\color{#d91a1a}-4.65\%$
test_step_mdp_speed[False-True-True-False-True] 41.4410μs 25.1857μs 39.7050 KOps/s 42.3972 KOps/s $\textbf{\color{#d91a1a}-6.35\%}$
test_step_mdp_speed[False-True-True-False-False] 34.6710μs 15.4535μs 64.7101 KOps/s 68.1053 KOps/s $\color{#d91a1a}-4.99\%$
test_step_mdp_speed[False-True-False-True-True] 65.6710μs 40.0058μs 24.9964 KOps/s 26.4531 KOps/s $\textbf{\color{#d91a1a}-5.51\%}$
test_step_mdp_speed[False-True-False-True-False] 62.9910μs 26.1747μs 38.2049 KOps/s 40.0708 KOps/s $\color{#d91a1a}-4.66\%$
test_step_mdp_speed[False-True-False-False-True] 45.0200μs 26.9743μs 37.0723 KOps/s 38.9514 KOps/s $\color{#d91a1a}-4.82\%$
test_step_mdp_speed[False-True-False-False-False] 0.1004ms 17.3680μs 57.5771 KOps/s 60.8788 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_step_mdp_speed[False-False-True-True-True] 59.7310μs 42.1341μs 23.7337 KOps/s 25.0937 KOps/s $\textbf{\color{#d91a1a}-5.42\%}$
test_step_mdp_speed[False-False-True-True-False] 46.8700μs 28.1780μs 35.4886 KOps/s 37.5017 KOps/s $\textbf{\color{#d91a1a}-5.37\%}$
test_step_mdp_speed[False-False-True-False-True] 44.9310μs 26.6888μs 37.4689 KOps/s 39.7802 KOps/s $\textbf{\color{#d91a1a}-5.81\%}$
test_step_mdp_speed[False-False-True-False-False] 42.1210μs 17.3713μs 57.5661 KOps/s 61.0583 KOps/s $\textbf{\color{#d91a1a}-5.72\%}$
test_step_mdp_speed[False-False-False-True-True] 0.1106ms 43.1895μs 23.1538 KOps/s 24.2395 KOps/s $\color{#d91a1a}-4.48\%$
test_step_mdp_speed[False-False-False-True-False] 44.9720μs 29.4393μs 33.9682 KOps/s 34.9859 KOps/s $\color{#d91a1a}-2.91\%$
test_step_mdp_speed[False-False-False-False-True] 53.6500μs 28.5547μs 35.0205 KOps/s 36.5156 KOps/s $\color{#d91a1a}-4.09\%$
test_step_mdp_speed[False-False-False-False-False] 36.2610μs 19.1334μs 52.2646 KOps/s 55.2875 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_values[generalized_advantage_estimate-True-True] 25.9632ms 25.3650ms 39.4245 Ops/s 38.9485 Ops/s $\color{#35bf28}+1.22\%$
test_values[vec_generalized_advantage_estimate-True-True] 83.1797ms 3.2285ms 309.7377 Ops/s 302.7205 Ops/s $\color{#35bf28}+2.32\%$
test_values[td0_return_estimate-False-False] 99.9710μs 62.0926μs 16.1050 KOps/s 16.1788 KOps/s $\color{#d91a1a}-0.46\%$
test_values[td1_return_estimate-False-False] 54.9633ms 54.1766ms 18.4581 Ops/s 17.7214 Ops/s $\color{#35bf28}+4.16\%$
test_values[vec_td1_return_estimate-False-False] 2.0304ms 1.7637ms 566.9833 Ops/s 563.1767 Ops/s $\color{#35bf28}+0.68\%$
test_values[td_lambda_return_estimate-True-False] 86.7040ms 86.1501ms 11.6076 Ops/s 11.1181 Ops/s $\color{#35bf28}+4.40\%$
test_values[vec_td_lambda_return_estimate-True-False] 3.9411ms 1.8031ms 554.6078 Ops/s 556.0554 Ops/s $\color{#d91a1a}-0.26\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.1600ms 23.9095ms 41.8243 Ops/s 40.8093 Ops/s $\color{#35bf28}+2.49\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9021ms 0.7050ms 1.4185 KOps/s 1.4169 KOps/s $\color{#35bf28}+0.11\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7407ms 0.6523ms 1.5329 KOps/s 1.5358 KOps/s $\color{#d91a1a}-0.19\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5808ms 1.4600ms 684.9409 Ops/s 685.3395 Ops/s $\color{#d91a1a}-0.06\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9630ms 0.6812ms 1.4679 KOps/s 1.4835 KOps/s $\color{#d91a1a}-1.05\%$
test_dqn_speed 75.9547ms 1.6556ms 604.0075 Ops/s 689.7207 Ops/s $\textbf{\color{#d91a1a}-12.43\%}$
test_ddpg_speed 3.3354ms 2.9551ms 338.3989 Ops/s 349.7322 Ops/s $\color{#d91a1a}-3.24\%$
test_sac_speed 9.2045ms 8.5684ms 116.7081 Ops/s 123.5429 Ops/s $\textbf{\color{#d91a1a}-5.53\%}$
test_redq_speed 11.6386ms 10.7414ms 93.0977 Ops/s 95.8338 Ops/s $\color{#d91a1a}-2.86\%$
test_redq_deprec_speed 12.0866ms 11.4678ms 87.2007 Ops/s 85.2441 Ops/s $\color{#35bf28}+2.30\%$
test_td3_speed 8.6485ms 8.4327ms 118.5855 Ops/s 121.8628 Ops/s $\color{#d91a1a}-2.69\%$
test_cql_speed 27.5666ms 26.3424ms 37.9617 Ops/s 39.0994 Ops/s $\color{#d91a1a}-2.91\%$
test_a2c_speed 6.4062ms 5.7863ms 172.8224 Ops/s 176.0469 Ops/s $\color{#d91a1a}-1.83\%$
test_ppo_speed 6.2943ms 6.0711ms 164.7135 Ops/s 164.9598 Ops/s $\color{#d91a1a}-0.15\%$
test_reinforce_speed 5.7259ms 4.6794ms 213.7004 Ops/s 211.9096 Ops/s $\color{#35bf28}+0.85\%$
test_iql_speed 24.3711ms 20.4714ms 48.8486 Ops/s 48.8336 Ops/s $\color{#35bf28}+0.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.9124ms 3.7360ms 267.6644 Ops/s 271.5061 Ops/s $\color{#d91a1a}-1.41\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7308ms 0.5849ms 1.7098 KOps/s 1.7295 KOps/s $\color{#d91a1a}-1.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7779ms 0.5612ms 1.7819 KOps/s 1.8110 KOps/s $\color{#d91a1a}-1.61\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.0326ms 3.7783ms 264.6714 Ops/s 270.8122 Ops/s $\color{#d91a1a}-2.27\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7479ms 0.5749ms 1.7394 KOps/s 1.8105 KOps/s $\color{#d91a1a}-3.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6876ms 0.5478ms 1.8253 KOps/s 1.8944 KOps/s $\color{#d91a1a}-3.65\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.0107ms 3.8574ms 259.2435 Ops/s 262.6511 Ops/s $\color{#d91a1a}-1.30\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9287ms 0.7067ms 1.4151 KOps/s 1.4302 KOps/s $\color{#d91a1a}-1.06\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8648ms 0.6848ms 1.4602 KOps/s 1.4957 KOps/s $\color{#d91a1a}-2.37\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.8732ms 3.7125ms 269.3627 Ops/s 270.8870 Ops/s $\color{#d91a1a}-0.56\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6964ms 0.5824ms 1.7169 KOps/s 1.5067 KOps/s $\textbf{\color{#35bf28}+13.95\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7245ms 0.5548ms 1.8025 KOps/s 1.8460 KOps/s $\color{#d91a1a}-2.36\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.0295ms 3.7303ms 268.0777 Ops/s 270.3212 Ops/s $\color{#d91a1a}-0.83\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7710ms 0.5749ms 1.7394 KOps/s 1.7879 KOps/s $\color{#d91a1a}-2.71\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7234ms 0.5477ms 1.8259 KOps/s 1.8700 KOps/s $\color{#d91a1a}-2.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.0425ms 3.8829ms 257.5399 Ops/s 261.1003 Ops/s $\color{#d91a1a}-1.36\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9046ms 0.7224ms 1.3842 KOps/s 1.4513 KOps/s $\color{#d91a1a}-4.62\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8711ms 0.6952ms 1.4385 KOps/s 1.3265 KOps/s $\textbf{\color{#35bf28}+8.44\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1245s 9.9264ms 100.7413 Ops/s 104.8402 Ops/s $\color{#d91a1a}-3.91\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 18.5673ms 16.3490ms 61.1659 Ops/s 62.8205 Ops/s $\color{#d91a1a}-2.63\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.3191ms 3.1596ms 316.4947 Ops/s 323.3744 Ops/s $\color{#d91a1a}-2.13\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1067s 11.4693ms 87.1895 Ops/s 89.5815 Ops/s $\color{#d91a1a}-2.67\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 18.7282ms 16.3764ms 61.0634 Ops/s 62.5061 Ops/s $\color{#d91a1a}-2.31\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.6860ms 3.1373ms 318.7408 Ops/s 322.5405 Ops/s $\color{#d91a1a}-1.18\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1111s 11.9369ms 83.7738 Ops/s 83.5803 Ops/s $\color{#35bf28}+0.23\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 19.1041ms 16.6838ms 59.9385 Ops/s 62.2982 Ops/s $\color{#d91a1a}-3.79\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.0959ms 3.4761ms 287.6817 Ops/s 295.6547 Ops/s $\color{#d91a1a}-2.70\%$

@vmoens vmoens merged commit 40e9900 into main Feb 22, 2024
65 of 68 checks passed
@vmoens vmoens deleted the non-td-policy branch February 27, 2024 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants