-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor,Performance] Faster collectors (bis) #1331
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jun 28, 2023
# Conflicts: # torchrl/envs/utils.py
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1892s | 0.1857s | 5.3861 Ops/s | 4.6913 Ops/s | |
test_sync | 0.1013s | 97.8309ms | 10.2217 Ops/s | 8.0327 Ops/s | |
test_async | 0.1854s | 95.0111ms | 10.5251 Ops/s | 8.4431 Ops/s | |
test_simple | 0.9737s | 0.8832s | 1.1323 Ops/s | 1.1198 Ops/s | |
test_transformed | 2.2864s | 2.2053s | 0.4535 Ops/s | 0.4736 Ops/s | |
test_serial | 2.8114s | 2.7485s | 0.3638 Ops/s | 0.3778 Ops/s | |
test_parallel | 2.3747s | 2.1575s | 0.4635 Ops/s | 0.4774 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 1.3202ms | 54.3867μs | 18.3869 KOps/s | 18.7080 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 2.2762ms | 30.9771μs | 32.2819 KOps/s | 32.4563 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 4.9490ms | 40.8395μs | 24.4861 KOps/s | 24.1757 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 0.4803ms | 22.0723μs | 45.3056 KOps/s | 44.3045 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 0.3837ms | 55.3917μs | 18.0532 KOps/s | 17.8371 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 0.5658ms | 32.6650μs | 30.6138 KOps/s | 28.8716 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 1.1439ms | 42.2791μs | 23.6524 KOps/s | 23.1702 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 0.5493ms | 24.3059μs | 41.1422 KOps/s | 40.4941 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 0.6469ms | 55.9424μs | 17.8755 KOps/s | 17.0418 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 2.2910ms | 34.0688μs | 29.3524 KOps/s | 28.6408 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 0.6002ms | 42.3791μs | 23.5965 KOps/s | 23.8015 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 0.3552ms | 24.0170μs | 41.6372 KOps/s | 40.0353 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 1.5068ms | 59.1872μs | 16.8955 KOps/s | 15.9275 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 0.9343ms | 35.6961μs | 28.0142 KOps/s | 26.9587 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 4.3551ms | 44.4860μs | 22.4790 KOps/s | 19.1732 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 0.5197ms | 25.9347μs | 38.5583 KOps/s | 37.3886 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 4.4477ms | 59.0413μs | 16.9373 KOps/s | 16.7771 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 1.3981ms | 34.6571μs | 28.8542 KOps/s | 29.5698 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 0.3970ms | 47.8693μs | 20.8902 KOps/s | 18.6256 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 6.4158ms | 27.2973μs | 36.6337 KOps/s | 36.0154 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 0.5516ms | 58.9145μs | 16.9738 KOps/s | 17.1028 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 1.1672ms | 35.9291μs | 27.8326 KOps/s | 27.1799 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 0.5573ms | 49.4166μs | 20.2361 KOps/s | 20.3988 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 0.9514ms | 28.2428μs | 35.4073 KOps/s | 35.5741 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 3.1293ms | 61.8573μs | 16.1662 KOps/s | 16.8778 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 0.6172ms | 37.5420μs | 26.6369 KOps/s | 26.0692 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 0.1831ms | 48.1761μs | 20.7572 KOps/s | 18.1208 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 0.4270ms | 27.9910μs | 35.7257 KOps/s | 33.5171 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 2.9801ms | 62.7129μs | 15.9457 KOps/s | 16.1196 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 2.3710ms | 39.5908μs | 25.2584 KOps/s | 25.4783 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 0.9253ms | 49.5415μs | 20.1851 KOps/s | 18.7217 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 4.0849ms | 29.7339μs | 33.6316 KOps/s | 30.5640 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 21.6707ms | 18.8630ms | 53.0139 Ops/s | 53.6339 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 75.5466ms | 66.1030ms | 15.1279 Ops/s | 14.4239 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.7134ms | 0.3071ms | 3.2564 KOps/s | 2.9574 KOps/s | |
test_values[td1_return_estimate-False-False] | 19.4841ms | 18.0950ms | 55.2639 Ops/s | 58.2740 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 86.9889ms | 67.0895ms | 14.9055 Ops/s | 14.9019 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 53.5987ms | 45.7055ms | 21.8792 Ops/s | 21.5424 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 78.4416ms | 66.6382ms | 15.0064 Ops/s | 14.9267 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 15.5938ms | 14.3389ms | 69.7404 Ops/s | 71.5780 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 6.7497ms | 4.3936ms | 227.6047 Ops/s | 233.2881 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 2.3877ms | 0.6359ms | 1.5726 KOps/s | 1.5499 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 84.4087ms | 74.7777ms | 13.3730 Ops/s | 14.0316 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 9.6492ms | 5.5209ms | 181.1298 Ops/s | 186.2370 Ops/s | |
test_dqn_speed | 7.9113ms | 2.3938ms | 417.7523 Ops/s | 423.0881 Ops/s | |
test_ddpg_speed | 9.9337ms | 4.4004ms | 227.2501 Ops/s | 228.6864 Ops/s | |
test_sac_speed | 16.6543ms | 12.1767ms | 82.1242 Ops/s | 79.7428 Ops/s | |
test_redq_speed | 30.3917ms | 24.1900ms | 41.3394 Ops/s | 42.2615 Ops/s | |
test_redq_deprec_speed | 22.8509ms | 20.3040ms | 49.2514 Ops/s | 50.7549 Ops/s | |
test_td3_speed | 23.2361ms | 17.7538ms | 56.3258 Ops/s | 61.6967 Ops/s | |
test_cql_speed | 54.7995ms | 47.8879ms | 20.8821 Ops/s | 17.7078 Ops/s | |
test_a2c_speed | 15.2632ms | 10.5665ms | 94.6384 Ops/s | 98.6468 Ops/s | |
test_ppo_speed | 20.7207ms | 11.2802ms | 88.6509 Ops/s | 88.3122 Ops/s | |
test_reinforce_speed | 14.7695ms | 8.5470ms | 116.9999 Ops/s | 113.6091 Ops/s | |
test_iql_speed | 45.2048ms | 41.6881ms | 23.9877 Ops/s | 24.5867 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 0.1409s | 5.6380ms | 177.3691 Ops/s | 207.9854 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 7.6142ms | 5.0055ms | 199.7810 Ops/s | 200.3868 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 9.8131ms | 5.1683ms | 193.4877 Ops/s | 212.6838 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 12.1216ms | 5.1335ms | 194.7987 Ops/s | 183.9582 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 8.2018ms | 5.1565ms | 193.9308 Ops/s | 208.1475 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 7.2151ms | 5.0109ms | 199.5639 Ops/s | 170.9874 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 0.1467s | 5.7025ms | 175.3609 Ops/s | 212.6741 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 9.6422ms | 5.0883ms | 196.5275 Ops/s | 195.2171 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 14.8343ms | 5.2845ms | 189.2316 Ops/s | 192.0651 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 9.3542ms | 4.9993ms | 200.0278 Ops/s | 202.0956 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 9.3471ms | 5.1211ms | 195.2707 Ops/s | 198.3255 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.1802s | 6.0057ms | 166.5075 Ops/s | 191.9995 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 9.5452ms | 5.0945ms | 196.2883 Ops/s | 204.6935 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 11.8956ms | 5.1343ms | 194.7691 Ops/s | 197.6823 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 8.9045ms | 5.0053ms | 199.7900 Ops/s | 191.6021 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 0.1405s | 5.7054ms | 175.2720 Ops/s | 196.5704 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 7.3654ms | 5.1167ms | 195.4396 Ops/s | 197.1261 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.1788s | 5.8603ms | 170.6403 Ops/s | 197.9995 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.3916s | 45.8901ms | 21.7912 Ops/s | 22.4059 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1861s | 41.3057ms | 24.2097 Ops/s | 22.0096 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 0.1884s | 42.0834ms | 23.7624 Ops/s | 23.7557 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.2077s | 41.9166ms | 23.8569 Ops/s | 23.6065 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1916s | 42.5099ms | 23.5239 Ops/s | 24.4105 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 0.2047s | 45.2123ms | 22.1179 Ops/s | 23.4627 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1971s | 41.9915ms | 23.8144 Ops/s | 22.1013 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1867s | 41.8494ms | 23.8952 Ops/s | 24.1772 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 0.1961s | 42.8442ms | 23.3404 Ops/s | 23.9019 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_single | 0.1704s | 0.1699s | 5.8864 Ops/s | 5.1710 Ops/s | |
test_sync | 90.0793ms | 88.0239ms | 11.3606 Ops/s | 9.9377 Ops/s | |
test_async | 0.1721s | 86.1629ms | 11.6059 Ops/s | 10.0400 Ops/s | |
test_simple | 0.8805s | 0.7884s | 1.2683 Ops/s | 1.3161 Ops/s | |
test_transformed | 2.0528s | 1.9794s | 0.5052 Ops/s | 0.5171 Ops/s | |
test_serial | 2.4584s | 2.3818s | 0.4199 Ops/s | 0.4341 Ops/s | |
test_parallel | 1.8794s | 1.8074s | 0.5533 Ops/s | 0.5419 Ops/s | |
test_step_mdp_speed[True-True-True-True-True] | 0.2216ms | 43.2430μs | 23.1251 KOps/s | 23.6474 KOps/s | |
test_step_mdp_speed[True-True-True-True-False] | 0.2150ms | 24.0319μs | 41.6113 KOps/s | 42.4848 KOps/s | |
test_step_mdp_speed[True-True-True-False-True] | 0.1462ms | 30.2507μs | 33.0570 KOps/s | 33.5703 KOps/s | |
test_step_mdp_speed[True-True-True-False-False] | 41.8010μs | 16.8012μs | 59.5196 KOps/s | 60.9452 KOps/s | |
test_step_mdp_speed[True-True-False-True-True] | 0.1589ms | 44.3112μs | 22.5677 KOps/s | 23.0281 KOps/s | |
test_step_mdp_speed[True-True-False-True-False] | 50.0000μs | 25.6981μs | 38.9134 KOps/s | 40.0840 KOps/s | |
test_step_mdp_speed[True-True-False-False-True] | 0.1447ms | 32.2604μs | 30.9978 KOps/s | 31.8297 KOps/s | |
test_step_mdp_speed[True-True-False-False-False] | 48.0010μs | 18.7129μs | 53.4390 KOps/s | 55.2770 KOps/s | |
test_step_mdp_speed[True-False-True-True-True] | 0.1227ms | 45.7098μs | 21.8771 KOps/s | 22.0147 KOps/s | |
test_step_mdp_speed[True-False-True-True-False] | 61.6010μs | 27.5046μs | 36.3576 KOps/s | 37.6358 KOps/s | |
test_step_mdp_speed[True-False-True-False-True] | 0.1262ms | 31.9648μs | 31.2844 KOps/s | 31.9056 KOps/s | |
test_step_mdp_speed[True-False-True-False-False] | 50.2000μs | 18.4261μs | 54.2707 KOps/s | 55.9239 KOps/s | |
test_step_mdp_speed[True-False-False-True-True] | 0.1546ms | 47.0473μs | 21.2552 KOps/s | 21.3604 KOps/s | |
test_step_mdp_speed[True-False-False-True-False] | 58.1010μs | 28.8819μs | 34.6238 KOps/s | 35.5858 KOps/s | |
test_step_mdp_speed[True-False-False-False-True] | 0.1453ms | 33.5414μs | 29.8139 KOps/s | 30.6289 KOps/s | |
test_step_mdp_speed[True-False-False-False-False] | 67.2010μs | 20.0241μs | 49.9398 KOps/s | 51.0707 KOps/s | |
test_step_mdp_speed[False-True-True-True-True] | 0.1504ms | 46.0335μs | 21.7233 KOps/s | 21.9549 KOps/s | |
test_step_mdp_speed[False-True-True-True-False] | 60.6000μs | 27.3346μs | 36.5837 KOps/s | 37.6938 KOps/s | |
test_step_mdp_speed[False-True-True-False-True] | 0.1456ms | 37.0356μs | 27.0010 KOps/s | 27.3387 KOps/s | |
test_step_mdp_speed[False-True-True-False-False] | 0.2772ms | 20.4644μs | 48.8652 KOps/s | 50.0280 KOps/s | |
test_step_mdp_speed[False-True-False-True-True] | 0.1573ms | 47.8914μs | 20.8806 KOps/s | 21.0942 KOps/s | |
test_step_mdp_speed[False-True-False-True-False] | 0.1113ms | 28.9172μs | 34.5815 KOps/s | 35.2366 KOps/s | |
test_step_mdp_speed[False-True-False-False-True] | 0.1387ms | 38.3578μs | 26.0703 KOps/s | 26.2010 KOps/s | |
test_step_mdp_speed[False-True-False-False-False] | 52.0000μs | 22.1564μs | 45.1336 KOps/s | 46.5548 KOps/s | |
test_step_mdp_speed[False-False-True-True-True] | 0.1533ms | 48.9225μs | 20.4405 KOps/s | 20.8729 KOps/s | |
test_step_mdp_speed[False-False-True-True-False] | 0.1173ms | 30.5890μs | 32.6914 KOps/s | 33.5767 KOps/s | |
test_step_mdp_speed[False-False-True-False-True] | 63.1010μs | 39.2234μs | 25.4950 KOps/s | 25.9366 KOps/s | |
test_step_mdp_speed[False-False-True-False-False] | 0.1064ms | 21.7969μs | 45.8780 KOps/s | 47.0852 KOps/s | |
test_step_mdp_speed[False-False-False-True-True] | 0.2385ms | 49.9431μs | 20.0228 KOps/s | 20.3246 KOps/s | |
test_step_mdp_speed[False-False-False-True-False] | 0.1098ms | 32.0035μs | 31.2466 KOps/s | 31.7519 KOps/s | |
test_step_mdp_speed[False-False-False-False-True] | 0.1545ms | 39.6949μs | 25.1921 KOps/s | 25.6122 KOps/s | |
test_step_mdp_speed[False-False-False-False-False] | 50.0010μs | 23.4226μs | 42.6939 KOps/s | 43.9301 KOps/s | |
test_values[generalized_advantage_estimate-True-True] | 16.7459ms | 16.1545ms | 61.9022 Ops/s | 61.3062 Ops/s | |
test_values[vec_generalized_advantage_estimate-True-True] | 56.5632ms | 50.9361ms | 19.6325 Ops/s | 19.1710 Ops/s | |
test_values[td0_return_estimate-False-False] | 0.4517ms | 0.3064ms | 3.2637 KOps/s | 3.4089 KOps/s | |
test_values[td1_return_estimate-False-False] | 15.8342ms | 15.5858ms | 64.1608 Ops/s | 63.5724 Ops/s | |
test_values[vec_td1_return_estimate-False-False] | 53.1953ms | 50.6735ms | 19.7342 Ops/s | 19.2212 Ops/s | |
test_values[td_lambda_return_estimate-True-False] | 39.4961ms | 38.4341ms | 26.0186 Ops/s | 26.0583 Ops/s | |
test_values[vec_td_lambda_return_estimate-True-False] | 58.9599ms | 51.1611ms | 19.5461 Ops/s | 19.0920 Ops/s | |
test_gae_speed[generalized_advantage_estimate-False-1-512] | 13.5418ms | 13.2792ms | 75.3059 Ops/s | 73.1945 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 7.5598ms | 4.2535ms | 235.0991 Ops/s | 228.9204 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 2.2000ms | 0.5897ms | 1.6958 KOps/s | 1.6728 KOps/s | |
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 67.9402ms | 67.3996ms | 14.8369 Ops/s | 14.9897 Ops/s | |
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 6.3622ms | 3.9343ms | 254.1773 Ops/s | 261.3731 Ops/s | |
test_dqn_speed | 2.6318ms | 1.9935ms | 501.6255 Ops/s | 487.4321 Ops/s | |
test_ddpg_speed | 10.3786ms | 3.2902ms | 303.9293 Ops/s | 280.2547 Ops/s | |
test_sac_speed | 12.0594ms | 10.2940ms | 97.1436 Ops/s | 95.1448 Ops/s | |
test_redq_speed | 24.8585ms | 18.4094ms | 54.3202 Ops/s | 54.6378 Ops/s | |
test_redq_deprec_speed | 16.7859ms | 15.6741ms | 63.7995 Ops/s | 63.0728 Ops/s | |
test_td3_speed | 19.5024ms | 14.5638ms | 68.6632 Ops/s | 70.9143 Ops/s | |
test_cql_speed | 47.8674ms | 40.6239ms | 24.6160 Ops/s | 22.8095 Ops/s | |
test_a2c_speed | 9.0028ms | 7.4013ms | 135.1113 Ops/s | 142.9486 Ops/s | |
test_ppo_speed | 20.8610ms | 8.0706ms | 123.9069 Ops/s | 135.4278 Ops/s | |
test_reinforce_speed | 7.1560ms | 5.5983ms | 178.6252 Ops/s | 188.5206 Ops/s | |
test_iql_speed | 29.2524ms | 27.4961ms | 36.3688 Ops/s | 35.1077 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.4035ms | 4.5511ms | 219.7267 Ops/s | 196.5339 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 9.1175ms | 4.6834ms | 213.5220 Ops/s | 219.0414 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 9.4895ms | 4.7057ms | 212.5090 Ops/s | 217.1204 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 0.1520s | 5.1868ms | 192.7964 Ops/s | 225.6472 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 8.0227ms | 4.6747ms | 213.9182 Ops/s | 211.6111 Ops/s | |
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.1857s | 5.4546ms | 183.3301 Ops/s | 217.4018 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 11.5651ms | 4.5467ms | 219.9393 Ops/s | 193.3705 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 9.7504ms | 4.6618ms | 214.5090 Ops/s | 216.5552 Ops/s | |
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 7.8136ms | 4.6453ms | 215.2733 Ops/s | 217.5755 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 0.1470s | 5.1432ms | 194.4331 Ops/s | 197.0654 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 7.7157ms | 4.6356ms | 215.7223 Ops/s | 217.0088 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 11.9930ms | 4.7296ms | 211.4336 Ops/s | 184.2382 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.1093ms | 4.5214ms | 221.1718 Ops/s | 226.0547 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 9.4435ms | 4.7244ms | 211.6661 Ops/s | 216.9080 Ops/s | |
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 7.3330ms | 4.6999ms | 212.7685 Ops/s | 217.7595 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 0.1482s | 5.1920ms | 192.6025 Ops/s | 191.0213 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 8.1383ms | 4.6986ms | 212.8295 Ops/s | 214.5232 Ops/s | |
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 9.4354ms | 4.7232ms | 211.7204 Ops/s | 216.8997 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.3343s | 42.0968ms | 23.7548 Ops/s | 24.9902 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 0.1880s | 35.7583ms | 27.9655 Ops/s | 28.1401 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 0.1859s | 35.4190ms | 28.2334 Ops/s | 28.0174 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1838s | 35.1400ms | 28.4576 Ops/s | 28.3727 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 0.1895s | 35.5189ms | 28.1540 Ops/s | 28.1951 Ops/s | |
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 0.1900s | 38.8619ms | 25.7321 Ops/s | 25.4413 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1934s | 36.2949ms | 27.5521 Ops/s | 27.8152 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 0.1937s | 35.6335ms | 28.0635 Ops/s | 27.9271 Ops/s | |
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 0.1928s | 35.7775ms | 27.9505 Ops/s | 28.4495 Ops/s |
# Conflicts: # torchrl/envs/utils.py
…collector_rollout # Conflicts: # torchrl/envs/transforms/rlhf.py
…collector_rollout
# Conflicts: # torchrl/envs/transforms/rlhf.py # torchrl/envs/utils.py # torchrl/envs/vec_env.py # torchrl/modules/tensordict_module/common.py
# Conflicts: # torchrl/_utils.py # torchrl/envs/transforms/transforms.py
matteobettini
approved these changes
Jul 7, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, i just have a question about the 2 clones. These are very expensive so just wanna make sure there is absolutely no way to avoid them
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
performance
Performance issue or suggestion for improvement
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The perf of stack onto is better than the perf of stack (compared with call to contiguous(), otherwise no real stack occurs)