-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Better names handling in LazyStackTD #482
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jul 8, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 36.5000μs | 19.7140μs | 50.7254 KOps/s | 50.3945 KOps/s | |
test_plain_set_stack_nested | 0.2243ms | 0.1808ms | 5.5304 KOps/s | 5.5104 KOps/s | |
test_plain_set_nested_inplace | 88.8010μs | 23.1873μs | 43.1271 KOps/s | 42.3931 KOps/s | |
test_plain_set_stack_nested_inplace | 0.2481ms | 0.2138ms | 4.6773 KOps/s | 4.5585 KOps/s | |
test_items | 43.1010μs | 3.5762μs | 279.6268 KOps/s | 305.0550 KOps/s | |
test_items_nested | 0.3706ms | 0.3465ms | 2.8858 KOps/s | 2.6327 KOps/s | |
test_items_nested_locked | 2.0413ms | 0.3611ms | 2.7696 KOps/s | 2.8296 KOps/s | |
test_items_nested_leaf | 0.2380ms | 0.2107ms | 4.7450 KOps/s | 4.6471 KOps/s | |
test_items_stack_nested | 1.9749ms | 1.9106ms | 523.3876 Ops/s | 341.6974 Ops/s | |
test_items_stack_nested_leaf | 1.8096ms | 1.7346ms | 576.5040 Ops/s | 361.6908 Ops/s | |
test_items_stack_nested_locked | 1.0454ms | 0.9373ms | 1.0669 KOps/s | 810.3591 Ops/s | |
test_keys | 24.5000μs | 4.8431μs | 206.4808 KOps/s | 208.8662 KOps/s | |
test_keys_nested | 1.9769ms | 0.1717ms | 5.8242 KOps/s | 5.6567 KOps/s | |
test_keys_nested_locked | 0.1942ms | 0.1701ms | 5.8802 KOps/s | 5.7529 KOps/s | |
test_keys_nested_leaf | 0.2951ms | 0.1641ms | 6.0942 KOps/s | 5.5703 KOps/s | |
test_keys_stack_nested | 1.8871ms | 1.6961ms | 589.6019 Ops/s | 364.8520 Ops/s | |
test_keys_stack_nested_leaf | 2.0113ms | 1.7018ms | 587.6292 Ops/s | 365.6810 Ops/s | |
test_keys_stack_nested_locked | 0.8567ms | 0.7249ms | 1.3794 KOps/s | 981.6144 Ops/s | |
test_values | 8.0000μs | 1.2228μs | 817.7731 KOps/s | 701.4518 KOps/s | |
test_values_nested | 93.4010μs | 64.0652μs | 15.6091 KOps/s | 15.1843 KOps/s | |
test_values_nested_locked | 95.2020μs | 63.8259μs | 15.6676 KOps/s | 15.1973 KOps/s | |
test_values_nested_leaf | 0.1231ms | 56.4928μs | 17.7014 KOps/s | 17.2581 KOps/s | |
test_values_stack_nested | 1.5877ms | 1.5276ms | 654.6207 Ops/s | 392.4694 Ops/s | |
test_values_stack_nested_leaf | 1.6399ms | 1.5217ms | 657.1713 Ops/s | 393.7564 Ops/s | |
test_values_stack_nested_locked | 0.8418ms | 0.6215ms | 1.6090 KOps/s | 1.1052 KOps/s | |
test_membership | 19.6000μs | 1.7871μs | 559.5607 KOps/s | 532.8338 KOps/s | |
test_membership_nested | 26.6000μs | 3.5420μs | 282.3254 KOps/s | 271.1346 KOps/s | |
test_membership_nested_leaf | 26.9000μs | 3.5024μs | 285.5180 KOps/s | 271.2350 KOps/s | |
test_membership_stacked_nested | 32.5000μs | 13.5651μs | 73.7188 KOps/s | 70.2036 KOps/s | |
test_membership_stacked_nested_leaf | 64.0000μs | 13.5508μs | 73.7963 KOps/s | 70.0473 KOps/s | |
test_membership_nested_last | 31.5010μs | 7.2711μs | 137.5307 KOps/s | 131.9665 KOps/s | |
test_membership_nested_leaf_last | 34.0000μs | 7.2657μs | 137.6323 KOps/s | 130.8132 KOps/s | |
test_membership_stacked_nested_last | 0.2452ms | 0.2159ms | 4.6325 KOps/s | 4.4438 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.5010μs | 16.0368μs | 62.3566 KOps/s | 59.0942 KOps/s | |
test_nested_getleaf | 73.1010μs | 15.1142μs | 66.1629 KOps/s | 65.0607 KOps/s | |
test_nested_get | 0.1943ms | 14.4759μs | 69.0806 KOps/s | 68.9992 KOps/s | |
test_stacked_getleaf | 0.9605ms | 0.8451ms | 1.1833 KOps/s | 707.1999 Ops/s | |
test_stacked_get | 0.8900ms | 0.8101ms | 1.2345 KOps/s | 739.4170 Ops/s | |
test_nested_getitemleaf | 48.3000μs | 15.3712μs | 65.0568 KOps/s | 64.1006 KOps/s | |
test_nested_getitem | 43.8000μs | 14.5888μs | 68.5460 KOps/s | 67.4643 KOps/s | |
test_stacked_getitemleaf | 1.1439ms | 0.8508ms | 1.1754 KOps/s | 700.5907 Ops/s | |
test_stacked_getitem | 0.9823ms | 0.8299ms | 1.2050 KOps/s | 734.4859 Ops/s | |
test_lock_nested | 74.1324ms | 1.4760ms | 677.5128 Ops/s | 707.2141 Ops/s | |
test_lock_stack_nested | 87.2093ms | 16.2647ms | 61.4828 Ops/s | 61.3237 Ops/s | |
test_unlock_nested | 69.1038ms | 1.4795ms | 675.8865 Ops/s | 656.5249 Ops/s | |
test_unlock_stack_nested | 0.1022s | 17.0166ms | 58.7662 Ops/s | 59.9334 Ops/s | |
test_flatten_speed | 1.1360ms | 1.0124ms | 987.7036 Ops/s | 967.1909 Ops/s | |
test_unflatten_speed | 1.9637ms | 1.8014ms | 555.1329 Ops/s | 519.7613 Ops/s | |
test_common_ops | 1.3760ms | 1.0909ms | 916.6547 Ops/s | 911.6756 Ops/s | |
test_creation | 37.9010μs | 6.1784μs | 161.8553 KOps/s | 162.7801 KOps/s | |
test_creation_empty | 31.5010μs | 13.9186μs | 71.8461 KOps/s | 70.1621 KOps/s | |
test_creation_nested_1 | 62.4010μs | 24.8109μs | 40.3048 KOps/s | 39.0455 KOps/s | |
test_creation_nested_2 | 68.8020μs | 27.5208μs | 36.3362 KOps/s | 36.4954 KOps/s | |
test_clone | 0.1992ms | 23.8160μs | 41.9885 KOps/s | 40.4078 KOps/s | |
test_getitem[int] | 0.1047ms | 30.1904μs | 33.1231 KOps/s | 33.0110 KOps/s | |
test_getitem[slice_int] | 98.3010μs | 64.0166μs | 15.6209 KOps/s | 15.5474 KOps/s | |
test_getitem[range] | 0.1261ms | 67.7193μs | 14.7668 KOps/s | 15.0698 KOps/s | |
test_getitem[tuple] | 0.1653ms | 59.0979μs | 16.9211 KOps/s | 16.7552 KOps/s | |
test_getitem[list] | 96.3020μs | 58.2089μs | 17.1795 KOps/s | 16.9407 KOps/s | |
test_setitem_dim[int] | 61.1000μs | 32.4346μs | 30.8312 KOps/s | 30.1666 KOps/s | |
test_setitem_dim[slice_int] | 0.1008ms | 66.6406μs | 15.0059 KOps/s | 14.6818 KOps/s | |
test_setitem_dim[range] | 0.1091ms | 63.9477μs | 15.6378 KOps/s | 15.4247 KOps/s | |
test_setitem_dim[tuple] | 98.8020μs | 58.9954μs | 16.9505 KOps/s | 16.5688 KOps/s | |
test_setitem | 0.2225ms | 31.6545μs | 31.5911 KOps/s | 30.0681 KOps/s | |
test_set | 0.1979ms | 30.2475μs | 33.0605 KOps/s | 31.4474 KOps/s | |
test_set_shared | 0.4165ms | 0.1767ms | 5.6604 KOps/s | 5.6229 KOps/s | |
test_update | 0.2259ms | 34.2443μs | 29.2020 KOps/s | 28.0682 KOps/s | |
test_update_nested | 0.2343ms | 50.6550μs | 19.7414 KOps/s | 18.7928 KOps/s | |
test_set_nested | 0.1784ms | 33.5755μs | 29.7836 KOps/s | 28.2965 KOps/s | |
test_set_nested_new | 0.2318ms | 51.7611μs | 19.3195 KOps/s | 18.7101 KOps/s | |
test_select | 2.3049ms | 94.9993μs | 10.5264 KOps/s | 10.2833 KOps/s | |
test_unbind_speed | 0.7282ms | 0.6274ms | 1.5938 KOps/s | 1.5888 KOps/s | |
test_unbind_speed_stack0 | 3.4515ms | 3.0928ms | 323.3364 Ops/s | 255.1125 Ops/s | |
test_unbind_speed_stack1 | 3.4151μs | 0.4310μs | 2.3202 MOps/s | 2.1105 MOps/s | |
test_creation[device0] | 0.5458ms | 0.4384ms | 2.2808 KOps/s | 2.2822 KOps/s | |
test_creation_from_tensor | 0.5951ms | 0.4908ms | 2.0373 KOps/s | 2.0316 KOps/s | |
test_add_one[memmap_tensor0] | 1.3587ms | 32.2044μs | 31.0516 KOps/s | 30.2074 KOps/s | |
test_contiguous[memmap_tensor0] | 0.1074ms | 8.7092μs | 114.8205 KOps/s | 109.2337 KOps/s | |
test_stack[memmap_tensor0] | 67.1010μs | 26.2219μs | 38.1361 KOps/s | 37.3146 KOps/s | |
test_memmaptd_index | 0.3304ms | 0.2749ms | 3.6378 KOps/s | 3.5056 KOps/s | |
test_memmaptd_index_astensor | 1.3043ms | 1.1793ms | 847.9714 Ops/s | 819.5922 Ops/s | |
test_memmaptd_index_op | 2.5272ms | 2.3841ms | 419.4451 Ops/s | 410.5599 Ops/s | |
test_reshape_pytree | 0.1056ms | 36.2998μs | 27.5483 KOps/s | 26.9232 KOps/s | |
test_reshape_td | 83.4010μs | 43.8861μs | 22.7863 KOps/s | 22.4789 KOps/s | |
test_view_pytree | 0.1380ms | 33.7766μs | 29.6063 KOps/s | 28.7544 KOps/s | |
test_view_td | 30.5010μs | 8.4904μs | 117.7807 KOps/s | 115.9792 KOps/s | |
test_unbind_pytree | 77.5010μs | 37.3945μs | 26.7419 KOps/s | 26.0471 KOps/s | |
test_unbind_td | 0.2053ms | 93.8288μs | 10.6577 KOps/s | 10.7499 KOps/s | |
test_split_pytree | 89.3010μs | 42.9006μs | 23.3097 KOps/s | 22.1641 KOps/s | |
test_split_td | 0.8901ms | 0.1123ms | 8.9054 KOps/s | 8.5948 KOps/s | |
test_add_pytree | 0.1099ms | 45.7420μs | 21.8617 KOps/s | 21.2855 KOps/s | |
test_add_td | 0.1158ms | 74.8637μs | 13.3576 KOps/s | 13.7557 KOps/s | |
test_distributed | 23.0010μs | 8.4238μs | 118.7117 KOps/s | 115.6390 KOps/s | |
test_tdmodule | 0.2057ms | 28.2387μs | 35.4124 KOps/s | 35.1451 KOps/s | |
test_tdmodule_dispatch | 0.3021ms | 54.7005μs | 18.2814 KOps/s | 8.0924 KOps/s | |
test_tdseq | 0.6030ms | 33.1476μs | 30.1681 KOps/s | 29.5710 KOps/s | |
test_tdseq_dispatch | 0.2172ms | 66.8924μs | 14.9494 KOps/s | 14.6476 KOps/s | |
test_instantiation_functorch | 2.0620ms | 1.5680ms | 637.7673 Ops/s | 615.6777 Ops/s | |
test_instantiation_td | 2.0919ms | 1.3133ms | 761.4605 Ops/s | 733.6300 Ops/s | |
test_exec_functorch | 0.2425ms | 0.1811ms | 5.5209 KOps/s | 5.3457 KOps/s | |
test_exec_td | 0.2710ms | 0.1728ms | 5.7882 KOps/s | 5.6527 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.3172ms | 1.1605ms | 861.6862 Ops/s | 610.4136 Ops/s | |
test_vmap_mlp_speed[True-False] | 1.0522ms | 0.5942ms | 1.6829 KOps/s | 1.6981 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.9686ms | 0.9814ms | 1.0189 KOps/s | 728.5250 Ops/s | |
test_vmap_mlp_speed[False-False] | 11.1605ms | 0.4492ms | 2.2259 KOps/s | 2.2865 KOps/s | |
test_vmap_transformer_speed[True-True] | 14.7679ms | 13.8752ms | 72.0710 Ops/s | 53.2357 Ops/s | |
test_vmap_transformer_speed[True-False] | 9.4918ms | 8.5114ms | 117.4892 Ops/s | 120.2111 Ops/s | |
test_vmap_transformer_speed[False-True] | 13.2990ms | 12.4904ms | 80.0612 Ops/s | 54.4092 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.9025ms | 7.9833ms | 125.2615 Ops/s | 121.7401 Ops/s |
Merging as it solves a bug formerly hidden in RL |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.