Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
perf(crypto/merkle): reduce allocations in HashFromByteSlices (celest…
…iaorg#1365) ## Description ~~Reduce allocations in `HashAlternatives` by adding `case 2` which is very common and doesn't require 2 additional recursive calls. As a result 20% less allocations and 10% less CPU time.~~ We can have better solution by using fact that `hash.Hash` implements `io.Writer`. Instead of allocation slice that we want to hash we can do this step by step. Results are even better (30% less CPU and 60% less allocations): ``` celestia-core % go-perftuner bstat a.txt b.txt args: [a.txt b.txt]name old time/op new time/op delta HashAlternatives/recursive-10 31.7µs ± 3% 21.8µs ± 0% -31.23% (p=0.002 n=6+6) HashAlternatives/iterative-10 32.1µs ± 3% 31.9µs ± 1% ~ (p=0.537 n=5+6) name old alloc/op new alloc/op delta HashAlternatives/recursive-10 25.7kB ± 0% 6.6kB ± 0% -74.45% (p=0.002 n=6+6) HashAlternatives/iterative-10 28.4kB ± 0% 28.4kB ± 0% ~ (all equal) name old allocs/op new allocs/op delta HashAlternatives/recursive-10 502 ± 0% 202 ± 0% -59.76% (p=0.002 n=6+6) HashAlternatives/iterative-10 503 ± 0% 503 ± 0% ~ (all equal) ``` Note: same solutions is used in `cometbft` (see: https://github.com/cometbft/cometbft/blob/main/crypto/merkle/hash.go#L26) <details> <summary>Old benchmark</summary> ``` celestia-core % go-perftuner bstat a.txt b.txt args: [a.txt b.txt]name old time/op new time/op delta HashAlternatives/recursive-10 31.6µs ± 3% 27.9µs ± 0% -11.66% (p=0.002 n=6+6) HashAlternatives/iterative-10 32.3µs ± 3% 31.9µs ± 2% ~ (p=0.240 n=6+6) name old alloc/op new alloc/op delta HashAlternatives/recursive-10 25.4kB ± 0% 22.2kB ± 0% -12.59% (p=0.002 n=6+6) HashAlternatives/iterative-10 28.1kB ± 0% 28.1kB ± 0% ~ (all equal) name old allocs/op new allocs/op delta HashAlternatives/recursive-10 497 ± 0% 397 ± 0% -20.12% (p=0.002 n=6+6) HashAlternatives/iterative-10 498 ± 0% 498 ± 0% ~ (all equal) ``` </details> Also, changing benchmark by 1 digit: instead of 100 nodes create 101 to cover all code branches (a bit better code coverage). --- #### PR checklist - [x] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments
- Loading branch information