Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbl: Refactor to avoid unsafe #688

Merged
merged 1 commit into from
Dec 21, 2021

Conversation

aewag
Copy link
Contributor

@aewag aewag commented Dec 19, 2021

MACs/cmac benchmarks were used to evaluate the refactoring, as dbl does not contain its own. The refactoring of dbl seems to have no effect on the MACs/cmac performance.

Benchmarks results with dbl refactored:

$ git checkout 90c46e717ab2321022104ead2f9060b1327f5710; rustup run nightly cargo bench
Previous HEAD position was 9d876cb Bump digest from 0.10.0 to 0.10.1 (#100)
HEAD is now at 90c46e7 Test dbl safe code
    Finished bench [optimized] target(s) in 0.02s
     Running unittests (/home/alwagner/working_directory/tests/MACs/target/release/deps/cmac-5a75094f2021ff95)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running unittests (/home/alwagner/working_directory/tests/MACs/target/release/deps/aes128_cmac-7ce19bbcc576eef1)

running 4 tests
test bench1_10    ... bench:         235 ns/iter (+/- 102) = 42 MB/s
test bench2_100   ... bench:       2,294 ns/iter (+/- 55) = 43 MB/s
test bench3_1000  ... bench:      22,824 ns/iter (+/- 1,547) = 43 MB/s
test bench3_10000 ... bench:     228,810 ns/iter (+/- 16,931) = 43 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out; finished in 5.51s

     Running unittests (/home/alwagner/working_directory/tests/MACs/target/release/deps/aes256_cmac-e6b672d2c21213f3)

running 4 tests
test bench1_10    ... bench:         312 ns/iter (+/- 10) = 32 MB/s
test bench2_100   ... bench:       3,030 ns/iter (+/- 159) = 33 MB/s
test bench3_1000  ... bench:      30,607 ns/iter (+/- 2,214) = 32 MB/s
test bench3_10000 ... bench:     303,722 ns/iter (+/- 16,212) = 32 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out; finished in 3.86s

Benchmark results with dbl untouched:

$ git checkout 9d876cb237bbbddee527d8126663268bea91b6e9; rustup run nightly cargo bench
Previous HEAD position was 90c46e7 Test dbl safe code
HEAD is now at 9d876cb Bump digest from 0.10.0 to 0.10.1 (#100)
    Finished bench [optimized] target(s) in 0.02s
     Running unittests (/home/alwagner/working_directory/tests/MACs/target/release/deps/cmac-b1ec60cee9662e1a)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running unittests (/home/alwagner/working_directory/tests/MACs/target/release/deps/aes128_cmac-137d219e2208c539)

running 4 tests
test bench1_10    ... bench:         236 ns/iter (+/- 13) = 42 MB/s
test bench2_100   ... bench:       2,286 ns/iter (+/- 124) = 43 MB/s
test bench3_1000  ... bench:      22,978 ns/iter (+/- 1,822) = 43 MB/s
test bench3_10000 ... bench:     228,643 ns/iter (+/- 10,610) = 43 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out; finished in 1.73s

     Running unittests (/home/alwagner/working_directory/tests/MACs/target/release/deps/aes256_cmac-b77e6ae6be42117d)

running 4 tests
test bench1_10    ... bench:         312 ns/iter (+/- 54) = 32 MB/s
test bench2_100   ... bench:       3,070 ns/iter (+/- 128) = 32 MB/s
test bench3_1000  ... bench:      30,668 ns/iter (+/- 2,124) = 32 MB/s
test bench3_10000 ... bench:     304,873 ns/iter (+/- 15,266) = 32 MB/s

test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out; finished in 12.60s

Copy link
Member

@tarcieri tarcieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tarcieri tarcieri requested a review from newpavlov December 19, 2021 19:54
Copy link
Member

@newpavlov newpavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It would be nice to get performance measurements without the force-soft feature enabled for aes, but I guess performance should be the same.

While we at it can you please also add #[inline] attributes on methods?

dbl/src/lib.rs Outdated
let mut val = [0u64; 2];
for (s, v) in self.chunks_exact(size_of::<u64>()).zip(val.iter_mut()) {
*v = u64::from_be_bytes(s.try_into().unwrap());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace it simply with:

let mut val = [
    u64::from_be_bytes(self[..8].try_into().unwrap()),
    u64::from_be_bytes(self[8..].try_into().unwrap()),
];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated for the methods dbl and inv_dbl having a block size of 128 bits.

For the block size of 256 bits I kept the for loop. IMO its a bit better readable, but can change it as well.

dbl/src/lib.rs Outdated Show resolved Hide resolved
@aewag aewag force-pushed the dbl-refactor-unsafe branch from 9511613 to 57db0e7 Compare December 20, 2021 15:38
@aewag
Copy link
Contributor Author

aewag commented Dec 20, 2021

Thank you! It would be nice to get performance measurements without the force-soft feature enabled for aes, but I guess performance should be the same.

Below is the benchmark without force-soft feature. PR seems to have no effect.

Before:

running 4 tests                                     
test bench1_10    ... bench:          19 ns/iter (+/- 1) = 526 MB/s
test bench2_100   ... bench:         118 ns/iter (+/- 6) = 847 MB/s
test bench3_1000  ... bench:       1,097 ns/iter (+/- 42) = 911 MB/s
test bench3_10000 ... bench:      10,927 ns/iter (+/- 455) = 915 MB/s
                                                    
running 4 tests                                                                                          
test bench1_10    ... bench:          22 ns/iter (+/- 2) = 454 MB/s
test bench2_100   ... bench:         148 ns/iter (+/- 9) = 675 MB/s                                      
test bench3_1000  ... bench:       1,412 ns/iter (+/- 45) = 708 MB/s
test bench3_10000 ... bench:      13,912 ns/iter (+/- 597) = 718 MB/s

After (dbl refactored):

running 4 tests                                                                                          
test bench1_10    ... bench:          19 ns/iter (+/- 3) = 526 MB/s        
test bench2_100   ... bench:         117 ns/iter (+/- 4) = 854 MB/s 
test bench3_1000  ... bench:       1,115 ns/iter (+/- 34) = 896 MB/s                   
test bench3_10000 ... bench:      10,960 ns/iter (+/- 258) = 912 MB/s

running 4 tests
test bench1_10    ... bench:          21 ns/iter (+/- 1) = 476 MB/s
test bench2_100   ... bench:         146 ns/iter (+/- 7) = 684 MB/s
test bench3_1000  ... bench:       1,411 ns/iter (+/- 94) = 708 MB/s
test bench3_10000 ... bench:      14,091 ns/iter (+/- 360) = 709 MB/s

While we at it can you please also add #[inline] attributes on methods?

Done.

@newpavlov
Copy link
Member

Thank you!

@newpavlov newpavlov merged commit 9701c71 into RustCrypto:master Dec 21, 2021
@aewag aewag deleted the dbl-refactor-unsafe branch January 4, 2022 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants