Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HMAC-SHA-256 test failures on upcoming gcc-15 (after partial union initialization changes) #9814

Open
trofi opened this issue Dec 1, 2024 · 9 comments
Assignees
Labels
bug component-crypto Crypto primitives and low-level interfaces size-s Estimated task size: small (~2d)

Comments

@trofi
Copy link
Contributor

trofi commented Dec 1, 2024

Summary

gcc-15 recently merged https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=0547dbb725b6d8e878a79e28a2e171eafcfbc1aa into it's master branch. Next gcc-15 release will break HMAC tests on mbedtls as:

  The following tests FAILED:
          91 - psa_crypto-suite (Failed)
          113 - psa_crypto_storage_format.v0-suite (Failed)

An example psa_crypto-suite failure is:

PSA MAC setup: good, HMAC-SHA-256 ................................. FAILED
  psa_mac_sign_setup(operation, key, alg) == *status
  at line 199, /home/slyfox/dev/git/mbedtls/tests/suites/test_suite_psa_crypto.function
  lhs = 0x0000000000000000 = 0
  rhs = 0xffffffffffffff77 = -137

System information

Mbed TLS version (number or commit id): 2ca6c28
Operating system and version: linux-6.11.8, NixOS with modified gcc from master branch.
Configuration (if not default, please attach mbedtls_config.h): default.
Compiler and options (if you used a pre-built binary, please indicate how you obtained it): gcc from master branch.
Additional environment information: gcc from master branch.

Expected behavior

The tests should pass.

Actual behavior

cmake reports the following failures:

  The following tests FAILED:
          91 - psa_crypto-suite (Failed)
          113 - psa_crypto_storage_format.v0-suite (Failed)

Steps to reproduce

Pick latest gcc master commit and build mbedtls against it.

Additional information

gcc changed partial union initialization handling in https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=0547dbb725b6d8e878a79e28a2e171eafcfbc1aa

That means that PSA_MAC_OPERATION_INIT is not a correct initializer for all active fields of psa_mac_operation_t.

For example in tests/src/psa_exercise_key.c psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT; and leaves hash contexts (used as &hmac->hash_ctx) with uninitialized data (specifically hmac->hash_ctx.id).

valgrind shows a good location of under-initialized union fields as:

$ valgrind --track-origins=yes --trace-children=yes --num-callers=50 --track-fds=yes --leak-check=full --show-reachable=yes --malloc-fill=0xE1 --free-fill=0xF1 tests/test_suite_psa_crypto
...
==2758824== Conditional jump or move depends on uninitialised value(s)
==2758824==    at 0x483C6B: psa_hash_setup (psa_crypto.c:2298)
==2758824==    by 0x490ADA: psa_hmac_setup_internal (psa_crypto_mac.c:90)
==2758824==    by 0x490ADA: psa_mac_setup (psa_crypto_mac.c:299)
==2758824==    by 0x48412C: psa_driver_wrapper_mac_sign_setup (psa_crypto_driver_wrappers.h:2297)
==2758824==    by 0x48412C: psa_mac_setup (psa_crypto.c:2619)
==2758824==    by 0x4083BC: test_mac_key_policy (test_suite_psa_crypto.function:2192)
==2758824==    by 0x408877: test_mac_key_policy_wrapper (test_suite_psa_crypto.function:2264)
==2758824==    by 0x429F4E: dispatch_test (main_test.function:170)
==2758824==    by 0x42A813: execute_tests (host_test.function:676)
==2758824==    by 0x40247A: main (main_test.function:263)
==2758824==  Uninitialised value was created by a stack allocation
==2758824==    at 0x40822C: test_mac_key_policy (test_suite_psa_crypto.function:2167)
...

Here psa_hash_setup (psa_crypto.c:2298) tries to access operation->hash_ctx.id.

The following change works around some of the test failures (the full fix needs a lot more of those):

--- a/library/psa_crypto_mac.c
+++ b/library/psa_crypto_mac.c
@@ -467,2 +467,3 @@ psa_status_t mbedtls_psa_mac_compute(
     mbedtls_psa_mac_operation_t operation = MBEDTLS_PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof(operation));

--- a/tests/src/psa_exercise_key.c
+++ b/tests/src/psa_exercise_key.c
@@ -125,2 +125,3 @@ static int exercise_mac_key(mbedtls_svc_key_id_t key,
     psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof (operation));
     const unsigned char input[] = "foo";
--- a/tests/suites/test_suite_psa_crypto.function
+++ b/tests/suites/test_suite_psa_crypto.function
@@ -2170,2 +2170,3 @@ void mac_key_policy(int policy_usage_arg,
     psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof(operation));
     psa_key_type_t key_type = key_type_arg;
@@ -3493,2 +3494,3 @@ void mac_setup(int key_type_arg,
     psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof (operation));
     psa_status_t status = PSA_ERROR_GENERIC_ERROR;
@@ -3536,2 +3538,3 @@ void mac_bad_order()
     psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof (operation));
     uint8_t sign_mac[PSA_MAC_MAX_SIZE + 10] = { 0 };
@@ -3701,2 +3704,3 @@ void mac_sign(int key_type_arg,
     psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof (operation));
     psa_key_attributes_t attributes = PSA_KEY_ATTRIBUTES_INIT;
@@ -3787,2 +3791,3 @@ void mac_verify(int key_type_arg,
     psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
+    memset(&operation, 0, sizeof (operation));
     psa_key_attributes_t attributes = PSA_KEY_ATTRIBUTES_INIT;

But I think it needs a wider set of fixes to account for partial enum initializations all over the place.

Another workaround is to use newly added -fzero-init-padding-bits=unions compiler flag.

What would be the best fix here?

Thanks!

@trofi
Copy link
Contributor Author

trofi commented Dec 1, 2024

Here is my understanding which fields do and do not get initialized. mbedtls code is close to this one:

// at tests/src/psa_exercise_key.c:
  psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;

// library/psa_crypto.c:

  if (operation.hash_ctx.id != 0) { return error; }
  //...

// include/psa/crypto_struct.h:
  #define PSA_MAC_OPERATION_INIT { 0, 0, 0, { 0 } }

// include/psa/crypto.h:
  typedef struct psa_mac_operation_s psa_mac_operation_t;

// include/psa/crypto_struct.h:
  struct psa_mac_operation_s {
    unsigned int id;
    uint8_t mac_size;
    unsigned int is_sign : 1;
    psa_driver_mac_context_t ctx;
  };

// include/psa/crypto_driver_contexts_composites.h:
  typedef union {
    unsigned dummy; /* Make sure this union is always non-empty */
    mbedtls_psa_mac_operation_t mbedtls_ctx;
  } psa_driver_mac_context_t;

// include/psa/crypto_builtin_composites.h:
  typedef struct {
    psa_algorithm_t alg;
    union {
        unsigned dummy; /* Make the union non-empty even with no supported algorithms. */
        mbedtls_psa_hmac_operation_t hmac;
        mbedtls_cipher_context_t cmac;
    } ctx;
  } mbedtls_psa_mac_operation_t;

// include/psa/crypto_builtin_composites.h
  typedef struct {
    /** The HMAC algorithm in use */
    psa_algorithm_t alg;
    /** The hash context. */
    struct psa_hash_operation_s hash_ctx;
    /** The HMAC part of the context. */
    uint8_t opad[PSA_HMAC_MAX_HASH_BLOCK_SIZE];
  } mbedtls_psa_hmac_operation_t;

// include/psa/crypto_types.h
  typedef uint32_t psa_algorithm_t;

// include/psa/crypto_struct.h
  struct psa_hash_operation_s {
    /** Unique ID indicating which driver got assigned to do the
     * operation. Since driver contexts are driver-specific, swapping
     * drivers halfway through the operation is not supported.
     * ID values are auto-generated in psa_driver_wrappers.h.
     * ID value zero means the context is not valid or not assigned to
     * any driver (i.e. the driver context is not active, in use). */
    unsigned int id;
    psa_driver_hash_context_t ctx;
  };

If we compressit into a single declaration it looks this way:

  struct {
    unsigned int id; // initialized below
    uint8_t mac_size; // initialized below
    unsigned int is_sign : 1; // initialized below
    union {
      unsigned dummy; // initialized below
      struct {
        uint32_t alg; // initialized below, alias of `dummy`

        // anything below is NOT initialized

        union {
          unsigned dummy;
          struct {
              uint32_t alg;
              struct {
                  unsigned int id; // <- we are about to use this field
                  psa_driver_hash_context_t ctx;
              } hash_ctx;
              uint8_t opad[PSA_HMAC_MAX_HASH_BLOCK_SIZE];
          } hmac;
          // ..
       } mbedtls_ctx;
    } ctx;
  } operation = {
    0, // id
    0, // mac_size
    0, // is_sign
    { 0 } // ctx.dummy
  };

  if (operation.hash_ctx.id != 0) { return error; }

I hope that makes it clearer.

@gilles-peskine-arm gilles-peskine-arm added bug component-crypto Crypto primitives and low-level interfaces labels Dec 2, 2024
@gilles-peskine-arm
Copy link
Contributor

gilles-peskine-arm commented Dec 2, 2024

Thanks for letting us know! This is a library bug:

psa_mac_operation_t operation = PSA_MAC_OPERATION_INIT;
psa_mac_setup_sign(&operation, ...);

or

psa_mac_operation_t operation = {0};
psa_mac_setup_sign(&operation, ...);

are both supposed to be correct code. The setup function should work with whatever gets initialized. The fix should be pretty straightforward (put that memset in the setup function) once we figure out which functions are affected.

Unfortunately, none of the compilers we test with take advantage of the possibility of optimizing union initializations, so our testing can't find this bug. (The only compiler I had encountered that does this optimization, at least for our code base, is CompCert.) Eventually we'll test with GCC 15, but we prefer not to hook pre-release versions of compilers in our CI.

@gilles-peskine-arm gilles-peskine-arm self-assigned this Dec 2, 2024
gilles-peskine-arm added a commit to Mbed-TLS/mbedtls-test that referenced this issue Dec 7, 2024
Temporarily make a recent GCC snapshot available for testing. We make it
available separately, not as gcc-latest, because it is known to break
current branches (Mbed-TLS/mbedtls#9814).

Once #9814 is fixed in all tested branches (including non-ancient pull
requests), which will surely be after the GCC 15 release, we should
switch gcc-latest to GCC 15 (or above).

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
gilles-peskine-arm added a commit to gilles-peskine-arm/mbedtls that referenced this issue Dec 7, 2024
Non-regression for Mbed-TLS#9814

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
gilles-peskine-arm added a commit to Mbed-TLS/mbedtls-test that referenced this issue Dec 7, 2024
Temporarily make a recent GCC snapshot available for testing. We make it
available separately, not as gcc-latest, because it is known to break
current branches (Mbed-TLS/mbedtls#9814).

Once #9814 is fixed in all tested branches (including non-ancient pull
requests), which will surely be after the GCC 15 release, we should
switch gcc-latest to GCC 15 (or above).

Signed-off-by: Gilles Peskine <Gilles.Peskine@arm.com>
@gilles-peskine-arm gilles-peskine-arm added the size-s Estimated task size: small (~2d) label Dec 10, 2024
@gilles-peskine-arm
Copy link
Contributor

As far as I understand, Clang made a similar optimization to union initialization, but backed it out. There, it was an originally unintended consequence of an LLVM optimization. The optimization was present in some 18.x releases (it's not clear to me exactly which ones), but was patched by many distributions, and was removed before 19.0.

For example, on Godbolt (thanks to @bensze01 for that example), in init_local_bad, to initialize a union {uint8_t u8; uint64_t u64;}, Clang 18.1.0 emits a byte move whereas Clang 19.1.0 emits a word move. Clang 18.1.3 from Ubuntu on my machine emits a word move.

@gilles-peskine-arm
Copy link
Contributor

gilles-peskine-arm commented Dec 10, 2024

Some notes on how we're going to fix that. There are several things we can do (not mutually exclusive):

  • Change unions so that their first member is always the largest one. (We can't use designated initializers since we want to retain C99 C++ compatibility in public headers.) This way {0} sets everything to zero. Note that we can't really control the order and sizes when people add drivers, so we'd have to change unions to doubly nested union like this:

    typedef struct {
        …
        union {
            uint8_t[sizeof(psa_driver_hash_context_t)] bytes;
            psa_driver_hash_context_t ctx;
        } ctx;
    } psa_hash_operation_t;
    

    Benefit: guarantees that unions are always initialized to what is naively expected.

    Limitation: this is an ABI change (not an API change because there are no transparent types that are unions), so not acceptable for LTS branches.

  • Change PSA_xxx_OPERATION_INIT to be something more complicated than {0}@bensze01 suggested

    #define INIT_UNION(type) (((union {char init[sizeof(type)]; type val;}) {{0}}).val)
    

    Benefit: guarantees that unions are always initialized to what is naively expected.

    Downside: doesn't help if application code uses {0} or {}, so this is not enough in minor releases (which currently means the 2.28 and 3.6 LTS branches), it's only acceptable for the upcoming TF-PSA-Crypto 1.0 (part of Mbed TLS 4.0). This would be compliant with the PSA API specification (see [API] note below).

  • Change psa_mac_sign_setup and such to memset the union to zero.

    Benefit: very targeted change, working even if the application initializes the structure with {0}.

    Downside: slightly inefficient when the initializer already sets the structure to all-bits-zero.

    Downside: needs us to get it right in all functions, rather than just having one idiom for initialization.


[API] The PSA Crypto API specification explains that you can initialize operation objects in multiple ways (e.g. psa_hash_operation_t, there's similar wording for all similar opaque structure types.) One of the ways is PSA_HASH_OPERATION_INIT. Another way is “Initialize the object to logical zero values by declaring the object as static or global without an explicit initializer”. We don't explicitly state that {} or {0} is a valid initialization. So technically, if {0} is not a valid initializer, we aren't breaking any explicit promise. But that seems to me like a gotcha. I actually didn't remember that {0} wasn't guaranteed to be a valid initializer in the PSA spec until I looked it up! (This changed between version 1.0beta3 and 1.0rc0 of the API specification.)

@thesamesam
Copy link

thesamesam commented Dec 10, 2024

Note that designated initializers are C99 and mbedtls already requires C99 per

* A C99 toolchain (compiler, linker, archiver). We actively test with GCC 5.4, Clang 3.8, Arm Compiler 6, IAR 8 and Visual Studio 2017. More recent versions should work. Slightly older versions may work.

@gilles-peskine-arm
Copy link
Contributor

Note that designated initializers are C99

Oh right, my bad. But in C++ they've only been introduced recently, in C++20. And our public headers are meant to be compatible with C++. We've never officially said which version of C++, but C++20 is still a bit too recent to require. So we can't use designated initializers in public headers, I was just misremembering why.

gentoo-bot pushed a commit to gentoo/gentoo that referenced this issue Dec 23, 2024
GCC 15 makes a change to union initialisation and exposes a bug in
mbedtls. Build with the new -fzero-init-padding-bits=unions flag if
supported to fix the testsuite until the upstream bug is fixed.

Bug: Mbed-TLS/mbedtls#9814
Closes: https://bugs.gentoo.org/946544
Signed-off-by: Sam James <sam@gentoo.org>
@mpg
Copy link
Contributor

mpg commented Dec 24, 2024

change unions to doubly nested union like this

Note: when reading the example it wasn't immediately obvious to me where the nesting was happening - that's because psa_driver_hash_context_t is a union type.

Limitation: this is an ABI change

I seem to be a bit slow this morning, but how exactly is it an ABI change?

@gilles-peskine-arm
Copy link
Contributor

Limitation: this is an ABI change

I seem to be a bit slow this morning, but how exactly is it an ABI change?

In practice, the layout in memory is very likely to be the same. But a compiler is allowed to compile union {a_t a; b_t b;} and union {b_t b; a_t a;} differently. For example, it can store information about which member of the union was written last in some padding. So in theory, it's an ABI change.

By the way, I wasn't fully precise: it is an API change! It changes the meaning of {0} as an initializer for the type. But it isn't an incompatible API change because we'd only be making {0} “better”.

@mpg
Copy link
Contributor

mpg commented Dec 24, 2024

Ah OK, yes, I was thinking we'd get the same memory layout, but of course the standard makes to guarantees about that.

Downside: doesn't help if application code uses {0} or {}, so this is not enough in minor releases (which currently means the 2.28 and 3.6 LTS branches), it's only acceptable for the upcoming TF-PSA-Crypto 1.0 (part of Mbed TLS 4.0). This would be compliant with the PSA API specification (see [API] note below).

That's acceptable in the upcoming major release, but still not super user-friendly: people who already use {0} or {} will need to find every place they do so and replace it with the proper macro. If they miss a place, the outcome is likely that their code will compile but be incorrect.

I get a feeling that's a situation where someone (either us with the first and last approach, or all users with the middle approach) will need to make sure they're doing the right thing in all places, under penalty of code that compiles but is not correct (with some compiles). It's probably better for us to be taking that burden - and hopefully at some point we can have tests ensuring we do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component-crypto Crypto primitives and low-level interfaces size-s Estimated task size: small (~2d)
Projects
None yet
Development

No branches or pull requests

4 participants