Skip to content

p2p/conn/secret_connection.go allows MITM #3010

Closed
@HarryR

Description

@HarryR

The MakeSecretConnection function has a bug which allows a third party to MITM the connection and decrypt/modify data in a way which both sides think the long-term key of either side is the same.

This is due to a lack of validation that the ephemeral or long-term keys are not of low-order.

  1. In shareEphPub the ephemeral keys are transmitted to each other.
  2. Then in computeDHSecret a shared secret is derived from the two ephemeral keys using curve25519.ScalarMult
  3. Using that secret the send and receive keys, and a challenge, are derived using deriveSecretAndChallenge
  4. The challenge and long-term public key is then signed and transmitted to the other party using

The problem stems from computeDHSecret not validating the ephemeral public key, if the connection is intercepted and an ephemeral key consisting of all zeros is injected then the secret from computeDHSecret will be the same for both parties for every handshake with any key. That is - the two parties, because of the MITM injecting a null ephemeral key in place of the remotes key, will then both derive the same secret which is also known to the MITM actor.

The derived recvSecret, sendSecret and challenge is now also known by the MITM actor, and will be the same for every connection handshake that is intercepted. The authSigMsg by the long-term keys of either side will also be the same for every intercepted connection, and because the send/recv secrets are known the MITM actor can replay the auth signature message to the other party to impersonate any public key that it's previously observed.

So:

  • Full active intercept with decrypt/re-encrypt
  • Impersonation of any peer using signature re-play

This means SecretConnection isn't secure if connections are intercepted / redirected.

And any peers identity can be impersonated at the connection-level (as in, it'll show that peer in the remote peers list), unless the long-term public key of the peer is validated elsewhere (e.g. expecting a signature from the remote peers nodeInfo.ID() cannot be forged, only re-played during the handshake).

Affected functions:

Example code:

package main

import (
	"fmt"
	crand "crypto/rand"
	"golang.org/x/crypto/curve25519"
	"golang.org/x/crypto/nacl/box"
)


func genEphKeys() (ephPub, ephPriv *[32]byte) {
	var err error
	ephPub, ephPriv, err = box.GenerateKey(crand.Reader)
	if err != nil {
		panic("Could not generate ephemeral keypairs")
	}
	return
}

func computeDHSecret(remPubKey, locPrivKey *[32]byte) (shrKey *[32]byte) {
	shrKey = new([32]byte)
	curve25519.ScalarMult(shrKey, locPrivKey, remPubKey)
	return
}

func main() {
	//remPubKey, _ := genEphKeys()
	var remPubKey [32]byte
	_, locPrivKey := genEphKeys()

	shrKey := computeDHSecret(&remPubKey, locPrivKey)

	fmt.Println("remPubKey =", remPubKey)
	fmt.Println("locPrivKey =", locPrivKey)
	fmt.Println("shrKey =", shrKey)
}

Suggested fix:

Make a reader specifically for public keys, perform low-order point validation on the keys after reading.

Activity

added
T:bugType Bug (Confirmed)
T:securityType: Security (specify priority)
C:p2pComponent: P2P pkg
on Dec 12, 2018
tarcieri

tarcieri commented on Dec 13, 2018

@tarcieri

Handling of all-zero X25519 public keys is a somewhat contentious issue, and something of a red herring given the attack described here. I've generally been weakly in favor of detecting and rejecting them, but explicit handling of them is unnecessary in a correctly-designed protocol.

We can start by looking at what Noise does:

https://noiseprotocol.org/noise.html

Section 12.1:

Executes the Curve25519 DH function (aka "X25519" in [7]). Invalid public key values will produce an output of all zeros.

Alternatively, implementations are allowed to detect inputs that produce an all-zeros output and signal an error instead. This behavior is discouraged because it adds complexity and implementation variance, and does not improve security. This behavior is allowed because it might match the behavior of some software.

The Noise protocol spec effectively says that implementations MAY detect inputs that produce all-zero outputs, but SHOULD ignore them and output all zeroes in this case.

Backing up a bit and looking at the reported issue, we can see the real problem here:

The problem stems from computeDHSecret not validating the ephemeral public key, if the connection is intercepted and an ephemeral key consisting of all zeros is injected

In this case, an active attacker is modifying one of the handshake messages. The real issue here is handshake malleability: a well designed protocol MUST detect and reject this regardless of how the messages were modified. Exactly how the attacker is modifying the message is irrelevant: any modifications to any handshake parameters MUST be rejected. The injection of an all-zero public key is something of a red herring: if an attacker can successfully modify the ephemeral keys without being detected, they could just as easily substitute one for which they know the discrete log, rather than one which is all zeroes.

TLS and Noise both solve this problem with transcript hashing (Noise calls it a "handshake hash" - see section 2.2): both sides compute a digest of all messages sent and received during a handshake. If both sides do not compute the same transcript hash, this MUST abort the connection in some way. In TLS or Noise this manifests as a MAC verification failure.

My recommendation here would be to adopt one of the Noise handshake patterns outright, or barring that, adding a "handshake hash" similar to the one Noise uses to the existing handshake.

As for the all-zero public key issue, I could go either way on that. If the protocol were designed correctly, this is an attack which can only be performed by one of the legitimate connection endpoints (i.e. self-MitM, a threat I don't care about). There are good arguments on both sides for detecting it or ignoring it, although as I said earlier I'm weakly in favor of detecting it, but I feel that way because I think it might provide value detecting buggy implementations, not for security purposes.

HarryR

HarryR commented on Dec 13, 2018

@HarryR
Author

I am suggesting a low-effort fix which doesn't break network-layer compatibility with existing clients.

if an attacker can successfully modify the ephemeral keys without being detected, they could just as easily substitute one for which they know the discrete log, rather than one which is all zeroes.

With SecretConnection you could use an arbitrary ephemeral key, but you can't re-play the challenge signature of either side (which you can do, using a low-order point as the interceptor/malicious-actor's ephemeral key, hence this ticket).

If you are expecting a specific peer and after the handshake the peers long-term ID is different then you can detect interception and disconnect the connection. But... I can't see where in the code verifies that the long-term key that it ended up connecting to was the one it expected (e.g. if you are connecting to a peer with a known long-term key, but get another one instead, you have no authenticity).

For necessary point validation checks, see: https://github.com/jedisct1/libsodium/blob/cfb0f94704841f943a5a11d9e335da409c55d58a/src/libsodium/crypto_core/ed25519/core_ed25519.c#L7

This bug is specifically to do with not validating the ephemeral keys with the current protocol. Validating the public keys fixes the bug.


Re: Noise and validating points being unnecessary, this pushes a hash of the entire conversation from both sides into the challenge signature, rather than just something derived from the DH kex between ephemeral keys. It binds the long-term keys to the contents of the handshake regardless of whether or not the ephemeral points are valid.

However, it doesn't protect against a malicious actor forcing you to derive a predictable session key, which would allow one party to intercept many connections after the handshake, without any communication with the malicious actor.

There are a handful of other bugs throughout Tendermint and related projects which stem from not validating public keys, I highly suggest you do it, just switching to Noise will not solve those problem, and IMO doesn't really fully fix this one either.

tarcieri

tarcieri commented on Dec 13, 2018

@tarcieri

I'd agree that point validation provides a band-aid for this specific attack, and as I said earlier is probably a good idea in general. However:

For necessary point validation checks, see:
https://github.com/jedisct1/libsodium/blob/cfb0f94704841f943a5a11d9e335da409c55d58a/src/libsodium/crypto_core/ed25519/core_ed25519.c#L7

Note that these checks are for the twisted Edwards form of Curve25519 (i.e. "edwards25519"), and most of them are irrelevant or orthogonal to this discussion (the has_small_order one in particular is actually relevant).

The context of this discussion is Diffie-Hellman over the Montgomery form of Curve25519, a.k.a. "X25519". The two curve forms have different wire formats: Ed25519 uses a compressed point, whereas X25519 uses a Montgomery-u coordinate (a.k.a. "x") alone, and in doing so strategically eliminates all other classes of invalid points besides the low order ones we're discussing here (see below).

The point validation you're suggesting can be found on the original Curve25519 ECDH web site:

https://cr.yp.to/ecdh.html

How do I validate Curve25519 public keys?
Don't. The Curve25519 function was carefully designed to allow all 32-byte strings as Diffie-Hellman public keys. Relevant lower-level facts: the number of points of this elliptic curve over the base field is 8 times the prime 2^252 + 27742317777372353535851937790883648493; the number of points of the twist is 4 times the prime 2^253 - 55484635554744707071703875581767296995. This is discussed in more detail in the curve25519 paper.
There are some unusual non-Diffie-Hellman elliptic-curve protocols that need to ensure ``contributory'' behavior. In those protocols, you should reject the 32-byte strings that, in little-endian form, represent 0, 1, 325606250916557431795983626356110631294008115727848805560023387167927233504
(which has order 8), 39382357235489614581723060781553021112529911719440698176882885853963445705823 (which also has order 8), 2^255 - 19 - 1, 2^255 - 19, 2^255 - 19 + 1, 2^255 - 19 + 325606250916557431795983626356110631294008115727848805560023387167927233504, 2^255 - 19 + 39382357235489614581723060781553021112529911719440698176882885853963445705823, 2(2^255 - 19) - 1, 2(2^255 - 19), and 2(2^255 - 19) + 1. But these exclusions are unnecessary for Diffie-Hellman.

The context being discussed here is Diffie-Hellman.

So is djb wrong when he says "Don't"? Again, that's debatable. The "May the Forth" paper suggests it's probably a good idea to defend against microarchitectural attacks, for example:

Rejecting Known Bad Points.
To protect against small subgroup attacks against Curve25519 and related curves that have
a small set of low-order elements, an implementation can simply
check if the received public key is in the set. Bernstein [12] provides
a full list of these points for Curve25519, but suggests that rejecting
these points is only necessary for protocols that wish to ensure
“contributory” behavior. Langley and Hamburg [53] have a similar
suggestion. We argue that rejecting these points would also give
better side-channel protection. While this protection may seem
unnecessary when used with constant-time code, as Kaufmann
et al. [50] demonstrate, constant-time code is fragile and may fail
to provide adequate protection.

All that said: the core problem here is handshake malleability, not low order points. Malleating these keys MUST result in a MAC verify fail for the protocol to be secure.

HarryR

HarryR commented on Dec 13, 2018

@HarryR
Author

Bernstein [12] provides a full list of these points for Curve25519, but suggests that rejecting these points is only necessary for protocols that wish to ensure “contributory” behavior.

For contributory behaviour (e.g. Diffie-Hellman KEX) it is necessary to verify the point is not of low order. Checks I'd use are:

  • Multiply the point by the cofactor (8) and check the point isn't infinity to verify it's not a point of low-order.
  • Then multiply original point by the prime-ordered sub-group (L) to verify it is infinity (to check it's on the main group)
  • Verify curve equation holds for point
  • Verify the output after DH KEX isn't all zeros as a sanity measure, although technically unnecessary?

The sodium implementation has hard-coded low-order points, but needs to do a scalar multiply by L to verify it's on the prime-ordered group.

I'm unsure about the canonical test, it does some bit fiddling and the differences for ed25519 vs curve25519 aren't immediately obvious.

tarcieri

tarcieri commented on Dec 13, 2018

@tarcieri

For contributory behaviour (e.g. Diffie-Hellman KEX) it is necessary to verify the point is not of low order.

While there are good reasons for doing so which I have already covered, the Curve25519 web site explicitly calls out Diffie-Hellman as a use case for which the low order check is unnecessary.

https://cr.yp.to/ecdh.html

Here is the relevant text about contributory behavior:

There are some unusual non-Diffie-Hellman (edit: emphasis mine) elliptic-curve protocols that need to ensure "contributory'' behavior. In those protocols, you should reject [...] But these exclusions are unnecessary for Diffie-Hellman.

HarryR

HarryR commented on Dec 13, 2018

@HarryR
Author

Righto. I will leave this for you to figure out.

However, I would argue that the statement is wrong, because it lead to the bug described above.

tarcieri

tarcieri commented on Dec 13, 2018

@tarcieri

The same bug does not exist in TLS or Noise, regardless of whether they check for low-order points, and that's because the root cause is a malleable, attacker-controlled ephemeral key.

Beyond that, I think "low order points" are a complete red herring here: this attack is specific to the point at Montgomery-u 0, which so happens to be in the set of low order points, and is possible because any number multiplied by 0 is 0.

To my knowledge you cannot perform a similar attack with any of the other blacklisted points.

HarryR

HarryR commented on Dec 13, 2018

@HarryR
Author

Beyond that, I think "low order points" are a complete red herring here: this attack is specific to the point at Montgomery-u 0, which so happens to be in the set of low order points, and is possible because any number multiplied by 0 is 0.

The same attack applies to all points of low-order, for example

func main() {
        _, locPrivKey := genEphKeys()

        /* 325606250916557431795983626356110631294008115727848805560023387167927233504
           (order 8) */
        /*
        remPubKey := [32]byte{ 0xe0, 0xeb, 0x7a, 0x7c, 0x3b, 0x41, 0xb8, 0xae, 0x16, 0x56, 0xe3,
          0xfa, 0xf1, 0x9f, 0xc4, 0x6a, 0xda, 0x09, 0x8d, 0xeb, 0x9c, 0x32,
          0xb1, 0xfd, 0x86, 0x62, 0x05, 0x16, 0x5f, 0x49, 0xb8, 0x00 }
        */

        /* 39382357235489614581723060781553021112529911719440698176882885853963445705823
           (order 8) */
        /*
        remPubKey := [32]byte{ 0x5f, 0x9c, 0x95, 0xbc, 0xa3, 0x50, 0x8c, 0x24, 0xb1, 0xd0, 0xb1,
          0x55, 0x9c, 0x83, 0xef, 0x5b, 0x04, 0x44, 0x5c, 0xc4, 0x58, 0x1c,
          0x8e, 0x86, 0xd8, 0x22, 0x4e, 0xdd, 0xd0, 0x9f, 0x11, 0x57 }
        */

        /* p-1 (order 2) */
        remPubKey := [32]byte{ 0xec, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
          0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
          0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x7f }

        shrKey := computeDHSecret(&remPubKey, locPrivKey)

        fmt.Println("remPubKey =", remPubKey)
        fmt.Println("locPrivKey =", locPrivKey)
        fmt.Println("shrKey =", shrKey)
}
tarcieri

tarcieri commented on Dec 13, 2018

@tarcieri

Touché. I'm not able to execute that code right now, but is shrKey zero?

HarryR

HarryR commented on Dec 13, 2018

@HarryR
Author

Yup it is all zeros.

With other curves there should be n potential points as the result when one of the points is of order n, but I think x25519 may be weird in that sense and just return all zeros instead, is worth double-checking.

A sufficient check, instead of key validation, may be validating that the result of the scalarmult is non-zero.

tarcieri

tarcieri commented on Dec 13, 2018

@tarcieri

Yeah, so the Noise protocol spec discusses that case here:

https://noiseprotocol.org/noise.html

Alternatively, implementations are allowed to detect inputs that produce an all-zeros output and signal an error instead (Section 12.1)

I think it would probably be best to detect and return an error in this case, but note that Noise is also designed to be robust against it:

Executes the Curve25519 DH function (aka "X25519" in [7]). Invalid public key values will produce an output of all zeros.

23 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

C:p2pComponent: P2P pkgT:bugType Bug (Confirmed)T:securityType: Security (specify priority)

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions

    p2p/conn/secret_connection.go allows MITM · Issue #3010 · tendermint/tendermint