Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of entropy checks #1584

Closed
garrettr opened this issue Feb 15, 2017 · 9 comments · Fixed by #5954
Closed

Get rid of entropy checks #1584

garrettr opened this issue Feb 15, 2017 · 9 comments · Fixed by #5954
Labels
Milestone

Comments

@garrettr
Copy link
Contributor

In source.py, we check for "available entropy" before generating a key. This is done because GPG blocks on "sufficient entropy" from /dev/random to generate a GPG key, and we don't want to block returning the HTTP response.

After discussing this with some more-knowledgeable security folks, a couple of things have come to my attention:

  • libgcrypt has apparently improved their DRBG code since I last investigated this issue. It now appears to allow the use of urandom (although this may require some configuration, and it's not clear to me how/when this is done), will use getrandom if it's available, etc.
  • libgcrypt has relaxed the formerly stringent checks to use /dev/random and only /dev/random. So if we can't configure libgcrypt to use a non-blocking CSPRNG, we may now be able to trick it by symlinking urandom, or using mknod, etc.

Making this change could have a variety of potential benefits:

However, we may need to use a more recent version of GPG/libgcrypt to take advantage of these recent improvements. This might require upgrading our entire stack from Ubuntu 14.04LTS - a change that we want to make anyway, but one that is non-trivial.

@ghost
Copy link

ghost commented Nov 28, 2017

@redshiftzero
Copy link
Contributor

According to this upstream ticket libgcrypt 1.8 and above will use getrandom() if it's available, but unfortunately @emkll found that xenial is using libgcrypt 1.6.

However another option would be to use the only-urandom gcrypt config option. From the docs this option will:

Always use the non-blocking /dev/urandom or the respective system call instead of the blocking /dev/random. If Libgcrypt is used early in the boot process of the system, this option should only be used if the system also supports the getrandom system call.

Since we're not generating source keys early in the boot process, this might be sufficient - if we want consider this path, we should next determine how long it takes for the entropy pool to be initialized.

@eloquence
Copy link
Member

For the 6/12-6/26 sprint, we've committed to at least an 8 hour time-boxed investigation to determine whether removal of these checks is feasible (this may include some external outreach to security experts). Since removing the checks would obviate the need for the "Flag for reply" workflow, it would not only make journalists' lives easier, but also save quite a bit of time in client development (freedomofpress/securedrop-client#247).

@eloquence
Copy link
Member

if it's available, but unfortunately @emkll found that xenial is using libgcrypt 1.6.

Self-hosting that package is not on the table, so if that's the only solution we'll have to wait until all prod instances have migrated to Bionic before we can potentially fully remove entropy checks (with perhaps a transitional stage where Xenial/Bionic behavior will differ).

However another option would be to use the only-urandom gcrypt config option.

@redshiftzero has committed to do a quick investigation during the 10/23-11/6 sprint into whether this or another workaround might indeed do the trick.

@redshiftzero
Copy link
Contributor

Above I mentioned the possibility of using the only_urandom option (see the relevant discussion of this option here). This option will use getrandom() if it is supported instead of /dev/urandom directly. We are using a version of the kernel that supports the getrandom() syscall. This is what we want as it will block shortly after boot if the RNG is not yet seeded. Otherwise once /dev/urandom is seeded, it will return high quality randomness from /dev/urandom indefinitely (here's a nice talk about /dev/urandom useful for application developers (PDF)).

We should be able to use this config option and remove both the flag for reply and entropy check logic, but I think we'll want to keep the async key generation.

Concretely I think we should:

  1. Use the only_urandom configuration option. When we move to bionic, we'll land on the version of libgcrypt that uses the new getrandom syscall for key generation by default (using this patch). At this time we no longer will need to set only_urandom.
  2. Continue to generate keys asynchronously unless testing shows that key generation is significantly faster. As seen by the plot below, currently key generation is not snappy enough to do synchronously (note that this was generated using libgcrypt out of the box, i.e. reading 300 bytes from /dev/random. It's worth checking again with the only_urandom option turned on but I didn't get a chance to do this):

keygen

  1. Remove the entropy checks and the flag for reply flow. We can release v2 of the API without the flag API endpoint, we remove the corresponding flag view functions, and the flagged column in the sources table in the database. At this time, we should remove haveged also as we will be using /dev/urandom on modern Linux and as such it is not necessary (and may even reduce security).
  2. We will continue to need to handle the case in the UI where the source has submitted something but does not yet have a key. This can happen in cases even without the entropy checking logic, for example if the system is rebooted shortly right after a source submits and key generation has not completed yet.
  3. When a source user signs back in, the logic will be much simpler: we check if they have a key, and if not, we attempt to generate their key again. This is the only time we can do the key generation since we require the user's codename for key generation, which we do not persistently store anywhere.

@eloquence
Copy link
Member

Thanks for the summary, @redshiftzero! Aside from the reboot scenario your mention, did your investigation shed light on how likely key generation is to be skipped (the check in https://github.com/freedomofpress/securedrop/blob/develop/securedrop/source_app/main.py#L221-L234) on typical prod systems, in a denial of service attack or high traffic scenario?

@eloquence
Copy link
Member

When a source user signs back in, the logic will be much simpler: we check if they have a key, and if not, we attempt to generate their key again. This is the only time we can do the key generation since we require the user's codename for key generation, which we do not persistently store anywhere.

Just leaving a note here that we should probably keep some version of the "Sorry we haven't responded yet" message in the UI, to make it clear to the source that the lack of response from the news org is not due to lack of interest.

@eloquence
Copy link
Member

This could be a nice stretch goal for 2.0.0, tentatively adding to milestone.

@eloquence
Copy link
Member

@rmol has committed to taking a stab at this during the 5/5-5/19 sprint, focusing on following up on Jen's earlier investigation and potentially removing the "Flag for reply" workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants