[core-updates] Entropy starvation during boot

DoneSubmitted by Marius Bakke.
Details
2 participants
  • Ludovic Courtès
  • Marius Bakke
Owner
unassigned
Severity
important
M
M
Marius Bakke wrote on 24 Sep 2019 17:48
(address . bug-guix@gnu.org)
87sgolae6l.fsf@devup.no
Hello,

After reconfiguring on the 'core-updates' branch, systems using the
OpenSSH service will occasionally (not always!) hang forever during
boot, waiting for entropy. Moving the mouse or mashing the keyboard
allows the boot to proceed.

I don't think this is limited to OpenSSH, but anything that calls
getrandom() during startup.

There is some information about this problem and various workarounds
here, including links to recent LKML discussions:


For Guix, I believe adding (service urandom-seed-service-type) to
%base-services should be sufficient, but have not verified this yet.
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl2KOrIACgkQoqBt8qM6
VPrWNwf/VNr5TutA5dxA8VwjPw5+oxW2hMc5lGVL7eFjFlXE89QrOTLJPZ175e/u
Ix+Qbad5bcR1aW351Q9mTbWC9dKVIJIR3NXQgrTbVzW6amHt552EzdCi56+IKbV+
+C0qLq6H7O3+XfOc0nRCvGTO4kUayCmMvDuWD4yKUR3O5JcIbG+fnXCO38kuTTD+
aduKb6obtnSdkODhZowi/cZM7rARigu2pYH5QttNxkApNAbaLKn2LuOlcIo+ivu7
4brJstBNQqzN56Z0jfiYpgsV47J7isrP3f/6Ibp4yfd3afursMZzQpDcLI0gTkJV
DSROMTGQPMkKN9DiMMgSULmtZF09ww==
=/dUd
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 2 Oct 2019 15:42
(name . Marius Bakke)(address . mbakke@fastmail.com)(address . 37501@debbugs.gnu.org)
875zl7i7qf.fsf@gnu.org
Hello!

Marius Bakke <mbakke@fastmail.com> skribis:

Toggle quote (16 lines)
> After reconfiguring on the 'core-updates' branch, systems using the
> OpenSSH service will occasionally (not always!) hang forever during
> boot, waiting for entropy. Moving the mouse or mashing the keyboard
> allows the boot to proceed.
>
> I don't think this is limited to OpenSSH, but anything that calls
> getrandom() during startup.
>
> There is some information about this problem and various workarounds
> here, including links to recent LKML discussions:
>
> https://daniel-lange.com/archives/152-hello-buster.html
>
> For Guix, I believe adding (service urandom-seed-service-type) to
> %base-services should be sufficient, but have not verified this yet.

‘urandom-seed’ is already part of ‘%base-services’. I have Guix System
on ‘core-updates’ since Sept. 19th, and I haven’t experienced the
problem (or at least it always booted without troubles; could it be that
I didn’t notice?).

Did you remove ‘urandom-seed’ from your list of services?

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 3 Oct 2019 00:19
control message for bug #37501
(address . control@debbugs.gnu.org)
87wodmdc34.fsf@gnu.org
severity 37501 important
quit
L
L
Ludovic Courtès wrote on 3 Oct 2019 00:29
Re: bug#37501: [core-updates] Entropy starvation during boot
(name . Marius Bakke)(address . mbakke@fastmail.com)(address . 37501@debbugs.gnu.org)
87h84qdbmg.fsf@gnu.org
Hi again,

Marius Bakke <mbakke@fastmail.com> skribis:

Toggle quote (13 lines)
> After reconfiguring on the 'core-updates' branch, systems using the
> OpenSSH service will occasionally (not always!) hang forever during
> boot, waiting for entropy. Moving the mouse or mashing the keyboard
> allows the boot to proceed.
>
> I don't think this is limited to OpenSSH, but anything that calls
> getrandom() during startup.
>
> There is some information about this problem and various workarounds
> here, including links to recent LKML discussions:
>
> https://daniel-lange.com/archives/152-hello-buster.html

I read some of these, and our ‘urandom-seed-service-type’ has the same
write the previous seed to /dev/urandom but we don’t credit the
entropy.

The attached patch fixes that, and I think it should fix the problem you
reported. Could people give it a try?

I’m interested in seeing the value of
/proc/sys/kernel/random/entropy_avail with and without this patch right
after boot (don’t try it in ‘guix system vm’ because there’s no seed
there.)

I wasn’t sure how much to add to the entropy count, but I think it’s
safe to account for all the bits of the seed since we know that it comes
from /dev/urandom.

Thoughts?

Ludo’.
Toggle diff (89 lines)
diff --git a/gnu/services/base.scm b/gnu/services/base.scm
index 25716ef152..3fe5cb3329 100644
--- a/gnu/services/base.scm
+++ b/gnu/services/base.scm
@@ -573,7 +573,13 @@ file systems, as well as corresponding @file{/etc/fstab} entries.")))
                         (lambda (seed)
                           (call-with-output-file "/dev/urandom"
                             (lambda (urandom)
-                              (dump-port seed urandom))))))
+                              (dump-port seed urandom)
+
+                              ;; Writing SEED to URANDOM isn't enough: we must
+                              ;; also tell the kernel to account for these
+                              ;; extra bits of entropy.
+                              (let ((bits (* 8 (stat:size (stat seed)))))
+                                (add-to-entropy-count urandom bits)))))))
 
                     ;; Try writing from /dev/hwrng into /dev/urandom.
                     ;; It seems that the file /dev/hwrng always exists, even
diff --git a/guix/build/syscalls.scm b/guix/build/syscalls.scm
index f2fdb4d9d1..bbf2531c79 100644
--- a/guix/build/syscalls.scm
+++ b/guix/build/syscalls.scm
@@ -68,6 +68,7 @@
             statfs
             free-disk-space
             device-in-use?
+            add-to-entropy-count
 
             processes
             mkdtemp!
@@ -706,6 +707,33 @@ backend device."
              (list (strerror err))
              (list err))))))
 
+
+;;;
+;;; Random.
+;;;
+
+;; From <uapi/linux/random.h>.
+(define RNDADDTOENTCNT #x40045201)
+
+(define (add-to-entropy-count port-or-fd n)
+  "Add N to the kernel's entropy count (the value that can be read from
+/proc/sys/kernel/random/entropy_avail).  PORT-OR-FD must correspond to
+/dev/urandom or /dev/random.  Raise to 'system-error with EPERM when the
+caller lacks root privileges."
+  (let ((fd  (if (port? port-or-fd)
+                 (fileno port-or-fd)
+                 port-or-fd))
+        (box (make-bytevector (sizeof int))))
+    (bytevector-sint-set! box 0 n (native-endianness)
+                          (sizeof int))
+    (let-values (((ret err)
+                  (%ioctl fd RNDADDTOENTCNT
+                          (bytevector->pointer box))))
+      (unless (zero? err)
+        (throw 'system-error "add-to-entropy-count" "~A"
+               (list (strerror err))
+               (list err))))))
+
 
 ;;;
 ;;; Containers.
diff --git a/tests/syscalls.scm b/tests/syscalls.scm
index eeb223b950..1b3121e503 100644
--- a/tests/syscalls.scm
+++ b/tests/syscalls.scm
@@ -567,6 +567,19 @@
   (let ((result (call-with-input-file "/var/run/utmpx" read-utmpx)))
     (or (utmpx? result) (eof-object? result))))
 
+(when (zero? (getuid))
+  (test-skip 1))
+(test-equal "add-to-entropy-count"
+  EPERM
+  (call-with-output-file "/dev/urandom"
+    (lambda (port)
+      (catch 'system-error
+        (lambda ()
+          (add-to-entropy-count port 77)
+          #f)
+        (lambda args
+          (system-error-errno args))))))
+
 (test-end)
 
 (false-if-exception (delete-file temp-file))
M
M
Marius Bakke wrote on 4 Oct 2019 00:10
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 37501@debbugs.gnu.org)
87zhih7a5m.fsf@devup.no
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (25 lines)
> Hi again,
>
> Marius Bakke <mbakke@fastmail.com> skribis:
>
>> After reconfiguring on the 'core-updates' branch, systems using the
>> OpenSSH service will occasionally (not always!) hang forever during
>> boot, waiting for entropy. Moving the mouse or mashing the keyboard
>> allows the boot to proceed.
>>
>> I don't think this is limited to OpenSSH, but anything that calls
>> getrandom() during startup.
>>
>> There is some information about this problem and various workarounds
>> here, including links to recent LKML discussions:
>>
>> https://daniel-lange.com/archives/152-hello-buster.html
>
> I read some of these, and our ‘urandom-seed-service-type’ has the same
> bug as <https://github.com/systemd/systemd/issues/4271>. Namely, we
> write the previous seed to /dev/urandom but we don’t credit the
> entropy.
>
> The attached patch fixes that, and I think it should fix the problem you
> reported. Could people give it a try?

Good catch, LGTM. Unfortunately it does not fix the problem.

Toggle quote (5 lines)
> I’m interested in seeing the value of
> /proc/sys/kernel/random/entropy_avail with and without this patch right
> after boot (don’t try it in ‘guix system vm’ because there’s no seed
> there.)

before - 243
after - 2419

I don't know why this change was insufficient. Perhaps the kernel
does not consider such a seed alone trustworthy enough? I also tried to
increase the seed size to no avail.

I found this patch in the 5.4 kernel tree after reading the commit log
of random.c:


...which *does* solve the problem.

The comments in the merge commit suggests that it is not necessarily a
good solution, so I think we should let it "settle" a bit upstream
before pushing it. It does look rather sledgehammer-y...


Thoughts?

I have attached a patch that adds Linus' fix for the curious:
Toggle diff (118 lines)
diff --git a/gnu/local.mk b/gnu/local.mk
index 9f8ce842b6..b9b6ea3ae7 100644
--- a/gnu/local.mk
+++ b/gnu/local.mk
@@ -1078,6 +1078,7 @@ dist_patch_DATA =						\
   %D%/packages/patches/lierolibre-remove-arch-warning.patch	\
   %D%/packages/patches/lierolibre-try-building-other-arch.patch	\
   %D%/packages/patches/linkchecker-tests-require-network.patch	\
+  %D%/packages/patches/linux-libre-active-entropy.patch		\
   %D%/packages/patches/linux-pam-no-setfsuid.patch		\
   %D%/packages/patches/lirc-localstatedir.patch			\
   %D%/packages/patches/lirc-reproducible-build.patch		\
diff --git a/gnu/packages/linux.scm b/gnu/packages/linux.scm
index 6664620c04..dda95c29ac 100644
--- a/gnu/packages/linux.scm
+++ b/gnu/packages/linux.scm
@@ -420,7 +420,8 @@ corresponding UPSTREAM-SOURCE (an origin), using the given DEBLOB-SCRIPTS."
 
 (define-public linux-libre-5.2-source
   (source-with-patches linux-libre-5.2-pristine-source
-                       (list %boot-logo-patch
+                       (list (search-patch "linux-libre-active-entropy.patch")
+                             %boot-logo-patch
                              %linux-libre-arm-export-__sync_icache_dcache-patch)))
 
 (define-public linux-libre-4.19-source
diff --git a/gnu/packages/patches/linux-libre-active-entropy.patch b/gnu/packages/patches/linux-libre-active-entropy.patch
new file mode 100644
index 0000000000..8f081f4a19
--- /dev/null
+++ b/gnu/packages/patches/linux-libre-active-entropy.patch
@@ -0,0 +1,86 @@
+Try to actively add entropy instead of waiting forever.
+Fixes <https://bugs.gnu.org/37501>.
+
+Taken from upstream:
+https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/patch/?id=50ee7529ec4500c88f8664560770a7a1b65db72b
+
+diff --git a/drivers/char/random.c b/drivers/char/random.c
+index 5d5ea4ce1442..2fda6166c1dd 100644
+--- a/drivers/char/random.c
++++ b/drivers/char/random.c
+@@ -1731,6 +1731,56 @@ void get_random_bytes(void *buf, int nbytes)
+ }
+ EXPORT_SYMBOL(get_random_bytes);
+ 
++
++/*
++ * Each time the timer fires, we expect that we got an unpredictable
++ * jump in the cycle counter. Even if the timer is running on another
++ * CPU, the timer activity will be touching the stack of the CPU that is
++ * generating entropy..
++ *
++ * Note that we don't re-arm the timer in the timer itself - we are
++ * happy to be scheduled away, since that just makes the load more
++ * complex, but we do not want the timer to keep ticking unless the
++ * entropy loop is running.
++ *
++ * So the re-arming always happens in the entropy loop itself.
++ */
++static void entropy_timer(struct timer_list *t)
++{
++	credit_entropy_bits(&input_pool, 1);
++}
++
++/*
++ * If we have an actual cycle counter, see if we can
++ * generate enough entropy with timing noise
++ */
++static void try_to_generate_entropy(void)
++{
++	struct {
++		unsigned long now;
++		struct timer_list timer;
++	} stack;
++
++	stack.now = random_get_entropy();
++
++	/* Slow counter - or none. Don't even bother */
++	if (stack.now == random_get_entropy())
++		return;
++
++	timer_setup_on_stack(&stack.timer, entropy_timer, 0);
++	while (!crng_ready()) {
++		if (!timer_pending(&stack.timer))
++			mod_timer(&stack.timer, jiffies+1);
++		mix_pool_bytes(&input_pool, &stack.now, sizeof(stack.now));
++		schedule();
++		stack.now = random_get_entropy();
++	}
++
++	del_timer_sync(&stack.timer);
++	destroy_timer_on_stack(&stack.timer);
++	mix_pool_bytes(&input_pool, &stack.now, sizeof(stack.now));
++}
++
+ /*
+  * Wait for the urandom pool to be seeded and thus guaranteed to supply
+  * cryptographically secure random numbers. This applies to: the /dev/urandom
+@@ -1745,7 +1795,17 @@ int wait_for_random_bytes(void)
+ {
+ 	if (likely(crng_ready()))
+ 		return 0;
+-	return wait_event_interruptible(crng_init_wait, crng_ready());
++
++	do {
++		int ret;
++		ret = wait_event_interruptible_timeout(crng_init_wait, crng_ready(), HZ);
++		if (ret)
++			return ret > 0 ? 0 : ret;
++
++		try_to_generate_entropy();
++	} while (!crng_ready());
++
++	return 0;
+ }
+ EXPORT_SYMBOL(wait_for_random_bytes);
+
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl2WcdYACgkQoqBt8qM6
VPryCgf/UVs4kZC7OKPtSxKrKgr4ZzA7UNt6Dq4bZVK+9B9acsEM2KMFWsfrjUul
7JOQpLiW4DjKdnhjYLNvL2gC+icmzeuZY+SEKCvNi+ScyRnJ8Q8MVelrsOcCgjPE
wrjP8tdsJ5s3m9sjX0TvdXvUi4XzT8yy+nDgcNKY2yibr9JqXOapiU3iAlELJ4l4
UCPTdXft+14jbIfKvyOySScDWKbPYfVE9DGG/5B1mL76zdPuhQh0GQfFwVE3Gae4
NuoR87vFgkezI2BXcIMrf43EXeRnm2RUgIIrOnyuL01CaZHKty5yJXMfvcyIng5T
vvf+kGIOH1cwi5NRImTdSl9n5jbhVA==
=1zYs
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 4 Oct 2019 11:01
(name . Marius Bakke)(address . mbakke@fastmail.com)(address . 37501@debbugs.gnu.org)
87h84oyjdy.fsf@gnu.org
Hi Marius,

Marius Bakke <mbakke@fastmail.com> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (18 lines)
>> I read some of these, and our ‘urandom-seed-service-type’ has the same
>> bug as <https://github.com/systemd/systemd/issues/4271>. Namely, we
>> write the previous seed to /dev/urandom but we don’t credit the
>> entropy.
>>
>> The attached patch fixes that, and I think it should fix the problem you
>> reported. Could people give it a try?
>
> Good catch, LGTM. Unfortunately it does not fix the problem.
>
>> I’m interested in seeing the value of
>> /proc/sys/kernel/random/entropy_avail with and without this patch right
>> after boot (don’t try it in ‘guix system vm’ because there’s no seed
>> there.)
>
> before - 243
> after - 2419

Is that with or without sshd running?

Do we have strong evidence that sshd is stuck in getrandom(2)?

That seems weird to me because we use #:pid-file for sshd, and thus
either sshd produces its PID file and we’re done (‘ssh-daemon’ is
considered started and life goes on), or sshd fails to produce its PID
file within 15 seconds and we kill it and consider that ‘ssh-daemon’
failed to start.

This only way this can hang is if sshd calls getrandom(2) before
daemonizing.

Looking at ‘main’ in sshd.c, I see:

seed_rng();
[…]
already_daemon = daemonized();

which I think means sshd indeed calls getrandom(2) before it has forked.
That explains the situation. :-/

(‘seed_rng’ uses ‘RAND_status’ from OpenSSL, which supports several
methods but presumably defaults to getrandom(2)?)

Toggle quote (4 lines)
> I don't know why this change was insufficient. Perhaps the kernel
> does not consider such a seed alone trustworthy enough? I also tried to
> increase the seed size to no avail.

Can you try to do:

(add-to-entropy-count urandom (expt 2 17))

to see if that changes anything at all?

I checked with strace and the RNDADDTOENTCNT binding seems to be passing
its argument as expected.

Toggle quote (15 lines)
> I found this patch in the 5.4 kernel tree after reading the commit log
> of random.c:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f2dc2798b81531fd93a3b9b7c39da47ec689e55
>
> ...which *does* solve the problem.
>
> The comments in the merge commit suggests that it is not necessarily a
> good solution, so I think we should let it "settle" a bit upstream
> before pushing it. It does look rather sledgehammer-y...
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f2dc2798b81531fd93a3b9b7c39da47ec689e55
>
> Thoughts?

If it has to be that way, we can use this patch and we can always remove
it later if we have a better solution.

At any rate, I’d rather not block ‘core-updates’ any longer.

Thoughts?

Thanks for investigating!

Ludo’.
L
L
Ludovic Courtès wrote on 4 Oct 2019 11:07
(name . Marius Bakke)(address . mbakke@fastmail.com)(address . 37501@debbugs.gnu.org)
87a7agyj2x.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)
> + (let ((bits (* 8 (stat:size (stat seed)))))
> + (add-to-entropy-count urandom bits)))))))

Oh we also need to do that below, when reading from /dev/hwrng:

(when buf
(call-with-output-file "/dev/urandom"
(lambda (urandom)
(put-bytevector urandom buf)
(let ((bits (* 8 (bytevector-length buf))))
(add-to-entropy-count urandom bits))))) ;<- here

Ludo’.
L
L
Ludovic Courtès wrote on 4 Oct 2019 11:15
(name . Marius Bakke)(address . mbakke@fastmail.com)
87y2y0x453.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (5 lines)
> I read some of these, and our ‘urandom-seed-service-type’ has the same
> bug as <https://github.com/systemd/systemd/issues/4271>. Namely, we
> write the previous seed to /dev/urandom but we don’t credit the
> entropy.

Now that I think about it, ‘urandom-seed’ normally contributes 512 bytes
of entropy, but immediately after it *consumes* 512 bytes of entropy:

;; Immediately refresh the seed in case the system doesn't
;; shut down cleanly.
(call-with-input-file "/dev/urandom"
(lambda (urandom)
(let ((previous-umask (umask #o077))
(buf (make-bytevector 512)))
(mkdir-p (dirname #$%random-seed-file))
(get-bytevector-n! urandom buf 0 512)
(call-with-output-file #$%random-seed-file
(lambda (seed)
(put-bytevector seed buf)))
(umask previous-umask))))

This comes from commit 71cb237a7d98dafda7dfbb5f3ba7c68463310383 by Leo.

What about deleting the seed instead of populating it right at boot
time?

That way, we would actually have entropy available at boot time. In
case of a crash, the system may lack entropy upon reboot, but that’s
better than always lacking entropy when booting.

Marius, Leo, WDYT?

(If we wanted to go fancy, we could spawn a separate process that will
attempt to refill the seed minutes after the system has booted.)

Thanks,
Ludo’.
M
M
Marius Bakke wrote on 5 Oct 2019 14:56
(name . Ludovic Courtès)(address . ludo@gnu.org)
87pnjb73la.fsf@devup.no
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (34 lines)
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> I read some of these, and our ‘urandom-seed-service-type’ has the same
>> bug as <https://github.com/systemd/systemd/issues/4271>. Namely, we
>> write the previous seed to /dev/urandom but we don’t credit the
>> entropy.
>
> Now that I think about it, ‘urandom-seed’ normally contributes 512 bytes
> of entropy, but immediately after it *consumes* 512 bytes of entropy:
>
> ;; Immediately refresh the seed in case the system doesn't
> ;; shut down cleanly.
> (call-with-input-file "/dev/urandom"
> (lambda (urandom)
> (let ((previous-umask (umask #o077))
> (buf (make-bytevector 512)))
> (mkdir-p (dirname #$%random-seed-file))
> (get-bytevector-n! urandom buf 0 512)
> (call-with-output-file #$%random-seed-file
> (lambda (seed)
> (put-bytevector seed buf)))
> (umask previous-umask))))
>
> This comes from commit 71cb237a7d98dafda7dfbb5f3ba7c68463310383 by Leo.
>
> What about deleting the seed instead of populating it right at boot
> time?
>
> That way, we would actually have entropy available at boot time. In
> case of a crash, the system may lack entropy upon reboot, but that’s
> better than always lacking entropy when booting.
>
> Marius, Leo, WDYT?

I tried it, but it did not make any discernible difference in the
available entropy right after boot, nor did it aid the CRNG seeding.

So I think we should go with Linus' solution for now, as well as your
original fix Ludo because it seems to be the right thing to do anyway.
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl2YkxEACgkQoqBt8qM6
VPrVxQgAkP2NBvWZ/+KSJ8Mb/Ep7HPhS6I7V4j0RZQvxfLs2mJ64GvtDIaokHOm7
QJSQSvQj2BngE8Pln9S7CbLhXozvDGd8mnQvhV0vmO9udpcLMo5d6yIe8IMYav64
6uYwgmN4wOTIHzT0S+pMJqMRT4ftBj1QcxW00BpeUzqvo9aG+vYibOdzkurIwap2
JsBfHsNgG/JQsYajl9xIckpflPXU9bQdWG9jlRZxwETKp1KWSZd0EVVbuTMC+rKK
sr7RiVj1O51d9zEfUsE23BJLB5402mZRcEUA5hOqvEkvJ2iytpoR7/hfHl3EUotT
qN/Lz+tHzw3wHeSmCKJLRaPBPVzd3A==
=RNUV
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 5 Oct 2019 22:08
(name . Marius Bakke)(address . mbakke@fastmail.com)
87imp3nefk.fsf@gnu.org
Hi,

Marius Bakke <mbakke@fastmail.com> skribis:

Toggle quote (3 lines)
> I tried it, but it did not make any discernible difference in the
> available entropy right after boot, nor did it aid the CRNG seeding.

Bah, too bad, though it still doesn’t sound right to consume this much
entropy right from the start. I’m surprised it doesn’t make any
difference when you remove that bit.

Perhaps we should print the value of /proc/…/entropy_avail in several
places during boot time to get a better understanding.

Toggle quote (3 lines)
> So I think we should go with Linus' solution for now, as well as your
> original fix Ludo because it seems to be the right thing to do anyway.

Sounds good. I’ve pushed it as
81bc4533aa1d7d81472c1d8d9f697ba2a9c9cbf9.

Thanks!

Ludo’.
M
M
Marius Bakke wrote on 6 Oct 2019 18:42
(name . Ludovic Courtès)(address . ludo@gnu.org)
877e5h7rlo.fsf@devup.no
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (11 lines)
> Hi,
>
> Marius Bakke <mbakke@fastmail.com> skribis:
>
>> I tried it, but it did not make any discernible difference in the
>> available entropy right after boot, nor did it aid the CRNG seeding.
>
> Bah, too bad, though it still doesn’t sound right to consume this much
> entropy right from the start. I’m surprised it doesn’t make any
> difference when you remove that bit.

I guess generating 512 random bytes does not cost a lot of entropy.
Writing that made me curious, so I tested it:

$ cat /proc/sys/kernel/random/entropy_avail
3938
$ head -c 512 /dev/urandom > /dev/null && !!
3947

Wait, what? Trying again...

$ head -c 512 /dev/urandom > /dev/null && cat /proc/sys/kernel/random/entropy_avail
3693
[...typing this section of the email...]
$ head -c 512 /dev/urandom > /dev/null && cat /proc/sys/kernel/random/entropy_avail
3898

Toggle quote (3 lines)
> Perhaps we should print the value of /proc/…/entropy_avail in several
> places during boot time to get a better understanding.

That could be useful. My understanding is that we were waiting for the
kernel to be absolutely certain that the entropy pool is sufficiently
random, i.e. "state 2" from this overview:


Once it is initialized, we get an "endless" stream of good random data
thanks to the entropy pool and ChaCha20(?).

See also this article for an overview of the discussions that lead to
Torvalds' patch:


Anyway, I pushed the upstream fix in
dd6989711370c43676edc974f86c8586f21f80f6.
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl2aGYMACgkQoqBt8qM6
VPoz0wf/ffUn+ZsRbKfK/dnREoGnVWtbc4mmYa4Xmd9LwU1npxCNBRMr11b8JNu2
O9GWP3bi54pZhOY7FSCSDxskxR6UYKrY83EaYCzOc29p83d+uATJy7qK/YFsd09P
PpbiZLqAa894EdmmNhwGyvV0A3I9LnF1L3jmB43EJljIMWvrsN3pSdnIueRFyO+8
+xUOrqY7+dC3Yso8Dn3wHxRF682yhnJ3nBArCJ2SlelXUfCfFjSxle1nPOQIks+F
Ab6P06f3NQ0KLGWbH4cRLkrg92JyHvb+RqycmbBjHe4p72P8xgCeNkAnwgjV5zvo
K7YoOp23t9HW9lywNx5b2Ltbx5G7QQ==
=rKLL
-----END PGP SIGNATURE-----

Closed
M
M
Marius Bakke wrote on 6 Oct 2019 19:38
(name . Ludovic Courtès)(address . ludo@gnu.org)
874l0l7p19.fsf@devup.no
Marius Bakke <mbakke@fastmail.com> writes:

Toggle quote (41 lines)
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hi,
>>
>> Marius Bakke <mbakke@fastmail.com> skribis:
>>
>>> I tried it, but it did not make any discernible difference in the
>>> available entropy right after boot, nor did it aid the CRNG seeding.
>>
>> Bah, too bad, though it still doesn’t sound right to consume this much
>> entropy right from the start. I’m surprised it doesn’t make any
>> difference when you remove that bit.
>
> I guess generating 512 random bytes does not cost a lot of entropy.
> Writing that made me curious, so I tested it:
>
> $ cat /proc/sys/kernel/random/entropy_avail
> 3938
> $ head -c 512 /dev/urandom > /dev/null && !!
> 3947
>
> Wait, what? Trying again...
>
> $ head -c 512 /dev/urandom > /dev/null && cat /proc/sys/kernel/random/entropy_avail
> 3693
> [...typing this section of the email...]
> $ head -c 512 /dev/urandom > /dev/null && cat /proc/sys/kernel/random/entropy_avail
> 3898
>
>> Perhaps we should print the value of /proc/…/entropy_avail in several
>> places during boot time to get a better understanding.
>
> That could be useful. My understanding is that we were waiting for the
> kernel to be absolutely certain that the entropy pool is sufficiently
> random, i.e. "state 2" from this overview:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43838a23a05fbd13e47d750d3dfd77001536dd33
>
> Once it is initialized, we get an "endless" stream of good random data
> thanks to the entropy pool and ChaCha20(?).

Answering a question I got from reading my own email: I guess the reason
we can read from /dev/urandom before getrandom(2) works, is that
/dev/urandom does not require the CRNG to be in "state 2".
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEu7At3yzq9qgNHeZDoqBt8qM6VPoFAl2aJoIACgkQoqBt8qM6
VPpAJQf+Ljmu97yxmkmEpjglNGdKe/QD78aYxPKCJNRp0HuoLXJCeTuoojsF3VfS
nLOBApVJFhp6wXHUuKd12AiLKT87MHtyt1k+SLsBZ8eJiWUEpLXdK9AzVNqjqj7Y
H1xrTvUfCLaToYvuCiDwE4VBvLhF8YqXWRAWSJ9Q6Xttf2qiPP1W7MtwZdYw3L7O
Arx1uPhsjrjVEKePr66JoyLzmrGQJDsjFVyk9VoxLeOO8lHwsGJ6+6odcNVl9B7V
sgP0W/8Vy8BqmZQkyZKDOyjazSHhXDvBbgvyVoD2G6t0l5L0tMnZnAcGcLedsS86
efay+pnAfuKaUcJ3t2uHENCoNV7m5w==
=I9IG
-----END PGP SIGNATURE-----

Closed
L
L
Ludovic Courtès wrote on 7 Oct 2019 00:03
(name . Marius Bakke)(address . mbakke@fastmail.com)
871rvpbkgo.fsf@gnu.org
Hello!

Marius Bakke <mbakke@fastmail.com> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (20 lines)
>> Bah, too bad, though it still doesn’t sound right to consume this much
>> entropy right from the start. I’m surprised it doesn’t make any
>> difference when you remove that bit.
>
> I guess generating 512 random bytes does not cost a lot of entropy.
> Writing that made me curious, so I tested it:
>
> $ cat /proc/sys/kernel/random/entropy_avail
> 3938
> $ head -c 512 /dev/urandom > /dev/null && !!
> 3947
>
> Wait, what? Trying again...
>
> $ head -c 512 /dev/urandom > /dev/null && cat /proc/sys/kernel/random/entropy_avail
> 3693
> [...typing this section of the email...]
> $ head -c 512 /dev/urandom > /dev/null && cat /proc/sys/kernel/random/entropy_avail
> 3898

Uh! But that’s once the system is running, and with a long-enough pause
in between reads… maybe?

Toggle quote (17 lines)
>> Perhaps we should print the value of /proc/…/entropy_avail in several
>> places during boot time to get a better understanding.
>
> That could be useful. My understanding is that we were waiting for the
> kernel to be absolutely certain that the entropy pool is sufficiently
> random, i.e. "state 2" from this overview:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=43838a23a05fbd13e47d750d3dfd77001536dd33
>
> Once it is initialized, we get an "endless" stream of good random data
> thanks to the entropy pool and ChaCha20(?).
>
> See also this article for an overview of the discussions that lead to
> Torvalds' patch:
>
> https://lwn.net/SubscriberLink/800509/de787577364be340/

Interesting, thanks for the link!

Toggle quote (3 lines)
> Anyway, I pushed the upstream fix in
> dd6989711370c43676edc974f86c8586f21f80f6.

Coolio, now merging is no longer blocked due to entropy starvation! :-)

Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 37501@debbugs.gnu.org