guix should not ask me to report a bug if it fails due to running out of memory

  • Open
  • quality assurance status badge
Details
2 participants
  • Zack Weinberg
  • Simon Tournier
Owner
unassigned
Submitted by
Zack Weinberg
Severity
normal
Z
Z
Zack Weinberg wrote on 28 Jul 18:02 +0200
(address . bug-guix@gnu.org)
8f273f8d-50db-463b-a0c1-cff561661f7e@app.fastmail.com
If a guix operation fails due to running out of memory, it's reported
to the user as an internal error, instructing them to report a bug to
bug-guix. For example:

# guix pull
Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...
Building from this channel:
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
module-import 2KiB 56KiB/s 00:00 ???????????????????? 100.0%
module-import-compiled 1.3MiB 2.4MiB/s 00:01 ???????????????????? 100.0%
compute-guix-derivation 1KiB 877KiB/s 00:00 ???????????????????? 100.0%
Computing Guix derivation for 'x86_64-linux'... \
GC Warning: Failed to expand heap by 8388608 bytes
[previous line repeats 79 more times]
GC Warning: Failed to expand heap by 69632 bytes
GC Warning: Out of Memory! Heap size: 402 MiB. Returning NULL!
Warning: Unwind-only out of memory exception; skipping pre-unwind handler.
guix pull: error: You found a bug: the program
'/gnu/store/6pckga173mjbpfz7a2bl93h4gimd8fp5-compute-guix-derivation'
failed to compute the derivation for Guix (version:
"46a64c7fdd057283063aae6df058579bb07c4b6a"; system: "x86_64-linux";
host version: "e4aabf42b33346849cb565199cfafc49d4f0aeff";
pull-version: 1). Please report the COMPLETE output above by email to
<bug-guix@gnu.org>.

The fact that it ran the computer out of RAM is *not* a bug per se --
perhaps we could talk about reducing "guix pull"'s memory
requirements, but that's a big hairy design challenge probably.

But that means there *is* a bug: the fatal exception handler should
have a dedicated message for out-of-memory conditions, directing the
user to add more swap or something, instead of reporting a bug.

For reference, this is a server with this much RAM as of when I ran
the above command:

# free -h
total used free shared buff/cache available
Mem: 969Mi 229Mi 557Mi 0B 319Mi 739Mi
Swap: 613Mi 39Mi 574Mi

Enabling zswap did not permit a successful "guix pull" but increasing
the amount of available swap space to 2Gi did. (Memory overcommit is
disabled. Peak memory usage, with the increased swap, was roughly
1.5Gi.)

zw
S
S
Simon Tournier wrote on 3 Sep 16:39 +0200
87ed60u7hc.fsf@gmail.com
Hi,

On Sun, 28 Jul 2024 at 16:02, "Zack Weinberg" via Bug reports for GNU Guix <bug-guix@gnu.org> wrote:

Toggle quote (12 lines)
> Computing Guix derivation for 'x86_64-linux'... \
> GC Warning: Failed to expand heap by 8388608 bytes
> [previous line repeats 79 more times]
> GC Warning: Failed to expand heap by 69632 bytes
> GC Warning: Out of Memory! Heap size: 402 MiB. Returning NULL!
> Warning: Unwind-only out of memory exception; skipping pre-unwind handler.
> guix pull: error: You found a bug: the program
> '/gnu/store/6pckga173mjbpfz7a2bl93h4gimd8fp5-compute-guix-derivation'
> failed to compute the derivation for Guix (version:
> "46a64c7fdd057283063aae6df058579bb07c4b6a"; system: "x86_64-linux";
> host version: "e4aabf42b33346849cb565199cfafc49d4f0aeff";

I guess this bug had been fixed since then. Do you still have it?

Aside, I am not able to reproduce it:

guix time-machine -q --commit=e4aabf42b33346849cb565199cfafc49d4f0aeff \
-- time-machine -q --commit=46a64c7fdd057283063aae6df058579bb07c4b6a \
-- describe

This passes for me.

Cheers,
simon
Z
Z
Zack Weinberg wrote on 3 Sep 17:30 +0200
1cf310cf-8d7f-4f6d-9b07-397ebdaa1ffc@app.fastmail.com
On Tue, Sep 3, 2024, at 10:39 AM, Simon Tournier wrote:
Toggle quote (17 lines)
> On Sun, 28 Jul 2024 at 16:02, "Zack Weinberg" via Bug reports for GNU
> Guix <bug-guix@gnu.org> wrote:
>
>> Computing Guix derivation for 'x86_64-linux'... \
>> GC Warning: Failed to expand heap by 8388608 bytes
>> [previous line repeats 79 more times]
>> GC Warning: Failed to expand heap by 69632 bytes
>> GC Warning: Out of Memory! Heap size: 402 MiB. Returning NULL!
>> Warning: Unwind-only out of memory exception; skipping pre-unwind handler.
>> guix pull: error: You found a bug: the program
>> '/gnu/store/6pckga173mjbpfz7a2bl93h4gimd8fp5-compute-guix-derivation'
>> failed to compute the derivation for Guix (version:
>> "46a64c7fdd057283063aae6df058579bb07c4b6a"; system: "x86_64-linux";
>> host version: "e4aabf42b33346849cb565199cfafc49d4f0aeff";
>
> I guess this bug had been fixed since then. Do you still have it?

I reconfigured my Guix system to avoid the bug, so I cannot easily
trigger it anymore. However, this is not a bug that would go away by
accident. Until someone writes code specifically intended to produce
a better error message when Guile runs out of memory, the bug will be
live.

Toggle quote (6 lines)
> Aside, I am not able to reproduce it:
>
> guix time-machine -q --commit=e4aabf42b33346849cb565199cfafc49d4f0aeff \
> -- time-machine -q --commit=46a64c7fdd057283063aae6df058579bb07c4b6a \
> -- describe

In order to reproduce the bug you need to be working in an environment
(probably a virtual machine is easiest) with a limited amount of
system memory (512MB of RAM and no swap should do it) and memory
overcommit disabled (sysctl -w vm.overcommit_memory=2).

Also, I don't know what "guix time-machine" does but I would not
expect "guix describe" to use substantial amounts of memory. You need
to run a command that computes a large derivation, such as "guix pull"
or "guix system reconfigure".

zw
S
S
Simon Tournier wrote on 3 Sep 18:09 +0200
(name . Zack Weinberg)(address . zack@owlfolio.org)(address . 72340@debbugs.gnu.org)
874j6wsoqk.fsf@gmail.com
Hi,

On Tue, 03 Sep 2024 at 11:30, "Zack Weinberg" <zack@owlfolio.org> wrote:

Toggle quote (4 lines)
> I reconfigured my Guix system to avoid the bug, so I cannot easily
> trigger it anymore. However, this is not a bug that would go away by
> accident.

Yes for sure, that’s not what I meant. :-)

This can happen if a cycle is introduced, i.e., package foo depending on
package foo. Or it can happen for other reasons, as it seems the case here.

Well, this bug seems an instance of bug#47543 [1].

1: bug#47543: “Repeated allocation of very large block” during ‘guix pull’
Ludovic Courtès <ludovic.courtes@inria.fr>
Thu, 01 Apr 2021 16:00:06 +0200
id:87v9966r6h.fsf@inria.fr


Toggle quote (5 lines)
> Also, I don't know what "guix time-machine" does but I would not
> expect "guix describe" to use substantial amounts of memory. You need
> to run a command that computes a large derivation, such as "guix pull"
> or "guix system reconfigure".

“guix time-machine” runs “guix pull”, other said:

guix time-machine -q --commit=e4aabf42b33346849cb565199cfafc49d4f0aeff \
-- time-machine -q --commit=46a64c7fdd057283063aae6df058579bb07c4b6a \
-- describe

is more or less equivalent to:

guix pull --commit=e4aabf42b33346849cb565199cfafc49d4f0aeff -p /tmp/host
/tmp/host/bin/guix pull --commit=46a64c7fdd057283063aae6df058579bb07c4b6a

which minimicks what you ran. Indeed, I’ve run it on my machine and not
inside some VM with constrained memory. However, considering bug#47543
[1], I am not convinced the crash comes from this constrained memory.

Hum, maybe it’s worth merging with bug#47543 [1]. WDYT?

Cheers,
simon
Z
Z
Zack Weinberg wrote on 3 Sep 20:23 +0200
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)(address . 72340@debbugs.gnu.org)
3bb66d15-1597-4c12-b2b3-1fd7a77bffe5@app.fastmail.com
On Tue, Sep 3, 2024, at 12:09 PM, Simon Tournier wrote:
Toggle quote (3 lines)
>
> Well, this bug seems an instance of bug#47543 [1].

It is not. This bug is "When guix runs out of memory, it prints an error
message that gives the user unhelpful advice." That is not the same as
any particular reason why guix might run out of memory.

Toggle quote (13 lines)
> “guix time-machine” runs “guix pull”, other said:
>
> guix time-machine -q --commit=e4aabf42b33346849cb565199cfafc49d4f0aeff \
> -- time-machine -q --commit=46a64c7fdd057283063aae6df058579bb07c4b6a \
> -- describe
>
> is more or less equivalent to:
>
> guix pull --commit=e4aabf42b33346849cb565199cfafc49d4f0aeff -p /tmp/host
> /tmp/host/bin/guix pull --commit=46a64c7fdd057283063aae6df058579bb07c4b6a
>
> which minimicks what you ran.

Thanks for the explanation.

zw
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 72340@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 72340
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch