grafts cause “guix_ environment” to get killed with OOM

DoneSubmitted by Ricardo Wurmus.
Details
4 participants
  • Sarah Morgensen
  • Julien Lepiller
  • Ludovic Courtès
  • Ricardo Wurmus
Owner
unassigned
Severity
important
R
R
Ricardo Wurmus wrote on 6 Jul 2021 16:38
grafts cause “guix environment” to get killed with OOM
(address . bug-guix@gnu.org)
87sg0rse1n.fsf@elephly.net
With a recent version of Guix, “guix environment” will not
terminate on its own, keeps the CPU busy, and gets killed when the
system eventually runs out of memory.

$ guix describe -f channels

Toggle snippet (8 lines)
(list (channel
(name 'guix)
(url "/home/rekado/dev/gx/branches/master")
(commit
"685cfdec94e5e48c4ad28de53466a28dfc258edb")))


$ guix environment pigx-scrnaseq
[wait until it gets killed]

The problem disappears when grafts are disabled:

$ guix environment --no-grafts pigx-scrnaseq
$ [env] yay!

--
Ricardo
L
L
Ludovic Courtès wrote on 9 Jul 2021 11:31
control message for bug #49439
(address . control@debbugs.gnu.org)
87eec7j0k6.fsf@gnu.org
severity 49439 important
quit
S
S
Sarah Morgensen wrote on 23 Jul 2021 06:59
Re: bug#49439: grafts cause “guix environment ” to get killed with OOM
(name . Ricardo Wurmus)(address . rekado@elephly.net)(address . 49439@debbugs.gnu.org)
86tuklr5g6.fsf@mgsn.dev
Hello,

Ricardo Wurmus <rekado@elephly.net> writes:

Toggle quote (16 lines)
> With a recent version of Guix, “guix environment” will not
> terminate on its own, keeps the CPU busy, and gets killed when the
> system eventually runs out of memory.
>
> $ guix describe -f channels
>
> (list (channel
> (name 'guix)
> (url "/home/rekado/dev/gx/branches/master")
> (commit
> "685cfdec94e5e48c4ad28de53466a28dfc258edb")))
>
>
> $ guix environment pigx-scrnaseq
> [wait until it gets killed]

I can reproduce this with pigx-scrnaseq as well a number of other
packages (listed below).

$ ./pre-inst-env guix describe -f channels
(list (channel
(name 'guix)
(url "/home/sarah/guix")
(commit
"3217a04b0352c2dd13323257b369604eeabfccc3")))

Does not complete within 5 minutes:
package # inputs # transitive inputs
(from package-transitive-inputs)
pigx-chipseq 48 338
pigx-scrnaseq 41 321
r-cellchat 34 110
pigx-rnaseq 34 343
pigx-bsseq 32 358
pigx-sars-cov2-ww 25 261
r-circus 16 134

Does complete:
r-chipseq 6 37 completes in >2m
r-shortread 17 36 completes in >1m
python-scanpy 25 113 completes in <15s

I suspect it has something to do with the number of transitive inputs,
because it is so prevalent with these R packages, which all use
propagated inputs. However... python-scanpy succeeds in under 15
seconds, and it has more transitive inputs than r-chipseq.

Can we reproduce this with a large number of low-transitivity packages
directly on the command line?

Toggle quote (6 lines)
>
> The problem disappears when grafts are disabled:
>
> $ guix environment --no-grafts pigx-scrnaseq
> $ [env] yay!

--
Sarah
J
J
Julien Lepiller wrote on 27 Jul 2021 16:52
Re: bug#49439: grafts cause “guix environment” to get killed with OOM
(address . 49439@debbugs.gnu.org)
464151E3-FBF5-4B80-947E-0F1291FD879D@lepiller.eu
I have a similar issue with an ocaml package I use at work. It's not free software, but all its dependencies are. The dependencies are not all yeet in guix, so to reproduce you might have to import them first with "guix import opam -r foo" for ocaml-foo.

The package depends on ocaml-ounit, ocaml-lp, ocaml-apronext, menhir, ocaml-async, ocaml-core, ocaml-graph, ocaml-libsvm, ocaml-minisat, ocaml-ppx-deriving-yojson, ocaml-yojson, ocaml-z3, ocaml-zarith and z3.

In total, that's 118 transitive inputs. Building the profile takes 30 minutes for me, on an SSD. The builder takes 1.5g resident.

Other than that, I measured time and memory for creating the environment when the profile was already built (no more derivation to build):

`which time` -v ~/guix/pre-inst-env guix environment ocaml-dummy-package -- echo done

User time: 121.43s
System time: 2.28s
Maximum resident: 1803028kB (1.8 GB)

With a warning from GC:

Repeated allocation of very large block (approx. size 35606528)

Note that I get the same numbers with --no-grafts, so it might be a different issue.

"guix build" terminates quickly.

Le 23 juillet 2021 00:59:21 GMT-04:00, Sarah Morgensen <iskarian@mgsn.dev> a écrit :
Toggle quote (62 lines)
>Hello,
>
>Ricardo Wurmus <rekado@elephly.net> writes:
>
>> With a recent version of Guix, “guix environment” will not
>> terminate on its own, keeps the CPU busy, and gets killed when the
>> system eventually runs out of memory.
>>
>> $ guix describe -f channels
>>
>> (list (channel
>> (name 'guix)
>> (url "/home/rekado/dev/gx/branches/master")
>> (commit
>> "685cfdec94e5e48c4ad28de53466a28dfc258edb")))
>>
>>
>> $ guix environment pigx-scrnaseq
>> [wait until it gets killed]
>
>I can reproduce this with pigx-scrnaseq as well a number of other
>packages (listed below).
>
>$ ./pre-inst-env guix describe -f channels
>(list (channel
> (name 'guix)
> (url "/home/sarah/guix")
> (commit
> "3217a04b0352c2dd13323257b369604eeabfccc3")))
>
>Does not complete within 5 minutes:
>package # inputs # transitive inputs
> (from package-transitive-inputs)
>pigx-chipseq 48 338
>pigx-scrnaseq 41 321
>r-cellchat 34 110
>pigx-rnaseq 34 343
>pigx-bsseq 32 358
>pigx-sars-cov2-ww 25 261
>r-circus 16 134
>
>Does complete:
>r-chipseq 6 37 completes in >2m
>r-shortread 17 36 completes in >1m
>python-scanpy 25 113 completes in <15s
>
>I suspect it has something to do with the number of transitive inputs,
>because it is so prevalent with these R packages, which all use
>propagated inputs. However... python-scanpy succeeds in under 15
>seconds, and it has more transitive inputs than r-chipseq.
>
>Can we reproduce this with a large number of low-transitivity packages
>directly on the command line?
>
>>
>> The problem disappears when grafts are disabled:
>>
>> $ guix environment --no-grafts pigx-scrnaseq
>> $ [env] yay!
>
>--
>Sarah
Attachment: file
L
L
Ludovic Courtès wrote on 27 Jul 2021 18:28
Re: bug#49439: grafts cause “guix environment ” to get killed with OOM
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
87y29r3enb.fsf@gnu.org
Hi,

Sarah Morgensen <iskarian@mgsn.dev> skribis:

Toggle quote (30 lines)
> Ricardo Wurmus <rekado@elephly.net> writes:
>
>> With a recent version of Guix, “guix environment” will not
>> terminate on its own, keeps the CPU busy, and gets killed when the
>> system eventually runs out of memory.
>>
>> $ guix describe -f channels
>>
>> (list (channel
>> (name 'guix)
>> (url "/home/rekado/dev/gx/branches/master")
>> (commit
>> "685cfdec94e5e48c4ad28de53466a28dfc258edb")))
>>
>>
>> $ guix environment pigx-scrnaseq
>> [wait until it gets killed]
>
> I can reproduce this with pigx-scrnaseq as well a number of other
> packages (listed below).
>
> $ ./pre-inst-env guix describe -f channels
> (list (channel
> (name 'guix)
> (url "/home/sarah/guix")
> (commit
> "3217a04b0352c2dd13323257b369604eeabfccc3")))
>
> Does not complete within 5 minutes:

What hardware are you using?

Here’s what I observe (with everything already in store and on a hot
cache, with an i7 CPU):

Toggle snippet (28 lines)
$ guix describe
Generacio 188 Jul 25 2021 12:47:29 (nuna)
guix a92dfbc
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: a92dfbce30777de6ca05031e275410cf9f56c84c
$ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null
Garbage collection statistics:
heap size: 160.31 MiB
allocated: 1440.70 MiB
GC times: 39
time spent in GC: 4.51 seconds (46% of user time)

real 0m7.534s
user 0m9.747s
sys 0m0.167s
$ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths >/dev/null
Garbage collection statistics:
heap size: 168.31 MiB
allocated: 2111.71 MiB
GC times: 53
time spent in GC: 6.92 seconds (45% of user time)

real 0m12.100s
user 0m15.517s
sys 0m0.198s

Commit 78daf9e02e5bc51f91488d8237cab2050cc060cf optimizes
‘coalesce-duplicate-inputs’, which was quadratic in the number of inputs (!).
After that change, I get:

Toggle snippet (22 lines)
$ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null
Garbage collection statistics:
heap size: 168.31 MiB
allocated: 716.58 MiB
GC times: 24
time spent in GC: 2.65 seconds (40% of user time)

real 0m5.720s
user 0m6.708s
sys 0m0.149s
$ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths >/dev/null
Garbage collection statistics:
heap size: 160.31 MiB
allocated: 1387.96 MiB
GC times: 42
time spent in GC: 5.87 seconds (43% of user time)

real 0m10.821s
user 0m13.513s
sys 0m0.198s

Could you tell me if it improves the situation for you?

It’s not the end of the road, but it should be an improvement.

Thanks,
Ludo’.
S
S
Sarah Morgensen wrote on 28 Jul 2021 01:35
(name . Ludovic Courtès)(address . ludo@gnu.org)
867dhbpbyt.fsf@mgsn.dev
Hi,

Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (12 lines)
>> I can reproduce this with pigx-scrnaseq as well a number of other
>> packages (listed below).
>>
>> $ ./pre-inst-env guix describe -f channels
>> (list (channel
>> (name 'guix)
>> (url "/home/sarah/guix")
>> (commit
>> "3217a04b0352c2dd13323257b369604eeabfccc3")))
>>
>> Does not complete within 5 minutes:

Okay, so all of a sudden I can't reproduce this; even with the same
commit as above, it completes in ~20s.

guix time-machine --commit=3217a04 -- environment pigx-scrnaseq --search-paths >/dev/null

Toggle quote (2 lines)
> What hardware are you using?

Virtualbox VM with VT-x etc. on a host i7-6700. The VM has 6GB of memory.

Toggle quote (59 lines)
>
> Here’s what I observe (with everything already in store and on a hot
> cache, with an i7 CPU):
>
> $ guix describe
> Generacio 188 Jul 25 2021 12:47:29 (nuna)
> guix a92dfbc
> repository URL: https://git.savannah.gnu.org/git/guix.git
> branch: master
> commit: a92dfbce30777de6ca05031e275410cf9f56c84c
> $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null
> Garbage collection statistics:
> heap size: 160.31 MiB
> allocated: 1440.70 MiB
> GC times: 39
> time spent in GC: 4.51 seconds (46% of user time)
>
> real 0m7.534s
> user 0m9.747s
> sys 0m0.167s
> $ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths >/dev/null
> Garbage collection statistics:
> heap size: 168.31 MiB
> allocated: 2111.71 MiB
> GC times: 53
> time spent in GC: 6.92 seconds (45% of user time)
>
> real 0m12.100s
> user 0m15.517s
> sys 0m0.198s
>
>
> Commit 78daf9e02e5bc51f91488d8237cab2050cc060cf optimizes
> ‘coalesce-duplicate-inputs’, which was quadratic in the number of inputs (!).
> After that change, I get:
>
> $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths --no-grafts >/dev/null
> Garbage collection statistics:
> heap size: 168.31 MiB
> allocated: 716.58 MiB
> GC times: 24
> time spent in GC: 2.65 seconds (40% of user time)
>
> real 0m5.720s
> user 0m6.708s
> sys 0m0.149s
> $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths >/dev/null
> Garbage collection statistics:
> heap size: 160.31 MiB
> allocated: 1387.96 MiB
> GC times: 42
> time spent in GC: 5.87 seconds (43% of user time)
>
> real 0m10.821s
> user 0m13.513s
> sys 0m0.198s
>
> Could you tell me if it improves the situation for you?

*Now* my experience is like yours:

Toggle snippet (30 lines)
$ guix describe
Generation 9 Jul 27 2021 12:35:05 (current)
guix d0ec739
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: d0ec73907c2995034a34339f4a7c2c72c2e21fea

time GUIX_PROFILING=gc guix time-machine --commit=3217a04 -- environment pigx-scrnaseq --search-paths >/dev/null
Garbage collection statistics:
heap size: 176.31 MiB
allocated: 2107.82 MiB
GC times: 52
time spent in GC: 5.26 seconds (23% of user time)

real 0m20.471s
user 0m22.605s
sys 0m0.372s

$ time GUIX_PROFILING=gc guix environment pigx-scrnaseq --search-paths >/dev/null
Garbage collection statistics:
heap size: 152.31 MiB
allocated: 1367.16 MiB
GC times: 40
time spent in GC: 3.25 seconds (21% of user time)

real 0m14.701s
user 0m15.698s
sys 0m0.361s

But why was it occurring before? The only I thing I can think of is that
I didn't have everything in the store first. Is there a way I can prune
just the relevant items from the store to test this?

Toggle quote (6 lines)
>
> It’s not the end of the road, but it should be an improvement.
>
> Thanks,
> Ludo’.

--
Sarah
L
L
Ludovic Courtès wrote on 28 Jul 2021 12:00
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
87k0la3ghk.fsf@gnu.org
Hi,

Sarah Morgensen <iskarian@mgsn.dev> skribis:

Toggle quote (4 lines)
> But why was it occurring before? The only I thing I can think of is that
> I didn't have everything in the store first. Is there a way I can prune
> just the relevant items from the store to test this?

You could try something like:

guix gc -D $(guix gc --references $(guix build pigx-scrnaseq)) \
$(guix gc --references $(guix build pigx-scrnaseq --no-grafts))

Thinking about it, the grafts code depends on what’s in the store: when
nothing is in the store, it bounces to the “build handler”, which
accumulates the list of missing store items, until it starts building
them.

Let’s see if I can reproduce that behavior, too…

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 29 Jul 2021 00:03
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
87wnpa14fx.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (5 lines)
> Thinking about it, the grafts code depends on what’s in the store: when
> nothing is in the store, it bounces to the “build handler”, which
> accumulates the list of missing store items, until it starts building
> them.

So I can reproduce the problem Ricardo and you initially reported by
running:

./pre-inst-env guix environment pigx-scrnaseq --search-paths

after removing some of the ungrafted store items with:

guix gc -D $(guix build r-rlang --no-grafts) \
$(guix gc --references $(guix build pigx-scrnaseq --no-grafts))

The seemingly endless CPU usage and unbound memory use comes from the
‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’
has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when
‘r-rlang’ is not in the store, the grafting code keeps asking for it by
calling ‘build-derivations’, which aborts to the build handler; the
build handler saves the .drv file name and the continuation and keeps
going. But since the next package also depends on ‘r-langr’, we abort
again to the build handler, and so on.

The end result is a very long list of <unresolved> nodes, probably of
this order in this case:

$ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l
594

Presumably, the captured continuations occupy quite a bit of memory,
hence the quick growth.

I suppose one solution is to fire suspended builds when the build
handler sees a build request for a given derivation for the second time.
It needs more thought and testing…

Ludo’.

PS: Did you know ‘pigx-scrnaseq’ has twice as many nodes as
‘libreoffice’?

$ guix graph -t bag pigx-scrnaseq |grep 'label = "'|wc -l
1359
$ guix graph -t bag libreoffice |grep 'label = "'|wc -l
699

That makes it a great example to study and fix scalability issues!
S
S
Sarah Morgensen wrote on 29 Jul 2021 05:20
(name . Ludovic Courtès)(address . ludo@gnu.org)
861r7hpzzv.fsf@mgsn.dev
Hi,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (19 lines)
> Hi,
>
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> Thinking about it, the grafts code depends on what’s in the store: when
>> nothing is in the store, it bounces to the “build handler”, which
>> accumulates the list of missing store items, until it starts building
>> them.
>
> So I can reproduce the problem Ricardo and you initially reported by
> running:
>
> ./pre-inst-env guix environment pigx-scrnaseq --search-paths
>
> after removing some of the ungrafted store items with:
>
> guix gc -D $(guix build r-rlang --no-grafts) \
> $(guix gc --references $(guix build pigx-scrnaseq --no-grafts))

Same here. I'm glad we were able to pinpoint this!

Toggle quote (23 lines)
>
> The seemingly endless CPU usage and unbound memory use comes from the
> ‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’
> has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when
> ‘r-rlang’ is not in the store, the grafting code keeps asking for it by
> calling ‘build-derivations’, which aborts to the build handler; the
> build handler saves the .drv file name and the continuation and keeps
> going. But since the next package also depends on ‘r-langr’, we abort
> again to the build handler, and so on.
>
> The end result is a very long list of <unresolved> nodes, probably of
> this order in this case:
>
> $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l
> 594
>
> Presumably, the captured continuations occupy quite a bit of memory,
> hence the quick growth.
>
> I suppose one solution is to fire suspended builds when the build
> handler sees a build request for a given derivation for the second time.
> It needs more thought and testing…

I wonder if instead it's possible to reduce the memory taken by the
continuations? As someone who has absolutely no experience with the
build/derivation system, it seems like all we *should* need to save is
the derivation file name.

Toggle quote (13 lines)
>
> Ludo’.
>
> PS: Did you know ‘pigx-scrnaseq’ has twice as many nodes as
> ‘libreoffice’?
>
> $ guix graph -t bag pigx-scrnaseq |grep 'label = "'|wc -l
> 1359
> $ guix graph -t bag libreoffice |grep 'label = "'|wc -l
> 699
>
> That makes it a great example to study and fix scalability issues!

For those interested, I've compiled a list of the top 60
highest-dependency packages, as reported by package-closure:

pigx 1630
nextcloud-client 1539
akregator 1486
kmail 1484
korganizer 1481
kdepim-runtime 1480
kincidenceeditor 1478
keventviews 1477
kmailcommon 1476
kcalendarsupport 1475
kmessagelib 1474
knotes 1472
kaddressbook 1469
libksieve 1467
kdepim-apps-libs 1465
libgravatar 1463
kpimcommon 1462
akonadi-calendar 1453
pigx-bsseq 1448
elisa 1446
kaffeine 1432
kdenlive 1431
kmailtransport 1431
dolphin-plugins 1426
k3b 1424
libkgapi 1422
dolphin 1421
kopete 1403
pigx-sars-cov2-ww 1401
krdc 1398
baloo-widgets 1397
baloo 1396
pigx-chipseq 1396
krfb 1389
ffmpegthumbs 1388
kget 1382
kmplayer 1381
kdelibs4support 1375
pigx-scrnaseq 1342
kdevelop 1340
kmailimporter 1326
libkdepim 1325
pigx-rnaseq 1324
labplot 1316
smb4k 1315
kleopatra 1311
kalarmcal 1311
choqok 1311
okular 1310
ktnef 1310
ktorrent 1310
kate 1308
akonadi-search 1308
kcalutils 1307
yakuake 1306
khelpcenter 1305
libksysguard 1305
kdeconnect 1304
konsole 1304
libkleo 1304

There seem to be a lot of kde packages in there, so perhaps this issue
isn't as rare as we might otherwise expect?

--
Sarah
L
L
Ludovic Courtès wrote on 9 Aug 2021 23:26
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
87czqmtims.fsf@inria.fr
Hi!

Sarah Morgensen <iskarian@mgsn.dev> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (27 lines)
>> The seemingly endless CPU usage and unbound memory use comes from the
>> ‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’
>> has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when
>> ‘r-rlang’ is not in the store, the grafting code keeps asking for it by
>> calling ‘build-derivations’, which aborts to the build handler; the
>> build handler saves the .drv file name and the continuation and keeps
>> going. But since the next package also depends on ‘r-langr’, we abort
>> again to the build handler, and so on.
>>
>> The end result is a very long list of <unresolved> nodes, probably of
>> this order in this case:
>>
>> $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l
>> 594
>>
>> Presumably, the captured continuations occupy quite a bit of memory,
>> hence the quick growth.
>>
>> I suppose one solution is to fire suspended builds when the build
>> handler sees a build request for a given derivation for the second time.
>> It needs more thought and testing…
>
> I wonder if instead it's possible to reduce the memory taken by the
> continuations? As someone who has absolutely no experience with the
> build/derivation system, it seems like all we *should* need to save is
> the derivation file name.

Continuations may take a bit of room but the main problem is that we’re
storing too many of them. :-)

Roughly, I think we keep one <unresolved> node per incoming edge into a
package not in the store (‘r-rlang’ in the example above). In a case
like ‘pigx-scrnaseq’, that’s a lot of items.

I added counters and ‘pk’ calls in ‘map/accumulate-builds’ & co. to see
what happens. AFAICS, a cutoff as in the attached patch does the job.
It’s a basic heuristic to avoid unbounded growth, at the expense of
possibly reduced accumulation. For example, here’s the effect on my
user profile with 300+ packages:

Toggle snippet (34 lines)
$ # with cutoff
$ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n

[...]


2,926.7 MB would be downloaded
Garbage collection statistics:
heap size: 119.37 MiB
allocated: 849.56 MiB
GC times: 30
time spent in GC: 2.88 seconds (30% of user time)

real 0m8.987s
user 0m9.482s
sys 0m0.186s
$
$ # without cutoff
$ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n

[...]

3,402.5 MB would be downloaded
Garbage collection statistics:
heap size: 127.37 MiB
allocated: 1504.59 MiB
GC times: 46
time spent in GC: 5.31 seconds (29% of user time)

real 0m21.616s
user 0m18.082s
sys 0m0.255s

You can tell that, without the cutoff (2nd run), the command more
accurately finds out what it’s going to have to download since the
“would be downloaded” figure is higher; with the cutoff (1st run), it
“sees less” of what it’s going to end up downloading. This could be
annoying from a UI viewpoint in “extreme” cases like this one, but most
likely the difference would be invisible in many cases.

More importantly, having this cutoff halves the execution time, which is
nice.

The command initially given in this bug report now behaves like this:

Toggle snippet (23 lines)
$ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths -n -v2
41.8 MB would be downloaded:
/gnu/store/difgj9gf8l7y5bsilcl42x2vzgh493c6-r-utf8-1.2.2
/gnu/store/z4rpk1sn3jhy8brsr30zccln8mira3z5-r-purrr-0.3.4
/gnu/store/nkiga9rfd8a9gdfsp2rjcqz39s8p2hg3-r-magrittr-2.0.1
/gnu/store/c5y0xlw1fbcwa5p5qnk4xji286hd7cqk-r-vctrs-0.3.8
/gnu/store/86vz257dqy4vm6s7axq7zl2jqxacgngf-r-ellipsis-0.3.2
/gnu/store/m2m7h497g5n6vdrp5pxsnr2v8hsriq5f-r-glue-1.4.2
/gnu/store/0zv1sl91fz3ddq8namdfvf6yr0j1p2vx-texlive-bin-20190410
/gnu/store/7ks38sv5cpg7x76g2fdp9cra3x7dpbyq-texlive-union-51265
/gnu/store/vkdjhkl5s025jsjy7jrr9q7kxlsi2sc4-r-minimal-4.1.0
/gnu/store/pysc6ixb5fj1m2n50dac6aq5lgd5bnv4-r-rlang-0.4.11
Garbage collection statistics:
heap size: 127.31 MiB
allocated: 621.97 MiB
GC times: 24
time spent in GC: 2.82 seconds (37% of user time)

real 0m6.493s
user 0m7.584s
sys 0m0.117s

This time it completes, which is already an improvement ;-). The
41.8┬áMB download reported is underestimated, but again that’s the
tradeoff we’re making.

Attached is the patch. I’ll go ahead with that if there are no
objections.

Toggle quote (14 lines)
> For those interested, I've compiled a list of the top 60
> highest-dependency packages, as reported by package-closure:
>
> pigx 1630
> nextcloud-client 1539
> akregator 1486
> kmail 1484
> korganizer 1481
> kdepim-runtime 1480
> kincidenceeditor 1478
> keventviews 1477
> kmailcommon 1476
> kcalendarsupport 1475

Nice! I really need to stop taking ‘libreoffice’ as an example.

Thanks,
Ludo’.
Toggle diff (54 lines)
diff --git a/guix/store.scm b/guix/store.scm
index 1ab2b08b47..340ad8bc10 100644
--- a/guix/store.scm
+++ b/guix/store.scm
@@ -1358,11 +1358,27 @@ on the build output of a previous derivation."
 (define (map/accumulate-builds store proc lst)
   "Apply PROC over each element of LST, accumulating 'build-things' calls and
 coalescing them into a single call."
-  (define result
-    (map (lambda (obj)
-           (with-build-handler build-accumulator
-             (proc obj)))
-         lst))
+  (define accumulation-cutoff
+    ;; Threshold above which we stop accumulating unresolved nodes to avoid
+    ;; pessimal behavior.  See <https://bugs.gnu.org/49439>.
+    30)
+
+  (define-values (result rest)
+    (let loop ((lst lst)
+               (result '())
+               (unresolved 0))
+      (match lst
+        ((head . tail)
+         (match (with-build-handler build-accumulator
+                  (proc head))
+           ((? unresolved? obj)
+            (if (> unresolved accumulation-cutoff)
+                (values (reverse (cons obj result)) tail)
+                (loop tail (cons obj result) (+ 1 unresolved))))
+           (obj
+            (loop tail (cons obj result) unresolved))))
+        (()
+         (values (reverse result) lst)))))
 
   (match (append-map (lambda (obj)
                        (if (unresolved? obj)
@@ -1370,6 +1386,7 @@ coalescing them into a single call."
                            '()))
                      result)
     (()
+     ;; REST is necessarily empty.
      result)
     (to-build
      ;; We've accumulated things TO-BUILD.  Actually build them and resume the
@@ -1382,7 +1399,7 @@ coalescing them into a single call."
                                   ;; unnecessary.
                                   ((unresolved-continuation obj) #f)
                                   obj))
-                            result))))
+                            (append result rest)))))
 
 (define build-things
   (let ((build (operation (build-things (string-list things)
L
L
Ludovic Courtès wrote on 10 Aug 2021 17:36
(name . Sarah Morgensen)(address . iskarian@mgsn.dev)
87v94dpb0k.fsf@gnu.org
Hello!

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (6 lines)
> I added counters and ‘pk’ calls in ‘map/accumulate-builds’ & co. to see
> what happens. AFAICS, a cutoff as in the attached patch does the job.
> It’s a basic heuristic to avoid unbounded growth, at the expense of
> possibly reduced accumulation. For example, here’s the effect on my
> user profile with 300+ packages:

Pushed as fa81971cbae85b39183ccf8f51e8d96ac88fb4ac.

I saw your message on IRC, Sarah; note that because grafts are “dynamic
dependencies” (they depend on previous build results), we cannot know in
advance which grafts we’re going to build, so there’s necessarily at
least a second phase during which grafts get built. See

Thanks!

Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 49439@debbugs.gnu.org