Hi! Sarah Morgensen skribis: > Ludovic Courtès writes: [...] >> The seemingly endless CPU usage and unbound memory use comes from the >> ‘build-accumulator’ build handler. Here, the graph of ‘pigx-scrnaseq’ >> has many nodes, and many of them depend on, say, ‘r-rlang’. Thus, when >> ‘r-rlang’ is not in the store, the grafting code keeps asking for it by >> calling ‘build-derivations’, which aborts to the build handler; the >> build handler saves the .drv file name and the continuation and keeps >> going. But since the next package also depends on ‘r-langr’, we abort >> again to the build handler, and so on. >> >> The end result is a very long list of nodes, probably of >> this order in this case: >> >> $ guix graph -t reverse-package r-rlang |grep 'label = "'|wc -l >> 594 >> >> Presumably, the captured continuations occupy quite a bit of memory, >> hence the quick growth. >> >> I suppose one solution is to fire suspended builds when the build >> handler sees a build request for a given derivation for the second time. >> It needs more thought and testing… > > I wonder if instead it's possible to reduce the memory taken by the > continuations? As someone who has absolutely no experience with the > build/derivation system, it seems like all we *should* need to save is > the derivation file name. Continuations may take a bit of room but the main problem is that we’re storing too many of them. :-) Roughly, I think we keep one node per incoming edge into a package not in the store (‘r-rlang’ in the example above). In a case like ‘pigx-scrnaseq’, that’s a lot of items. I added counters and ‘pk’ calls in ‘map/accumulate-builds’ & co. to see what happens. AFAICS, a cutoff as in the attached patch does the job. It’s a basic heuristic to avoid unbounded growth, at the expense of possibly reduced accumulation. For example, here’s the effect on my user profile with 300+ packages: --8<---------------cut here---------------start------------->8--- $ # with cutoff $ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n [...] 2,926.7 MB would be downloaded Garbage collection statistics: heap size: 119.37 MiB allocated: 849.56 MiB GC times: 30 time spent in GC: 2.88 seconds (30% of user time) real 0m8.987s user 0m9.482s sys 0m0.186s $ $ # without cutoff $ time GUIX_PROFILING=gc ./pre-inst-env guix upgrade -n [...] 3,402.5 MB would be downloaded Garbage collection statistics: heap size: 127.37 MiB allocated: 1504.59 MiB GC times: 46 time spent in GC: 5.31 seconds (29% of user time) real 0m21.616s user 0m18.082s sys 0m0.255s --8<---------------cut here---------------end--------------->8--- You can tell that, without the cutoff (2nd run), the command more accurately finds out what it’s going to have to download since the “would be downloaded” figure is higher; with the cutoff (1st run), it “sees less” of what it’s going to end up downloading. This could be annoying from a UI viewpoint in “extreme” cases like this one, but most likely the difference would be invisible in many cases. More importantly, having this cutoff halves the execution time, which is nice. The command initially given in this bug report now behaves like this: --8<---------------cut here---------------start------------->8--- $ time GUIX_PROFILING=gc ./pre-inst-env guix environment pigx-scrnaseq --search-paths -n -v2 41.8 MB would be downloaded: /gnu/store/difgj9gf8l7y5bsilcl42x2vzgh493c6-r-utf8-1.2.2 /gnu/store/z4rpk1sn3jhy8brsr30zccln8mira3z5-r-purrr-0.3.4 /gnu/store/nkiga9rfd8a9gdfsp2rjcqz39s8p2hg3-r-magrittr-2.0.1 /gnu/store/c5y0xlw1fbcwa5p5qnk4xji286hd7cqk-r-vctrs-0.3.8 /gnu/store/86vz257dqy4vm6s7axq7zl2jqxacgngf-r-ellipsis-0.3.2 /gnu/store/m2m7h497g5n6vdrp5pxsnr2v8hsriq5f-r-glue-1.4.2 /gnu/store/0zv1sl91fz3ddq8namdfvf6yr0j1p2vx-texlive-bin-20190410 /gnu/store/7ks38sv5cpg7x76g2fdp9cra3x7dpbyq-texlive-union-51265 /gnu/store/vkdjhkl5s025jsjy7jrr9q7kxlsi2sc4-r-minimal-4.1.0 /gnu/store/pysc6ixb5fj1m2n50dac6aq5lgd5bnv4-r-rlang-0.4.11 Garbage collection statistics: heap size: 127.31 MiB allocated: 621.97 MiB GC times: 24 time spent in GC: 2.82 seconds (37% of user time) real 0m6.493s user 0m7.584s sys 0m0.117s --8<---------------cut here---------------end--------------->8--- This time it completes, which is already an improvement ;-). The 41.8 MB download reported is underestimated, but again that’s the tradeoff we’re making. Attached is the patch. I’ll go ahead with that if there are no objections. > For those interested, I've compiled a list of the top 60 > highest-dependency packages, as reported by package-closure: > > pigx 1630 > nextcloud-client 1539 > akregator 1486 > kmail 1484 > korganizer 1481 > kdepim-runtime 1480 > kincidenceeditor 1478 > keventviews 1477 > kmailcommon 1476 > kcalendarsupport 1475 Nice! I really need to stop taking ‘libreoffice’ as an example. Thanks, Ludo’.