guix-daemon from "guix pull" does not honor user settings

DoneSubmitted by Ricardo Wurmus.
Details
2 participants
  • Ludovic Courtès
  • Ricardo Wurmus
Owner
unassigned
Severity
normal
R
R
Ricardo Wurmus wrote on 23 May 2019 23:01
“guix pull” fails on setlocale
(address . bug-guix@gnu.org)
87y32wga23.fsf@mdc-berlin.de
Hi Guix,

I’m getting this weird error on “guix pull”:

Toggle snippet (63 lines)
[rwurmus@max147.mdc-berlin.net:~] $ guix pull
Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...
Building from this channel:
guix https://git.savannah.gnu.org/git/guix.git e26d628
Computing Guix derivation for 'x86_64-linux'... \@ build-started /gnu/store/pryjyasqnhc69qqjsbvv5f1ksi25mjdc-libgit2-0.28.tar.xz.drv - x86_64-linux /gnu/var/log/guix/drvs/pr//yjyasqnhc69qqjsbvv5f1ksi25mjdc-libgit2-0.28.tar.xz.drv 2110 |@ build-log 2110 252
Backtrace:
2 (primitive-load "/gnu/store/lgad0sg02p56jadwqrq674250d5?")
In ice-9/eval.scm:
619:8 1 (_ #f)
In unknown file:
0 (setlocale 6 "en_US.utf8")

ERROR: In procedure setlocale:
In procedure setlocale: Invalid argument
builder for `/gnu/store/pryjyasqnhc69qqjsbvv5f1ksi25mjdc-libgit2-0.28.tar.xz.drv' failed with exit code 1
@ build-failed /gnu/store/pryjyasqnhc69qqjsbvv5f1ksi25mjdc-libgit2-0.28.tar.xz.drv - 1 builder for `/gnu/store/pryjyasqnhc69qqjsbvv5f1ksi25mjdc-libgit2-0.28.tar.xz.drv' failed with exit code 1
cannot build derivation `/gnu/store/nj6zd6gn3x1rf08ayxxwd1v0fyg71v9c-libgit2-0.28.2.drv': 1 dependencies couldn't be built
cannot build derivation `/gnu/store/82x55s3m26j3rpq45ppijzvvh3rhxhsb-guile-git-0.2.0.drv': 1 dependencies couldn't be built
Backtrace:
In ./guix/store.scm:
1667:8 19 (_ _)
1667:8 18 (_ _)
In ./guix/gexp.scm:
708:2 17 (_ _)
In ./guix/monads.scm:
482:9 16 (_ _)
In ./guix/gexp.scm:
573:13 15 (_ _)
In ./guix/store.scm:
1667:8 14 (_ _)
In ./guix/gexp.scm:
708:2 13 (_ _)
In ./guix/monads.scm:
482:9 12 (_ _)
In ./guix/gexp.scm:
573:13 11 (_ _)
In ./guix/store.scm:
1667:8 10 (_ _)
In ./guix/gexp.scm:
708:2 9 (_ _)
In ./guix/monads.scm:
482:9 8 (_ _)
In ./guix/gexp.scm:
573:13 7 (_ _)
In ./guix/store.scm:
1667:8 6 (_ _)
1690:38 5 (_ #<store-connection 256.99 d5cfb40>)
In ./guix/packages.scm:
936:16 4 (cache! #<weak-table 420/883> #<package guile-git@0.2.?> ?)
In ./guix/grafts.scm:
314:4 3 (graft-derivation #<store-connection 256.99 d5cfb40> # # ?)
192:4 2 (references-oracle #<store-connection 256.99 d5cfb40> #)
201:20 1 (_ _ _)
In ./guix/store.scm:
1203:15 0 (_ #<store-connection 256.99 d5cfb40> _ _)

./guix/store.scm:1203:15: Throw to key `srfi-34' with args `(#<condition &store-protocol-error [message: "build of `/gnu/store/82x55s3m26j3rpq45ppijzvvh3rhxhsb-guile-git-0.2.0.drv' failed" status: 100] d59ede0>)'.
guix pull: error: You found a bug: the program '/gnu/store/2mjaq8zxq60ifqxj3fra7f8gyxxccypm-compute-guix-derivation'
failed to compute the derivation for Guix (version: "e26d628b0fabf5a0aa7c4164a9558c66c61e02ab"; system: "x86_64-linux";
host version: "ebd45195dd10eea9ce2c563697989bd4b27dfdd3"; pull-version: 1).
Please report it by email to <bug-guix@gnu.org>.

I’m using “guix” from the result of a previous “guix pull”, but it’s the
same if I use a git checkout.

The daemon is probably a little special. I’m using the daemon from a
git checkout inside of an environment for “guix”, because localstatedir
in my case is /gnu/var.

I also tried using the daemon from the same “guix pull” tree, after
setting GUIX_DATABASE_DIRECTORY=/gnu/var/guix/db and
GUIX_STATE_DIRECTORY=/gnu/var/guix.

Here’s how I launch the daemon:

Toggle snippet (26 lines)
#!/bin/bash

export GUIX_PROFILE=/gnu/var/guix/profiles/custom/guix-remote/.guix-profile

# We need this to augment the GUILE_LOAD_PATH such that it includes
# the Guile bindings to gnutls. Sourcing the whole profile is
# overkill, but who cares, eh?
source ${GUIX_PROFILE}/etc/profile

# Fix locale warnings
export GUIX_LOCPATH=${GUIX_PROFILE}/lib/locale

# Fix certificate validation
export SSL_CERT_DIR=${GUIX_PROFILE}/etc/ssl/certs/
#export GUIX_DATABASE_DIRECTORY=/gnu/var/guix/db
#export GUIX_STATE_DIRECTORY=/gnu/var/guix

#/gnu/remote/.guix-pull/bin/guix-daemon \
#/gnu/remote/guix/pre-inst-env guix-daemon \
exec /gnu/remote/guix/pre-inst-env guix-daemon \
--disable-log-compression \
--build-users-group=guix-builder \
--listen=141.80.186.209:9999 \
--substitute-urls="https://berlin.guixsd.org https://mirror.hydra.gnu.org" $@

All communication with the daemon happens over network; the local socket
is not involved, but this doesn’t seem to make any difference here.

The simplest reproducer is to run Guile where the daemon runs and to
evaluate setlocale:

Toggle snippet (19 lines)
[rwurmus@guix-builder:~] (716) $ /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/bin/guile
guile: warning: failed to install locale
warning: failed to install locale: Invalid argument
GNU Guile 2.2.4
Copyright (C) 1995-2017 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (setlocale 6 "en_US.utf8")
ERROR: In procedure setlocale:
In procedure setlocale: Invalid argument

Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]>

This is expected because GUIX_LOCPATH isn’t set in this environment.
It’s fine when I set GUIX_LOCPATH to the value it has in the above
guix-daemon wrapper:

Toggle snippet (14 lines)
[rwurmus@guix-builder:~] (719) $ GUIX_LOCPATH=/gnu/var/guix/profiles/custom/guix-remote/.guix-profile/lib/locale /gnu/store/r658y3cgpnf99nxjxqgjiaizx20ac4k0-guile-2.2.4/bin/guile
GNU Guile 2.2.4
Copyright (C) 1995-2017 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> (setlocale 6 "en_US.utf8")
$1 = "en_US.utf8"
scheme@(guile-user)>

I don’t understand why Guile as used in the builder of
libgit2-0.28.tar.xz would behave any different as the daemons
environment looks fine to me:

Toggle snippet (18 lines)
[rwurmus@guix-builder:~] (723) $ sudo strings /proc/27562/environ
GUIX_LOCPATH=/gnu/var/guix/profiles/custom/guix-remote/.guix-profile/lib/locale
NIX_BUILD_HOOK=/gnu/remote/guix/nix/scripts/offload
NIX_HASH=
NIX_LIBEXEC_DIR=/gnu/remote/guix/nix/scripts
LC_ALL=en_US.UTF-8
GUILE_LOAD_PATH=/gnu/remote/guix:/gnu/remote/guix:/gnu/var/guix/profiles/custom/guix-remote/.guix-profile/share/guile/site/2.2
GUIX_PROFILE=/gnu/var/guix/profiles/custom/guix-remote/.guix-profile
GUILE_LOAD_COMPILED_PATH=/gnu/remote/guix:/gnu/var/guix/profiles/custom/guix-remote/.guix-profile/lib/guile/2.2/site-ccache
PATH=/gnu/remote/guix/scripts:/gnu/remote/guix:/gnu/var/guix/profiles/custom/guix-remote/.guix-profile/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
PWD=/
LANG=en_US.UTF-8
SSL_CERT_DIR=/gnu/var/guix/profiles/custom/guix-remote/.guix-profile/etc/ssl/certs/
SHLVL=0
NIX_ROOT_FINDER=/gnu/remote/guix/nix/scripts/list-runtime-roots
GUIX_UNINSTALLED=1

What’s going on here?

--
Ricardo
R
R
Ricardo Wurmus wrote on 23 May 2019 23:40
(address . 35874@debbugs.gnu.org)(address . ludo@gnu.org)
87woigg88w.fsf@mdc-berlin.de
This is a store corruption bug.

The problem appears to be that I accidentally ran the daemon with the
wrong GUIX_DATABASE_DIRECTORY. The localstatedir is /gnu/var, not /var.

In an attempt to simplify my complicated cluster setup, I wanted to
switch from the git checkout to “guix pull”. I was able to use the Guix
client from “guix pull”, but not the daemon, because of the
localstatedir difference.

When I started the daemon from “guix pull” without having set
GUIX_DATABASE_DIRECTORY and I asked Guix to build something I noticed
this error message:

guix pull: error: cannot unlink `/gnu/store/h90vnqw0nwd0hhm1l5dgxsdrigddfmq4-glibc-2.28/lib/gconv': Directory not empty

Wait, “unlink”? Of course: when a build is not found in the database,
but the store contains an item of the same name the daemon will remove
the existing directory.

In my case, the daemon did not realize that it couldn’t ever find
anything interesting in the database, because it looked in the wrong
localstate directory. So it partially removed store items and then
aborted, leaving the store in a broken state.

Can we make the daemon detect that its understanding of the site differs
from that of the Guix client?

--
Ricardo
L
L
Ludovic Courtès wrote on 24 May 2019 15:49
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 35874@debbugs.gnu.org)
877eagx8s2.fsf@gnu.org
Hello Ricardo,

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

Toggle quote (5 lines)
> This is a store corruption bug.
>
> The problem appears to be that I accidentally ran the daemon with the
> wrong GUIX_DATABASE_DIRECTORY. The localstatedir is /gnu/var, not /var.

Ouch. :-/

Toggle quote (6 lines)
> When I started the daemon from “guix pull” without having set
> GUIX_DATABASE_DIRECTORY and I asked Guix to build something I noticed
> this error message:
>
> guix pull: error: cannot unlink `/gnu/store/h90vnqw0nwd0hhm1l5dgxsdrigddfmq4-glibc-2.28/lib/gconv': Directory not empty

When you do ‘guix pull’, the resulting (guix config) is supposed to
honor the settings of the calling ‘guix’: %localstatedir, etc.

It seems that it wasn’t the case here? Could you try again running
‘guix pull’ from a ‘guix’ command that has non-default settings and
check the resulting (guix config) module?

Toggle quote (3 lines)
> Can we make the daemon detect that its understanding of the site differs
> from that of the Guix client?

I don’t see how that could be done. The daemon necessarily assumes that
its database is authoritative.

This kind of issue was supposed to happen only when building from
source, but in that case, ./configure tries hard to protect against
that. Here it seems that the real issue is that ‘guix pull’ produces a
‘guix’ that does not honor your settings.

Anyway, I hope you managed to recover from it without too much hassle.

Thanks,
Ludo’.
R
R
Ricardo Wurmus wrote on 24 May 2019 16:11
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35874@debbugs.gnu.org)
87tvdkdjsh.fsf@mdc-berlin.de
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (13 lines)
>> When I started the daemon from “guix pull” without having set
>> GUIX_DATABASE_DIRECTORY and I asked Guix to build something I noticed
>> this error message:
>>
>> guix pull: error: cannot unlink `/gnu/store/h90vnqw0nwd0hhm1l5dgxsdrigddfmq4-glibc-2.28/lib/gconv': Directory not empty
>
> When you do ‘guix pull’, the resulting (guix config) is supposed to
> honor the settings of the calling ‘guix’: %localstatedir, etc.
>
> It seems that it wasn’t the case here? Could you try again running
> ‘guix pull’ from a ‘guix’ command that has non-default settings and
> check the resulting (guix config) module?

Is (guix config) enough? What about the daemon? I’ve had no problem
with “guix” itself when used with a daemon taken from the git checkout.

Toggle quote (11 lines)
>> Can we make the daemon detect that its understanding of the site differs
>> from that of the Guix client?
>
> I don’t see how that could be done. The daemon necessarily assumes that
> its database is authoritative.
>
> This kind of issue was supposed to happen only when building from
> source, but in that case, ./configure tries hard to protect against
> that. Here it seems that the real issue is that ‘guix pull’ produces a
> ‘guix’ that does not honor your settings.

This is confusing, because I *am* using the “guix” client from whatever
“guix pull” produces. It’s just the daemon that works against me when I
take it from the same directory as the “guix” client.

So, “guix-daemon” currently runs from the git checkout, and all users
talk to it with “guix” from various runs of “guix pull” (we initially
pulled using the properly configured version from the git checkout).

Toggle quote (2 lines)
> Anyway, I hope you managed to recover from it without too much hassle.

Yes, I was able to identify the corrupt store items and copy the
corresponding items from a separate machine. I was lucky that it
aborted early when trying to delete items, so it seems that it didn’t
get to do all that much damage.

(Curiously, I wasn’t able to run “guix gc --verify=repair,contents”
because Guix claims I don’t have sufficient privileges to repair the
store — I’m running this as root, but who knows how NFS complicates
things…)

--
Ricardo
L
L
Ludovic Courtès wrote on 25 May 2019 19:17
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 35874@debbugs.gnu.org)
87ef4mqwqk.fsf@gnu.org
Hi!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

Toggle quote (2 lines)
> Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (10 lines)
>> When you do ‘guix pull’, the resulting (guix config) is supposed to
>> honor the settings of the calling ‘guix’: %localstatedir, etc.
>>
>> It seems that it wasn’t the case here? Could you try again running
>> ‘guix pull’ from a ‘guix’ command that has non-default settings and
>> check the resulting (guix config) module?
>
> Is (guix config) enough? What about the daemon? I’ve had no problem
> with “guix” itself when used with a daemon taken from the git checkout.

Oooh, good point, the ‘guix-daemon’ package uses a fixed localstatedir.

I believe the patch below solves the problem. WDYT?

Toggle quote (5 lines)
> Yes, I was able to identify the corrupt store items and copy the
> corresponding items from a separate machine. I was lucky that it
> aborted early when trying to delete items, so it seems that it didn’t
> get to do all that much damage.

Phheeew.

Toggle quote (5 lines)
> (Curiously, I wasn’t able to run “guix gc --verify=repair,contents”
> because Guix claims I don’t have sufficient privileges to repair the
> store — I’m running this as root, but who knows how NFS complicates
> things…)

It’s supposed to work if you’re root, and the privilege claim checks
just that (see nix-daemon.cc):

Toggle snippet (19 lines)
if (remoteAddr.ss_family == AF_UNIX) {
[…]
trusted = clientUid == 0;

[…]
case wopVerifyStore: {
bool checkContents = readInt(from) != 0;
bool repair = readInt(from) != 0;
startWork();
if (repair && !trusted)
throw Error("you are not privileged to repair paths");
bool errors = store->verifyStore(checkContents, repair);
stopWork();
writeInt(errors, to);
break;
}

Thanks,
Ludo’.
Toggle diff (25 lines)
diff --git a/guix/self.scm b/guix/self.scm
index 6d7569ec19..8cc82de64c 100644
--- a/guix/self.scm
+++ b/guix/self.scm
@@ -603,7 +603,21 @@ Info manual."
   (define (wrap daemon)
     (program-file "guix-daemon"
                   #~(begin
+                      ;; Refer to the right 'guix' command for 'guix
+                      ;; substitute' & co.
                       (setenv "GUIX" #$command)
+
+                      ;; Honor the user's settings rather than those hardcoded
+                      ;; in the 'guix-daemon' package.
+                      (unless (getenv "GUIX_STATE_DIRECTORY")
+                        (setenv "GUIX_STATE_DIRECTORY"
+                                #$(string-append %localstatedir "/guix")))
+                      (unless (getenv "GUIX_CONFIGURATION_DIRECTORY")
+                        (setenv "GUIX_CONFIGURATION_DIRECTORY"
+                                #$(string-append %sysconfdir "/guix")))
+                      (unless (getenv "NIX_STORE_DIR")
+                        (setenv "NIX_STORE_DIR" %storedir))
+
                       (apply execl #$(file-append daemon "/bin/guix-daemon")
                              "guix-daemon" (cdr (command-line))))))
L
L
Ludovic Courtès wrote on 25 May 2019 19:20
control message for bug #35874
(address . control@debbugs.gnu.org)
87d0k6qwm7.fsf@gnu.org
retitle 35874 guix-daemon from "guix pull" does not honor user settings
quit
R
R
Ricardo Wurmus wrote on 26 May 2019 13:55
Re: bug#35874: “guix pull” fails on setlocale
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 35874@debbugs.gnu.org)
874l5hxweb.fsf@mdc-berlin.de
Hi Ludo,

Toggle quote (14 lines)
>>> When you do ‘guix pull’, the resulting (guix config) is supposed to
>>> honor the settings of the calling ‘guix’: %localstatedir, etc.
>>>
>>> It seems that it wasn’t the case here? Could you try again running
>>> ‘guix pull’ from a ‘guix’ command that has non-default settings and
>>> check the resulting (guix config) module?
>>
>> Is (guix config) enough? What about the daemon? I’ve had no problem
>> with “guix” itself when used with a daemon taken from the git checkout.
>
> Oooh, good point, the ‘guix-daemon’ package uses a fixed localstatedir.
>
> I believe the patch below solves the problem. WDYT?

Yes, I think this would fix it. I set two of these variables before
(not NIX_STORE_DIR) and it seemed to work fine.

Thanks!

--
Ricardo
L
L
Ludovic Courtès wrote on 26 May 2019 23:24
(name . Ricardo Wurmus)(address . ricardo.wurmus@mdc-berlin.de)(address . 35874-done@debbugs.gnu.org)
87imtwkiy9.fsf@gnu.org
Hello!

Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de> skribis:

Toggle quote (5 lines)
>> I believe the patch below solves the problem. WDYT?
>
> Yes, I think this would fix it. I set two of these variables before
> (not NIX_STORE_DIR) and it seemed to work fine.

Great. Pushed as dfc69e4b6d4bbc41a4d37b3cc6ea12adb34aaafa.

Thanks,
Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 35874@debbugs.gnu.org