GSSH ERROR while sending store items with 'guix deploy'

  • Open
  • quality assurance status badge
Details
3 participants
  • Leo Famulari
  • Ludovic Courtès
  • Maxim Cournoyer
Owner
unassigned
Submitted by
Maxim Cournoyer
Severity
normal
M
M
Maxim Cournoyer wrote on 16 Mar 2023 03:02
(name . bug-guix)(address . bug-guix@gnu.org)
87bkktjvwz.fsf@gmail.com
Hi,

I'm trying to update one of my machines using 'guix deploy' and I keep
getting the following error:

Toggle snippet (45 lines)
guix deploy: sending 599 store items (3,794 MiB) to 'x200.local'...
20% ????????????????????? ?
27% ???????????????????????????? ?;;; [2023/03/15 21:57:28.007492, 0] [GSSH ERROR] Parent session is not connected: #<unknown channel (freed) 7fca183c31c0>
Backtrace:
In ice-9/boot-9.scm:
1752:10 19 (with-exception-handler _ _ #:unwind? _ # _)
In guix/scripts/deploy.scm:
166:6 18 (_)
In guix/store.scm:
2168:25 17 (run-with-store #<store-connection 256.99 7fca0e115d70> …)
In gnu/machine/ssh.scm:
528:8 16 (_ _)
530:30 15 (_ _)
In ice-9/boot-9.scm:
1752:10 14 (with-exception-handler _ _ #:unwind? _ # _)
In gnu/machine/ssh.scm:
530:30 13 (_)
In guix/store.scm:
2168:25 12 (run-with-store #<store-connection 256.99 7fca0c81c140> …)
In guix/remote.scm:
136:10 11 (_ _)
In guix/store.scm:
2040:38 10 (_ #<store-connection 256.99 7fca0c81c140>)
In guix/ssh.scm:
530:7 9 (send-files _ _ _ #:recursive? _ #:log-port _)
In guix/store.scm:
1764:12 8 (export-paths #<store-connection 256.99 7fca0c81c140> _ …)
1731:22 7 (export-path #<store-connection 256.99 7fca0c81c140> _ # …)
716:13 6 (process-stderr _ _)
In guix/build/utils.scm:
1014:11 5 (loop _ _)
In unknown file:
4 (put-bytevector #<unknown channel (freed) 7fca183c31c0> …)
In ice-9/boot-9.scm:
1685:16 3 (raise-exception _ #:continuable? _)
1683:16 2 (raise-exception _ #:continuable? _)
1685:16 1 (raise-exception _ #:continuable? _)
1685:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `guile-ssh-error' with args `("write_to_channel_port"
"Parent session is not connected" #<unknown channel (freed)
7fca183c31c0> #f)'.

It works for a while, so it'll eventually succeeds, but that seems a new
regression (it didn't use to do that).

--
Thanks,
Maxim
L
L
Ludovic Courtès wrote on 17 Mar 2023 22:15
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 62213@debbugs.gnu.org)
87ttyjw03l.fsf@gnu.org
Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (6 lines)
> I'm trying to update one of my machines using 'guix deploy' and I keep
> getting the following error:
>
> guix deploy: sending 599 store items (3,794 MiB) to 'x200.local'...
> 20% ????????????????????? ?

[...]

Toggle quote (15 lines)
> 716:13 6 (process-stderr _ _)
> In guix/build/utils.scm:
> 1014:11 5 (loop _ _)
> In unknown file:
> 4 (put-bytevector #<unknown channel (freed) 7fca183c31c0> …)
> In ice-9/boot-9.scm:
> 1685:16 3 (raise-exception _ #:continuable? _)
> 1683:16 2 (raise-exception _ #:continuable? _)
> 1685:16 1 (raise-exception _ #:continuable? _)
> 1685:16 0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `guile-ssh-error' with args `("write_to_channel_port"
> "Parent session is not connected" #<unknown channel (freed)

What version of guix-daemon is running on the machine that runs "guix
deploy"?

Do the sshd logs on x200.local show anything interesting at the moment
the connection is dropped?

Thanks,
Ludo'.
M
M
Maxim Cournoyer wrote on 20 Mar 2023 21:22
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 62213@debbugs.gnu.org)
87355z41ht.fsf@gmail.com
Hi Ludo!

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (30 lines)
> Hi,
>
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> I'm trying to update one of my machines using 'guix deploy' and I keep
>> getting the following error:
>>
>> guix deploy: sending 599 store items (3,794 MiB) to 'x200.local'...
>> 20% ????????????????????? ?
>
> [...]
>
>> 716:13 6 (process-stderr _ _)
>> In guix/build/utils.scm:
>> 1014:11 5 (loop _ _)
>> In unknown file:
>> 4 (put-bytevector #<unknown channel (freed) 7fca183c31c0> …)
>> In ice-9/boot-9.scm:
>> 1685:16 3 (raise-exception _ #:continuable? _)
>> 1683:16 2 (raise-exception _ #:continuable? _)
>> 1685:16 1 (raise-exception _ #:continuable? _)
>> 1685:16 0 (raise-exception _ #:continuable? _)
>>
>> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
>> Throw to key `guile-ssh-error' with args `("write_to_channel_port"
>> "Parent session is not connected" #<unknown channel (freed)
>
> What version of guix-daemon is running on the machine that runs "guix
> deploy"?

Two variants of 1.4.0-4.01fd830:

Toggle snippet (7 lines)
$ pgrep -a guix-daemon
6883 /gnu/store/izxlr8c7aml0ajqpblidkq014wyxcczz-guix-1.4.0-4.01fd830/bin/guix-daemon 6822 guixbuild --max-silent-time 0 --timeout 0 --log-compression none --discover=yes --substitute-urls http://127.0.0.1:8181 https://ci.guix.gnu.org https://bordeaux.guix.gnu.org --max-jobs=4
13519 /gnu/store/izxlr8c7aml0ajqpblidkq014wyxcczz-guix-1.4.0-4.01fd830/bin/guix-daemon 13504 guixbuild --max-silent-time 0 --timeout 0 --log-compression none --discover=yes --substitute-urls http://127.0.0.1:8181 https://ci.guix.gnu.org https://bordeaux.guix.gnu.org --max-jobs=4
27855 /gnu/store/izxlr8c7aml0ajqpblidkq014wyxcczz-guix-1.4.0-4.01fd830/bin/guix-daemon --build-users-group guixbuild --max-silent-time 0 --timeout 0 --log-compression none --discover=yes --substitute-urls http://127.0.0.1:8181 https://ci.guix.gnu.org https://bordeaux.guix.gnu.org --max-jobs=4
28995 /gnu/store/izxlr8c7aml0ajqpblidkq014wyxcczz-guix-1.4.0-4.01fd830/bin/guix-daemon 28982 guixbuild --max-silent-time 0 --timeout 0 --log-compression none --discover=yes --substitute-urls http://127.0.0.1:8181 https://ci.guix.gnu.org https://bordeaux.guix.gnu.org --max-jobs=4

Toggle quote (3 lines)
> Do the sshd logs on x200.local show anything interesting at the moment
> the connection is dropped?

In my logs, at around the time I was observing these problems, I see a
couple instances of:

Toggle snippet (17 lines)
Mar 15 22:03:58 localhost shepherd[1]: Service sshd-7 has been started.
Mar 15 22:03:59 localhost sshd[24950]: Accepted publickey for mcournoyer from 192.168.10.15 port 40696 ssh2: RSA SHA256:bNjGrjJ6+adVo155SohxuXbuxeJB+w/+vQzhdFs3vUg
Mar 15 22:07:57 localhost sshd[24952]: ssh_dispatch_run_fatal: Connection from user mcournoyer 192.168.10.15 port 40696: invalid format
Mar 15 22:07:57 localhost shepherd[1]: 0 connections still in use after sshd-7 termination.
Mar 15 22:07:57 localhost shepherd[1]: Service sshd-7 (PID 24950) exited with 255.
Mar 15 22:07:57 localhost shepherd[1]: Service sshd-7 has been disabled.
Mar 15 22:07:57 localhost shepherd[1]: Transient service sshd-7 terminated, now unregistered.
Mar 15 22:14:41 localhost shepherd[1]: Service sshd-8 has been started.
Mar 15 22:14:42 localhost sshd[25142]: Accepted publickey for mcournoyer from 192.168.10.15 port 45596 ssh2: RSA SHA256:bNjGrjJ6+adVo155SohxuXbuxeJB+w/+vQzhdFs3vUg
Mar 15 22:20:02 localhost shepherd[1]: 0 connections still in use after sshd-8 termination.
Mar 15 22:20:02 localhost shepherd[1]: Service sshd-8 (PID 25142) exited with 255.
Mar 15 22:20:02 localhost shepherd[1]: Service sshd-8 has been disabled.
Mar 15 22:20:02 localhost shepherd[1]: Transient service sshd-8 terminated, now unregistered.
Mar 15 22:24:19 localhost shepherd[1]: Service sshd-9 has been started.
Mar 15 22:24:20 localhost sshd[25463]: Accepted publickey for mcournoyer from 192.168.10.15 port 41702 ssh2: RSA SHA256:bNjGrjJ6+adVo155SohxuXbuxeJB+w/+vQzhdFs3vUg

Where the 'ssh_dispatch_run_fatal ... invalid format' stands out to me.

I'll report with more if I can reproduce the issue again.

--
Thanks,
Maxim
L
L
Leo Famulari wrote on 18 May 2023 20:24
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)(address . 62213@debbugs.gnu.org)
ZGZtePnFM6dXMDoP@jasmine.lan
On Wed, Mar 15, 2023 at 10:02:20PM -0400, Maxim Cournoyer wrote:
Toggle quote (6 lines)
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `guile-ssh-error' with args `("write_to_channel_port"
> "Parent session is not connected" #<unknown channel (freed)
> 7fca183c31c0> #f)'.
> --8<---------------cut here---------------end--------------->8---

I'm seeing this too with offloading.
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 62213@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 62213
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch