[PATCH Shepherd 0/2] Use 'signalfd' on GNU/Linux

DoneSubmitted by Ludovic Courtès.
Details
4 participants
  • Jan Nieuwenhuizen
  • Jelle Licht
  • Ludovic Courtès
  • Mathieu Othacehe
Owner
unassigned
Severity
normal
L
L
Ludovic Courtès wrote on 24 May 16:27 +0200
(address . guix-patches@gnu.org)(name . Ludovic Courtès)(address . ludo@gnu.org)
20200524142700.6151-1-ludo@gnu.org
Hello!
This patch series allows shepherd to use ‘signalfd’ on GNU/Linux.It allows us to avoid race conditions related to signal delivery,which in turn means we can pass an infinite timeout to ‘select’,and thus increase battery life.
More generally, it’s a way to structure the code around the eventloop. The next step will be to make the code entirely reactive,so we can have things like socket activation, being able to startservices that don’t depend on one another concurrently, and all that.(This is actually also be possible without ‘signalfd’ but it’s moreconsistent and robust to have everything visible to ‘select’.)
Thoughts?
I guess the main question is: can it go into 0.8.1, which we outtarelease soon due to https://issues.guix.gnu.org/40981. I’d say“yes”, but I’ll be even more confident if others take a look. :-)
Ludo’.
Ludovic Courtès (2): system: Add support for 'signalfd'. shepherd: Use 'signalfd' when possible.
configure.ac | 18 +++++++++ modules/shepherd.scm | 73 ++++++++++++++++++++++++++++++---- modules/shepherd/service.scm | 19 +++++---- modules/shepherd/system.scm.in | 65 ++++++++++++++++++++++++++++++ 4 files changed, 161 insertions(+), 14 deletions(-)
-- 2.26.2
L
L
Ludovic Courtès wrote on 24 May 16:37 +0200
[PATCH Shepherd 2/2] shepherd: Use 'signalfd' when possible.
(address . 41507@debbugs.gnu.org)
20200524143700.6378-2-ludo@gnu.org
This eliminates possible race conditions related to signal delivery and'select', which in turn means we can pass 'select' an infinite timeoutwithout any risk of missing a signal. It also allows us to structurethe code around a single event loop.
* modules/shepherd.scm (maybe-signal-port): New procedure.(main): Use it and define 'signal-port'. Add 'ports' argument to'next-command'. Pass it to 'select'. Change the timeout argument of'select' to #f when SIGNAL-PORT is true.* modules/shepherd/service.scm (fork+exec-command): Call'unblock-signals' in the child process, right before 'exec-command'.--- modules/shepherd.scm | 73 ++++++++++++++++++++++++++++++++---- modules/shepherd/service.scm | 19 ++++++---- 2 files changed, 78 insertions(+), 14 deletions(-)
Toggle diff (154 lines)diff --git a/modules/shepherd.scm b/modules/shepherd.scmindex 46faab6..ac60070 100644--- a/modules/shepherd.scm+++ b/modules/shepherd.scm@@ -24,6 +24,7 @@ #:use-module (ice-9 format) #:use-module (ice-9 rdelim) ;; Line-based I/O. #:autoload (ice-9 readline) (activate-readline) ;for interactive use+ #:use-module ((ice-9 threads) #:select (all-threads)) #:use-module (oop goops) ;; Defining classes and methods. #:use-module (srfi srfi-1) ;; List library. #:use-module (srfi srfi-26)@@ -60,6 +61,53 @@ socket file at FILE-NAME upon exit of PROC. Return the values of PROC." (close sock) (catch-system-error (delete-file file-name)))))) +(define (maybe-signal-port signals)+ "Return a signal port for SIGNALS, using 'signalfd' on GNU/Linux, or #f if+that is not supported."+ (catch 'system-error+ (lambda ()+ (let ((port (signalfd -1 signals)))+ ;; As per the signalfd(2) man page, block SIGNALS. The tricky bit is+ ;; that SIGNALS must be blocked for all the threads; new threads will+ ;; inherit the signal mask, but we must ensure that neither Guile's+ ;; signal delivery thread nor its finalization thread are already+ ;; running, because if they do, they are not blocking SIGNALS. The+ ;; signal delivery thread is started on the first call to 'sigaction'+ ;; so we arrange to not call 'sigaction' beforehand; as for the+ ;; finalization thread, use 'without-automatic-finalization' to+ ;; temporarily stop it.+ (without-automatic-finalization+ (let ((count (length (all-threads))))+ (if (= 1 count)+ (begin+ (block-signals signals)+ port)+ (begin+ (local-output (l10n "warning: \+already ~a threads running, disabling 'signalfd' support")+ count)+ (close-port port)+ #f))))))+ (lambda args+ (if (= ENOSYS (system-error-errno args))+ #f+ (apply throw args)))))++(define (handle-signal-port port)+ "Read from PORT, a signalfd port, and handle the signal accordingly."+ (let ((signal (consume-signalfd-siginfo port)))+ (cond ((= signal SIGCHLD)+ (handle-SIGCHLD))+ ((= signal SIGINT)+ (catch 'quit+ (lambda ()+ (stop root-service))+ quit-exception-handler))+ ((memv signal (list SIGTERM SIGHUP))+ (stop root-service))+ (else+ #f))))+ ;; Main program. (define (main . args)@@ -82,6 +130,12 @@ socket file at FILE-NAME upon exit of PROC. Return the values of PROC." (= EINVAL errno) ;PR_SET_CHILD_SUBREAPER unavailable (apply throw args))))))) + (define signal-port+ ;; Attempt to create a "signal port" via 'signalfd'. This must be called+ ;; before the 'sigaction' procedure is called, because 'sigaction' spawns+ ;; the signal thread.+ (maybe-signal-port (list SIGCHLD SIGINT SIGTERM SIGHUP)))+ (initialize-cli) (let ((config-file #f)@@ -281,7 +335,9 @@ socket file at FILE-NAME upon exit of PROC. Return the values of PROC." ;; "Failed to autoload handle-SIGCHLD in (ice-9 readline):" (handle-SIGCHLD) - (let next-command ()+ (let next-command ((ports (if signal-port+ (list signal-port sock)+ (list sock)))) (define (read-from sock) (match (accept sock) ((command-source . client-address)@@ -289,14 +345,17 @@ socket file at FILE-NAME upon exit of PROC. Return the values of PROC." (process-connection command-source)) (_ #f))) - ;; XXX: Until we use signalfd(2), there's always a time window+ ;; When not using signalfd(2), there's always a time window ;; before 'select' during which a handler async can be queued ;; but not executed. Work around it by exiting 'select' every ;; few seconds.- (match (select (list sock) (list) (list)- (if poll-services? 0.5 30))- (((sock) _ _)- (read-from sock))+ (match (select ports (list) (list)+ (and (not signal-port)+ (if poll-services? 0.5 30)))+ (((port _ ...) _ _)+ (if (and signal-port (eq? port signal-port))+ (handle-signal-port port)+ (read-from sock))) (_ ;; 'select' returned an empty set, probably due to EINTR. ;; Explicitly call the SIGCHLD handler because we cannot be@@ -306,7 +365,7 @@ socket file at FILE-NAME upon exit of PROC. Return the values of PROC." (when poll-services? (check-for-dead-services))- (next-command))))))))+ (next-command ports)))))))) ;; Start all of SERVICES, which is a list of canonical names (FIXME?), ;; but in a order where all dependencies are fulfilled before wediff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scmindex 45fcf32..a202a98 100644--- a/modules/shepherd/service.scm+++ b/modules/shepherd/service.scm@@ -898,13 +898,18 @@ its PID." (pid (and (sigaction SIGTERM SIG_DFL) (primitive-fork)))) (if (zero? pid)- (exec-command command- #:user user- #:group group- #:log-file log-file- #:directory directory- #:file-creation-mask file-creation-mask- #:environment-variables environment-variables)+ (begin+ ;; Unblock any signals that might have been blocked by the parent+ ;; process.+ (unblock-signals (list SIGCHLD SIGINT SIGHUP SIGTERM))++ (exec-command command+ #:user user+ #:group group+ #:log-file log-file+ #:directory directory+ #:file-creation-mask file-creation-mask+ #:environment-variables environment-variables)) (begin ;; Restore the initial SIGTERM handler. (sigaction SIGTERM term-handler)-- 2.26.2
L
L
Ludovic Courtès wrote on 24 May 16:36 +0200
[PATCH Shepherd 1/2] system: Add support for 'signalfd'.
(address . 41507@debbugs.gnu.org)
20200524143700.6378-1-ludo@gnu.org
* configure.ac: Add 'AC_CHECK_SIZEOF' calls for 'structsignalfd_siginfo' and 'sigset_t'. Add 'AC_COMPUTE_INT' for'SFD_CLOEXEC', 'SIG_BLOCK', and 'SIG_UNBLOCK'. Substitute the results.* modules/shepherd/system.scm.in (allocate-sigset, sigemptyset)(sigaddset, sigset): New procedures.(%sizeof-struct-signalfd-siginfo, SFD_CLOEXEC): New variables.(signalfd, consume-signalfd-siginfo): New procedures.(SIG_BLOCK, SIG_UNBLOCK): New variables.(sigprocmask, block-signals, unblock-signals): New procedures.--- configure.ac | 18 ++++++++++ modules/shepherd/system.scm.in | 65 ++++++++++++++++++++++++++++++++++ 2 files changed, 83 insertions(+)
Toggle diff (123 lines)diff --git a/configure.ac b/configure.acindex 5d11846..052a826 100644--- a/configure.ac+++ b/configure.ac@@ -87,6 +87,24 @@ AC_COMPUTE_INT([PR_SET_CHILD_SUBREAPER], [PR_SET_CHILD_SUBREAPER], [#include <sy AC_SUBST([PR_SET_CHILD_SUBREAPER]) AC_MSG_RESULT([done]) +dnl Check the size of 'signalfd_siginfo'. If it's undefined, returns zero.+AC_CHECK_SIZEOF([struct signalfd_siginfo], [], [#include <sys/signalfd.h>])+AC_CHECK_SIZEOF([sigset_t], [], [#include <signal.h>])++AC_MSG_CHECKING([<sys/signalfd.h> and <sys/signal.h> constants])+AC_COMPUTE_INT([SFD_CLOEXEC], [SFD_CLOEXEC], [#include <sys/signalfd.h>])+AC_COMPUTE_INT([SIG_BLOCK], [SIG_BLOCK], [#include <sys/signal.h>])+AC_COMPUTE_INT([SIG_UNBLOCK], [SIG_UNBLOCK], [#include <sys/signal.h>])+AC_MSG_RESULT([done])++SIZEOF_STRUCT_SIGNALFD_SIGINFO="$ac_cv_sizeof_struct_signalfd_siginfo"+SIZEOF_SIGSET_T="$ac_cv_sizeof_sigset_t"+AC_SUBST([SIZEOF_STRUCT_SIGNALFD_SIGINFO])+AC_SUBST([SIZEOF_SIGSET_T])+AC_SUBST([SFD_CLOEXEC])+AC_SUBST([SIG_BLOCK])+AC_SUBST([SIG_UNBLOCK])+ AC_MSG_CHECKING([whether to build crash handler]) case "$host_os" in linux-gnu*) build_crash_handler=yes;;diff --git a/modules/shepherd/system.scm.in b/modules/shepherd/system.scm.inindex e5ecd1f..872fad4 100644--- a/modules/shepherd/system.scm.in+++ b/modules/shepherd/system.scm.in@@ -20,8 +20,10 @@ (define-module (shepherd system) #:use-module (system foreign)+ #:use-module (ice-9 binary-ports) #:use-module (rnrs bytevectors) #:use-module (srfi srfi-11)+ #:use-module (srfi srfi-26) #:export (disable-reboot-on-ctrl-alt-del reboot halt@@ -30,6 +32,11 @@ prctl PR_SET_CHILD_SUBREAPER getpgid+ SFD_CLOEXEC+ signalfd+ consume-signalfd-siginfo+ block-signals+ unblock-signals without-automatic-finalization)) ;; The <sys/reboot.h> constants.@@ -132,6 +139,64 @@ ctrlaltdel(8) and see kernel/reboot.c in Linux." (list err)) result))))) +(define (allocate-sigset)+ (bytevector->pointer (make-bytevector @SIZEOF_SIGSET_T@)))++(define sigemptyset+ (syscall->procedure int "sigemptyset" '(*)))++(define sigaddset+ (syscall->procedure int "sigaddset" `(* ,int)))++(define (sigset signals)+ "Return a pointer to a fresh 'sigset_t' for SIGNALS."+ (let ((set (allocate-sigset)))+ (sigemptyset set)+ (for-each (cut sigaddset set <>) signals)+ set))++(define %sizeof-struct-signalfd-siginfo+ ;; Size of 'struct signalfd_siginfo' or zero if it doesn't exist, as is the+ ;; case on GNU/Hurd.+ @SIZEOF_STRUCT_SIGNALFD_SIGINFO@)++(define SFD_CLOEXEC @SFD_CLOEXEC@)++(define signalfd+ (let ((proc (syscall->procedure int "signalfd" `(,int * ,int))))+ (lambda* (fd signals #:optional (flags SFD_CLOEXEC))+ "Return an open input port over a signal file descriptor for SIGNALS, a+list of signal constants; if FD is -1, a new file descriptor is allocated,+otherwise FD is returned and its associated state is updated. FLAGS must be a+bitmask of SFD_CLOEXEC or SFD_NONBLOCK."+ (fdopen (proc fd (sigset signals) flags) "r0"))))++(define (consume-signalfd-siginfo port)+ "Read a 'signalfd_siginfo' structure from PORT and discard it. Return the+number of the signal received."+ (let ((bv (get-bytevector-n port %sizeof-struct-signalfd-siginfo)))+ ;; The first 'uint32_t' field of 'struct signalfd_siginfo' is the signal+ ;; number.+ (bytevector-u32-native-ref bv 0)))++(define SIG_BLOCK @SIG_BLOCK@)+(define SIG_UNBLOCK @SIG_UNBLOCK@)++(define sigprocmask+ (let ((proc (syscall->procedure int "pthread_sigmask" `(,int * *))))+ (lambda (how signals)+ "Add SIGNALS, a list of SIG* values, to the set of blocked signals if+HOW is SIG_BLOCK, or unblock them if HOW is SIG_UNBLOCK."+ (proc how (sigset signals) %null-pointer))))++(define (block-signals signals)+ "Block SIGNALS, a list of SIG* values, in the current thread."+ (sigprocmask SIG_BLOCK signals))++(define (unblock-signals signals)+ "Unblock SIGNALS, a list of SIG* values, in the current thread."+ (sigprocmask SIG_UNBLOCK signals))+ ;;; ;;; Guile shenanigans.-- 2.26.2
J
J
Jelle Licht wrote on 24 May 17:13 +0200
Re: [bug#41507] [PATCH Shepherd 0/2] Use 'signalfd' on GNU/Linux
87o8qd2wos.fsf@jlicht.xyz
Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (4 lines)> Hello!>> This patch series allows shepherd to use ‘signalfd’ on GNU/Linux.
[...]
Toggle quote (2 lines)> Thoughts?
I am extremely happy to see that you got this to work; I bashed my headagainst a wall getting the signal-handling thread to block the rightsignals. This does seem like something that should be in guile properrather than shepherd though.
As I read it, you need to set up a signalfd handlers for specificsignals before ever calling `sigaction'. Does this not have an impact onbeing able to do some forms of REPL-driven development with shepherd?Perhaps it makes sense to document this gotcha (and some of itsimplications) in a location other than an inline comment in`maybe-signal-port'.
Thanks for getting this to work nonetheless! - Jelle
L
L
Ludovic Courtès wrote on 25 May 09:37 +0200
(name . Jelle Licht)(address . jlicht@fsfe.org)(address . 41507@debbugs.gnu.org)
87tv04wjm9.fsf@gnu.org
Hi Jelle,
Jelle Licht <jlicht@fsfe.org> skribis:
Toggle quote (5 lines)> I am extremely happy to see that you got this to work; I bashed my head> against a wall getting the signal-handling thread to block the right> signals. This does seem like something that should be in guile proper> rather than shepherd though.
Oh right, you had first-hand experience fiddling with this!
Yeah ideally Guile would provide some ‘signalfd’ support. I’m not sureexactly how that should work. It’s easier to have ad-hoc support in theShepherd in the meantime.
Toggle quote (4 lines)> As I read it, you need to set up a signalfd handlers for specific> signals before ever calling `sigaction'. Does this not have an impact on> being able to do some forms of REPL-driven development with shepherd?
It depends, I don’t test things like SIGCHLD handling from the REPLanyway. :-)
Toggle quote (4 lines)> Perhaps it makes sense to document this gotcha (and some of its> implications) in a location other than an inline comment in> `maybe-signal-port'.
Where would you document that? Currently, one gets a warning if signalscannot be properly blocked.
Thanks for your feedback!
Ludo’.
J
J
Jelle Licht wrote on 25 May 10:02 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41507@debbugs.gnu.org)
87mu5wv3wg.fsf@jlicht.xyz
Ludovic Courtès <ludo@gnu.org> writes:
Toggle quote (14 lines)>> As I read it, you need to set up a signalfd handlers for specific>> signals before ever calling `sigaction'. Does this not have an impact on>> being able to do some forms of REPL-driven development with shepherd?>> It depends, I don’t test things like SIGCHLD handling from the REPL> anyway. :-)>>> Perhaps it makes sense to document this gotcha (and some of its>> implications) in a location other than an inline comment in>> `maybe-signal-port'.>> Where would you document that? Currently, one gets a warning if signals> cannot be properly blocked.
Being able to use both sigaction-based handlers and signalfd is simplyweird, not forbidden.
When I read
Toggle snippet (3 lines)"already X>1 threads running, disabling 'signalfd' support"
I would already have to understand why this is ("I didn't start anythreads, what is going on").
Your inline comment clearly indicates some possible approaches to debugthis warning. Knowing that I should call this before any sigaction stuffseems relevant enough to be documented. The Texinfo manual could have a(tiny) section on this. The _only_ way to learn this is byalready being a Guile-guru, or by reading the source.
- Jelle
M
M
Mathieu Othacehe wrote on 25 May 14:31 +0200
Re: [bug#41507] [PATCH Shepherd 2/2] shepherd: Use 'signalfd' when possible.
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41507@debbugs.gnu.org)
87a71ww5zx.fsf@gnu.org
Hello Ludo,
This is a big improvement, thank you!
Toggle quote (5 lines)> + (lambda args> + (if (= ENOSYS (system-error-errno args))> + #f> + (apply throw args)))))
Maybe:
(and (= ENOSYS (system-error-errno args)) (apply throw args))
Toggle quote (12 lines)> +> +(define (handle-signal-port port)> + "Read from PORT, a signalfd port, and handle the signal accordingly."> + (let ((signal (consume-signalfd-siginfo port)))> + (cond ((= signal SIGCHLD)> + (handle-SIGCHLD))> + ((= signal SIGINT)> + (catch 'quit> + (lambda ()> + (stop root-service))> + quit-exception-handler))
Maybe we should create a handle-SIGINT, to make sure that the same codeis shared with the sigaction handler?
Toggle quote (5 lines)> + (begin> + ;; Unblock any signals that might have been blocked by the parent> + ;; process.> + (unblock-signals (list SIGCHLD SIGINT SIGHUP SIGTERM))
This made me realize that we may want to disable/reset SIGINT andSIGHUP, in the same way that we do for SIGTERM? This is not related toyour patch anyway.
You could also extend the comment to say that it is only necessary ifusing the signalfd mechanism (because signals are not blockedotherwise).
In any case, this looks fine!
Thanks,
Mathieu
L
L
Ludovic Courtès wrote on 26 May 23:32 +0200
Re: [bug#41507] [PATCH Shepherd 0/2] Use 'signalfd' on GNU/Linux
(name . Jelle Licht)(address . jlicht@fsfe.org)(address . 41507@debbugs.gnu.org)
87eer6mlfs.fsf@gnu.org
Hello,
Jelle Licht <jlicht@fsfe.org> skribis:
Toggle quote (19 lines)> Ludovic Courtès <ludo@gnu.org> writes:>>>> As I read it, you need to set up a signalfd handlers for specific>>> signals before ever calling `sigaction'. Does this not have an impact on>>> being able to do some forms of REPL-driven development with shepherd?>>>> It depends, I don’t test things like SIGCHLD handling from the REPL>> anyway. :-)>>>>> Perhaps it makes sense to document this gotcha (and some of its>>> implications) in a location other than an inline comment in>>> `maybe-signal-port'.>>>> Where would you document that? Currently, one gets a warning if signals>> cannot be properly blocked.>> Being able to use both sigaction-based handlers and signalfd is simply> weird, not forbidden.
It’s not really possible in practice. What happens occasionally if youdo that is:
1. thread A exits ‘select’ due to read event on the signal FD 2. thread B runs handler 3. thread A attempts to read from signal FD, which blocks indefinitely because the signal is no longer “pending”
Toggle quote (7 lines)> When I read >> "already X>1 threads running, disabling 'signalfd' support">> I would already have to understand why this is ("I didn't start any> threads, what is going on").
You mean you as a user? The message is meant as a hint for developersif they have to debug the thing. It’s true that there’s nothing userscan do about it. (But hopefully nobody will see that message, it’s notsupposed to happen when running ‘shepherd’ directly!)
Toggle quote (6 lines)> Your inline comment clearly indicates some possible approaches to debug> this warning. Knowing that I should call this before any sigaction stuff> seems relevant enough to be documented. The Texinfo manual could have a> (tiny) section on this. The _only_ way to learn this is by> already being a Guile-guru, or by reading the source.
I don’t think the *user* manual should document POSIX signal handling inmulti-threaded programs with Linux’ signalfd and with Guile. That’s apretty niche topic that users probably don’t want to know about. :-)
Those who do want to know about it are Shepherd developers, and Ithought they’d quickly stumble upon the comment if they have to fiddlewith this. Hopefully the explanation makes it look a bit less arcane!
Thanks,Ludo’.
L
L
Ludovic Courtès wrote on 27 May 00:13 +0200
Re: [bug#41507] [PATCH Shepherd 2/2] shepherd: Use 'signalfd' when possible.
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
875zcimjj2.fsf@gnu.org
Hi!
Mathieu Othacehe <othacehe@gnu.org> skribis:
Toggle quote (10 lines)>> + (lambda args>> + (if (= ENOSYS (system-error-errno args))>> + #f>> + (apply throw args)))))>> Maybe:>> (and (= ENOSYS (system-error-errno args))> (apply throw args))
It’s not equivalent. :-)
Toggle quote (14 lines)>> +(define (handle-signal-port port)>> + "Read from PORT, a signalfd port, and handle the signal accordingly.">> + (let ((signal (consume-signalfd-siginfo port)))>> + (cond ((= signal SIGCHLD)>> + (handle-SIGCHLD))>> + ((= signal SIGINT)>> + (catch 'quit>> + (lambda ()>> + (stop root-service))>> + quit-exception-handler))>> Maybe we should create a handle-SIGINT, to make sure that the same code> is shared with the sigaction handler?
Good idea, done!
Toggle quote (9 lines)>> + (begin>> + ;; Unblock any signals that might have been blocked by the parent>> + ;; process.>> + (unblock-signals (list SIGCHLD SIGINT SIGHUP SIGTERM))>> This made me realize that we may want to disable/reset SIGINT and> SIGHUP, in the same way that we do for SIGTERM? This is not related to> your patch anyway.
Right, I’ll let you look into it if that’s fine with you.
Toggle quote (4 lines)> You could also extend the comment to say that it is only necessary if> using the signalfd mechanism (because signals are not blocked> otherwise).
Done.
I did:
make dist guix build shepherd --with-source=shepherd-0.8.0.tar.gz \ -s x86_64-linux -s i686-linux -s armhf-linux -s aarch64-linux guix build shepherd --with-source=shepherd-0.8.0.tar.gz \ --target=i586-pc-gnu
and it all passes.
Janneke, could you check that it works on GNU/Hurd?
Ludo’.
Closed
J
J
Jan Nieuwenhuizen wrote on 27 May 07:45 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)
87367mdj88.fsf@gnu.org
Ludovic Courtès writes:
Hello,
Toggle quote (2 lines)> Janneke, could you check that it works on GNU/Hurd?
I haven't caught-up with the discussion, but I just verified thatupstream master~
399bea2 shepherd: Use 'signalfd' when possible.
works on latest wip-hurd-vm!
I did a make dist after finding out (well that was pretty clear from thepatches, but...) that applying the patches need autotools rebootstrap,etc.
Janneke
-- Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.orgFreelance IT http://JoyofSource.com| Avatar® http://AvatarAcademy.com
Closed
L
L
Ludovic Courtès wrote on 27 May 11:23 +0200
(name . Jan Nieuwenhuizen)(address . janneke@gnu.org)
87blm9loik.fsf@gnu.org
Hi,
Jan Nieuwenhuizen <janneke@gnu.org> skribis:
Toggle quote (13 lines)> Ludovic Courtès writes:>> Hello,>>> Janneke, could you check that it works on GNU/Hurd?>> I haven't caught-up with the discussion, but I just verified that> upstream master~>> 399bea2 shepherd: Use 'signalfd' when possible.>> works on latest wip-hurd-vm!
Yay, awesome, thanks for checking!
Ludo’.
Closed
L
L
Ludovic Courtès wrote on 30 May 19:44 +0200
(name . Mathieu Othacehe)(address . othacehe@gnu.org)(address . 41507@debbugs.gnu.org)
87tuzxs4gj.fsf@gnu.org
Hello Mathieu!
Mathieu Othacehe <othacehe@gnu.org> skribis:
Toggle quote (9 lines)>> + (begin>> + ;; Unblock any signals that might have been blocked by the parent>> + ;; process.>> + (unblock-signals (list SIGCHLD SIGINT SIGHUP SIGTERM))>> This made me realize that we may want to disable/reset SIGINT and> SIGHUP, in the same way that we do for SIGTERM? This is not related to> your patch anyway.
I looked into this and came up with the patches below.
What it does is unconditionally block these four signals before forking.Then the child process installs SIG_DFL handlers for them and unblocksthem.
That way, the child is guaranteed to never execute the originalhandlers, and neither the parent nor the child misses any of thesesignals (previously, the temporary (sigaction SIGTERM SIG_DFL)introduced a window during which shepherd could be killed by a SIGTERMinstead of handling it gracefully.)
WDYT?
(I also thought about using ‘call-with-blocked-asyncs’ instead of‘with-blocked-signals’. That would have prevented signal handling atthe Scheme level from happening, but the unhandled signals in the childwould be lost.)
If it works for you, I’ll do some more testing, and then hopefully wecan release 0.8.1 in the coming days!
Ludo’.
From bc74b5e33625a082ad0d44fe4409d459222aa295 Mon Sep 17 00:00:00 2001From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>Date: Sat, 30 May 2020 17:44:07 +0200Subject: [PATCH 1/3] system: 'sigprocmask' returns the previous set of blocked signals.
* modules/shepherd/system.scm.in (sigismember, sigset->list): Newprocedures.(sigprocmask): Return the old set of signals.--- modules/shepherd/system.scm.in | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)
Toggle diff (60 lines)diff --git a/modules/shepherd/system.scm.in b/modules/shepherd/system.scm.inindex ac822f8..9c55c69 100644--- a/modules/shepherd/system.scm.in+++ b/modules/shepherd/system.scm.in@@ -148,6 +148,11 @@ ctrlaltdel(8) and see kernel/reboot.c in Linux." (define sigaddset (syscall->procedure int "sigaddset" `(* ,int))) +(define sigismember+ (let ((proc (syscall->procedure int "sigismember" `(* ,int))))+ (lambda (set signal)+ (not (zero? (proc set signal))))))+ (define (sigset signals) "Return a pointer to a fresh 'sigset_t' for SIGNALS." (let ((set (allocate-sigset)))@@ -155,6 +160,20 @@ ctrlaltdel(8) and see kernel/reboot.c in Linux." (for-each (cut sigaddset set <>) signals) set)) +(define sigset->list+ (let ((all-signals+ (filter integer?+ (module-map (lambda (symbol variable)+ (let ((str (symbol->string symbol)))+ (and (string-prefix? "SIG" str)+ (not (string-prefix? "SIG_" str))+ (variable-ref variable))))+ (resolve-interface '(guile))))))+ (lambda (set)+ "Return the list of integers (signal numbers) corresponding to SET, a+sigset pointer."+ (filter (cut sigismember set <>) all-signals))))+ (define %sizeof-struct-signalfd-siginfo ;; Size of 'struct signalfd_siginfo' or zero if it doesn't exist, as is the ;; case on GNU/Hurd.@@ -186,13 +205,17 @@ number of the signal received." (let ((proc (syscall->procedure int "pthread_sigmask" `(,int * *)))) (lambda (how signals) "Add SIGNALS, a list of SIG* values, to the set of blocked signals if-HOW is SIG_BLOCK, or unblock them if HOW is SIG_UNBLOCK."+HOW is SIG_BLOCK, or unblock them if HOW is SIG_UNBLOCK. Return the previous+set of blocked signals as a list of SIG* values."+ (define old+ (allocate-sigset))+ (let-values (((result err)- (proc how (sigset signals) %null-pointer)))+ (proc how (sigset signals) old))) (if (= -1 result) (throw 'system-error "sigprocmask" "~A" (list (strerror err)) (list err))- result)))))+ (sigset->list old)))))) (define (block-signals signals) "Block SIGNALS, a list of SIG* values, in the current thread."-- 2.26.2
From ec3631115f8ef070c939f392bb316ad44360a83c Mon Sep 17 00:00:00 2001From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>Date: Sat, 30 May 2020 19:28:35 +0200Subject: [PATCH 2/3] system: Add 'with-blocked-signals'.
* configure.ac: Compute and substitute 'SIG_SETMASK'.* modules/shepherd/system.scm.in (SIG_SETMASK): New variable.(set-blocked-signals, call-with-blocked-signals): New procedures.(with-blocked-signals): New macro.--- .dir-locals.el | 4 +++- configure.ac | 2 ++ modules/shepherd/system.scm.in | 21 +++++++++++++++++++++ 3 files changed, 26 insertions(+), 1 deletion(-)
Toggle diff (82 lines)diff --git a/.dir-locals.el b/.dir-locals.elindex 8361cb6..3e64a3e 100644--- a/.dir-locals.el+++ b/.dir-locals.el@@ -9,6 +9,8 @@ (bug-reference-url-format . "http://bugs.gnu.org/%s") (bug-reference-bug-regexp . "<https?://\\(debbugs\\|bugs\\)\\.gnu\\.org/\\([0-9]+\\)>")))- (scheme-mode . ((indent-tabs-mode . nil)))+ (scheme-mode+ . ((indent-tabs-mode . nil)+ (eval . (put 'with-blocked-signals 'scheme-indent-function 1)))) (texinfo-mode . ((indent-tabs-mode . nil) (fill-column . 72))))diff --git a/configure.ac b/configure.acindex 052a826..6c2c7b3 100644--- a/configure.ac+++ b/configure.ac@@ -95,6 +95,7 @@ AC_MSG_CHECKING([<sys/signalfd.h> and <sys/signal.h> constants]) AC_COMPUTE_INT([SFD_CLOEXEC], [SFD_CLOEXEC], [#include <sys/signalfd.h>]) AC_COMPUTE_INT([SIG_BLOCK], [SIG_BLOCK], [#include <sys/signal.h>]) AC_COMPUTE_INT([SIG_UNBLOCK], [SIG_UNBLOCK], [#include <sys/signal.h>])+AC_COMPUTE_INT([SIG_SETMASK], [SIG_SETMASK], [#include <sys/signal.h>]) AC_MSG_RESULT([done]) SIZEOF_STRUCT_SIGNALFD_SIGINFO="$ac_cv_sizeof_struct_signalfd_siginfo"@@ -104,6 +105,7 @@ AC_SUBST([SIZEOF_SIGSET_T]) AC_SUBST([SFD_CLOEXEC]) AC_SUBST([SIG_BLOCK]) AC_SUBST([SIG_UNBLOCK])+AC_SUBST([SIG_SETMASK]) AC_MSG_CHECKING([whether to build crash handler]) case "$host_os" indiff --git a/modules/shepherd/system.scm.in b/modules/shepherd/system.scm.inindex 9c55c69..d606573 100644--- a/modules/shepherd/system.scm.in+++ b/modules/shepherd/system.scm.in@@ -37,6 +37,8 @@ consume-signalfd-siginfo block-signals unblock-signals+ set-blocked-signals+ with-blocked-signals without-automatic-finalization)) ;; The <sys/reboot.h> constants.@@ -200,6 +202,7 @@ number of the signal received." (define SIG_BLOCK @SIG_BLOCK@) (define SIG_UNBLOCK @SIG_UNBLOCK@)+(define SIG_SETMASK @SIG_SETMASK@) (define sigprocmask (let ((proc (syscall->procedure int "pthread_sigmask" `(,int * *))))@@ -225,6 +228,24 @@ set of blocked signals as a list of SIG* values." "Unblock SIGNALS, a list of SIG* values, in the current thread." (sigprocmask SIG_UNBLOCK signals)) +(define (set-blocked-signals signals)+ "Block exactly the signals listed in SIGNALS, a list of SIG* values, in the+current thread."+ (sigprocmask SIG_SETMASK signals))++(define (call-with-blocked-signals signals thunk)+ (let ((previous-set #f))+ (dynamic-wind+ (lambda ()+ (set! previous-set (block-signals signals)))+ thunk+ (lambda ()+ (set-blocked-signals previous-set)))))++(define-syntax-rule (with-blocked-signals signals exp ...)+ "Evaluate EXP... in a context where SIGNALS are blocked."+ (call-with-blocked-signals signals (lambda () exp ...)))+ ;;; ;;; Guile shenanigans.-- 2.26.2
From 576ac6155dcabea47dfdf69fb9a8cc07cecf9695 Mon Sep 17 00:00:00 2001From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= <ludo@gnu.org>Date: Sat, 30 May 2020 19:32:43 +0200Subject: [PATCH 3/3] service: 'fork+exec-command' blocks handled signals before forking.
This is a followup to d190773751ddeddbe0daa8e4a43f76b73c4fd7ac, whichaddressed only SIGTERM instead of all of %PRECIOUS-SIGNALS.Furthermore, setting the SIGTERM handler introduced a window in theshepherd process during which SIGTERM instances would be lost.
* modules/shepherd/service.scm (%precious-signals): New variable.(fork+exec-command): Remove calls to 'sigaction' for SIGTERM. Wrap'let' in 'with-blocked-signals'. Restore default signal handlers in thechild before unblocking signals.--- modules/shepherd/service.scm | 53 ++++++++++++++++++------------------ 1 file changed, 26 insertions(+), 27 deletions(-)
Toggle diff (73 lines)diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scmindex f3ac32a..1bc77b1 100644--- a/modules/shepherd/service.scm+++ b/modules/shepherd/service.scm@@ -870,6 +870,10 @@ false." program (strerror (system-error-errno args))) (primitive-exit 1)))))))) +(define %precious-signals+ ;; Signals that the shepherd process handles.+ (list SIGCHLD SIGINT SIGHUP SIGTERM))+ (define* (fork+exec-command command #:key (user #f)@@ -886,33 +890,28 @@ its PID." (sigaction SIGCHLD handle-SIGCHLD SA_NOCLDSTOP) (set! %sigchld-handler-installed? #t)) - ;; When forking a process, the signal handlers are inherited, until it- ;; forks. If SIGTERM is received by the forked process, before it calls- ;; execv, the installed SIGTERM handler, stopping Shepherd will be called.- ;; To avoid this, save the SIGTERM handler, disable it, and restore it once,- ;; the process has been forked. This way, the forked process will use the- ;; default SIGTERM handler stopping the process.- (let ((term-handler (match (sigaction SIGTERM)- ((proc . _)- proc)))- (pid (and (sigaction SIGTERM SIG_DFL)- (primitive-fork))))- (if (zero? pid)- (begin- ;; Unblock any signals that might have been blocked by the parent- ;; process if using 'signalfd'.- (unblock-signals (list SIGCHLD SIGINT SIGHUP SIGTERM))-- (exec-command command- #:user user- #:group group- #:log-file log-file- #:directory directory- #:file-creation-mask file-creation-mask- #:environment-variables environment-variables))- (begin- ;; Restore the initial SIGTERM handler.- (sigaction SIGTERM term-handler)+ ;; Child processes inherit signal handlers until they exec. If one of+ ;; %PRECIOUS-SIGNALS is received by the child before it execs, the installed+ ;; handler, which stops shepherd, is called. To avoid this, block signals+ ;; so that the child process never executes those handlers.+ (with-blocked-signals %precious-signals+ (let ((pid (primitive-fork)))+ (if (zero? pid)+ (begin+ ;; First restore the default handlers.+ (for-each (cut sigaction <> SIG_DFL) %precious-signals)++ ;; Unblock any signals that have been blocked by the parent+ ;; process.+ (unblock-signals %precious-signals)++ (exec-command command+ #:user user+ #:group group+ #:log-file log-file+ #:directory directory+ #:file-creation-mask file-creation-mask+ #:environment-variables environment-variables)) pid)))) (define* (make-forkexec-constructor command-- 2.26.2
M
M
Mathieu Othacehe wrote on 2 Jun 09:00 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 41507@debbugs.gnu.org)
87imgax88r.fsf@gnu.org
Hey Ludo,
Toggle quote (8 lines)> That way, the child is guaranteed to never execute the original> handlers, and neither the parent nor the child misses any of these> signals (previously, the temporary (sigaction SIGTERM SIG_DFL)> introduced a window during which shepherd could be killed by a SIGTERM> instead of handling it gracefully.)>> WDYT?
Yes, this is indeed much better this way!
Toggle quote (4 lines)> +(define %precious-signals> + ;; Signals that the shepherd process handles.> + (list SIGCHLD SIGINT SIGHUP SIGTERM))
We could maybe factorize this list with the one in "signal-port" in the"main" procedure. This way if we ever add an extra signal, we do notforget to add it to blocked signals list.
Otherwise, this looks fine, thanks for taking care of that :)
Mathieu
L
L
Ludovic Courtès wrote on 2 Jun 23:38 +0200
(name . Mathieu Othacehe)(address . othacehe@gnu.org)(address . 41507@debbugs.gnu.org)
87h7vtrvvo.fsf@gnu.org
Hi,
Mathieu Othacehe <othacehe@gnu.org> skribis:
Toggle quote (18 lines)>> That way, the child is guaranteed to never execute the original>> handlers, and neither the parent nor the child misses any of these>> signals (previously, the temporary (sigaction SIGTERM SIG_DFL)>> introduced a window during which shepherd could be killed by a SIGTERM>> instead of handling it gracefully.)>>>> WDYT?>> Yes, this is indeed much better this way!>>> +(define %precious-signals>> + ;; Signals that the shepherd process handles.>> + (list SIGCHLD SIGINT SIGHUP SIGTERM))>> We could maybe factorize this list with the one in "signal-port" in the> "main" procedure. This way if we ever add an extra signal, we do not> forget to add it to blocked signals list.
Yes, good idea.
I pushed the series with a followup commit to do that:
38e3589 shepherd: Factorize list of handled signals. 576ac61 service: 'fork+exec-command' blocks handled signals before forking. ec36311 system: Add 'with-blocked-signals'. bc74b5e system: 'sigprocmask' returns the previous set of blocked signals.
Let’s see if I can tag a release tomorrow…
Thanks for your feedback!
Ludo’.
?