[shepherd] Timer not executed

Done

Details

2 participants

Ludovic Courtès
Tomas Volf

Owner: unassigned

Submitted by: Tomas Volf

Severity: normal

Debbugs page

Tomas Volf wrote 2 weeks ago

Recipients:(address . bug-guix@gnu.org)

Message-ID:87y0xw1fmz.fsf@wolfsden.cz

Hi,

I might have find a bug in shepherd timers. I have a timer scheduled to

run every 24 hours, the definition is as follow:

Toggle snippet (21 lines)(define %kerberos-log-in-refresh-service
  (let ((name 'kerberos-log-in-refresh))
    (simple-service
     name
     home-shepherd-service-type
     (list (shepherd-service
            (documentation "Refresh the kerberos ticket.")
            (provision (list name))
            (requirement '(kerberos-reachable?))
            (start #~(make-timer-constructor
                      (calendar-event #:hours '(12) #:minutes '(0))
                      (command (list #$%kerberos-log-in))))
            (stop #~(make-timer-destructor))
            (modules (cons '(shepherd service timer)
                           %default-modules))
            (actions (list (shepherd-action
                            (name 'trigger)
                            (documentation "Immediately refresh the ticket.")
                            (procedure #~trigger-timer)))))))))

This should run every 24 hours (at noon) and execute the

%kerberos-log-in script (simple guile program that authenticates against

kerberos).

However that did not happen. Here are the logs:

Toggle snippet (9 lines)

2025-02-22 19:17:00 Service kerberos-log-in running with value #<<process> id: 730 command: ("/gnu/store/8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in")>.

2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>

2025-02-23 12:00:02 Waiting anew for timer 'kerberos-log-in-refresh' (resuming from sleep state?).

2025-02-23 22:00:01 Not rotating '/home/<redacted>/.local/state/shepherd/dbus.log', which is below the 8192 B threshold.

The ones from 19:17:00 are from 'kerberos-log-in service, which is

one-shot executed upon login. That went fine.

However the 'kerberos-log-in-refresh is only at 12:00:02, and only as

"Waiting anew ...". The message indicates that the computer might be

resuming from sleep, however that was not the case here. It is a

desktop machine, and it was left running over night.

Here is herd status:

Toggle snippet (21 lines)

$ herd status kerberos-log-in-refresh

● Status of kerberos-log-in-refresh:

It is running since Sat 22 Feb 2025 07:17:00 PM CET (28 hours ago).

Timed service.

Periodically running: /gnu/store/8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in

It is enabled.

Provides: kerberos-log-in-refresh

Requires: kerberos-reachable?

Custom action: trigger

Will be respawned.

Upcoming timer alarms:

Mon 24 Feb 2025 12:00:00 PM CET (in 13 hours)

Tue 25 Feb 2025 12:00:00 PM CET (in 37 hours)

Wed 26 Feb 2025 12:00:00 PM CET (in 3 days)

Thu 27 Feb 2025 12:00:00 PM CET (in 4 days)

Fri 28 Feb 2025 12:00:00 PM CET (in 5 days)

Have a nice day,
Tomas

-- 
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.

Ludovic Courtès wrote 2 weeks ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 76516@debbugs.gnu.org)

Message-ID:87cyf7tiro.fsf@gnu.org

Hi Tomas,

Tomas Volf <~@wolfsden.cz> skribis:

Toggle quote (19 lines)> However that did not happen.  Here are the logs:
>
> 2025-02-22 19:17:00 Service kerberos-log-in running with value #<<process> id: 730 command: ("/gnu/store/8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in")>.
> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
> 2025-02-23 12:00:02 Waiting anew for timer 'kerberos-log-in-refresh' (resuming from sleep state?).
> 2025-02-23 22:00:01 Not rotating '/home/<redacted>/.local/state/shepherd/dbus.log', which is below the 8192 B threshold.
>
>
> The ones from 19:17:00 are from 'kerberos-log-in service, which is
> one-shot executed upon login.  That went fine.
>
> However the 'kerberos-log-in-refresh is only at 12:00:02, and only as
> "Waiting anew ...".  The message indicates that the computer might be
> resuming from sleep, however that was not the case here.  It is a
> desktop machine, and it was left running over night.

What architecture is this on?

From the excerpt above, the ‘log-rotation’ timer did fire as expected.

Did it also have “Waiting anew” messages?

Ludo’.

Ludovic Courtès wrote 2 weeks ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 76516@debbugs.gnu.org)

Message-ID:874j0jtijs.fsf@gnu.org

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (2 lines)

>> 2025-02-23 12:00:02 Waiting anew for timer 'kerberos-log-in-refresh' (resuming from sleep state?).

The “Waiting anew” message happens when the timer fires 2 seconds or

more later than expected (see ‘sleep-operation/check’), which is indeed

the case here.

It’s not supposed to happen normally. Before we bump that to 10

seconds, say, it would be good to understand why the timer got late

here.

Are there services that could block shepherd somehow, for instance by

calling ‘waitpid’, or running computations at 12:00pm?

Ludo’.

Tomas Volf wrote 2 weeks ago

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 76516@debbugs.gnu.org)

Message-ID:87o6yr17t2.fsf@wolfsden.cz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (25 lines)> Hi Tomas,
>
> Tomas Volf <~@wolfsden.cz> skribis:
>
>> However that did not happen.  Here are the logs:
>>
>> 2025-02-22 19:17:00 Service kerberos-log-in running with value #<<process> id: 730 command: ("/gnu/store/8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in")>.
>> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
>> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
>> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
>> 2025-02-22 19:17:00 [8m21cnqnllk6g1kcgyj91i5h05s7c0c4-krb-log-in] <redacted>
>> 2025-02-23 12:00:02 Waiting anew for timer 'kerberos-log-in-refresh' (resuming from sleep state?).
>> 2025-02-23 22:00:01 Not rotating '/home/<redacted>/.local/state/shepherd/dbus.log', which is below the 8192 B threshold.
>>
>>
>> The ones from 19:17:00 are from 'kerberos-log-in service, which is
>> one-shot executed upon login.  That went fine.
>>
>> However the 'kerberos-log-in-refresh is only at 12:00:02, and only as
>> "Waiting anew ...".  The message indicates that the computer might be
>> resuming from sleep, however that was not the case here.  It is a
>> desktop machine, and it was left running over night.
>
> What architecture is this on?

x86_64-linux, AMD Ryzen 5 5600G with Radeon Graphics

Toggle quote (4 lines)

> From the excerpt above, the ‘log-rotation’ timer did fire as expected.

> Did it also have “Waiting anew” messages?

No, no such message. Actually, there are only 2 additional lines in the

log file. So the following are the last 4 lines of the

shepherd.log.1.zstd (you did already see the first 2 lines):

Toggle snippet (6 lines)

2025-02-23 12:00:02 Waiting anew for timer 'kerberos-log-in-refresh' (resuming from sleep state?).

2025-02-23 22:00:01 Not rotating '/home/<redacted>/.local/state/shepherd/dbus.log', which is below the 8192 B threshold.

2025-02-23 22:00:01 Rotating log.

The empty line is in the log, that is not a copy&paste error. The next

log (shepherd.log) starts with:

Toggle snippet (4 lines)

2025-02-23 22:00:01 Rotating '/home/<redacted>/.local/state/shepherd/shepherd.log' to '/home/<redacted>/.local/state/shepherd/shepherd.log.1'.

2025-02-23 22:00:01 Rotated '/home/<redacted>/.local/state/shepherd/shepherd.log'.

So there is not much indication what happened.

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Tomas Volf wrote 2 weeks ago

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 76516@debbugs.gnu.org)

Message-ID:87jz9f16yt.fsf@wolfsden.cz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (12 lines)> Ludovic Courtès <ludo@gnu.org> skribis:
>
>>> 2025-02-23 12:00:02 Waiting anew for timer 'kerberos-log-in-refresh' (resuming from sleep state?).
>
> The “Waiting anew” message happens when the timer fires 2 seconds or
> more later than expected (see ‘sleep-operation/check’), which is indeed
> the case here.
>
> It’s not supposed to happen normally.  Before we bump that to 10
> seconds, say, it would be good to understand why the timer got late
> here.

I definitely agree on this.

(I wonder if there is better way to detect the sleep. I feel like *any*

number will be wrong for someone. Do we know how for example systemd's

timers handle this?)

Toggle quote (4 lines)

> Are there services that could block shepherd somehow, for instance by

> calling ‘waitpid’, or running computations at 12:00pm?

Not really (I think). This is full shepherd status output:

Toggle snippet (15 lines)$ herd status
Started:
 + dbus
 + pulseaudio
 + root
 + timer
 + transient
Running timers:
 + kerberos-log-in-refresh
 + log-rotation
One-shot:
 * kerberos-log-in
 * kerberos-reachable?

I have already shared the definition of kerberos-log-in-refresh. There

is no other timer scheduled (except for log rotation). Other services

are from Guix, with the exception of pulseaudio:

Toggle snippet (14 lines)

(define (home-pulseaudio-shepherd-services _)

"Return a shepherd service to run a pulseaudio daemon.

Currently no configuration is supported."

(list

(shepherd-service

(documentation "Run a pulseaudio daemon.")

(provision '(pulseaudio))

(start #~(make-forkexec-constructor

'(#$(file-append pulseaudio "/bin/pulseaudio")

"--daemonize=false")))

(stop #~(make-kill-destructor)))))

There is a timer scheduled to run every 15 minutes in the system

shepherd, but is it not compute heavy (it just checks error counts from

the root filesystem). The machine has 12 cores, each at ~3GHz, 32GB of

RAM and SSD for /. I am not aware of any significant resource use that

should happen at noon, but even if there would be one, it is hard to

believe shepherd would not get a time slice on *any* core for 2 seconds.

For what it is worth, today the cronjob worked fine, however even today

it was executed at :01, so a second later then it should have been.

Toggle snippet (4 lines)

2025-02-24 12:00:01 Timer 'kerberos-log-in-refresh' spawned process 24129.

2025-02-24 12:00:01 Registering new logger for kerberos-log-in-refresh.

If you have any idea what additional information would be useful, I have

no problem deploying patched shepherd with extra logging to this machine

(assuming you know what extra logs we need).

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Ludovic Courtès wrote 2 weeks ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 76516@debbugs.gnu.org)

Message-ID:87h64jrosm.fsf@gnu.org

Tomas Volf <~@wolfsden.cz> skribis:

Toggle quote (4 lines)

> (I wonder if there is better way to detect the sleep. I feel like *any*

> number will be wrong for someone. Do we know how for example systemd's

> timers handle this?)

I believe systemd is the one initiating hibernation, so it has the
information first-hand; in our case this is initiated by elogind and
shepherd doesn’t know.  Probably something to fix.

Anyway, this time drift remains a mystery to me.  I would go for a hack
like this:

Toggle diff (23 lines)diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index adc4530..1587a02 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -2490,6 +2490,10 @@ keyword arguments as @code{fork+exec-command}: @code{#:user},
   "Make an operation that returns @var{timeout} when @var{seconds} have
 elapsed and @var{overslept} when many more seconds have elapsed--this can
 happen if the machine is suspended or put into hibernation mode."
+  (define max-delay
+    ;; Time after which we consider that we missed the deadline.
+    (if (> seconds 180) 10 2))
+
   (let ((expiry (+ (get-internal-real-time)
                    (inexact->exact
                     (round (* seconds internal-time-units-per-second))))))
@@ -2497,7 +2501,7 @@ happen if the machine is suspended or put into hibernation mode."
                     (lambda ()
                       (let* ((now (get-internal-real-time))
                              (delta (- now expiry)))
-                        (if (> delta (* 2 internal-time-units-per-second))
+                        (if (> delta (* max-delay internal-time-units-per-second))
                             overslept
                             timeout))))))

WDYT?

Ludo’.

Tomas Volf wrote 2 weeks ago

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 76516@debbugs.gnu.org)

Message-ID:87bjurymtv.fsf@wolfsden.cz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (24 lines)> Tomas Volf <~@wolfsden.cz> skribis:
>
>> (I wonder if there is better way to detect the sleep.  I feel like *any*
>> number will be wrong for someone.  Do we know how for example systemd's
>> timers handle this?)
>
> I believe systemd is the one initiating hibernation, so it has the
> information first-hand; in our case this is initiated by elogind and
> shepherd doesn’t know.  Probably something to fix.
>
> Anyway, this time drift remains a mystery to me.  I would go for a hack
> like this:
>
> diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
> index adc4530..1587a02 100644
> --- a/modules/shepherd/service.scm
> +++ b/modules/shepherd/service.scm
> @@ -2490,6 +2490,10 @@ keyword arguments as @code{fork+exec-command}: @code{#:user},
>    "Make an operation that returns @var{timeout} when @var{seconds} have
>  elapsed and @var{overslept} when many more seconds have elapsed--this can
>  happen if the machine is suspended or put into hibernation mode."
> +  (define max-delay
> +    ;; Time after which we consider that we missed the deadline.

I would extend the comment to describe why both 10 and 2 are used.

Toggle quote (9 lines)

> + (if (> seconds 180) 10 2))

> +

> (let ((expiry (+ (get-internal-real-time)

> (inexact->exact

> (round (* seconds internal-time-units-per-second))))))

> @@ -2497,7 +2501,7 @@ happen if the machine is suspended or put into hibernation mode."

> (lambda ()

> (let* ((now (get-internal-real-time))

I have no idea how Shepherd works internally (and much less how Fibers

work), so maybe this comment is completely off, but this seems

suspicious. Should this lambda not get the wake up time as an argument,

instead of calling get-internal-real-time to get the "now"?

I have no idea what guarantees do Fibers make regarding the delays

between detecting that time is up and calling the callback. And after

quick look at the source code I have decided that it is way beyond me to

try to figure it out.

Is there a way to enable logging of the events? So we would know when

fibers decided the timer is up, and when the lambda was called?

Toggle quote (10 lines)>                               (delta (- now expiry)))
> -                        (if (> delta (* 2 internal-time-units-per-second))
> +                        (if (> delta (* max-delay internal-time-units-per-second))
>                              overslept
>                              timeout))))))
>  
>
>
> WDYT?

Well, in *this* particular case it would have resolved the problem, so

great for me I guess. However I have left a suggestion above.

Out of curiosity, I have scheduled a timer event for tomorrow 23:0{0..5}

to see if they will fire with delay. Testing with short timer (closest

whole minute) did not bring any results (the timers were executed

exactly on time), so maybe the long wait is a factor? Will report

tomorrow.

Tomas

PS: Looking into timer.scm, I see this comment

Toggle snippet (6 lines)

;; Reached when resuming from sleep state: we slept

;; significantly more than the requested number of seconds. To

;; avoid triggering every timer when resuming from sleep state,

;; sleep again to remain in sync.

Not sure I would call 2 (or even the 10) a "significantly more". :) If I

expect the cron to sleep for 86400 seconds, 10 more seems... minor.

Maybe (I did not put too much though into this and the numbers are

completely thumb-sucked), the "overslept" could be if the sleep was

longer by more than 10% of the timer period, clipped to be at least 2,

and at most 30 minutes?

If I have a cron scheduled to run once a month, I would guess most

people would prefer to have it run 20 minutes late than to skip a month

completely.

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Ludovic Courtès wrote 2 weeks ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 76516@debbugs.gnu.org)

Message-ID:875xkx3tm5.fsf@gnu.org

Hi,

Tomas Volf <~@wolfsden.cz> skribis:

Toggle quote (5 lines)

> I have no idea how Shepherd works internally (and much less how Fibers

> work), so maybe this comment is completely off, but this seems

> suspicious. Should this lambda not get the wake up time as an argument,

> instead of calling get-internal-real-time to get the "now"?

Yes, it would probably be nicer, but it wouldn’t make much of a

difference here (and it’s not related to the bug: the bug shows that we

sleep longer than asked for).

Toggle quote (3 lines)

> Is there a way to enable logging of the events? So we would know when

> fibers decided the timer is up, and when the lambda was called?

There’s no logging at the Fibers level; all we have is logging by

shepherd itself.

Toggle quote (15 lines)> PS: Looking into timer.scm, I see this comment
>
> ;; Reached when resuming from sleep state: we slept
> ;; significantly more than the requested number of seconds.  To
> ;; avoid triggering every timer when resuming from sleep state,
> ;; sleep again to remain in sync.
>
> Not sure I would call 2 (or even the 10) a "significantly more". :) If I
> expect the cron to sleep for 86400 seconds, 10 more seems... minor.
>
> Maybe (I did not put too much though into this and the numbers are
> completely thumb-sucked), the "overslept" could be if the sleep was
> longer by more than 10% of the timer period, clipped to be at least 2,
> and at most 30 minutes?

Yeah, though there’s no reason for sleeps to drift this much, it’s a

pretty fundamental assumption. Maybe this:

(define max-delay

;; Time after which we consider that we missed the deadline. Tolerate a

;; slight drift, which can happen occasionally.

(max (min (/ seconds 10.) 120) 2))

Thanks,

Ludo’.

Tomas Volf wrote 2 weeks ago

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 76516@debbugs.gnu.org)

Message-ID:87senyq2k1.fsf@wolfsden.cz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (13 lines)> Hi,
>
> Tomas Volf <~@wolfsden.cz> skribis:
>
>> I have no idea how Shepherd works internally (and much less how Fibers
>> work), so maybe this comment is completely off, but this seems
>> suspicious.  Should this lambda not get the wake up time as an argument,
>> instead of calling get-internal-real-time to get the "now"?
>
> Yes, it would probably be nicer, but it wouldn’t make much of a
> difference here (and it’s not related to the bug: the bug shows that we
> sleep longer than asked for).

I am not sure this is correct. What the bug shows is that the callback

is called later then expected. We do not know how long the sleep was.

Am I missing something?

Toggle quote (25 lines)>
>> Is there a way to enable logging of the events?  So we would know when
>> fibers decided the timer is up, and when the lambda was called?
>
> There’s no logging at the Fibers level; all we have is logging by
> shepherd itself.
>
>> PS: Looking into timer.scm, I see this comment
>>
>> ;; Reached when resuming from sleep state: we slept
>> ;; significantly more than the requested number of seconds.  To
>> ;; avoid triggering every timer when resuming from sleep state,
>> ;; sleep again to remain in sync.
>>
>> Not sure I would call 2 (or even the 10) a "significantly more". :) If I
>> expect the cron to sleep for 86400 seconds, 10 more seems... minor.
>>
>> Maybe (I did not put too much though into this and the numbers are
>> completely thumb-sucked), the "overslept" could be if the sleep was
>> longer by more than 10% of the timer period, clipped to be at least 2,
>> and at most 30 minutes?
>
> Yeah, though there’s no reason for sleeps to drift this much, it’s a
> pretty fundamental assumption.

Does not seem to hold in this particular case (at least for the lower

bound). ¯\_(ツ)_/¯

Toggle quote (7 lines)

> Maybe this:

> (define max-delay

> ;; Time after which we consider that we missed the deadline. Tolerate a

> ;; slight drift, which can happen occasionally.

> (max (min (/ seconds 10.) 120) 2))

That should work, yeah. At least as a temporary measure. :)

Few additional data-points: The timers I have scheduled for almost 24h

in the future fired exactly on time. As for the kerberos-log-in-refresh

timer, twice it fired within the 2 seconds (12:00:01), once outside

(12:00:02).

I was thinking about this some more, and the right solution here

probably is to use netlink to listen for ACPI events, the same way acpid

does. That should provide reliable information about the suspend and

resume events.

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Ludovic Courtès wrote 2 weeks ago

Recipients:(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 76516@debbugs.gnu.org)

Message-ID:87ldtok6ia.fsf@gnu.org

Hi,

Tomas Volf <~@wolfsden.cz> skribis:

Toggle quote (13 lines)>>> I have no idea how Shepherd works internally (and much less how Fibers
>>> work), so maybe this comment is completely off, but this seems
>>> suspicious.  Should this lambda not get the wake up time as an argument,
>>> instead of calling get-internal-real-time to get the "now"?
>>
>> Yes, it would probably be nicer, but it wouldn’t make much of a
>> difference here (and it’s not related to the bug: the bug shows that we
>> sleep longer than asked for).
>
> I am not sure this is correct.  What the bug shows is that the callback
> is called later then expected.  We do not know how long the sleep was.
> Am I missing something?

The bug is that it slept longer than expected, not that it was late, if

you see what I mean.

Toggle quote (9 lines)

>> Maybe this:

>> (define max-delay

>> ;; Time after which we consider that we missed the deadline. Tolerate a

>> ;; slight drift, which can happen occasionally.

>> (max (min (/ seconds 10.) 120) 2))

> That should work, yeah. At least as a temporary measure. :)

Heh, agreed. Pushed as 7a7b4e16f9697c4822b7693e63cc4ba0ace134a2.

Toggle quote (5 lines)

> Few additional data-points: The timers I have scheduled for almost 24h

> in the future fired exactly on time. As for the kerberos-log-in-refresh

> timer, twice it fired within the 2 seconds (12:00:01), once outside

> (12:00:02).

OK.

Toggle quote (5 lines)

> I was thinking about this some more, and the right solution here

> probably is to use netlink to listen for ACPI events, the same way acpid

> does. That should provide reliable information about the suspend and

> resume events.

Sounds like a good idea (though it’s a bit annoying to depend on

guile-netlink and low-level details).

Another thing I had in mind was to use an elogind hook so that shepherd

would know when we’re suspending; this is necessary for other things

such as locking LUKS devices on suspend. But that’s a change for 1.1.x.

Thanks,

Ludo’.

Ludovic Courtès wrote 2 weeks ago

control message for bug #76516

Recipients:(address . control@debbugs.gnu.org)

Message-ID:87jz98k6hk.fsf@gnu.org

close 76516

quit

Tomas Volf wrote 1 weeks ago

Re: bug#76516: [shepherd] Timer not executed

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 76516@debbugs.gnu.org)

Message-ID:87ldtonx0i.fsf@wolfsden.cz

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (6 lines)

> [..]

> Another thing I had in mind was to use an elogind hook so that shepherd

> would know when we’re suspending; this is necessary for other things

> such as locking LUKS devices on suspend. But that’s a change for 1.1.x.

I see two possible problem here (both solvable).

1. AFAICT shepherd currently does not depend on elogind at all. Having

it as a run-time dependency might be fine on Guix (assuming we move

elogind into %base-services), but could be annoying on foreign

distributions, especially from non-root user's point of view.

2. How will the hook know what all processes it should let know? There

is no global registry of all running shepherd processes no?

Though I am sure both of these are solvable.

Have a nice day (and thanks for the fix :) ),

Tomas

There are only two hard things in Computer Science:

cache invalidation, naming things and off-by-one errors.

Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 76516@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 76516

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

You may also tag this issue. See list of standard tags. For example, to set the confirmed and easy tags

mumi command -t +confirmed -t +easy

Or, remove the moreinfo tag and set the help tag

mumi command -t -moreinfo -t +help

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date