UPower ignores ‘critical-power-action’

OpenSubmitted by Ludovic Courtès.
Details
2 participants
  • Ludovic Courtès
  • Maxime Devos
Owner
unassigned
Severity
important
L
L
Ludovic Courtès wrote on 30 Aug 17:59 +0200
(address . bug-guix@gnu.org)
87sflden8d.fsf@inria.fr
As discussed on IRC with Tobias, UPower appears to ignore our
‘critical-power-action’ setting. On my machine, I left the default
(‘HybridSleep’) but upowerd dismisses it and chooses ‘PowerOff’:

Toggle snippet (38 lines)
$ guix system describe
Generation 198 Aug 29 2022 00:47:53 (current)
file name: /var/guix/profiles/system-198-link
canonical file name: /gnu/store/85441w3nzqv8lg04gm7601wi9np4qlw7-system
label: GNU with Linux-Libre 5.18.19
bootloader: grub-efi
root device: label: "root"
kernel: /gnu/store/a43ai5qi4vbgm2zywg4b60y71d7whccn-linux-libre-5.18.19/bzImage
channels:
guix:
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 3294fa2ba451c7d5ef42a5d9fac780877f364bc7
configuration file: /gnu/store/lmqb5d0il8zydd0p0vz4kviaq1qg4n9m-configuration.scm
$ upower -d | tail -6
Daemon:
daemon-version: 0.99.15
on-battery: no
lid-is-closed: no
lid-is-present: yes
critical-action: PowerOff
$ sudo herd status upower-daemon
Status of upower-daemon:
It is started.
Running value is 338.
It is enabled.
Provides (upower-daemon).
Requires (dbus-system udev).
Conflicts with ().
Will be respawned.
$ sudo cat /proc/338/environ |xargs -0
PWD=/ UPOWER_CONF_FILE_NAME=/gnu/store/yq6zf8q2l2axy03d99pami3sxrk4784y-UPower.conf SHLVL=0 XDG_DATA_DIRS=/gnu/store/bnsf9il448hl5xjavbhq3rcx355svz2v-glib-2.70.2/share
$ cat /gnu/store/yq6zf8q2l2axy03d99pami3sxrk4784y-UPower.conf |grep Critical
PercentageCritical=3
TimeCritical=300
CriticalPowerAction=HybridSleep

Looking at the code, it could be because the ‘CanHybridSleep’ method
returns false, but why that would happen is unknown to me.

Thoughts?

Ludo’.
L
L
Ludovic Courtès wrote on 30 Aug 18:20 +0200
control message for bug #57490
(address . control@debbugs.gnu.org)
87r10xem9a.fsf@gnu.org
severity 57490 important
quit
L
L
Ludovic Courtès wrote on 30 Aug 18:22 +0200
Re: bug#57490: UPower ignores ‘critical-power-a ction’
(address . 57490@debbugs.gnu.org)
87mtblem6e.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> As discussed on IRC with Tobias, UPower appears to ignore our
> ‘critical-power-action’ setting. On my machine, I left the default
> (‘HybridSleep’) but upowerd dismisses it and chooses ‘PowerOff’:

Furthermore, powering off is actually ungraceful: ‘halt’ wasn’t invoked;
the machine just stopped abruptly, fsck was needed on reboot, etc.

The “PowerOff” DBus method is implemented by elogind, so there could be
another bug there.

Ludo’.
L
L
Ludovic Courtès wrote on 30 Aug 23:56 +0200
(address . 57490@debbugs.gnu.org)
871qsxe6ol.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)
> Looking at the code, it could be because the ‘CanHybridSleep’ method
> returns false, but why that would happen is unknown to me.

Indeed, if we run ‘dbus-monitor --system’ and
‘herd restart upower-daemon’ (in QEMU), we see this:

Toggle snippet (11 lines)
method call time=1661890192.586471 sender=:1.40 -> destination=:1.39 serial=16 path=/org/freedesktop/UPower; interface=org.freedesktop.UPower; member=GetCriticalAction
method call time=1661890192.586862 sender=:1.39 -> destination=:1.1 serial=17 path=/org/freedesktop/login1; interface=org.freedesktop.login1.Manager; member=CanHybridSleep
method return time=1661890192.588676 sender=:1.1 -> destination=:1.39 serial=104 reply_serial=17
string "na"
method call time=1661890192.589034 sender=:1.39 -> destination=:1.1 serial=18 path=/org/freedesktop/login1; interface=org.freedesktop.login1.Manager; member=CanHibernate
method return time=1661890192.591082 sender=:1.1 -> destination=:1.39 serial=105 reply_serial=18
string "na"
method return time=1661890192.591405 sender=:1.39 -> destination=:1.40 serial=19 reply_serial=16
string "PowerOff"

That is, elogind returns “na” to the Can* methods.

Same story on my actual laptop:

Toggle snippet (11 lines)
$ dbus-send --print-reply --system --dest=org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager.CanHybridSleep
method return time=1661890748.184775 sender=:1.1 -> destination=:1.130 serial=253 reply_serial=2
string "na"
$ dbus-send --print-reply --system --dest=org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager.CanHibernate
method return time=1661890756.999248 sender=:1.1 -> destination=:1.131 serial=254 reply_serial=2
string "na"
$ dbus-send --print-reply --system --dest=org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager.CanPowerOff
method return time=1661890761.375007 sender=:1.1 -> destination=:1.132 serial=258 reply_serial=2
string "yes"

This is not surprising since our ‘logind.conf’ reads:

Toggle snippet (3 lines)
HybridSleepState=disk

… meaning that “hybrid sleep” attempts to suspend-to-disk¹, something
that’s not implemented yet in Guix System².

Setting ‘hybrid-sleep-state’ to '("mem") doesn’t help though:
“CanHybridSleep” still returns “na”. I’m looking at ‘can_sleep_state’
in elogind without seeing why it doesn’t return true.

To be continued…

Ludo’.

² But it’s almost there! https://issues.guix.gnu.org/49475
L
L
Ludovic Courtès wrote on 31 Aug 16:46 +0200
(address . 57490@debbugs.gnu.org)
87wnaoa2tv.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> Setting ‘hybrid-sleep-state’ to '("mem") doesn’t help though:
> “CanHybridSleep” still returns “na”. I’m looking at ‘can_sleep_state’
> in elogind without seeing why it doesn’t return true.

Having changed elogind’s “LogLevel” to “debug” with a 1km-long
‘dbus-send’ command I’ll spare you, I got this in /var/log/debug for my
“CanHybridSleep” method call (in QEMU):

Toggle snippet (6 lines)
Aug 31 16:01:07 localhost elogind[183]: Got message type=method_call sender=:1.78 destination=org.freedesktop.login1 path=/org/freedesktop/login1 interface=org.freedesktop.login1.Manager member=CanHybridSleep cookie=2 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
Aug 31 16:01:07 localhost elogind[183]: Sleep mode "mem" is supported by the kernel.
Aug 31 16:01:07 localhost elogind[183]: No possible swap partitions or files suitable for hibernation were found in /proc/swaps.
Aug 31 16:01:07 localhost elogind[183]: Sent message type=method_return sender=n/a destination=:1.78 path=n/a interface=n/a member=n/a cookie=190 reply_cookie=2 signature=s error-name=n/a error-message=n/a

Closer inspection of the code confirms what we can guess from the above:
if swap space is missing, ‘can_sleep’ returns false, even if the chosen
sleep state is “mem”:

Toggle snippet (29 lines)
static int can_sleep_internal(const char *verb, bool check_allowed, const SleepConfig *sleep_config) {
bool allow;
char **modes = NULL, **states = NULL;
int r;

assert(STR_IN_SET(verb, "suspend", "hibernate", "hybrid-sleep", "suspend-then-hibernate"));

[...]

#if 0 /// elogind supports setting a suspend mode
if (!can_sleep_state(states) || !can_sleep_disk(modes))
return false;
#else // 0
if (!can_sleep_state(states) ||
!((strcmp("suspend", verb) && can_sleep_disk(modes)) ||
(streq("suspend", verb) && can_sleep_mem(modes))))
return false;
#endif // 0

if (streq(verb, "suspend"))
return true;

if (!enough_swap_for_hibernation())
return -ENOSPC;

return true;
}

(Specifically, ‘enough_swap_for_hibernation’ returns false when there’s
no space space.)

The caller:

Toggle snippet (25 lines)
static int method_can_shutdown_or_sleep(
Manager *m,
sd_bus_message *message,
InhibitWhat w,
const char *action,
const char *action_multiple_sessions,
const char *action_ignore_inhibit,
const char *sleep_verb,
sd_bus_error *error) {

[...]

if (sleep_verb) {
#if 0 /// elogind needs to have the manager being passed
r = can_sleep(sleep_verb);
#else // 0
r = can_sleep(m, sleep_verb);
#endif // 0
if (IN_SET(r, 0, -ENOSPC))
return sd_bus_reply_method_return(message, "s", "na");
if (r < 0)
return r;
}

I find it a bit ridiculous: if we’re choosing “mem”, then we shouldn’t
need to check for swap space.

However, given how ‘hybrid-sleep’ is documented¹, it’s not meant to be
implemented by suspend-to-RAM:

A low-power state where execution of the OS is paused, which might be
slow to enter, and on complete power loss does not result in lost data
but might be slower to exit in that case. This mode is called
suspend-to-both by the kernel.

So, as a conclusion, it would seem that everything here is working as
advertised: no swap, no hybrid-sleep. (We should probably document that
in the manual.)


One issue remains: UPower should have called elogind’s “PowerOff” method
for ordered shutdown before total power outage, but either that didn’t
happen or elogind didn’t do it right (which is weird, because ‘loginctl
poweroff’ DTRT.)

Thoughts?

Ludo’.

L
L
Ludovic Courtès wrote on 5 Sep 14:56 +0200
(address . 57490@debbugs.gnu.org)
87bkru3rq3.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (5 lines)
> One issue remains: UPower should have called elogind’s “PowerOff” method
> for ordered shutdown before total power outage, but either that didn’t
> happen or elogind didn’t do it right (which is weird, because ‘loginctl
> poweroff’ DTRT.)

I pushed these patches:

eedf71f948 * services: upower: Default to a percentage-based policy.
4765242540 * services: upower: Update default percentage values.

I’m not sure whether they help, but they might: using a
time-estimate-based policy is documented as less reliable, and I suppose
even less so when a battery gets old, as is the case on this laptop.

I’d like to test whether UPower invokes the intended critical action,
but I’m not sure how to simulate a low battery level. Thoughts?

Thanks,
Ludo’.
M
M
Maxime Devos wrote on 5 Sep 22:01 +0200
Re: bug#57490: UPower ignores ‘critical-pow er-action’
16fe99ab-4fa2-bcb1-e70b-b52bbc70c7f5@telenet.be
On 05-09-2022 14:56, Ludovic Courtès wrote:
Toggle quote (2 lines)
> I’d like to test whether UPower invokes the intended critical action,
> but I’m not sure how to simulate a low battery level. Thoughts?
I've found a 'Virtual Battery Driver': https://lwn.net/Articles/440097/
The gmane links are dead, but the linux source tree as a
drivers/power/supply/test_power.c that seems to be about the same thing.
Greetings,
Maxime.
Attachment: OpenPGP_signature
L
L
Ludovic Courtès wrote on 20 Sep 12:05 +0200
Re: bug#57490: UPower ignores ‘critical-power-a ction’
(address . 57490@debbugs.gnu.org)
875yhipdil.fsf@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (9 lines)
> I pushed these patches:
>
> eedf71f948 * services: upower: Default to a percentage-based policy.
> 4765242540 * services: upower: Update default percentage values.
>
> I’m not sure whether they help, but they might: using a
> time-estimate-based policy is documented as less reliable, and I suppose
> even less so when a battery gets old, as is the case on this laptop.

As those at the Ten Years of Guix event noticed, this didn’t help at all.

Ludo’.
L
L
Ludovic Courtès wrote 4 days ago
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 57490@debbugs.gnu.org)
87wn9i9rfm.fsf@gnu.org
Hi!

Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (10 lines)
> On 05-09-2022 14:56, Ludovic Courtès wrote:
>> I’d like to test whether UPower invokes the intended critical action,
>> but I’m not sure how to simulate a low battery level. Thoughts?
>
> I've found a 'Virtual Battery Driver': https://lwn.net/Articles/440097/
>
> The gmane links are dead, but the linux source tree as a
> drivers/power/supply/test_power.c that seems to be about the same
> thing.

I was able to give that a try. Since the default ‘percentage-action’ in
<upower-configuration> is 2, I did this (I configured syslogd to log to
/dev/console):

Toggle snippet (34 lines)
root@antelope ~# modprobe test-power ac_online=off usb_online=off battery_capacity=2 battery_status=discharging
modprobe test-power ac_online=off usb_online=off battery_capacity=2 battery_status=discharging
Oct 2 17:00:20 localhost vmunix: [ 44.294530] __power_supply_register: Expected proper parent device for 'test_ac'
Oct 2 17:00:20 localhost vmunix: [ 44.295049] __power_supply_register: Expected proper parent device for 'test_battery'
Oct 2 17:00:20 localhost vmunix: [ 44.295147] __power_supply_register: Expected proper parent device for 'test_usb'
root@antelope ~# Oct 2 17:00:20 localhost shepherd[1]: [upowerd]
Oct 2 17:00:20 localhost shepherd[1]: [upowerd] (upowerd:186): UPower-Linux-WARNING **: 17:00:20.457: no valid voltage value found for device /sys/devices/virtual/power_supply/test_battery, assuming 10V
Oct 2 17:00:20 localhost shepherd[1]: [upowerd]
Oct 2 17:00:20 localhost shepherd[1]: [upowerd] (upowerd:186): UPower-Linux-WARNING **: 17:00:20.473: USB power supply /sys/devices/virtual/power_supply/test_usb without usb_type property, please report


root@antelope ~# Oct 2 17:00:41 localhost elogind[185]: System is powering down..
Oct 2 17:00:41 localhost elogind[185]: System is powering down..
Oct 2 17:00:41 localhost shepherd[1]: Exiting shepherd...
Oct 2 17:00:41 localhost shepherd[1]: Service ntpd has been stopped.
Oct 2 17:00:41 localhost ntpd[184]: ntpd exiting on signal 15 (Terminated)
Oct 2 17:00:41 localhost avahi-daemon[191]: Got SIGTERM, quitting.
Oct 2 17:00:41 localhost avahi-daemon[191]: Leaving mDNS multicast group on interface ens3.IPv6 with address fec0::e35c:509d:d937:2087.
Oct 2 17:00:41 localhost avahi-daemon[191]: Leaving mDNS multicast group on interface ens3.IPv4 with address 10.0.2.15.
Oct 2 17:00:41 localhost shepherd[1]: Service avahi-daemon has been stopped.
Oct 2 17:00:41 localhost NetworkManager[183]: <info> [1664722841.2220] caught SIGTERM, shutting down normally.
Oct 2 17:00:41 localhost shepherd[1]: Service networking has been stopped.
Oct 2 17:00:41 localhost shepherd[1]: Service console-font-tty6 has been stopped.
Oct 2 17:00:41 localhost shepherd[1]: Service term-tty6 has been stopped.
Oct 2 17:00:41 localhost NetworkManager[183]: <info> [1664722841.2318] dhcp4 (ens3): canceled DHCP transaction
Oct 2 17:00:41 localhost NetworkManager[183]: <info> [1664722841.2318] dhcp4 (ens3): activation: beginning transaction (timeout in 45 seconds)
Oct 2 17:00:41 localhost NetworkManager[183]: <info> [1664722841.2318] dhcp4 (ens3): state changed no lease
Oct 2 17:00:41 localhost NetworkManager[183]: <info> [1664722841.2326] manager: NetworkManager state is now CONNECTED_SITE
Oct 2 17:00:41 localhost NetworkManager[183]: <info> [1664722841.2335] exiting (success)
Oct 2 17:00:41 localhost avahi-daemon[191]: Leaving mDNS multicast group on interface lo.IPv6 with address ::1.
Oct 2 17:00:41 localhost syslogd: exiting on signal 15
[ 70.670560] reboot: Power down

So 20s after a battery at 2% got plugged in, UPower told elogind to
turn off the VM, which told shepherd to do that—exactly as expected.

It’s kinda disappointing to debug something just to find out it’s
working as advertised. I think the conclusion is that 2% is too low a
threshold, at least for my laptop, or maybe the charge estimate is
slightly off on my laptop. I’ll raise that threshold and see if it
works correctly next time I run out of battery…

If someone has better ideas, I’d love to hear them!

Ludo’.
?