syslogd logging kernel messages slowly?

DoneSubmitted by Ludovic Courtès.
Details
2 participants
  • Ludovic Courtès
  • Luis Felipe
Owner
unassigned
Severity
normal
L
L
Ludovic Courtès wrote on 29 May 16:59 +0200
(address . bug-guix@gnu.org)
87y1yk8jwm.fsf@inria.fr
Hello,

I noticed this weird phenomenon on an x86_64 machine (a 2GHz Celeron),
seen in an excerpt of /var/log/messages:

Toggle snippet (34 lines)
May 29 12:34:49 localhost avahi-daemon[349]: avahi-daemon 0.8 starting up.
May 29 12:35:10 localhost vmunix: [ 1.391673] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 1
May 29 12:35:10 localhost vmunix: [ 1.392751] xhci_hcd 0000:00:14.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000000009810
May 29 12:35:10 localhost vmunix: [ 1.393050] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.17
May 29 12:35:10 localhost vmunix: [ 1.393057] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
May 29 12:35:10 localhost vmunix: [ 1.393062] usb usb1: Product: xHCI Host Controller
May 29 12:35:10 localhost vmunix: [ 1.393065] usb usb1: Manufacturer: Linux 5.17.11-gnu xhci-hcd
May 29 12:35:10 localhost vmunix: [ 1.393069] usb usb1: SerialNumber: 0000:00:14.0
May 29 12:35:11 localhost vmunix: [ 1.393645] hub 1-0:1.0: USB hub found
May 29 12:35:11 localhost vmunix: [ 1.393669] hub 1-0:1.0: 6 ports detected
May 29 12:35:11 localhost vmunix: [ 1.394524] xhci_hcd 0000:00:14.0: xHCI Host Controller
May 29 12:35:11 localhost vmunix: [ 1.394534] xhci_hcd 0000:00:14.0: new USB bus registered, assigned bus number 2
May 29 12:35:11 localhost vmunix: [ 1.394543] xhci_hcd 0000:00:14.0: Host supports USB 3.0 SuperSpeed
May 29 12:35:11 localhost vmunix: [ 1.394652] usb usb2: New USB device found, idVendor=1d6b, idProduct=0003, bcdDevice= 5.17
May 29 12:34:50 localhost elogind[341]: Watching system buttons on /dev/input/event4 (Darfon HP USB Keyboard)
May 29 12:35:11 localhost vmunix: [ 1.394659] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
May 29 12:35:11 localhost vmunix: [ 1.394664] usb usb2: Product: xHCI Host Controller
May 29 12:35:11 localhost vmunix: [ 1.394667] usb usb2: Manufacturer: Linux 5.17.11-gnu xhci-hcd
May 29 12:35:11 localhost vmunix: [ 1.394670] usb usb2: SerialNumber: 0000:00:14.0
May 29 12:35:11 localhost vmunix: [ 1.395217] hub 2-0:1.0: USB hub found
May 29 12:35:12 localhost vmunix: [ 1.395243] hub 2-0:1.0: 1 port detected
May 29 12:35:12 localhost vmunix: [ 1.395581] i8042: PNP: No PS/2 controller found.
May 29 12:35:12 localhost vmunix: [ 1.395584] i8042: Probing ports directly.
May 29 12:35:12 localhost vmunix: [ 1.396141] serio: i8042 KBD port at 0x60,0x64 irq 1
May 29 12:35:12 localhost vmunix: [ 1.396153] serio: i8042 AUX port at 0x60,0x64 irq 12
May 29 12:35:12 localhost vmunix: [ 1.396481] mousedev: PS/2 mouse device common for all mice
May 29 12:35:12 localhost vmunix: [ 1.396980] rtc_cmos 00:00: RTC can wake from S4
May 29 12:35:12 localhost vmunix: [ 1.397736] rtc_cmos 00:00: registered as rtc0
May 29 12:35:12 localhost vmunix: [ 1.397770] rtc_cmos 00:00: setting system clock to 2022-05-29T10:33:31 UTC (1653820411)
May 29 12:35:13 localhost vmunix: [ 1.397817] rtc_cmos 00:00: alarms up to one month, y3k, 242 bytes nvram
May 29 12:35:13 localhost vmunix: [ 1.397843] i2c_dev: i2c /dev entries driver
May 29 12:34:52 localhost NetworkManager[339]: <info> [1653820492.6955] manager[0xfdb000]: monitoring kernel firmware directory '/lib/firmware'.

The “vmunix” messages take 3 seconds to get logged, although they
correspond to events that occurred in ~0.5 second before syslogd was
started.

During boot up on that machine, if you go to tty12, where syslogd relays
messages, you can indeed see those lines getting displayed pretty slowly.

Has anyone experienced that?

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote on 2 Jun 23:02 +0200
Re: bug#55488: GDM, GNOME: Can't start desktop session after upgrade
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
87o7zalqxw.fsf@gnu.org
Hola,

Luis Felipe <luis.felipe.la@protonmail.com> skribis:

Toggle quote (8 lines)
>> Could it be https://issues.guix.gnu.org/55707?At boot time, if you
>> switch to tty12 (by pressing alt-f12) as soon as it’s available, do you
>> see messages that get printed pretty slowly, like 4–5 lines per second?
>
> Maybe? I recorded a video. Image is low quality, but speed is accurate (I changed to tty12 at second 33):
>
> https://luis-felipe.gitlab.io/media/2022/06/guix-system-ab98b51e-boot-TTY12.webm

Oh yes, that’s exactly what I observed on another machine recently.
I’ll trace to investigate further when I can access it.

In the meantime, could you try the following config, which disables
logging on tty12, to see if it boots quicker? The config changes is
twofold; first, after the ‘use-modules’ form of your OS config, add:

Toggle snippet (15 lines)
(define %syslog.conf
;; Custom syslog.conf without /dev/tty12 logging.
(plain-file "custom-syslog.conf"
(let loop ((lines (string-split (plain-file-content %default-syslog.conf)
#\newline))
(result '()))
(match lines
(()
(string-join (reverse result) "\n"))
((head . tail)
(if (string-contains head "/dev/tty12")
(loop tail result) ;drop this line
(loop tail (cons head result))))))))

Second, in the ‘services’ field of your config, replace
‘%desktop-services’ with:

Toggle snippet (6 lines)
(modify-services %desktop-services
(syslog-service-type
_ => (syslog-configuration
(config-file %syslog.conf))))

Dunno if it’s silly, but at least it’ll allow us to check whether it’s
just the framebuffer that’s slowing things down or if it’s something
else.

Thanks for your help!

Ludo’.
L
L
Luis Felipe wrote on 3 Jun 01:51 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)
L6PGP_GXG5oaCpEsXGwEZrfD8cGc19YGAbpGZDDDvmjpgYbLQivsfbawzBf6eHebs_POoLS0kD22OeP7zUWvhMdzeobsSkGSH5PYjgqSEU8=@protonmail.com
Salut,

On Thursday, June 2nd, 2022 at 21:02, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (12 lines)
> In the meantime, could you try the following config, which disables
> logging on tty12, to see if it boots quicker? The config changes is
> twofold; first, after the ‘use-modules’ form of your OS config, add:
>

> [...]
>

> Dunno if it’s silly, but at least it’ll allow us to check whether it’s
> just the framebuffer that’s slowing things down or if it’s something
> else.

It seems nothing changed. I'm still unable to start a GNOME session, and booting time is about the same. I recorded a new video:

Attachment: signature.asc
L
L
Ludovic Courtès wrote on 4 Jun 00:54 +0200
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
87wndxfjcw.fsf@gnu.org
Hi,

Luis Felipe <luis.felipe.la@protonmail.com> skribis:

Toggle quote (2 lines)
> It seems nothing changed. I'm still unable to start a GNOME session, and booting time is about the same.

Indeed.

Here’s another debugging trick; would be great if you could try this:

Toggle snippet (9 lines)
(define strace-syslogd
(program-file "strace-syslogd"
#~(apply execl #$(file-append strace "/bin/strace")
"strace" ;argv[0]
"-f" "-Tt" "-o" "/syslogd.log" "-s" "80"
#$(file-append inetutils "/libexec/syslogd")
(cdr (command-line)))))

and then:

Toggle snippet (6 lines)
(modify-services %desktop-services
(syslog-service-type
_ => (syslog-configuration
(syslogd strace-syslogd))))

This creates a log file, /syslogd.log, which will allow us to see the
time it takes syslogd to read from /proc/kmsg and hopefully to determine
the origin of delays.

TIA!

Ludo’.
L
L
Ludovic Courtès wrote on 4 Jun 19:07 +0200
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
878rqcfjcn.fsf@gnu.org
Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (22 lines)
> Here’s another debugging trick; would be great if you could try this:
>
> (define strace-syslogd
> (program-file "strace-syslogd"
> #~(apply execl #$(file-append strace "/bin/strace")
> "strace" ;argv[0]
> "-f" "-Tt" "-o" "/syslogd.log" "-s" "80"
> #$(file-append inetutils "/libexec/syslogd")
> (cdr (command-line)))))
>
>
> and then:
>
> (modify-services %desktop-services
> (syslog-service-type
> _ => (syslog-configuration
> (syslogd strace-syslogd))))
>
> This creates a log file, /syslogd.log, which will allow us to see the
> time it takes syslogd to read from /proc/kmsg and hopefully to determine
> the origin of delays.

I tried this on a machine I have access to that exhibits this slowness,
and here’s what I get (excerpt that spans 2+ seconds of syslogd
activity):

Toggle snippet (104 lines)
328 18:46:13 openat(AT_FDCWD, "/dev/console", O_WRONLY|O_CREAT|O_APPEND, 0644) = 4 <0.000099>
328 18:46:13 openat(AT_FDCWD, "/var/log/messages", O_WRONLY|O_CREAT|O_APPEND, 0644) = 5 <0.000075>
328 18:46:13 ioctl(5, TCGETS, 0x7ffe23d6c930) = -1 ENOTTY (Inappropriate ioctl for device) <0.000261>
328 18:46:13 openat(AT_FDCWD, "/var/log/debug", O_WRONLY|O_CREAT|O_APPEND, 0644) = 6 <0.000201>
328 18:46:13 ioctl(6, TCGETS, 0x7ffe23d6c930) = -1 ENOTTY (Inappropriate ioctl for device) <0.000138>
328 18:46:13 openat(AT_FDCWD, "/dev/tty12", O_WRONLY|O_CREAT|O_APPEND, 0644) = 7 <0.001059>
328 18:46:13 ioctl(7, TCGETS, {B38400 opost isig icanon echo ...}) = 0 <0.000101>
328 18:46:13 openat(AT_FDCWD, "/var/log/secure", O_WRONLY|O_CREAT|O_APPEND, 0644) = 8 <0.000077>
328 18:46:13 ioctl(8, TCGETS, 0x7ffe23d6c930) = -1 ENOTTY (Inappropriate ioctl for device) <0.000039>
328 18:46:13 openat(AT_FDCWD, "/var/log/maillog", O_WRONLY|O_CREAT|O_APPEND, 0644) = 9 <0.000070>

[…]

328 18:46:13 read(3, "<5>[ 0.000000] Linux version 5.17.11-gnu (guix@guix) (gcc (GCC) 10.3.0, GNU l"..., 1024) = 981 <0.000083>
328 18:46:13 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000059>
322 18:46:13 +++ exited with 0 +++
328 18:46:13 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000049>
328 18:46:13 writev(7, [{iov_base="Jun 4 18:46:13", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Linux version 5.17.11-gnu (guix@guix) (gcc (GCC) 10.3.0, "..., iov_len=124}, {iov_base="\r\n", iov_len=2}], 6) = 152 <0.000086>
328 18:46:13 fsync(7) = -1 EINVAL (Invalid argument) <0.000036>
328 18:46:13 writev(6, [{iov_base="Jun 4 18:46:13", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Linux version 5.17.11-gnu (guix@guix) (gcc (GCC) 10.3.0, "..., iov_len=124}, {iov_base="\n", iov_len=1}], 6) = 151 <0.000063>
328 18:46:13 fsync(6) = 0 <0.380857>
328 18:46:13 writev(5, [{iov_base="Jun 4 18:46:13", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Linux version 5.17.11-gnu (guix@guix) (gcc (GCC) 10.3.0, "..., iov_len=124}, {iov_base="\n", iov_len=1}], 6) = 151 <0.000079>
328 18:46:13 fsync(5) = 0 <0.131763>
328 18:46:13 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000213>
328 18:46:13 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000160>
328 18:46:13 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000049>
328 18:46:13 writev(7, [{iov_base="Jun 4 18:46:13", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Command line: BOOT_IMAGE=/gnu/store/w8py29cnikbg69jvxhxb3"..., iov_len=314}, {iov_base="\r\n", iov_len=2}], 6) = 342 <0.000123>
328 18:46:13 fsync(7) = -1 EINVAL (Invalid argument) <0.000040>
328 18:46:13 writev(6, [{iov_base="Jun 4 18:46:13", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Command line: BOOT_IMAGE=/gnu/store/w8py29cnikbg69jvxhxb3"..., iov_len=314}, {iov_base="\n", iov_len=1}], 6) = 341 <0.000074>
328 18:46:13 fsync(6) = 0 <0.239999>
328 18:46:13 writev(5, [{iov_base="Jun 4 18:46:13", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Command line: BOOT_IMAGE=/gnu/store/w8py29cnikbg69jvxhxb3"..., iov_len=314}, {iov_base="\n", iov_len=1}], 6) = 341 <0.000156>
328 18:46:13 fsync(5) = 0 <0.099143>
328 18:46:14 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000128>
328 18:46:14 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000180>
328 18:46:14 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000100>
328 18:46:14 writev(7, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] KERNEL supported cpus:", iov_len=45}, {iov_base="\r\n", iov_len=2}], 6) = 73 <0.000189>
328 18:46:14 fsync(7) = -1 EINVAL (Invalid argument) <0.000039>
328 18:46:14 writev(6, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] KERNEL supported cpus:", iov_len=45}, {iov_base="\n", iov_len=1}], 6) = 72 <0.000097>
328 18:46:14 fsync(6) = 0 <0.198157>
328 18:46:14 writev(5, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] KERNEL supported cpus:", iov_len=45}, {iov_base="\n", iov_len=1}], 6) = 72 <0.000284>
328 18:46:14 fsync(5) = 0 <0.098527>
328 18:46:14 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000136>
328 18:46:14 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000133>
328 18:46:14 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000488>
328 18:46:14 writev(7, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Intel GenuineIntel", iov_len=43}, {iov_base="\r\n", iov_len=2}], 6) = 71 <0.000213>
328 18:46:14 fsync(7) = -1 EINVAL (Invalid argument) <0.000141>
328 18:46:14 writev(6, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Intel GenuineIntel", iov_len=43}, {iov_base="\n", iov_len=1}], 6) = 70 <0.000219>
328 18:46:14 fsync(6) = 0 <0.096564>
328 18:46:14 writev(5, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Intel GenuineIntel", iov_len=43}, {iov_base="\n", iov_len=1}], 6) = 70 <0.000215>
328 18:46:14 fsync(5) = 0 <0.125492>
328 18:46:14 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000128>
328 18:46:14 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000130>
328 18:46:14 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000252>
328 18:46:14 writev(7, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] AMD AuthenticAMD", iov_len=41}, {iov_base="\r\n", iov_len=2}], 6) = 69 <0.000368>
328 18:46:14 fsync(7) = -1 EINVAL (Invalid argument) <0.000130>
328 18:46:14 writev(6, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] AMD AuthenticAMD", iov_len=41}, {iov_base="\n", iov_len=1}], 6) = 68 <0.000402>
328 18:46:14 fsync(6) = 0 <0.112798>
328 18:46:14 writev(5, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] AMD AuthenticAMD", iov_len=41}, {iov_base="\n", iov_len=1}], 6) = 68 <0.000199>
328 18:46:14 fsync(5) = 0 <0.098952>
328 18:46:14 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000244>
328 18:46:14 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000128>
328 18:46:14 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000252>
328 18:46:14 writev(7, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Hygon HygonGenuine", iov_len=43}, {iov_base="\r\n", iov_len=2}], 6) = 71 <0.000375>
328 18:46:14 fsync(7) = -1 EINVAL (Invalid argument) <0.000039>
328 18:46:14 writev(6, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Hygon HygonGenuine", iov_len=43}, {iov_base="\n", iov_len=1}], 6) = 70 <0.000099>
328 18:46:14 fsync(6) = 0 <0.113492>
328 18:46:14 writev(5, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Hygon HygonGenuine", iov_len=43}, {iov_base="\n", iov_len=1}], 6) = 70 <0.000998>
328 18:46:14 fsync(5) = 0 <0.106071>
328 18:46:15 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000220>
328 18:46:15 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000115>
328 18:46:15 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000115>
328 18:46:15 writev(7, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Centaur CentaurHauls", iov_len=45}, {iov_base="\r\n", iov_len=2}], 6) = 73 <0.000262>
328 18:46:15 fsync(7) = -1 EINVAL (Invalid argument) <0.000048>
328 18:46:15 writev(6, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Centaur CentaurHauls", iov_len=45}, {iov_base="\n", iov_len=1}], 6) = 72 <0.000076>
328 18:46:15 fsync(6) = 0 <0.080533>
328 18:46:15 writev(5, [{iov_base="Jun 4 18:46:14", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] Centaur CentaurHauls", iov_len=45}, {iov_base="\n", iov_len=1}], 6) = 72 <0.000198>
328 18:46:15 fsync(5) = 0 <0.107653>
328 18:46:15 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000051>
328 18:46:15 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000055>
328 18:46:15 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000093>
328 18:46:15 writev(7, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] zhaoxin Shanghai ", iov_len=45}, {iov_base="\r\n", iov_len=2}], 6) = 73 <0.000197>
328 18:46:15 fsync(7) = -1 EINVAL (Invalid argument) <0.000042>
328 18:46:15 writev(6, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] zhaoxin Shanghai ", iov_len=45}, {iov_base="\n", iov_len=1}], 6) = 72 <0.000123>
328 18:46:15 fsync(6) = 0 <0.123659>
328 18:46:15 writev(5, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] zhaoxin Shanghai ", iov_len=45}, {iov_base="\n", iov_len=1}], 6) = 72 <0.000370>
328 18:46:15 fsync(5) = 0 <0.098383>
328 18:46:15 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000138>
328 18:46:15 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000128>
328 18:46:15 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000254>
328 18:46:15 writev(7, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] x86/fpu: x87 FPU will use FXSAVE", iov_len=55}, {iov_base="\r\n", iov_len=2}], 6) = 83 <0.000383>
328 18:46:15 fsync(7) = -1 EINVAL (Invalid argument) <0.000135>
328 18:46:15 writev(6, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] x86/fpu: x87 FPU will use FXSAVE", iov_len=55}, {iov_base="\n", iov_len=1}], 6) = 82 <0.000209>
328 18:46:15 fsync(6) = 0 <0.122001>
328 18:46:15 writev(5, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] x86/fpu: x87 FPU will use FXSAVE", iov_len=55}, {iov_base="\n", iov_len=1}], 6) = 82 <0.000197>
328 18:46:15 fsync(5) = 0 <0.123772>
328 18:46:15 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000102>
328 18:46:15 rt_sigprocmask(SIG_BLOCK, [HUP ALRM], [], 8) = 0 <0.000140>
328 18:46:15 newfstatat(AT_FDCWD, "/etc/localtime", {st_mode=S_IFREG|0444, st_size=2962, ...}, 0) = 0 <0.000167>
328 18:46:15 writev(7, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] signal: max sigframe size: 1440", iov_len=54}, {iov_base="\r\n", iov_len=2}], 6) = 82 <0.000273>
328 18:46:15 fsync(7) = -1 EINVAL (Invalid argument) <0.000129>
328 18:46:15 writev(6, [{iov_base="Jun 4 18:46:15", iov_len=15}, {iov_base=" ", iov_len=1}, {iov_base="localhost", iov_len=9}, {iov_base=" ", iov_len=1}, {iov_base="vmunix: [ 0.000000] signal: max sigframe size: 1440", iov_len=54}, {iov_base="\n", iov_len=1}], 6) = 81 <0.000099>
328 18:46:15 fsync(6) = 0 <0.121488>

During that time span, syslogd makes no less than 19 ‘fsync’ calls (not
counting the EINVAL ones), each of which takes between 100ms and
400ms—no wonder it’s slow. (This machine has a low-grade spinning HDD.)

There are two things we can do:

1. Use ‘fdatasync’ rather than ‘fsync’;

2. Explicitly disable syncing for some of the log files by prepending
a hyphen right before the file name in syslogd.conf (info
"(inetutils) syslogd invocation").

I’ll give that a try and report back.

Ludo’.
L
L
Ludovic Courtès wrote on 4 Jun 19:35 +0200
Re: bug#55707: syslogd logging kernel messages slowly?
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
874k10fi1d.fsf_-_@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (2 lines)
> 1. Use ‘fdatasync’ rather than ‘fsync’;

Turns out ‘fdatasync’ is about as expensive:

Toggle snippet (13 lines)
1234:301 19:22:00 fdatasync(6) = 0 <0.398749>
1236:301 19:22:01 fdatasync(5) = 0 <0.198520>
1243:301 19:22:01 fdatasync(6) = 0 <0.231935>
1245:301 19:22:01 fdatasync(5) = 0 <0.183503>
1252:301 19:22:01 fdatasync(6) = 0 <0.230854>
1254:301 19:22:02 fdatasync(5) = 0 <0.248839>
1261:301 19:22:02 fdatasync(6) = 0 <0.181513>
1263:301 19:22:02 fdatasync(5) = 0 <0.141267>
1270:301 19:22:02 fdatasync(6) = 0 <0.131548>
1272:301 19:22:02 fdatasync(5) = 0 <0.124369>
1279:301 19:22:02 fdatasync(6) = 0 <0.121414>

Tested with:

Toggle snippet (11 lines)
(define inetutils/fdatasync
(package/inherit inetutils
(version (string-append (package-version inetutils) ".fdatasync"))
(source (origin
(inherit (package-source inetutils))
(modules '((guix build utils)))
(snippet
'(substitute* "src/syslogd.c"
(("fsync") "fdatasync")))))))

It still makes more sense than ‘fsync’ though.

Ludo’.
L
L
Ludovic Courtès wrote on 4 Jun 21:30 +0200
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
87sfokdy5r.fsf_-_@gnu.org
Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (4 lines)
> 2. Explicitly disable syncing for some of the log files by prepending
> a hyphen right before the file name in syslogd.conf (info
> "(inetutils) syslogd invocation").

Tried this (in addition to ‘fdatasync’):

Toggle snippet (25 lines)
(define %default-syslog.conf
(plain-file "syslog.conf" "
# Log all error messages, authentication messages of
# level notice or higher and anything of level err or
# higher to the console.
# Don't log private authentication messages!
*.alert;auth.notice;authpriv.none -/dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none /var/log/messages

*.=debug -/var/log/debug

# Same, in a different place.
*.info;mail.none;authpriv.none -/dev/tty12

# The authpriv file has restricted access.
authpriv.* /var/log/secure

# Log all the mail messages in one place.
mail.* -/var/log/maillog
"))

In addition to the hyphens, notice that now only debugging info goes to
/var/log/debug (previously everything going to /var/log/messages would
also go to /var/log/debug, which is kinda silly).

This halves the number of ‘fdatasync’ calls and thus doubles the
throughput, but we’re still spending ~0.1s on ‘fdatasync’ for each line,
which is slow.

So the next solution is to not ‘fdatasync’ on /var/log/messages either:

Toggle snippet (27 lines)
(define %default-syslog.conf
(plain-file "syslog.conf" "
# Log all error messages, authentication messages of
# level notice or higher and anything of level err or
# higher to the console.
# Don't log private authentication messages!
*.alert;auth.notice;authpriv.none -/dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none -/var/log/messages

# Log \"debug\"-level entries and nothing else.
*.=debug -/var/log/debug

# Same, in a different place.
*.info;mail.none;authpriv.none -/dev/tty12

# The authpriv file has restricted access.
# 'fsync' the file after each line.
authpriv.* /var/log/secure

# Log all the mail messages in one place.
mail.* -/var/log/maillog
"))

With this, the system boots much more quickly.

I’m tempted to commit the above syslog.conf. From a look at the code
and doc, it seems that neither rsyslog nor systemd-journald does
anything like “fsync after each logged line”, so the config above sounds
like a reasonable default (at any rate, the current default is
definitely unreasonable on slow spinning disks).

I’m curious to see if it fixes your other boot-up issues, timeouts and
all, Luis.

Ludo’.
L
L
Luis Felipe wrote on 5 Jun 15:19 +0200
Re: bug#55488: GDM, GNOME: Can't start desktop session after upgrade
(name . Ludovic Courtès)(address . ludo@gnu.org)
lqIwB62_rCWJQZnFFvEVMObQSr8a05iCinuAj7TTPt1NVNWXRWqIn6ombMUqcXPOjp1gazR9itgutgU_oW5X5YVHN6-PIUJ_4CfGPQAv9oA=@protonmail.com
Howdy,


On Friday, June 3rd, 2022 at 22:54, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (39 lines)
> Here’s another debugging trick; would be great if you could try this:
>

> --8<---------------cut here---------------start------------->8---
>

> (define strace-syslogd
> (program-file "strace-syslogd"
> #~(apply execl #$(file-append strace "/bin/strace")
> "strace" ;argv[0]
> "-f" "-Tt" "-o" "/syslogd.log" "-s" "80"
> #$(file-append inetutils "/libexec/syslogd")
> (cdr (command-line)))))
> --8<---------------cut here---------------end--------------->8---
>

>

> and then:
>

> --8<---------------cut here---------------start------------->8---
>

> (modify-services %desktop-services
> (syslog-service-type
> _ => (syslog-configuration
>

> (syslogd strace-syslogd))))
> --8<---------------cut here---------------end--------------->8---
>

>

> This creates a log file, /syslogd.log, which will allow us to see the
> time it takes syslogd to read from /proc/kmsg and hopefully to determine
> the origin of delays.

I see you already tried this on the other machine, but I gave this a try too just in case and sent the log to you privately. I'll try the other tricks next.
Attachment: signature.asc
L
L
Luis Felipe wrote on 5 Jun 17:25 +0200
Re: bug#55707: syslogd logging kernel messages slowly?
(name . Ludovic Courtès)(address . ludo@gnu.org)
YDcBv9iInLBBpElUUnxpfLjIK1LXG1wb_A5c59GZ9fBoegUo0a3aQ8mA38hlKalWZx-yLgSQ-Yuc2bC_1VWE18eoPOv2qnDOtNC1dd_5UUY=@protonmail.com
On Saturday, June 4th, 2022 at 19:30, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (3 lines)
> I’m curious to see if it fixes your other boot-up issues, timeouts and
> all, Luis.

Do I simply put both definitions (inetutils/fdatasync and %default-syslog.conf) in my system config and reconfigure...? Because I tried that and nothing seems to change.

Or, once defined, do I have to use inetutils/fdatasync and %default-syslog.conf somewhere in my system packages...?
Attachment: signature.asc
L
L
Ludovic Courtès wrote on 6 Jun 12:00 +0200
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
87r142cdry.fsf@gnu.org
Hi,

Luis Felipe <luis.felipe.la@protonmail.com> skribis:

Toggle quote (9 lines)
> On Saturday, June 4th, 2022 at 19:30, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> I’m curious to see if it fixes your other boot-up issues, timeouts and
>> all, Luis.
>
> Do I simply put both definitions (inetutils/fdatasync and %default-syslog.conf) in my system config and reconfigure...? Because I tried that and nothing seems to change.
>
> Or, once defined, do I have to use inetutils/fdatasync and %default-syslog.conf somewhere in my system packages...?

Sorry, I wasn’t clear. Let’s forget about ‘inetutils/fdatasync’ because
it doesn’t have a measurable impact AFAICS.

In your config file, you first put:

Toggle snippet (27 lines)
(define %new-default-syslog.conf
(plain-file "syslog.conf" "
# Log all error messages, authentication messages of
# level notice or higher and anything of level err or
# higher to the console.
# Don't log private authentication messages!
*.alert;auth.notice;authpriv.none -/dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none -/var/log/messages

# Log \"debug\"-level entries and nothing else.
*.=debug -/var/log/debug

# Same, in a different place.
*.info;mail.none;authpriv.none -/dev/tty12

# The authpriv file has restricted access.
# 'fsync' the file after each line.
authpriv.* /var/log/secure

# Log all the mail messages in one place.
mail.* -/var/log/maillog
"))

Then, you replace ‘%desktop-services’ with:

Toggle snippet (6 lines)
(modify-services %desktop-services
(syslog-service-type
_ => (syslog-configuration
(config-file %new-default-syslog.conf))))

Thanks for testing!

Ludo’.
L
L
Luis Felipe wrote on 6 Jun 18:31 +0200
(name . Ludovic Courtès)(address . ludo@gnu.org)
Fg9njIuG4A2FgdRgqnQTkiAs2y-P96sAEd-_lJ2_qjY6CioHRy3e074QMwDiUnlifipEfMfkN2T3zwXI28_2oQXh_6zjabCtsoNxglzWoNs=@protonmail.com
Hey,

On Monday, June 6th, 2022 at 10:00, Ludovic Courtès <ludo@gnu.org> wrote:

Toggle quote (39 lines)
> > > I’m curious to see if it fixes your other boot-up issues, timeouts and
> > > all, Luis.
> >

> > Do I simply put both definitions (inetutils/fdatasync and %default-syslog.conf) in my system config and reconfigure...? Because I tried that and nothing seems to change.
> >

> > Or, once defined, do I have to use inetutils/fdatasync and %default-syslog.conf somewhere in my system packages...?
>

> Sorry, I wasn’t clear. Let’s forget about ‘inetutils/fdatasync’ because
> it doesn’t have a measurable impact AFAICS.
>

> In your config file, you first put:
>

> --8<---------------cut here---------------start------------->8---
> (define %new-default-syslog.conf
> [...]
> --8<---------------cut here---------------end--------------->8---
>

>

> Then, you replace ‘%desktop-services’ with:
>

> --8<---------------cut here---------------start------------->8---
>

> (modify-services %desktop-services
> (syslog-service-type
> _ => (syslog-configuration
>

> (config-file %new-default-syslog.conf))))
> --8<---------------cut here---------------end--------------->8---

This does reduce the time it takes to get to GDM log in screen from GRUB. It used to be around 2:35 minutes, but now it takes around 1:25 minutes, which is nice.

Unfortunately, though, I still can't start a GNOME session. I logged in, and stared at a blackish screen for about 5 minutes. Nothing happened :)
Attachment: signature.asc
L
L
Luis Felipe wrote on 6 Jun 20:11 +0200
Re: bug#55488: GDM, GNOME: Can't start desktop session after upgrade
(name . Ludovic Courtès)(address . ludo@gnu.org)
ak52RzsNYnyNcDjSO47mtUWBawvNYhxwdlQl_qoMIzg0CaaYn8If97e-jG_7yg5jeWnqS_-Sjdbz2w2su48alODaQWzeLoavKEwS6FUj1_U=@protonmail.com
It turns out the problem is related to user services. I just disabled my user services, rebooted, and GNOME session started.

I only have one user service defined in ~/.config/shepherd/init.scm as follows

~~~
;;; Definition of my user-level services.
(use-modules (shepherd service))


;;; SERVICE DEFINITIONS
;;; ===================

(define rygel
(make <service>
#:provides '(rygel)
#:docstring "Run the Rygel media server."
#:start (make-forkexec-constructor '("rygel"))
#:stop (make-kill-destructor)
#:respawn? #t))


;;; REGISTER SERVICES
;;; =================

(register-services rygel)


;;; START SERVICES
;;; ==============

(start rygel)

;;; Send shepherd to the background.
(action 'shepherd 'daemonize)
~~~

And I call "shepherd" in the ~/.profile file. This never posed a problem until my recent system generations from April to date, though, where I was never able to start a GNOME session. So I wonder what's going on now...

For what it's worth, the only error I see in ~/.config/shepherd/shepherd.log is this

2022-04-07 11:01:57 system-error("getpgid" "~A" ("No existe el proceso") (3))

"No existe el proceso" meaning, "The process doesn't exist". This error has occurred several times since 2021, though.
Attachment: signature.asc
L
L
Ludovic Courtès wrote on 6 Jun 22:46 +0200
Re: bug#55707: syslogd logging kernel messages slowly?
(name . Luis Felipe)(address . luis.felipe.la@protonmail.com)
87k09tbjuw.fsf@gnu.org
Hi,

Luis Felipe <luis.felipe.la@protonmail.com> skribis:

Toggle quote (2 lines)
> This does reduce the time it takes to get to GDM log in screen from GRUB. It used to be around 2:35 minutes, but now it takes around 1:25 minutes, which is nice.

Alright, so commit 264ca9452fae827d6621b28b8972f4b1d68401a1 closes this
particular issue (#55707).

Thanks,
Ludo’.
Closed
?
Your comment

This issue is archived.

To comment on this conversation send email to 55707@debbugs.gnu.org