ntpd segfaults at boot (take 2)

  • Open
  • quality assurance status badge
Details
2 participants
  • Danny Milosavljevic
  • Fredrik Salomonsson
Owner
unassigned
Submitted by
Fredrik Salomonsson
Severity
normal
F
F
Fredrik Salomonsson wrote on 24 Nov 01:32 +0100
(address . bug-guix@gnu.org)
87iksdii8t.fsf@posteo.net
Hi,

Similar to issue 73873 [0], I'm also seeing ntpd segfaults at boot and
it looks to be due to ipv6:
In /var/log/messages
Toggle snippet (27 lines)
Nov 23 16:13:41 localhost ntpd[1019]: ntpd 4.2.8p18@1.4062-o Thu Jan 1 00:00:01 UTC 1970 (1): Starting
Nov 23 16:13:41 localhost ntpd[1019]: Command line: /gnu/store/s4ra0g0ym1q1wh5jrqs60092x1nrb8h9-ntp-4.2.8p18/bin/ntpd -n -c /gnu/store/ghh3m9wzraszf7p4ynac006x96svddbq-ntpd.conf -u ntpd -g
Nov 23 16:13:41 localhost ntpd[1019]: ----------------------------------------------------
Nov 23 16:13:41 localhost ntpd[1019]: ntp-4 is maintained by Network Time Foundation,
Nov 23 16:13:41 localhost ntpd[1019]: Inc. (NTF), a non-profit 501(c)(3) public-benefit
Nov 23 16:13:41 localhost ntpd[1019]: corporation. Support and training for ntp-4 are
Nov 23 16:13:41 localhost ntpd[1019]: available at https://www.nwtime.org/support
Nov 23 16:13:41 localhost ntpd[1019]: ----------------------------------------------------
Nov 23 16:13:41 localhost ntpd[1019]: DEBUG behavior is enabled - a violation of any diagnostic assertion will cause ntpd to abort
Nov 23 16:13:41 localhost ntpd[1019]: proto: precision = 0.040 usec (-24)
Nov 23 16:13:41 localhost ntpd[1019]: baseday_set_day: invalid day (25556), UNIX epoch substituted
Nov 23 16:13:41 localhost ntpd[1019]: basedate set to 1970-01-01
Nov 23 16:13:41 localhost ntpd[1019]: gps base set to 1980-01-06 (week 0)
Nov 23 16:13:41 localhost ntpd[1019]: Listen and drop on 0 v6wildcard [::]:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen and drop on 1 v4wildcard 0.0.0.0:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 2 lo 127.0.0.1:123
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 3 enp37s0 192.168.1.8:123
Nov 23 16:13:41 localhost vmunix: [ 22.648239] ntpd[1019]: segfault at 24 ip 000055fe102ab29b sp 00007ffc26382ca0 error 4 in ntpd[7f29b,55fe1023e000+86000] likely on CPU 0 (core 0, socket 0)
Nov 23 16:13:41 localhost ntpd[1019]: Listen normally on 4 lo [::1]:123
Nov 23 16:13:41 localhost vmunix: [ 22.649529] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
Nov 23 16:13:41 localhost ntpd[1019]: bind(21) AF_INET6 [2001:REDACTED:cedf]:123 flags 0x11 failed: Cannot assign requested address
Nov 23 16:13:41 localhost ntpd[1019]: unable to create socket on enp37s0 (5) for [2001:REDACTED:cedf]:123
Nov 23 16:13:41 localhost shepherd[1]: Service ntpd (PID 1019) terminated with signal 11.
Nov 23 16:13:41 localhost shepherd[1]: Service ntpd has been disabled.
Nov 23 16:13:41 localhost shepherd[1]: (Respawning too fast.)

And `sudo dmesg`:

Toggle snippet (16 lines)
[ 21.871447] ntpd[954]: segfault at 24 ip 000055abbdf0029b sp 00007ffebf673770 error 4 in ntpd[7f29b,55abbde93000+86000] likely on CPU 7 (core 9, socket 0)
[ 21.871453] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[ 22.002809] ntpd[1005]: segfault at 24 ip 000055ac349d229b sp 00007fff8be14a00 error 4 in ntpd[7f29b,55ac34965000+86000] likely on CPU 12 (core 0, socket 0)
[ 22.002863] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[ 22.131272] ntpd[1008]: segfault at 24 ip 0000556dc1ad529b sp 00007ffef46b9d50 error 4 in ntpd[7f29b,556dc1a68000+86000] likely on CPU 3 (core 3, socket 0)
[ 22.132111] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[ 22.264012] ntpd[1011]: segfault at 24 ip 000055e02824f29b sp 00007fffa1e29970 error 4 in ntpd[7f29b,55e0281e2000+86000] likely on CPU 4 (core 4, socket 0)
[ 22.264019] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[ 22.390893] ntpd[1014]: segfault at 24 ip 0000555b2757129b sp 00007ffe2d0ea050 error 4 in ntpd[7f29b,555b27504000+86000] likely on CPU 4 (core 4, socket 0)
[ 22.390898] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[ 22.517794] ntpd[1016]: segfault at 24 ip 000056387455529b sp 00007ffde75cabf0 error 4 in ntpd[7f29b,5638744e8000+86000] likely on CPU 4 (core 4, socket 0)
[ 22.518953] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57
[ 22.648239] ntpd[1019]: segfault at 24 ip 000055fe102ab29b sp 00007ffc26382ca0 error 4 in ntpd[7f29b,55fe1023e000+86000] likely on CPU 0 (core 0, socket 0)
[ 22.649529] Code: 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 e8 dc 2d f9 ff 44 8b 28 48 89 c5 e8 61 9e ff ff 49 89 c4 48 85 db 0f 84 e5 00 00 00 <44> 0f b7 0b 66 41 83 f9 02 0f 84 f6 00 00 00 66 41 83 f9 0a 74 57

It's been doing that since around the time of issue 73873 [0]. I double
checked and it does use the 2.guix.pool.ntp.org pool. I also reverted back
to 0.guix.pool.ntp.org in case for some reason that would work. Both
segfaults ntpd. Did 2.guix.pool.ntp.org stop supporting ipv6?

Thanks

--
s/Fred[re]+i[ck]+/Fredrik/g
D
D
Danny Milosavljevic wrote on 15 Dec 01:52 +0100
Problem confirmed
(address . 74501@debbugs.gnu.org)
20241215005246.917D81120AB8@dd30410.kasserver.com
Hi,

I also have this problem on x86_64 znver3.

I disassembled my "Code:" block and I get:

8b 04 25 28 00 00 00 mov eax, DWORD PTR ds:0x28
48 89 44 24 08 mov QWORD PTR [rsp+0x8], rax
31 c0 xor eax, eax
e8 dc 2d f9 ff call <relative_address>
44 8b 28 mov r13d, DWORD PTR [rax]
48 89 c5 mov rbp, rax
e8 61 9e ff ff call <relative_address>
49 89 c4 mov r12, rax
48 85 db test rbx, rbx
0f 84 e5 00 00 00 je <forward_jump>
<44> 0f b7 0b movzx r9d, WORD PTR [rbx] ; <-- This is where <44> is
66 41 83 f9 02 cmp r9w, 0x2
0f 84 f6 00 00 00 je <forward_jump>
66 41 83 f9 0a cmp r9w, 0xa
74 57 je <forward_jump>

The 0x44 byte in this instruction is part of the REX prefix that indicates the use of an extended register (r9d in this case).

The error code is a combination of several error bits defined in fault.c in the Linux kernel:

/*
* Page fault error code bits:
*
* bit 0 == 0: no page found 1: protection fault
* bit 1 == 0: read access 1: write access
* bit 2 == 0: kernel-mode access 1: user-mode access
* bit 3 == 1: use of reserved bit detected
* bit 4 == 1: fault was an instruction fetch
* bit 5 == 1: protection keys block access
* bit 6 == 1: shadow stack access fault
* bit 15 = 1: SGX MMU page-fault
*/
enum x86_pf_error_code {
X86_PF_PROT = 1 << 0,
X86_PF_WRITE = 1 << 1,
X86_PF_USER = 1 << 2,
X86_PF_RSVD = 1 << 3,
X86_PF_INSTR = 1 << 4,
X86_PF_PK = 1 << 5,
X86_PF_SHSTK = 1 << 6,
X86_PF_SGX = 1 << 15,
};

Since ntpd is a user-mode program, X86_PF_USER is set and the error code is at least 4.

If the error code is 4, then the faulty memory access is a read from user space.

In total:

- User-mode access.
- Read access.
- No page found.
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 74501@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 74501
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch