Csepp <raingloom@riseup.net> writes:
Toggle quote (64 lines)
> Csepp <raingloom@riseup.net> writes:
>
>> Csepp <raingloom@riseup.net> writes:
>>
>>> Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>>>
>>>> Hi!
>>>>
>>>> raingloom <raingloom@riseup.net> writes:
>>>>
>>>>> It's been at 67% on guix-packages-base for at least an hour now. The
>>>>> system itself is responsive and with the swap I gave it, it has more
>>>>> than enough memory. Htop shows three guile processes at the top of the
>>>>> list when sorted by CPU%, their states are S, D, D.
>>>>> Both CPUs are practically idling.
>>>>> This looks like some kind of lockup to me.
>>>>>
>>>>> Fresh install based on bare-bones example on a 32 bit netbook, but the
>>>>> install image used is the latest tagged version, since apparently there
>>>>> is no 32 bit option for edge.
>>>>>
>>>>> I also tried pulling using channel-with-substitutes, since I'm not too
>>>>> keen on locally building everything on such an old machine. Although
>>>>> Guix itself should frankly not take this long to build if we want to be
>>>>> competitive with other distros. Anyways, pulling with that in
>>>>> channels.scm gives a cert related error, so that's great, means old
>>>>> images can't easily be used for installation.
>>>>
>>>> Have you been able to reproduce this? If so, could you share the commit
>>>> you are starting from and the CPU architecture, so that we may hopefully
>>>> reproduce too?
>>>>
>>>> Thanks,
>>>>
>>>> Maxim
>>>
>>> CPU architecture is x86, commit it happened on last time is 347733b.
>>> Other possibly relevant factors:
>>> * spinning rust storage
>>> * 1GB RAM
>>> * encrypted BTRFS root
>>> * 4GB (encrypted) swap
>>> * 128MB zswap
>>>
>>> The last was not there when I originally submitted the bug.
>>>
>>> The swap is relevant because if it's a timing issue it's very possible
>>> some part of the code assumes reads are almost instant, which is not
>>> true with swap, and delaying a read might be exposing a race condition.
>>
>> Happening again.
>> pulled to: 8320c0c
>> pulled from: 4501a50
>>
>> Same system.
>>
>> The system version is from november of last year due, because trying to
>> upgrade takes so damn long and often gets stuck on some package with no
>> substitute.
>> So... the situation is not great...
>
> The process status says sleep so it's probably hanging in a syscall?
> Maybe a kernel bug?
Happening again with offloading.
This is getting really annoying.
Offload machine is completely idle, there is a process Guile for
guix-packages-base-builder running on it, its in sleeping status. Ran
for 17 minutes, now the time is not increasing.
I'm attaching a GDB backtrace of all the threads.