[GUIX SYSTEM]: Malfunction

  • Done
  • quality assurance status badge
Details
8 participants
  • Danny Milosavljevic
  • Efraim Flashner
  • Giovanni Biscuolo
  • Julien Lepiller
  • Ludovic Courtès
  • Maxim Cournoyer
  • Mark H Weaver
  • Raghav Gururajan
Owner
unassigned
Submitted by
Raghav Gururajan
Severity
normal
R
R
Raghav Gururajan wrote on 31 Aug 2020 11:48
(address . bug-guix@gnu.org)
8d2d6787-6649-2089-8d5e-a43fba6041e7@disroot.org
Hello Guix!

[1] Out of no where, when I did `guix environment foo`, I got:

\note: build failure may have been caused by lack of free disk space
builder for `/gnu/store/2ajnpcblwpgzjdhx3050qapy3li31pr5-profile.drv'
failed with exit code 1

[2] When I redid the command 2nd time, I got:

error (ignored): cannot unlink `/tmp/guix-build-profile.drv-0':
Read-only file system
error (ignored): cannot unlink
`/gnu/store/2ajnpcblwpgzjdhx3050qapy3li31pr5-profile.drv.chroot/tmp/guix-build-profile.drv-0':
Read-only file system
guix environment: error: cannot link
`/gnu/store/.links/1jd7y4xvj853m4aygnyixci5h2y7a1py6iavp9kwzvcinyniqwbd' to
`/gnu/store/3klrs2bkcmypwnmx61q24rc7csgk19f8-profile/share/icons/Adwaita/64x64/emotes/face-smile-big
symbolic.symbolic.png': Read-only file system

[3] When I redid the command 3rd time, I got:

guix environment: error: fport_read: Connection reset by peer

[4] When I redid the command 4th time, I got:

guix environment: error: failed to connect to
`/var/guix/daemon-socket/socket': Connection refused

[5] So I tried to restart guix-daemon and got a weird output:

sudo: unable to open /var/run/sudo/ts/rg: Read-only file system
Password:
Service guix-daemon is not running.
Service guix-daemon is currently disabled.

[6] Then I tried to enable the daemon:

sudo: unable to open /var/run/sudo/ts/rg: Read-only file system
Password:
Enabled service guix-daemon.

[7] Then I tried to start the daemon:

sudo: unable to open /var/run/sudo/ts/rg: Read-only file system
Password:
Service guix-daemon has been started.

[8] Now, I retried the `guix environment foo` and got same error as in 4.

[9] At this point, all the other running applications started to throw
errors regarding read-only file-system. I could not even save the above
errors in a text editor. Glad that I had the IceCat running and I was
able to email it to myself. IceCat wasn't affected, as I think the
web-process was containerized. Everything was back to normal after restart.

[10] I am experiencing this situation for the 3rd time this month. It
never happened before this month.

INFO:

`guix describe`

guix dad963a
commit: dad963a4393ea51409baa63817b26b449ed58338

Both my user profile and root profile are on the same commit.

Regards,
RG.
Attachment: signature.asc
E
E
Efraim Flashner wrote on 31 Aug 2020 11:53
(name . Raghav Gururajan)(address . raghavgururajan@disroot.org)(address . 43132@debbugs.gnu.org)
20200831095359.GC1048@E5400
On Mon, Aug 31, 2020 at 05:48:30AM -0400, Raghav Gururajan wrote:
Toggle quote (74 lines)
> Hello Guix!
>
> [1] Out of no where, when I did `guix environment foo`, I got:
>
> \note: build failure may have been caused by lack of free disk space
> builder for `/gnu/store/2ajnpcblwpgzjdhx3050qapy3li31pr5-profile.drv'
> failed with exit code 1
>
> [2] When I redid the command 2nd time, I got:
>
> error (ignored): cannot unlink `/tmp/guix-build-profile.drv-0':
> Read-only file system
> error (ignored): cannot unlink
> `/gnu/store/2ajnpcblwpgzjdhx3050qapy3li31pr5-profile.drv.chroot/tmp/guix-build-profile.drv-0':
> Read-only file system
> guix environment: error: cannot link
> `/gnu/store/.links/1jd7y4xvj853m4aygnyixci5h2y7a1py6iavp9kwzvcinyniqwbd' to
> `/gnu/store/3klrs2bkcmypwnmx61q24rc7csgk19f8-profile/share/icons/Adwaita/64x64/emotes/face-smile-big
> symbolic.symbolic.png': Read-only file system
>
> [3] When I redid the command 3rd time, I got:
>
> guix environment: error: fport_read: Connection reset by peer
>
> [4] When I redid the command 4th time, I got:
>
> guix environment: error: failed to connect to
> `/var/guix/daemon-socket/socket': Connection refused
>
> [5] So I tried to restart guix-daemon and got a weird output:
>
> sudo: unable to open /var/run/sudo/ts/rg: Read-only file system
> Password:
> Service guix-daemon is not running.
> Service guix-daemon is currently disabled.
>
> [6] Then I tried to enable the daemon:
>
> sudo: unable to open /var/run/sudo/ts/rg: Read-only file system
> Password:
> Enabled service guix-daemon.
>
> [7] Then I tried to start the daemon:
>
> sudo: unable to open /var/run/sudo/ts/rg: Read-only file system
> Password:
> Service guix-daemon has been started.
>
> [8] Now, I retried the `guix environment foo` and got same error as in 4.
>
> [9] At this point, all the other running applications started to throw
> errors regarding read-only file-system. I could not even save the above
> errors in a text editor. Glad that I had the IceCat running and I was
> able to email it to myself. IceCat wasn't affected, as I think the
> web-process was containerized. Everything was back to normal after restart.
>
> [10] I am experiencing this situation for the 3rd time this month. It
> never happened before this month.
>
> INFO:
>
> `guix describe`
>
> guix dad963a
> repository URL: https://git.savannah.gnu.org/git/guix.git
> commit: dad963a4393ea51409baa63817b26b449ed58338
>
> Both my user profile and root profile are on the same commit.
>
> Regards,
> RG.
>


What's the output of 'df -h' and 'df -i'? There's not much change in
error message if you're out of space or just out of inodes.


--
Efraim Flashner <efraim@flashner.co.il> ????? ?????
GPG key = A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351
Confidentiality cannot be guaranteed on emails sent or received unencrypted
-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAl9MyLQACgkQQarn3Mo9
g1ET+w/9Fhwb4xHRmgMYR2D/fbvih77zjEFfGjyRvp4rPgkXEli95nCMd9oQFeA1
3Uthj3WiyI66zWFoMGWYaGk8UcFVTDbB1SUpEFutaHT5bzHsp1YWBDaQKA0vHj3J
MAXjfWWAriaY421KRrJCUXKFj86vwpdNBr7OAAvAd6xsmRRoqSYsbEljT70tKG9b
BnckOlF8jVLZXZQ46xcOg+5eQSOAj6Z2AbDoOnr/GVi6HtKLg4SHACe/sWhsVbpP
f6/Vwd2dX/QO9pZcMJdv6rvosAkisASVQtf/y/XM+ZpUOo6nmpnTPg1P1HyQLetK
NcCrxR09Lyj3wVTZr+/TUA7kx9peymbNGSGnm8Eccm4pypgaxPLOsNXgDyc24OFL
jQnYhVoNvD4Gv34nlK6ynKfAdI78g3MpB5cSTt3O4+WwXSeYcLGBSg8rxp7p2PTM
ajAWvcsfOo8zOYzFNVFIKZ4iN2CcWa8PSVyydFFdQsp6LzAvw0Ow6kXLWDUJJeyA
EHVqTGG0gvXMq8oW1Yo5tgUYTvGbXN0po6ymAiqdF+wZF15FImvBgMQGbX2m3kEL
KEyo/7SxBHiR+tmVvFoJLfY/oD33f+vk4328Vf7HedlvZsb1Kb1vov3wpJifAdFL
sDTj47NxQhWv0kLgHpZe4A8Q424w9IwmAwud8HRdqU/Koz1w/ec=
=oxBK
-----END PGP SIGNATURE-----


R
R
Raghav Gururajan wrote on 31 Aug 2020 12:01
(name . Efraim Flashner)(address . efraim@flashner.co.il)(address . 43132@debbugs.gnu.org)
ca19519b-4fe5-27c1-03c4-c11437eb5fd3@disroot.org
Hi Efraim!

Toggle quote (3 lines)
> What's the output of 'df -h' and 'df -i'? There's not much change in
> error message if you're out of space or just out of inodes.

rg@secondary ~$ df -h
Filesystem Size Used Avail Use% Mounted on
none 3.8G 0 3.8G 0% /dev
/dev/dm-0 120G 95G 24G 81% /
tmpfs 3.8G 8.6M 3.8G 1% /dev/shm
none 3.8G 20K 3.8G 1% /run/systemd
none 3.8G 0 3.8G 0% /run/user
cgroup 3.8G 0 3.8G 0% /sys/fs/cgroup
tmpfs 771M 4.0K 771M 1% /run/user/1000
/dev/sdb1 60G 59G 638M 99% /media/rg/CARD
rg@secondary ~$ df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
none 983678 592 983086 1% /dev
/dev/dm-0 0 0 0 - /
tmpfs 986453 71 986382 1% /dev/shm
none 986453 21 986432 1% /run/systemd
none 986453 2 986451 1% /run/user
cgroup 986453 12 986441 1% /sys/fs/cgroup
tmpfs 986453 14 986439 1% /run/user/1000
/dev/sdb1 0 0 0 - /media/rg/CARD

Regards,
RG.
Attachment: signature.asc
G
G
Giovanni Biscuolo wrote on 31 Aug 2020 12:39
Re: bug#43132: [GUIX SYSTEM]: Malfunction
871rjndrpd.fsf@roquette.i-did-not-set--mail-host-address--so-tickle-me
Hi Raghav

Raghav Gururajan <raghavgururajan@disroot.org> writes:

[...]

Toggle quote (5 lines)
> [2] When I redid the command 2nd time, I got:
>
> error (ignored): cannot unlink `/tmp/guix-build-profile.drv-0':
> Read-only file system

It seems connected to a filesystem issue: can you also tell us what's
the output of "mount"?

[...]

Thanks, Gio'

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEERcxjuFJYydVfNLI5030Op87MORIFAl9M038ACgkQ030Op87M
ORLQtg/9EyBXHHkmq8OklTJgdl4P64geFsthkmGNU7L38/JqqeOw1RjnFi/5QPwr
eNJmZPp2sbuhiZ/v1aqWEUkS1lJPZcxWMgNYZOF10oYS/6vu9up9/MQQ2tFSyBnU
jB9zt0Nyt1FZaJ1WI1Rl/Pj5KZwUUY4hPHLJ6YKgvoib0lHZi5EYBEPnrwlYrIXa
1j43y90BSfnRiWisZXbnSnnXo3bWzpxupFpADv+npx3+oUhD6LaL1DS28BR9sgaZ
sbEqmU9LgzeofuoP82N8QJzPhY1OwkPO5VGVWFopluiT9FCj8XfRMSxhZMkk1HO0
QGzcbltwknj9eLSZx/yEmYz5GEW22+qKad7WrkU9cvdBZdMhY6I74Uanv5P2Fyvi
Mg77yewY/O4QE2fzRD+JfHfYN/FQMH7ysNfJmz011Blt9niFDQH0roRfm6Doq3BV
mgJUMY47x1wLLHByvNp9vXnE0VXSbjnCviC/U1XqWgti6gvU4qB1DhyO4LeHTnCx
vPg9VNznNbo25+hYtssdkX009cxuzc9BqaczLcNLjFxsErACVj0bgH+AcEZDShTO
IyDxZIPXwgboiMqImAPxJ3/eCpAtoBq4UbM1wqtEcls8Yrmz2btKZUn5GB1wVH2Q
qIBOEcAgGINr4Adrm5DD80/Uvo5wTXftA7st9JFNUaajCIPaPd0=
=CpON
-----END PGP SIGNATURE-----

R
R
Raghav Gururajan wrote on 31 Aug 2020 12:48
76fc86d2-3a76-03f2-cd96-713a3ccb1a9b@disroot.org
Hi Gio!


Toggle quote (3 lines)
> It seems connected to a filesystem issue: can you also tell us what's
> the output of "mount"?

none on /proc type proc (rw,relatime)
none on /dev type devtmpfs
(rw,relatime,size=3934712k,nr_inodes=983678,mode=755)
none on /sys type sysfs (rw,relatime)
/dev/mapper/secondary on / type btrfs
(rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
none on /dev/pts type devpts (rw,relatime,gid=996,mode=620,ptmxmode=000)
none on /sys/kernel/debug type debugfs (rw,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,relatime)
/dev/mapper/secondary on /gnu/store type btrfs
(ro,relatime,ssd,space_cache,subvolid=5,subvol=/)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,relatime)
none on /run/systemd type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
none on /run/user type tmpfs (rw,nosuid,nodev,noexec,relatime,mode=755)
cgroup on /sys/fs/cgroup type tmpfs (rw,relatime)
cgroup on /sys/fs/cgroup/elogind type cgroup (rw,relatime,name=elogind)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu type cgroup (rw,relatime,cpu)
cgroup on /sys/fs/cgroup/cpuacct type cgroup (rw,relatime,cpuacct)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,relatime,freezer)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,relatime,perf_event)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,relatime,pids)
cgroup2 on /sys/fs/cgroup/unified type cgroup2
(rw,nosuid,nodev,noexec,relatime,nsdelegate)
tmpfs on /run/user/1000 type tmpfs
(rw,nosuid,nodev,relatime,size=789160k,mode=700,uid=1000,gid=998)
/dev/sdb1 on /media/rg/CARD type vfat
(rw,nosuid,nodev,relatime,uid=1000,gid=998,fmask=0022,dmask=0022,codepage=437,iocharset=iso8859-1,shortname=mixed,showexec,utf8,flush,errors=remount-ro,uhelper=udisks2)

Regards,
RG.
Attachment: signature.asc
G
G
Giovanni Biscuolo wrote on 31 Aug 2020 13:11
87pn77cbou.fsf@roquette.i-did-not-set--mail-host-address--so-tickle-me
Hello Raghav

when forwarding the output of commands next time, plz beware your MUS
does not reformat the relevant :-)

This seems as a system issue on your side, not a Guix bug

Raghav Gururajan <raghavgururajan@disroot.org> writes:

Toggle quote (3 lines)
>> It seems connected to a filesystem issue: can you also tell us what's
>> the output of "mount"?

[...]

Toggle quote (3 lines)
> w on / type btrfs
> (rw,relatime,ssd,space_cache,subvolid=5,subvol=/)

[...]

Toggle quote (3 lines)
> /dev/mapper/secondary on /gnu/store type btrfs
> (ro,relatime,ssd,space_cache,subvolid=5,subvol=/)

I see two problems here:

1. the btrfs volume /dev/mapper/secondary seems mounted twice, and with
the same subvolume; I never tryed to mount the same btrfs volume on two
different mountpoints: is this the reason your /gnu/store is read-only?

2. /gnu/store is mounted read-only, that's why you get the errors

Please can you try removing the mounting of /gnu/store from your
filesystem configuration (or fstab if on a foreign distro)?

[...]

HTH! Gio'

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEERcxjuFJYydVfNLI5030Op87MORIFAl9M2tEACgkQ030Op87M
ORJzuw//XeFXOIqXdaX5ax7Yoy+hdGNkdne5pQiprzJJBLnKs6wSvh7c9KJ0WI36
HYWTjDJSz/sFx4bwP6aOMZi49BCRyEvGJ9p3fUODj9N4870hBnIFtezFOxlKmRG/
n/oaHnrFD5LozWyqGfhQygzam+75hkAsnWMo4bktKbxE0nWftcRZh7nX5bCCj2qe
zIDAIGFw8fBdfCw9zU0A6D6qj6KvCD6gO1yM/wE0zFnJUFLQymOxKoH2SRVnk8xe
Tz25DfPW2z3KRCLOiSJAmn8MJNAxysEf9pi+Vm280fuFuO/YZH5fi+hGCWXqk3eW
GuPI2OXgsCC09bsdpoeHxne9DBuBT/jlbmHcBB1p98D41cILG0Ll6LpW3poKH2ts
YDj9j+WOXk34uUfzs3zgSkihukaXfbl5xkw0rE63oKwJg2giQGBUK/3Z8Y3aQ+ip
zqp44KN3A0wg+aYR1kGjoIqTZilj5zRDnPTTEJyHoom+Cw4lsZwoozJnGHa/Xlpf
0PhaZjliZX2kvwp2XRqiNOaPj+n+BniY76rQFfLnx8nsnIp4gBZt5VPAWFVZVPeM
S7/ZB48GjvqxRf7IPKwUNNOMplNdavZyvV4YAxDQXybk4rdBr9FTJ+V3WMhFkfH0
Y81CemT54xv7ZPKWxAQkC/ve4rGl8wfXaFLXLWs1/cX8SpL9tYw=
=Q2gr
-----END PGP SIGNATURE-----

J
J
Julien Lepiller wrote on 31 Aug 2020 14:07
0819FC2B-8BCA-4479-8CFC-800DF7098EB6@lepiller.eu
No, it's supposed to be like that. /gnu/store is mounted read-only (on the guix system) to prevent you from writing to it. The guix daemon has write access to the store when it wants to add a new item, or garbage collect.

Le 31 août 2020 07:11:13 GMT-04:00, Giovanni Biscuolo <g@xelera.eu> a écrit :
Toggle quote (42 lines)
>Hello Raghav
>
>when forwarding the output of commands next time, plz beware your MUS
>does not reformat the relevant :-)
>
>This seems as a system issue on your side, not a Guix bug
>
>Raghav Gururajan <raghavgururajan@disroot.org> writes:
>
>>> It seems connected to a filesystem issue: can you also tell us
>what's
>>> the output of "mount"?
>
>[...]
>
>> w on / type btrfs
>> (rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
>
>[...]
>
>> /dev/mapper/secondary on /gnu/store type btrfs
>> (ro,relatime,ssd,space_cache,subvolid=5,subvol=/)
>
>I see two problems here:
>
>1. the btrfs volume /dev/mapper/secondary seems mounted twice, and with
>the same subvolume; I never tryed to mount the same btrfs volume on two
>different mountpoints: is this the reason your /gnu/store is read-only?
>
>2. /gnu/store is mounted read-only, that's why you get the errors
>
>Please can you try removing the mounting of /gnu/store from your
>filesystem configuration (or fstab if on a foreign distro)?
>
>[...]
>
>HTH! Gio'
>
>--
>Giovanni Biscuolo
>
>Xelera IT Infrastructures
Attachment: file
G
G
Giovanni Biscuolo wrote on 31 Aug 2020 14:40
87k0xfc7kg.fsf@roquette.i-did-not-set--mail-host-address--so-tickle-me
Julien Lepiller <julien@lepiller.eu> writes:

Toggle quote (3 lines)
> No, it's supposed to be like that. /gnu/store is mounted read-only (on
> the guix system) to prevent you from writing to it.

Very sorry for the confusion, I forgot that! (and I did not check before) :-(

Toggle quote (7 lines)
>>> /dev/mapper/secondary on / type btrfs
>>> (rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
>>
>>[...]
>>
>>> /dev/mapper/secondary on /gnu/store type btrfs
>>> (ro,relatime,ssd,space_cache,subvolid=5,subvol=/)
^ Same subvolume as /

This is the output from a running Guix system of mine:

Toggle snippet (9 lines)
/dev/sda5 on / type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)

[...]

/dev/sda5 on /gnu/store type btrfs (ro,relatime,space_cache,subvolid=5,subvol=/gnu/store)


Thanks! Gio'

[...]

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCgAdFiEERcxjuFJYydVfNLI5030Op87MORIFAl9M77EACgkQ030Op87M
ORIGBg/+NC5HxysTZFv+plibWpT2hBhZVSrC6bygxhYZ8J/Tthu5K8GTMA4lDv7g
AdOOiaPIKM1Nyr38PA0zG7OzPel2yrsD0x0rW+IIPHkoE3us1BFgeq2bRmlCe6qD
uzgrPl09wVweWeK0mNPguP48YdHvTBeFD8kR2q0HqYRS3rLda76LN0Ik9VQpuAU/
XJVfw4vzB2xK3phEi/J/AiSWmVOmWgw4r98BD3vfJzpOflmewKY0wvP+I0RaJv5s
S8m4dBaqZDO7zkk4IhN+oaPsivvOfLYDHvHMSbp0YWjwIe5hKJER5sRXStTtl4B3
VNb7NFVW4ZZPyq9fAZGl5CvegGFIsUuqzGs75RGN3BRi6HopFOzTSZ+Cs4FcLIeE
UNAPDomk8HgwgoFGTcCcF5f0sTR+4ciJZrJASTdCUPWe82tEivlcmL+fudCe7yKz
xaTKCo0nIXbeqcH2155D5Ubxzgvbfe3g7mKgJhtNRJ+kLqUlCt1h5iSsOX4R4Sfk
OK/CxOH4nheqZ+TmFrp4ql2o25GNAL2R+KCQPNrwPSnYGxpbWVOSMg/RBV0PNh8s
ZZ8GYFRplrQBkMU5ri6tIhARgDdi1mP1ILY4Qb1YoID53WWd7lVA3+k9ZGC/wDvg
ETRcbTsqQ0eATsp6A6tx8Yat+pOItR8IGFm4Sm0kreNTGcV+CgQ=
=SSEy
-----END PGP SIGNATURE-----

D
D
Danny Milosavljevic wrote on 31 Aug 2020 23:17
(name . Raghav Gururajan)(address . raghavgururajan@disroot.org)(address . 43132@debbugs.gnu.org)
20200831231730.021e45f4@scratchpost.org
Hi Raghav,

On Mon, 31 Aug 2020 05:48:30 -0400
Raghav Gururajan <raghavgururajan@disroot.org> wrote:

Toggle quote (20 lines)
> Hello Guix!
>
> [1] Out of no where, when I did `guix environment foo`, I got:
>
> \note: build failure may have been caused by lack of free disk space
> builder for `/gnu/store/2ajnpcblwpgzjdhx3050qapy3li31pr5-profile.drv'
> failed with exit code 1
>
> [2] When I redid the command 2nd time, I got:
>
> error (ignored): cannot unlink `/tmp/guix-build-profile.drv-0':
> Read-only file system
> error (ignored): cannot unlink
> `/gnu/store/2ajnpcblwpgzjdhx3050qapy3li31pr5-profile.drv.chroot/tmp/guix-build-profile.drv-0':
> Read-only file system
> guix environment: error: cannot link
> `/gnu/store/.links/1jd7y4xvj853m4aygnyixci5h2y7a1py6iavp9kwzvcinyniqwbd' to
> `/gnu/store/3klrs2bkcmypwnmx61q24rc7csgk19f8-profile/share/icons/Adwaita/64x64/emotes/face-smile-big
> symbolic.symbolic.png': Read-only file system

Usually that means file-system corruption, which very likely was caused by a
hardware (disk) problem. I've had it before, and shortly after the disk died.

What does "sudo dmesg" show around the time it made it read-only?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9NaOoACgkQ5xo1VCww
uqWclwgAib7grO/5WG9RYFww6HE8DqCAgWGLPeE9lMDBsA+wLmJ/8yQt7f4S7Qno
xb4p+iuT3nyxM9LfZxESSfoP9/rRpfrtV0YrkrkcQxjBZz34BlHqaIV7nP5DCmZv
0ME9JfiICEuWXif8BY09d4Q6m8LQg9Fq6wrxwXZVEdyc1chvyuJ0t388NbfZQsCh
BSHveHjoVH0Bjc4J814DkssXl7DxV7QiMQcvnkGnjTcHvLnBYAsmaFIMklKJQAUn
Qx0pb+unzPq+tZ4fFzwG/9A/4c0SGWlMIZmJ4Qn4oOf9ZXqrbj01y6hKHDbdI7Zp
VMXhu28VhBxugXwZeRlYDdIMZy/7EA==
=aBLi
-----END PGP SIGNATURE-----


R
R
Raghav Gururajan wrote on 1 Sep 2020 05:04
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)(address . 43132@debbugs.gnu.org)
c0b89bfe-60be-a0e1-3dca-6f93c9fb5c47@disroot.org
Hi Danny!

Toggle quote (3 lines)
> Usually that means file-system corruption, which very likely was caused by a
> hardware (disk) problem. I've had it before, and shortly after the disk died.

Oh no! My disk is a SSD, which is only about 2 years old. Isn't that too
soon?

Btw, is there a tool to check the health of the disk?

Toggle quote (2 lines)
> What does "sudo dmesg" show around the time it made it read-only?

Ah, I will have to wait until it happens again.

Regards,
RG.
Attachment: signature.asc
D
D
Danny Milosavljevic wrote on 1 Sep 2020 09:58
(name . Raghav Gururajan)(address . raghavgururajan@disroot.org)(address . 43132@debbugs.gnu.org)
20200901095843.2a2fed41@scratchpost.org
Hi,

On Mon, 31 Aug 2020 23:04:25 -0400
Raghav Gururajan <raghavgururajan@disroot.org> wrote:

Toggle quote (10 lines)
> Hi Danny!
>
> > Usually that means file-system corruption, which very likely was caused by a
> > hardware (disk) problem. I've had it before, and shortly after the disk died.
>
> Oh no! My disk is a SSD, which is only about 2 years old. Isn't that too
> soon?
>
> Btw, is there a tool to check the health of the disk?

Yes--usually it's a program in the disk firmware.

You can steer it and look at what it did using smartctl (in package
smartmontools).

But I'd advise to check dmesg because it could also be a RAM problem, or a
number of other things.

(UNIX also has fsck to check the filesystem, but it already automatically
does that on reboot when problems arised. So little need to manually
fiddle with that)
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAl9N/zMACgkQ5xo1VCww
uqXefgf7BvIm13sM+86gW67j8bt810JgHxD285ZwMuds5XiscPCLyLSXkoqr+SRr
BbLZsotaKEsD/BPbqoWXoKTx8/PW6WK2FYHlMElSLRTHb3OxXEqq2pE6R0kVoDVF
BHw28gw7ck2p2LnTOxMommdRpM3HRshml/6yApZbt9KrFTBbkHgAgfwVok0xnlnR
Bj/xfTJtB291cGutpxJp/pucNu1i2gh1r2fbUcTt/Hkm+q2PbI66IH87ppYUPu+y
d8EUEE1yb4R1SI/7kJ4woycWtOihxS51SZ6WsnZqccqO6DQ4BVuZVD2f0mBI6Qc6
v2xTaW1YsJSyT8okw4Y7r2GoWffPtw==
=jsVZ
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 10 Sep 2020 09:56
(name . Raghav Gururajan)(address . raghavgururajan@disroot.org)
87h7s6jc98.fsf@gnu.org
Hey Raghav,

Did you eventually find what went wrong? Should we close this bug or at
least retitle it?

Thanks,
Ludo’.
M
M
Maxim Cournoyer wrote on 14 Sep 2020 17:14
(name . Ludovic Courtès)(address . ludo@gnu.org)
87363k5r1n.fsf@gmail.com
Hello,

Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (8 lines)
> Hey Raghav,
>
> Did you eventually find what went wrong? Should we close this bug or at
> least retitle it?
>
> Thanks,
> Ludo’.

I took Raghav to #btrfs last week, where with the help of gentle folks a
failing drive was established as the most likely culprit.

In other words, Btrfs checksuming capabilities helped quickly
discovering a hardware problem which might otherwise have silently
caused non-recoverable damage to Raghav's data.

I'm closing this bug now.

Thanks!

Maxim
Closed
L
L
Ludovic Courtès wrote on 14 Sep 2020 21:34
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
878sdcb1ac.fsf@gnu.org
Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

Toggle quote (7 lines)
> I took Raghav to #btrfs last week, where with the help of gentle folks a
> failing drive was established as the most likely culprit.
>
> In other words, Btrfs checksuming capabilities helped quickly
> discovering a hardware problem which might otherwise have silently
> caused non-recoverable damage to Raghav's data.

Good, thanks for following up!

Ludo’.
Closed
R
R
Raghav Gururajan wrote on 15 Sep 2020 13:51
95c4430f-1a98-50ad-cb6a-0adb1ec6039a@disroot.org
Hi!

Toggle quote (13 lines)
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> I took Raghav to #btrfs last week, where with the help of gentle folks a
>> failing drive was established as the most likely culprit.
>>
>> In other words, Btrfs checksuming capabilities helped quickly
>> discovering a hardware problem which might otherwise have silently
>> caused non-recoverable damage to Raghav's data.
>
> Good, thanks for following up!
>
> Ludo’.

Thank you!

Yeah, seems like my disk is shot, but I am not sure. I have reinstalled
guix with ext4, instead of btrfs, as these issues started to arise after
migration to btrfs from ext4. So far, my system is doing well. Lets see
how it goes. :-)

Regards,
RG.
Attachment: signature.asc
Closed
M
M
Maxim Cournoyer wrote on 15 Sep 2020 15:31
(name . Raghav Gururajan)(address . raghavgururajan@disroot.org)
87tuvzp3nc.fsf@gmail.com
Hello Raghav,

Raghav Gururajan <raghavgururajan@disroot.org> writes:

Toggle quote (22 lines)
> Hi!
>
>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>>
>>> I took Raghav to #btrfs last week, where with the help of gentle folks a
>>> failing drive was established as the most likely culprit.
>>>
>>> In other words, Btrfs checksuming capabilities helped quickly
>>> discovering a hardware problem which might otherwise have silently
>>> caused non-recoverable damage to Raghav's data.
>>
>> Good, thanks for following up!
>>
>> Ludo’.
>
> Thank you!
>
> Yeah, seems like my disk is shot, but I am not sure. I have reinstalled
> guix with ext4, instead of btrfs, as these issues started to arise after
> migration to btrfs from ext4. So far, my system is doing well. Lets see
> how it goes. :-)

Sounds like playing with fire to me :-).

Ext4 won't detect bitrot (silent corruption of your drive's data).
You'll probably wake one day with a fsck that won't be able to recover
some files, or worst, a completely dead drive.

Your backups would also contain corrupted data (garbage in, garbage
out!).

Maxim
Closed
M
M
Mark H Weaver wrote on 16 Sep 2020 00:13
(address . 43132-done@debbugs.gnu.org)
871rj2bsdt.fsf@netris.org
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (31 lines)
> Raghav Gururajan <raghavgururajan@disroot.org> writes:
>
>>> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>>>
>>>> I took Raghav to #btrfs last week, where with the help of gentle folks a
>>>> failing drive was established as the most likely culprit.
>>>>
>>>> In other words, Btrfs checksuming capabilities helped quickly
>>>> discovering a hardware problem which might otherwise have silently
>>>> caused non-recoverable damage to Raghav's data.
>>>
>>> Good, thanks for following up!
>>>
>>> Ludo’.
>>
>> Thank you!
>>
>> Yeah, seems like my disk is shot, but I am not sure. I have reinstalled
>> guix with ext4, instead of btrfs, as these issues started to arise after
>> migration to btrfs from ext4. So far, my system is doing well. Lets see
>> how it goes. :-)
>
> Sounds like playing with fire to me :-).
>
> Ext4 won't detect bitrot (silent corruption of your drive's data).
> You'll probably wake one day with a fsck that won't be able to recover
> some files, or worst, a completely dead drive.
>
> Your backups would also contain corrupted data (garbage in, garbage
> out!).

For what it's worth, I wholeheartedly agree with Maxim. Btrfs did you a
great service by calling attention to this problem with your drive, and
it would be a shame to ignore it and switch back to ext4 where your data
may instead be silently corrupted.

I've been using btrfs for several years now on my x86_64 Guix system,
and it has served me well. Previously, I used ext4, which would
silently leave some of my files empty after crashes. I've never seen
that happen with btrfs.

Mark
Closed
R
R
Raghav Gururajan wrote on 16 Sep 2020 00:38
(address . 43132-done@debbugs.gnu.org)
dd7e1230-04a4-de06-fdb9-2bbdd4c684c0@disroot.org
Hi Mark and Maxim!

Toggle quote (17 lines)
>> Ext4 won't detect bitrot (silent corruption of your drive's data).
>> You'll probably wake one day with a fsck that won't be able to recover
>> some files, or worst, a completely dead drive.
>>
>> Your backups would also contain corrupted data (garbage in, garbage
>> out!).
>
> For what it's worth, I wholeheartedly agree with Maxim. Btrfs did you a
> great service by calling attention to this problem with your drive, and
> it would be a shame to ignore it and switch back to ext4 where your data
> may instead be silently corrupted.
>
> I've been using btrfs for several years now on my x86_64 Guix system,
> and it has served me well. Previously, I used ext4, which would
> silently leave some of my files empty after crashes. I've never seen
> that happen with btrfs.

Yeah, makes sense. I have placed an order for WDS100T2B0A.

Thanks folks!

Regards,
RG.
Attachment: signature.asc
Closed
?