Hello, Ricardo Wurmus writes: > Today we discovered a few more things and discussed them on IRC. Here’s > a summary. > > /var/cache sits on the same storage as /gnu. We mounted the 5TB ext4 > file system that’s hosted by the SAN at /mnt_test and started copying > over /var/cache to /mnt_test/var/cache. Transfer speed was considerably > faster (not *great*, but reasonably fast) than the copy of > /gnu/store/trash to the same target. > > This confirmed our suspicions that the problem is not with the storage > array but due to the fact that /gnu/store/trash (and also /gnu/store) > is an extremely large, flat directory. /var/cache is not. There was an interesting thread in the Linux kernel mailing lists about this very issue earlier this year: https://lore.kernel.org/linux-fsdevel/206078.1621264018@warthog.procyon.org.uk/ I’m not sure I completely understood all of the concerns discussed there, but my understanding of it is that for workloads which don’t concurrently modify the huge directory, it’s size isn’t a problem for btrfs and XFS and in fact it’s even more efficient to have one big directory rather than subdirectories¹. It’s should also be well handled even by ext4, IIUC². The problem for all filesystems is concurrently modifying the directory (e.g., adding or removing files), because the kernel serializes directory operations at the VFS layer. Also in that case XFS can also have allocation issues when adding new files if one isn’t careful.³ -- Thanks Thiago ¹ https://lore.kernel.org/linux-fsdevel/20210517232237.GE2893@dread.disaster.area/ ² https://lore.kernel.org/linux-fsdevel/6E4DE257-4220-4B5B-B3D0-B67C7BC69BB5@dilger.ca/ ³ https://lore.kernel.org/linux-fsdevel/20210519125743.GP2893@dread.disaster.area/