From debbugs-submit-bounces@debbugs.gnu.org Fri Dec 09 18:25:15 2016 Received: (at 24937) by debbugs.gnu.org; 9 Dec 2016 23:25:16 +0000 Received: from localhost ([127.0.0.1]:36355 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cFUXT-0006WO-EL for submit@debbugs.gnu.org; Fri, 09 Dec 2016 18:25:15 -0500 Received: from eggs.gnu.org ([208.118.235.92]:59640) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cFU8f-0004Ph-OW for 24937@debbugs.gnu.org; Fri, 09 Dec 2016 17:59:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cFTto-00086V-8V for 24937@debbugs.gnu.org; Fri, 09 Dec 2016 17:44:21 -0500 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on eggs.gnu.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_50,RP_MATCHES_RCVD autolearn=disabled version=3.3.2 Received: from fencepost.gnu.org ([2001:4830:134:3::e]:47471) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cFTtX-00080r-SE; Fri, 09 Dec 2016 17:43:59 -0500 Received: from reverse-83.fdn.fr ([80.67.176.83]:37800 helo=pluto) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1cFTtX-0005xR-3D; Fri, 09 Dec 2016 17:43:59 -0500 From: ludo@gnu.org (Ludovic =?utf-8?Q?Court=C3=A8s?=) To: 24937@debbugs.gnu.org Subject: Re: bug#24937: "deleting unused links" GC phase is too slow References: <87wpg7ffbm.fsf@gnu.org> Date: Fri, 09 Dec 2016 23:43:57 +0100 In-Reply-To: <87wpg7ffbm.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 13 Nov 2016 18:41:01 +0100") Message-ID: <87wpf867v6.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-Spam-Score: -8.0 (--------) X-Debbugs-Envelope-To: 24937 Cc: Mark H Weaver X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -8.0 (--------) ludo@gnu.org (Ludovic Court=C3=A8s) skribis: > =E2=80=98LocalStore::removeUnusedLinks=E2=80=99 traverses all the entries= in > /gnu/store/.links and calls lstat(2) on each one of them and checks > =E2=80=98st_nlink=E2=80=99 to determine whether they can be deleted. > > There are two problems: lstat(2) can be slow on spinning disks as found > on hydra.gnu.org, and the algorithm is proportional in the number of > entries in /gnu/store/.links, which is a lot on hydra.gnu.org. On Dec. 2 on guix-sysadmin@gnu.org, Mark described an improvement that noticeably improved performance: The idea is to read the entire /gnu/store/.links directory, sort the entries by inode number, and then iterate over the entries by inode number, calling 'lstat' on each one and deleting the ones with a link count of 1. The reason this is so much faster is because the inodes are stored on disk in order of inode number, so this leads to a sequential access pattern on disk instead of a random access pattern. The difficulty is that the directory is too large to comfortably store all of the entries in virtual memory. Instead, the entries should be written to temporary files on disk, and then sorted using merge sort to ensure sequential access patterns during sorting. Fortunately, this is exactly what 'sort' does from GNU coreutils. So, for now, I've implemented this as a pair of small C programs that is used in a pipeline with GNU sort. The first program simply reads a directory and writes lines of the form " " to stdout. (Unfortunately, "ls -i" calls stat on each entry, so it can't be used). This is piped through 'sort -n' and then into another small C program that reads these lines, calls 'lstat' on each one, and deletes the non-directories with link count 1. Regarding memory usage, I replied: Really? For each entry, we have to store roughly 70 bytes for the file name (or 52 if we consider only the basename), plus 8 bytes for the inode number; let=E2=80=99s say 64 bytes. If we have 10 M entries, that=E2=80=99s 700 MB (or 520 MB), which is a lo= t, but maybe acceptable? At worst, we may still see an improvement if we proceed by batches: we read 10000 directory entries (7 MB), sort them, and stat them, then read the next 10000 entries. WDYT? Ludo=E2=80=99.