> Also, replace-with-link doesn't check whether the "containing directory" > is the store like optimisePath_() does, so in theory if another process > tried to make a permanent change to the store's permissions it could be > clobbered when replace-with-link restores the permissions. I don't know > of any instance where this could happen currently, but it's something to > keep in mind. I guess we should also avoid changing permissions on /gnu/store, it would be wiser. > And, of course, there's the inherent visible change of deduplication - > possible modification of inode number, but I don't see how that could > cause problems with any reasonable programs. Yes, that’s fine. >> I think the ‘replace-with-link’ dance is atomic, so we should be fine. >> >> Thoughts? > > replace-with-link is atomic, but it can fail if the "canonical" link in > .links is deleted by the garbage collector between when its existence is > checked in 'deduplicate' and when temp link creation in > replace-with-link happens. Then ENOENT would be returned, and > 'deduplicate' wouldn't catch that exception, nor would optimisePath_() > in nix/libstore/optimise-store.cc. > > The proper behavior there, in my opinion, would be to retry the > deduplication. Attached is a patch that makes 'deduplicate' do that. > > Also, while I'm perusing optimisePath_(), there's a minor bug in which > EMLINK generated by renaming the temp link may still be treated as a > fatal error. This is because errno may be clobbered by unlink() prior to > being checked (according to the errno man page it can still be modified > even if the call succeeds). Indeed, good catch! > From 461064da9e169df3dd939b734bb2860598d113f4 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 24 Jun 2020 00:56:50 -0500 > Subject: [PATCH 1/2] deduplication: retry on ENOENT. > > It's possible for the garbage collector to remove the "canonical" link after > it's been detected as existing by 'deduplicate'. This would cause an ENOENT > error when replace-with-link attempts to create the temporary link. This > changes it so that it will properly handle that by retrying. > > * guix/store/deduplication.scm (deduplicate): retry on ENOENT. > --- > guix/store/deduplication.scm | 64 +++++++++++++++++++++++------------- > 1 file changed, 41 insertions(+), 23 deletions(-) > > diff --git a/guix/store/deduplication.scm b/guix/store/deduplication.scm > index 6784ee0b92..b6d94e49c2 100644 > --- a/guix/store/deduplication.scm > +++ b/guix/store/deduplication.scm > @@ -161,26 +161,44 @@ under STORE." > (scandir* path)) > (let ((link-file (string-append links-directory "/" > (bytevector->nix-base32-string hash)))) > - (if (file-exists? link-file) > - (replace-with-link link-file path > - #:swap-directory links-directory) > - (catch 'system-error > - (lambda () > - (link path link-file)) > - (lambda args > - (let ((errno (system-error-errno args))) > - (cond ((= errno EEXIST) > - ;; Someone else put an entry for PATH in > - ;; LINKS-DIRECTORY before we could. Let's use it. > - (replace-with-link path link-file > - #:swap-directory links-directory)) > - ((= errno ENOSPC) > - ;; There's not enough room in the directory index for > - ;; more entries in .links, but that's fine: we can > - ;; just stop. > - #f) > - ((= errno EMLINK) > - ;; PATH has reached the maximum number of links, but > - ;; that's OK: we just can't deduplicate it more. > - #f) > - (else (apply throw args))))))))))) > + (let retry () > + (if (file-exists? link-file) > + (catch 'system-error > + (lambda () > + (replace-with-link link-file path > + #:swap-directory links-directory)) > + (lambda args > + (if (and (= (system-error-errno args) ENOENT) > + ;; ENOENT can be produced because: > + ;; - LINK-FILE has missing directory components > + ;; - LINKS-DIRECTORY has missing directory > + ;; components > + ;; - PATH has missing directory components > + ;; > + ;; the last two are errors, the first just > + ;; requires a retry. > + (file-exists? (dirname path)) > + (file-exists? links-directory)) > + (retry) > + (apply throw args)))) I feel like there are TOCTTOU issues here and redundant ‘stat’ calls, plus the risk of catching a ‘system-error’ coming from “somewhere else.” What about baking this logic directly in ‘replace-with-link’, and replacing ‘file-exists?’ calls by 'system-error handling? > From a14ff0a9dab4d73680befaf9d244d6cce0a73ab3 Mon Sep 17 00:00:00 2001 > From: Caleb Ristvedt > Date: Wed, 24 Jun 2020 01:00:40 -0500 > Subject: [PATCH 2/2] .dir-locals.el: fix call-with-{retrying-}transaction > indenting. > > * .dir-locals.el (call-with-transaction, call-with-retrying-transaction): > change scheme-indent-function property from 2 to 1. LGTM! Thanks for your quick feedback and thorough analyses! Ludo’.