From debbugs-submit-bounces@debbugs.gnu.org Fri Jan 18 04:52:56 2019 Received: (at 33899) by debbugs.gnu.org; 18 Jan 2019 09:52:56 +0000 Received: from localhost ([127.0.0.1]:35688 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gkQpb-0001c1-OB for submit@debbugs.gnu.org; Fri, 18 Jan 2019 04:52:56 -0500 Received: from hera.aquilenet.fr ([185.233.100.1]:37936) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gkQpZ-0001br-NQ for 33899@debbugs.gnu.org; Fri, 18 Jan 2019 04:52:54 -0500 Received: from localhost (localhost [127.0.0.1]) by hera.aquilenet.fr (Postfix) with ESMTP id E070B2341; Fri, 18 Jan 2019 10:52:51 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at aquilenet.fr Received: from hera.aquilenet.fr ([127.0.0.1]) by localhost (hera.aquilenet.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZQPwYQv1qiwm; Fri, 18 Jan 2019 10:52:51 +0100 (CET) Received: from ribbon (unknown [IPv6:2001:660:6102:320:e120:2c8f:8909:cdfe]) by hera.aquilenet.fr (Postfix) with ESMTPSA id A1334C39; Fri, 18 Jan 2019 10:52:50 +0100 (CET) From: =?utf-8?Q?Ludovic_Court=C3=A8s?= To: Hector Sanjuan Subject: Re: [PATCH 0/5] Distributing substitutes over IPFS References: <20181228231205.8068-1-ludo@gnu.org> <87r2dfv0nj.fsf@gnu.org> X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 29 =?utf-8?Q?Niv=C3=B4se?= an 227 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu Date: Fri, 18 Jan 2019 10:52:49 +0100 In-Reply-To: (Hector Sanjuan's message of "Fri, 18 Jan 2019 09:08:02 +0000") Message-ID: <8736pqthqm.fsf@gnu.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Score: 1.0 (+) X-Debbugs-Envelope-To: 33899 Cc: "go-ipfs-wg@ipfs.io" , Pierre Neidhardt , "33899@debbugs.gnu.org" <33899@debbugs.gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -0.0 (/) Hello, Hector Sanjuan skribis: > =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 Original = Message =E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90=E2=80=90 > On Monday, January 14, 2019 2:17 PM, Ludovic Court=C3=A8s = wrote: [...] >> Yes, I=E2=80=99m well aware of =E2=80=9Cunixfs=E2=80=9D. The problems, a= s I see it, is that it >> stores =E2=80=9Ctoo much=E2=80=9D in a way (we don=E2=80=99t need to sto= re the mtimes or >> permissions; we could ignore them upon reconstruction though), and =E2= =80=9Cnot >> enough=E2=80=9D in another way (the executable bit is lost, IIUC.) > > Actually the only metadata that Unixfs stores is size: > https://github.com/ipfs/go-unixfs/blob/master/pb/unixfs.proto and by all > means the amount of metadata is negligible for the actual data stored > and serves to give you a progress bar when you are downloading. Yes, the format I came up with also store the size so we can eventually display a progress bar. > Having IPFS understand what files are part of a single item is important > because you can pin/unpin,diff,patch all of them as a whole. Unixfs > also takes care of handling the case where the directories need to > be sharded because there are too many entries. Isn=E2=80=99t there a way, then, to achieve the same behavior with the cust= om format? The /api/v0/add entry point has a =E2=80=98pin=E2=80=99 argument; = I suppose we could leave it to false except when we add the top-level =E2=80=9Cdirectory= =E2=80=9D node? Wouldn=E2=80=99t that give us behavior similar to that of Unixfs? > When the user puts the single root hash in ipfs.io/ipfs/, it > will display correctly the underlying files and the people will be > able to navigate the actual tree with both web and cli. Right, though that=E2=80=99s less important in my view. > Note that every file added to IPFS is getting wrapped as a Unixfs > block anyways. You are just saving some "directory" nodes by adding > them separately. Hmm weird. When I do /api/v0/add, I=E2=80=99m really just passing a byte vector; there=E2=80=99s no notion of a =E2=80=9Cfile=E2=80=9D here, AFAICS.= Or am I missing something? >> > It will probably need some trial an error to get the multi-part right >> > to upload all in a single request. The Go code HTTP Clients doing >> > this can be found at: >> > https://github.com/ipfs/go-ipfs-files/blob/master/multifilereader.go#L= 96 >> > As you see, a directory part in the multipart will have the content-ty= pe Header >> > set to "application/x-directory". The best way to see how "abspath" et= c is set >> > is probably to sniff an `ipfs add -r ` operation (localhos= t:5001). >> > Once UnixFSv2 lands, you will be in a position to just drop the sexp f= ile >> > altogether. >> >> Yes, that makes sense. In the meantime, I guess we have to keep using >> our own format. >> >> What are the performance implications of adding and retrieving files one >> by one like I did? I understand we=E2=80=99re doing N HTTP requests to t= he >> local IPFS daemon where =E2=80=9Cipfs add -r=E2=80=9D makes a single req= uest, but this >> alone can=E2=80=99t be much of a problem since communication is happening >> locally. Does pinning each file separately somehow incur additional >> overhead? >> > > Yes, pinning separately is slow and incurs in overhead. Pins are stored > in a merkle tree themselves so it involves reading, patching and saving. = This > gets quite slow when you have very large pinsets because your pins block = size > grow. Your pinset will grow very large if you do this. Additionally the > pinning operation itself requires global lock making it more slow. OK, I see. > But, even if it was fast, you will not have a way to easily unpin > anything that becomes obsolete or have an overview of to where things > belong. It is also unlikely that a single IPFS daemon will be able to > store everything you build, so you might find yourself using IPFS Cluster > soon to distribute the storage across multiple nodes and then you will > be effectively adding remotely. Currently, =E2=80=98guix publish=E2=80=99 stores things as long as they are= requested, and then for the duration specified with =E2=80=98--ttl=E2=80=99. I suppos= e we could have similar behavior with IPFS: if an item hasn=E2=80=99t been requested f= or the specified duration, then we unpin it. Does that make sense? Thanks for your help! Ludo=E2=80=99.