‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Monday, January 14, 2019 2:17 PM, Ludovic Courtès wrote: > Hi Hector, > > Happy new year to you too! :-) > > Hector Sanjuan code@hector.link skribis: > > > 1. The doc strings usually refer to the IPFS HTTP API as GATEWAY. go-ipfs > > has a read/write API (on :5001) and a read-only API that we call "gateway" > > and which runs on :8080. The gateway, apart from handling most of the > > read-only methods from the HTTP API, also handles paths like "/ipfs/" > > or "/ipns/" gracefully, and returns an autogenerated webpage for > > directory-type CIDs. The gateway does not allow to "publish". Therefore I think > > the doc strings should say "IPFS daemon API" rather than "GATEWAY". > > > > Indeed, I’ll change that. > > > 2. I'm not proficient enough in schema to grasp the details of the > > "directory" format. If I understand it right, you keep a separate manifest > > object listing the directory structure, the contents and the executable bit > > for each. Thus, when adding a store item you add all the files separately and > > this manifest. And when retrieving a store item you fetch the manifest and > > reconstruct the tree by fetching the contents in it (and applying the > > executable flag). Is this correct? This works, but it can be improved: > > > > That’s correct. > > > You can add all the files/folders in a single request. If I'm > > reading it right, now each files is added separately (and gets pinned > > separately). It would probably make sense to add it all in a single request, > > letting IPFS to store the directory structure as "unixfs". You can > > additionally add the sexp file with the dir-structure and executable flags > > as an extra file to the root folder. This would allow to fetch the whole thing > > with a single request too /api/v0/get?arg=. And to pin a single hash > > recursively (and not each separately). After getting the whole thing, you > > will need to chmod +x things accordingly. > > Yes, I’m well aware of “unixfs”. The problems, as I see it, is that it > stores “too much” in a way (we don’t need to store the mtimes or > permissions; we could ignore them upon reconstruction though), and “not > enough” in another way (the executable bit is lost, IIUC.) Actually the only metadata that Unixfs stores is size: https://github.com/ipfs/go-unixfs/blob/master/pb/unixfs.proto and by all means the amount of metadata is negligible for the actual data stored and serves to give you a progress bar when you are downloading. Having IPFS understand what files are part of a single item is important because you can pin/unpin,diff,patch all of them as a whole. Unixfs also takes care of handling the case where the directories need to be sharded because there are too many entries. When the user puts the single root hash in ipfs.io/ipfs/, it will display correctly the underlying files and the people will be able to navigate the actual tree with both web and cli. Note that every file added to IPFS is getting wrapped as a Unixfs block anyways. You are just saving some "directory" nodes by adding them separately. There is an alternative way which is using IPLD to implement a custom block format that carries the executable bit information and nothing else. But I don't see significant advantages at this point for the extra work it requires. > > > It will probably need some trial an error to get the multi-part right > > to upload all in a single request. The Go code HTTP Clients doing > > this can be found at: > > https://github.com/ipfs/go-ipfs-files/blob/master/multifilereader.go#L96 > > As you see, a directory part in the multipart will have the content-type Header > > set to "application/x-directory". The best way to see how "abspath" etc is set > > is probably to sniff an `ipfs add -r ` operation (localhost:5001). > > Once UnixFSv2 lands, you will be in a position to just drop the sexp file > > altogether. > > Yes, that makes sense. In the meantime, I guess we have to keep using > our own format. > > What are the performance implications of adding and retrieving files one > by one like I did? I understand we’re doing N HTTP requests to the > local IPFS daemon where “ipfs add -r” makes a single request, but this > alone can’t be much of a problem since communication is happening > locally. Does pinning each file separately somehow incur additional > overhead? > Yes, pinning separately is slow and incurs in overhead. Pins are stored in a merkle tree themselves so it involves reading, patching and saving. This gets quite slow when you have very large pinsets because your pins block size grow. Your pinset will grow very large if you do this. Additionally the pinning operation itself requires global lock making it more slow. But, even if it was fast, you will not have a way to easily unpin anything that becomes obsolete or have an overview of to where things belong. It is also unlikely that a single IPFS daemon will be able to store everything you build, so you might find yourself using IPFS Cluster soon to distribute the storage across multiple nodes and then you will be effectively adding remotely. > Thanks for your feedback! > > Ludo’. Thanks for working on this! Hector