-
Notifications
You must be signed in to change notification settings - Fork 108
Conversation
Wow! I’ll need to set aside some time to read fully but just from the cursory read I’ve done this looks great! :) |
```ipldsch | ||
# Root node layout | ||
type HashMapRoot struct { | ||
hashAlg String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to use the string name over the codec number? I guess we already have a bunch of string overhead from the object keys so it shouldn't be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally did use the multihash varint for this but I got sick of the terseness and indirection for very little gain.
I could switch back if someone has strong feelings about this but my default preference is for these data structures to be somewhat readable on their own—at for the debugging case, to avoid indirection as much as possible (both code indirection "what is this varint again?" and cognitive indirection, "what's p
doing again?").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be fine, we just need to make sure we don't change the names of these codecs. I wonder if we should come up with a namespaced naming scheme and do a mass-rename first.
} | ||
|
||
type Element union { | ||
| Link "link" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We may want to call this a "shard" (this is a keyed union).
- I'm not sure if we should require that this be a link. Like the UnixFS-2 proposal, it would be nice if we could abstract over block boundaries" (but this SHOULD be a link).
- If we abstract over block boundaries, I wonder if we even need buckets? We can literally say: If there are fewer than 3 values in a subdag, erase block boundaries. To weird?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of the term "shard" but am not strongly tied to "link" either.
@mikeal you've talked a lot about inlining entire UnixFSv2 structures into a single block, does that requirement extend to the case where you're needing a HAMT to store directories, or is there some upper size bound where that's not allowed? If you need this data structure to be inlinable then yes, we'll need to do something about this "link" piece.
In terms of having a heuristic for automatic inlining of child nodes, (a) that would make the case for buckets harder to sustain so they'd probably need to go and (b) we need to figure out how to retain canonical form. Is it as easy as applying the same rules as applied here to buckets? "if child node has <= X
, inline it"?
There's a broader issue here of how to pack tree structures into blocks that I keep bumping into with collections, particularly sorted collections, but that might be a discussion for elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Reads
If you avoid defining where links are and build the read side entirely path based you end up with the flexibility to inline wherever you want. With IPLD Generics this concept ends up extending to all of the types, so even something like binary data could be a single inlined binary object or a big multi-block binary object.
For Writes
The heuristics here are a bit tricky. You don’t have great knowledge about the eventual size of the block before you encode it in order to decide when to inline and when to not inline.
For files in the experimental unixfsv2
implementation we use the size
of the underlying binary data, and target about half the max-block size we’re looking for, and this size
property accumulates and bubbles up through the directories as well (but we don’t currently shard directories in a HAMT). The obvious problem with this method is that if you have a lot of small files with large file names the encoding overhead is much larger than the size.
But, since the read size is completely agnostic of block boundaries we have room to improve all of this logic in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so for the purpose of this spec we still have to define Link
in the schema even if it's a union of some kind (I know you don't like that @mikeal but we need to be able to say clearly where these kinds are in a block and a Link is an important one) but we should allow for fully inline data structures too.
Currently we have:
type Element union {
| Link "link"
| Bucket "bucket"
} representation keyed
I'm going to have to involve @warpfork on this one because we lack the ability to define that one named property could be either a kind (Link) or a complex map representation of some other type (HashMapNode), so I can't just rename "link" to "child" and stick either a Link or HashMapNode in there, I have to do something like this:
type Element union {
| Child "child"
| Bucket "bucket"
} representation keyed
type Child union {
| Link "link"
| HashMapNode "inline"
} representation keyed
So you end up with either:
{
...
"data": [
... buckets and other child nodes here
"child": {
"link": CID
},
... buckets and other child nodes here
}
or
{
...
"data": [
... buckets and other child nodes here
"child": {
"node": {
... actual inline child node here
}
},
... buckets and other child nodes here
}
The other options we have are kinded (HashMapNode isn't a kind, it's a complex map representation), inline (won't work where one of these things is a kind, i.e. Link, it can only inline two complex map representations with a discriminator key), and envelope (would work but requires a discriminator key in addition to the child key). See docs from here down.
So we can't do a union of: "child": CID
and "child": { ... HashMapNode here ... }
. Which I know will bother at least @mikeal because it doesn't make block boundaries "transparent". But that's @warpfork's argument to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@warpfork could this particular case could be solved by treating HashMapNode
as a map
for a kinded union?
type Child union {
| Link link
| HashMapNode map
} representation kinded
making
"child": CID
or
"child": { ... HashMapNode here }
is that approach a general solution to the "everything could be a link" desire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you beat me to the punch. yes, we should be able to do exactly such an expression with a kinded union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've gone with that kinded form above, it's neat. I've also removed or altered explicit references to CIDs for read operations such that they could be inline. But I've also added this to the bottom of the summary in the top section of this doc:
By default, each node in an IPLD HashMap is stored in a distinct IPLD block and CIDs are used for child node links. The schema and algorithm presented here also allows for inline child nodes rather than links, with read operations able to traverse multiple nodes within a single block where they are inlined. The production of inlined IPLD HashMaps is left unspecified and users should be aware that inlining breaks canonical form guarantees.
i.e. Inline is possible and reads should assume it's possible. How you write them and under what circumstances are up to you and they shouldn't be default mode. See 73ab06b for the precise changes I made on this point. Reviews appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm perfectly happy with that phrasing.
Using a kinded union, as you've done, makes it super clear that we've got the possibility for "inline" vs separate linked blocks -- A+ good.
I also think -- if you want to -- you could get even more prescriptive than that text does:
(With the caveat that refining our exact language and understandings of this has been tricky...) My best thinking at present is that having "implicit links" in general was problematic (as it made canonicality more or less undefined)... but, we're not "in general", here; we're in the scope of an advanced layout.
Inside an advanced layout, we can set the definition of canonicality to whatever we want. Because it's already scoped by the defn of the advanced layout, and since an advanced layout defn already includes descriptions of algorithms/decisions... it therefore follows that adding another decision description for whether or not to inline-vs-link isn't out of character at all; it fits right along with the rest.
In other words -- as long as the decision to inline-vs-link happens inside the scope of the advanced layout, rather than being unspecified and left to some higher level (aka, made into a problem for the user), we're good!
3. Take the left-most `bitWidth` bits, offset by `depth x bitWidth`, from the hash to form an `index`. At each level of the data structure, we increment the section of bits we take from the hash so that the `index` comprises a different set of bits as we move down. | ||
4. If the `index` bit in the node's `map` is `0`, we can be certain that the `key` does not exist in this data structure, so return a `Null` value. | ||
5. If the `index` bit in the node's `map` is `1`, the value may exist. Perform a `popcount()` on the `map` up to `index` such that we count the number of `1` bits up to the `index` bit-position. This gives us `dataIndex`, an index in the `data` array to look up the value or insert a new bucket. | ||
6. If the `dataIndex` element of `data` contains a link (CID) to a child block, increment `depth` and repeat with the child node identified by the link from step **3**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(keyed union)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two variants scattered in this section: "contains a link (CID)" and "contains a bucket (array)", assuming that the reader will identify that the discriminator is in the keyed union. Should it be something like "contains a 'link' property" and "contains a 'bucket' property"?
05b9221
to
872b7d6
Compare
latest update inserts " |
|
||
### `hashAlg` | ||
|
||
* The default supported hash algorithm for writing IPLD HashMaps is the x64 form of the 128-bit [MurmurHash3](https://github.com/aappleby/smhasher), identified by the multihash name ['murmur3-128'](https://github.com/multiformats/multicodec/blob/master/table.csv). Note the x86 form will produce different output so should not be confused with the x64 form. Additionally, [some JavaScript implementations](https://cimi.io/murmurhash3js-revisited/) do not correctly decompose UTF-8 strings into their constituent bytes for hashing so will not produce portable results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current go implementation relies on the serialization of the two resulting u64
s, in a specific endianess. This needs to be specced, as it is not part of the original reference implementation. In addition, given that there is no spec it should be clear how edge cases like spaolacci/murmur3#11 should be handled.
Ref for serialization: https://github.com/spaolacci/murmur3/blob/master/murmur128.go#L52-L61
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the go-hamt-ipld implementation does not use a multihash, but the raw murmur3 hash, this sholud be probably clarified, as the reference to multihash is confusing in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dignifiedquire: re the reference to "multihash" here is simply as a string identifier that goes into the root block under the property "hashAlg", with the intention that any alternative hash algorithm found at "hashAlg" will also be a valid identifier in the multihash table, that's all. e.g. https://github.com/rvagg/iamap/blob/ec2259f5f6108954d511c688aafb91273dc50828/iamap.js#L92-L94 is the only place I use multihash at all in my implementaiton, just a check that it's valid. What I've done is clarified it a bit more at the top of the doc where it first references "multihash" and then put this reference in brackets: (identified by the multihash name ...).
Regarding the specifics of how Go implements multihash and the edge cases, can you help me out with some wording for this? I've had a bunch of trouble with murmur3 across platforms already and it's making me a bit weary of this area.
My go-to JavaScript implementation does some awkward concat-byte-to-hex work which I then have to unpack into bytes again. I have no confidence that the values I'm using will end up being cross platform suitable! I even gave up trying to generate compatible murmur3 hashes in JavaScript as produced by the Scala-to-JavaScript version in @magik6k's distributed-wiki-search when I was playing with that. Perhaps it's the wrong hash to be defaulting to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Soooo murmur3 is complicated at best, under specified at worst I would argue. But it does work nicely overall and is fast.
- In regards to compatible versions: I have this rust version https://github.com/dignifiedquire/murmur3/blob/nicer-hashing/src/murmur3_x64_128.rs that has fuzz tests to match against the reference implementation + now matches the bytes of the go implementation for all cases I tried so far. This might be useful to use in JS land as well.
- If we want to really rely on murmur3 here, I think we should write a spec, for it based on the reference implementation + generate a set of test vectors
- the minimum here I would say is:
- define the (byte) order of how the two
u64
s produced by the reference implementation are supposed to be written to bytes - say sth like "in case of differences, the reference implementation is to be taken as the correct version, unless a serious bug is discovered"
- define the (byte) order of how the two
- the minimum here I would say is:
- Do we have concrete alternatives in the pipeline we could use, that provide similar speeds?
|
||
### Parameters | ||
|
||
Configurable parameters for any given IPLD HashMap: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really like to see these being optional, such that we don't have to add the overhead of storing these in every hamt that gets serialized (especially for smaller structures)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bonus: if these become optional a regular node and a root node could be the same data structure, allowing for some interesting nesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dignifiedquire: optional means we're going to need to be more confident of the default values and their future suitability. I don't think we have that, we don't know how these data structures will be used by real-world applications or what the implications of the default shapes we're defining here are. For example, I have a suspicion that buckets might not be the right approach and that the elasticity in block sizes they will yield might be too messy. But I can't make assertions on that without real world applications and a lot of research work.
My preference would be to stick with explicit definitions for now, and maybe over time it'll be clear what the right parameters are an define implicit defaults.
allowing for some interesting nesting
Not really though, depth changes everything, a depth=0 node (root) is useless at depth=>0. Do you have something concrete in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have something concrete in mind?
no, just a random idea
My preference would be to stick with explicit definitions for now
fair enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even think we need to put them in there. We already have to have context to know that this is a HAMT and how to traverse it. might as well say that we need context to know the width.
Additionally, we can infer the width from the bitfield.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, that goes back to the discussion @ #130 I think so I'd rather that all be resolved before I remove these. Right now they're useful since we have no formal way of defining "context" except by inlining it. Fat pointers? CIDv2? We have neither.
The current experiment that I'm working with, the only thing that currently leverages these advanced layouts / generics / composites, is @mikeal's js-types which embeds a _type
property that points to the implementation. Search for _type
in https://github.com/ipld/js-generics/pull/1/files to see it embedded in blocks as they are encoded for a HashMap. The js-types implementation then knows when it encounters a _type
at the root of a block that it can go and find the implementation that matches that string and traverse the advanced layout / generic / composite with the associated logic.
I understand this may not be ideal, but it works at least and in lieu of an alternative I'm not sure how to make this useful without hardwiring all of these parameters into code that creates and reads these blocks.
Good point re width
, I'll try removing that from my implementation and update here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we shouldn’t need an identifier like _type
in this spec. For the (advanced layouts, / generics / composites) I’m separating out the “definition,” which describes the implementation of the data structure, from the actual data. Everything in this spec, including these parameters, are part of the data.
It’s important to keep these separate because we’re exploring in-band and out-of-band signaling mechanisms. Since the “definition” can be applied in multiple ways we need to make sure that definitions stay out of specs like this. As @whyrusleeping notes, there are cases where you already know what data structure you’re working with and don’t even need an in-band description.
The experimental implementation we have will change once we figure out how the “fat pointer” works and will only read the _type
field when we encounter the right kind of link or when you explicitly tell it the data you’re passing is a definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll admit that I don't fully understand this desire to remove this kind of data from blocks. I've been imagining that making blocks as self describing as possible was a worthy goal. There are obvious limitations to that but CIDs seem to get us over most of those. It seems to me that once you have a CID for something that's all you ideally should need to understand what's in it—minus all of the programatic pieces to decode and navigate through it (codecs and these advanced layout pieces). There seems to be a desire to push a lot of that self describing stuff into the "how we got to here" place. That's the bit I don't get.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This got a little off-track, let me try and float a proposal that I think can resolve it.
In our experimental composite system, we could have something like this:
{ _type: ‘IPLD/Experimental/HAMT/0’,
hamt: {
hashAlg: ‘sha2-256’,
bitWidth: 8,
bucketSize: 3,
root: {
data: [],
map: Bytes(),
}
}
}
The top level node containing _type
and hamt
are part of the “definition” which isn’t part of this spec and is only there for illustrative purposes.
The root block of this spec is then modified to put bytes
and map
under a root
key. In the schema we can define this as a union of Map or Link. In the Filecoin use case, they’d just stick to encoding the root node all the way down. For everyone else, including people linking to Filecoin’s hamt, they can create a block containing the parameters and then link to this root block.
This gives us the flexibility to avoid encoding parameters which are constant in the application while still allowing people to create pointers to the data containing the parameter context they need in order to read the structure properly. The only thing we need to make sure to note is that if you inline the Map vs Link to a Map you’re going to end up with a different hash.
hashAlg String | ||
bitWidth Int | ||
bucketSize Int | ||
map Bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a fixed size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dignifiedquire: it's dependent on bitWidth
but @warpfork has been very firm on his desire to not introduce dependent types into IPLD schemas so we can't represent that here.
I've added this to the notes below the schema:
- The size of
map
is determined bybitWidth
since it holds one bit per possible data element. It must be1
or2
bitWidth
/ 8
bytes long, whichever is largest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused, is this a different indexing scheme than used in go-hamt-ipld?
Because looking at https://github.com/ipfs/go-hamt-ipld/blob/master/hamt.go#L73-L74 it needs to always be of size 256 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this spec is not intended to reflect go-hamt-ipld. It has a flexible bitWidth
, the go-hamt-ipld data structure is fixed to 8-bits for simplicity. Fixing to 8
gives a fixed map
size of 32
bytes and a maximum number of data elements of 256
. The reference implementation for the purposes of writing this spec is https://github.com/rvagg/iamap (it's currently missing the ability to deal with inlined child nodes).
value Value (implicit "null") | ||
} representation tuple | ||
|
||
type Value union { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole thing might be possible to replace with the Any
type.
(Which... needs specification. But it's come up more than a few times that we need a way to go back to free-for-all untyped data land, and when that's been happening, I've been calling it Any
. So I'd say feel free to do so here, too -- and we should go sort out a full spec for Any
soon.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 my original draft used any
as a placeholder but since we don't have that defined yet ...
} | ||
|
||
type Element union { | ||
| Child "child" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would prefer this to be:
type Element union {
| Child "c"
| Bucket "b"
} representation keyed
For brevity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
re the outstanding point about the
becoming "c" and "b" for brevity: I'm moving my code to a But then we get some confusion when we handle the inline case where
I might have to introduce (Edit: for clarity, "kinded" would work for |
I don't know if we've hit this syntactical corner before (!) but I think the following should be a sensible expression that might do what you want:
Does that seem to resonate at all rightly? (I don't know if I've sufficiently fully wrapped my head around the problem space, but I wonder if this might make it possible to drop one of the types entirely? I don't quite fully grok BucketEntry/Bucket/Child/Element/Value/Root and I've got this vague idea that it seems like maybe one more type than is necessary. But I'm not at all prepared to back that feeling up and could easily be wrong.) |
1097518
to
ebabe1a
Compare
ebabe1a
to
b4814bb
Compare
@warpfork thanks, you're right that it does collapse one type from the list,
Full schema with this change now at https://github.com/ipld/specs/blob/b4814bba65df1fcf1a067a5da7e27e4a222653e3/schema-layer/data-structures/hashmap.md#schema, there's an outstanding TODO of removing |
Awesome! 🎉 I also re-read the other types and now they all make sense to me. (I ground my gears for a second figuring out how different leafward types were and why, though. Maybe a line or two of very terse comment in the schemas about each type's role would help? Just enough to back out to the bigger document.) |
This algorithm differs from CHAMP in the following ways: | ||
|
||
1. CHAMP separates `map` into `datamap` and `nodemap` for referencing local data elements and local references to child nodes. The `data` array is then split in half such that data elements are stored from the left and the child node links are stored from the right with a reverse index. This allows important speed and cache-locality optimizations for fully in-memory data structures but those optimizations are not present, or make negligible impact in a distributed data structure. | ||
2. CHAMP does not make use, of buckets, nor do common implementations of HAMTs on the JVM (e.g. Clojure, Scala, Java). Storing only entries and links removes the need for an iterative search and compare within buckets and allows direct traversal to the entries required. This is effective for in-memory data structures but is less useful when performing block-by-block traversal with distributed data structures where packing data to reduce traversals may be more important. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray comma after "use".
…ization initial compact serialization proposal
* Add HashMap (HAMT) spec * fixup! Add HashMap (HAMT) spec
An iteration of #109