-
Notifications
You must be signed in to change notification settings - Fork 1
First draft of proposal for inline classes #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I propose to keep this PR open for comments. If the comments are positive enough we give it a SNIP number and try to proceed to a prototype implementation phase. Otherwise we'll close. |
This is the first draft of the proposal for inline classes and a different representation of generics that avoids boxing.
5be6984
to
8f4c2e8
Compare
This comment was marked as resolved.
This comment was marked as resolved.
Co-authored-by: Lorenzo Gabriele <[email protected]>
- Some additional complications might arise for GC. | ||
- It would be difficult implement the techniques if usable address space needs to be increased significantly beyond 48 bits. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Moved here to allow inline replies)
I was not sure what would happen when you mix normal class
es and inline class
es, here is what I understood, is this correct ?
inline class Inner(x: Int)
inline class Outer(y1: Inner, y2: Inner)
Instances of Outer
are C-style structs with structure more or less [y1 -> int, y2 -> int]
where y1, y2 are not actually stored, but offsets known at compiletime
case class Inner(x: Int)
inline class Outer(y1: Inner, y2: Inner)
Outer
: [y1 -> pointer to Inner, y2 -> pointer to Inner]
inline class Inner(x: Int)
case class Outer(y1: Inner, y2: Inner)
Outer
: case class Outer(y1: Int, y2: Int)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I couldn't find a calling convention, so would the first example be [y1.x -> int, y2.x -> int]
instead ?
(Where again y1.x
and y2.x
are replaced by an offset at compiletime)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The layout of classes is all as you described. When calling a constructor or function, statically known inline classes of size up to 8 are passed by value, whereas larger inline classes are passed by reference. But that does not affect the field layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So would you call outer.y1.x
, or simply outer.y1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outer.y1.x
.
@odersky I like the proposal and it's something I've wanted for a while. However, I'm wondering if this really needs to be a SNIP... could it not be a SIP? It's true that there are some low-level memory layout advantages that can realistically only be achieved by Scala Native, however I think there is scope for this to also have a benefit for Scala as a wider whole. One reason why I'd like to understand how this would fit in the wider Scala context is that it would be a shame if Scala native started to syntactically diverge from Scala itself -- part of the benefit of Scala Native for me is how I can cross compile to JVM quite easily if I wish, obviously some people would rather just always use SN or SJVM instead. I think that a lot of this proposal shares similarities with a few features from GHC: the That said, just because Scala Native can do it better doesn't mean that Scala JVM and Scala JS cannot do it in a "less performant" fashion. I still think it would many of the benefits you rightly describe in the proposal. It might require more boxing or unboxing perhaps, but I still think that provides room for more compact cache coherent data access etc etc. Glad to hear your thoughts and discuss further 🙂 |
I'd be surprised if one could get the benefits of this proposal on the current JVM. Yes, we can try to extend the value class approach to classes with more than one field (which seems to be more or less what {-# UNPACK #-} does). But value classes are already problematic for possibly losing performance due to unexpected boxing. So it's not clear going further down that road will win anything on average. One can try for sure, but it will be a lot of work, and I don't see anyone having the appetite to do it. So I think the only way forward is: prototype this on SN. If it's a big win, lobby for the syntax changes to be backported to Scala JVM. As far as I can see, the only syntax change necessary would be to allow On the other hand, if the project is a success on SN, maybe it can influence and speed up Valhalla and we will get equivalent functionality on the JVM at some point? |
Otherwise our computations don't work out since inline classes with byte alignment need 8 consecutive entries in the vtable array, so one of them will hit 000 as an index.
It would be interesting to see what benefits you can get -- I think a system that is a little more robust than the current "best effort"
... i.e. this makes sense.
Ideally though, it would still be valid syntax on the JVM even if it "did nothing" for portability. Though as @keynmol noted earlier (elsewhere), it seems that the current parser already accepts
Well, you could still unpack a multi-arg value class directly into multiple arguments/variables on the JVM no?
Yeah, the annotation route seems a little hacky to me, I'd rather syntax that "does nothing" over that, I'd think:
Yeah, this is a great point! I see real value in that, and if a SN prototype is the way to get that, then cool! |
You can align the class' storage by the alignment of its first field and add padding as you go. That way you can align at
So IIUC basically all instances of in a class hierarchy are have the same layout. It's just that for parent classes the unused members are considered padding. If that's the case, I suspect there's a way to avoid bloating the size of a parent class instance by laying out the fields of child classes at the end of the inline storage.
I do not understand the reason why inline classes could not be parameterized. One can compute the layout of the inline storage during monomorphization, so there should be no problem writing such a class: inline class Pair[A, B](first: A, second: B) The main difficulty is to generate the ClassInstance info for such a type. IIUC, this information roughly matches what Swift (and Hylo) calls "witness tables". Perhaps I can shed some light on this concept so that we can figure out if they can be adapted to this proposal. Boxing in Swift consists of creating an existential type. Just like in CS literature, an existential type "wraps" a witness with an interface. At runtime, existential types are represented as so-called "existential containers", which have a layout of the form:
[Note: The reason for a 3-word size payload in Swift is historical. In practice I find it to be too small for many witnesses and therefore we usually end up wasting 2 words because we're only storing a pointer to out-of-line storage.] A value witness table tells the runtime about the "value behavior" of the witness. Specifically, it contains the size of the witness and pointers to methods for copying it, moving it, and destroying it. The protocol witness table tells the runtime how to apply dynamic dispatch. Protocols in Swift are best understood as a traits, and are the primary tool to write generic code. The protocols to which a witness conform describe the interface of the existential type in which it's been wrapped. At runtime, the protocol witness table is used to lookup the implementation of each trait requirement. You may want to have a look at this paper for some formal description of a protocol witness table. Note that fields can be represented suing getter and setters that hardcode the offset. So access to a property can be represented like method calls. Better performance can be achieved using subscripts, but that is perhaps an orthogonal feature that I'm happy to describe in another post. Say you have this generic program:
The call to
If the call to Should Of course, witness tables can be reused for every boxed instance of the same type so in practice they are not very expensive. |
From here:
Why not store a header with all the information we need along with the boxed value? Then we would need only a single bit in the reference to tell whether it's a regular reference or one to a boxed struct. This approach would also work on a 32-bit machine, since we'd need to tag pointers with only a single bit. References on a 32-bit machine are typically aligned at 4 bytes, allowing us to use the two least significant bits. From here:
I'd like to propose going with (mutable) value semantics to address this issue. If inline classes (or structs) behave with value semantics, then we no longer need to care about references to the middle of a potentially stack-allocated object. That's because any "use" of a field would result in a copy. This approach would make inline classes behave differently from regular Scala classes (which would be one more reason to call them structs) but I also suspect it would make it simpler to fit them into the language and lift the immutability restriction. There is precedent for making value semantics and reference semantics co-exist. C# and Swift are examples. I have a lot of experience with the latter. |
Yes, it would be really nice to support 32bit as Scala Native is quite portable and useful for 32bit platforms. |
This is the first draft of the proposal for inline classes and a different representation of generics that avoids boxing.
[Rendered]