Skip to content

Remove implicit conversions between LocalVector and Vector #107469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Ivorforce
Copy link
Member

@Ivorforce Ivorforce commented Jun 12, 2025

Vector copy-conversion to LocalVector, and vice versa, are costly. Conversions are currently implicit, which allows them to happen by accident. This PR changes existing conversions (seems to just be one-way right now) to be explicit instead, preventing accidental performance sinks.

The change exposes current unintended conversions. @godotengine/rendering may want to have a look especially, since there are some conversions that should maybe be addressed.

@Ivorforce Ivorforce added this to the 4.x milestone Jun 12, 2025
@Ivorforce Ivorforce requested a review from a team as a code owner June 12, 2025 22:01
@Ivorforce Ivorforce requested review from a team as code owners June 12, 2025 22:01
@Ivorforce Ivorforce requested review from a team as code owners June 12, 2025 22:01
@Ivorforce Ivorforce removed request for a team June 12, 2025 22:02
@Ivorforce Ivorforce changed the title Change Vector <-> LocalVector conversions to be Span based, and make them explicit Remove implicit conversions between LocalVector and Vector Jun 12, 2025
@Ivorforce Ivorforce force-pushed the vector-localvector-explicit-span-conversions branch from a693e80 to 71b62a8 Compare June 12, 2025 22:38
@Ivorforce Ivorforce requested a review from a team as a code owner June 12, 2025 22:38
@Ivorforce Ivorforce removed the request for review from a team June 12, 2025 22:54
@Ivorforce Ivorforce force-pushed the vector-localvector-explicit-span-conversions branch from 71b62a8 to b63645c Compare June 12, 2025 23:42
return RD::get_singleton()->framebuffer_format_create(attachments);
return RD::get_singleton()->framebuffer_format_create(Vector<RD::AttachmentFormat>(attachments));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great change, as it exposes issues like this, where a thread local variable was used in an effort to improve performance, only to hurt it, as it is immediately copied to the heap.

@stuartcarnie stuartcarnie self-requested a review June 13, 2025 01:56
Copy link
Contributor

@stuartcarnie stuartcarnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 from me, as implicit conversions should be cheap, zero cost IMHO. This exposes situations where an optimisation was intended, but failed due to the implicit conversion.

@lawnjelly
Copy link
Member

I'll probably do the same in 3.x. I've long thought these implicit conversions were a bad idea especially for heavy types, and I'm sure it will expose some performance bugs.

@akien-mga
Copy link
Member

akien-mga commented Jun 13, 2025

I think this might warrant proper review of each occurrence, as it might highlight that some stuff uses LocalVector when it shouldn't. My understanding is that LocalVector was meant to be local and not passed across method boundaries, which typically use Vector. Here it looks like we have code that creates LocalVector only to convert it to Vector whenever it's passed to a method, which seems wasteful.

@Ivorforce
Copy link
Member Author

Ivorforce commented Jun 13, 2025

I think this might warrant proper review of each occurrence, as it might highlight that some stuff uses LocalVector when it shouldn't.

Agreed, but unfortunately the fixes might not be fully straight-forward. For example, we had a PR recently that optimized by using LocalVector temporarily and converting to Vector only at the end, because the Vector creation overhead is so large. For example, Vector does not retain its buffer on clear() (yet), which can lead to frequent re-allocations during growth.
I'm guessing some of the performance problems exposed here might be best resolved by making use of Span or VectorView instead.

All in all, personally I'd still move forwards with this one now, and leave it to contributors of the areas to apply an appropriate fix as a follow-up.

@@ -239,7 +239,7 @@ void AudioStreamMP3::set_data(const Vector<uint8_t> &p_data) {
}

Vector<uint8_t> AudioStreamMP3::get_data() const {
return data;
return Vector<uint8_t>(data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a conversion from Vector -> LocalVector right above in set_data. Is that direction not handled in this PR?

Because here it seems there's just no point for data to be LocalVector if the input data is Vector and the output should be too. (Unless I guess we're doing stuff with data where LocalVector really gives a huge benefit and warrants this double conversion.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do this in the codebase sometimes, where we store LocalVector internally but expose Vector interfaces.
The reason is that Vector still has considerable overhead, even on lookup. If we expect user queries to happen rarely, but internal read/writes to occur often, this type of interface is still preferable, even though we're re-allocating for user queries.

Comment on lines +1716 to +1717
surface_tool->set_bones(Vector<int>(bones));
surface_tool->set_weights(Vector<float>(weights));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like bones and weightsshould just beVector` in the first place (same for SpringBone3DGizmo).

Copy link
Member Author

@Ivorforce Ivorforce Jun 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit 2: It looks like the arrays are modified during iteration. In a way, the LocalVector is only used as a blueprint because SurfaceTool receives different buffers during each iteration.
There's probably still optimizations to be done here, but looking at surrounding context I wouldn't feel comfortable simply changing the type above (without testing), due to CoW overheads during the loops, if the arrays are going to be forked each iteration anyway.

@@ -101,7 +101,7 @@ Vector<uint8_t> WaylandThread::_read_fd(int fd) {
data.resize(bytes_read + chunk_size);
}

return data;
return Vector<uint8_t>(data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @Riteo to confirm that have data as LocalVector to convert on return is worth it here, as opposed to making it Vector from the start (I see it's resized quite a bit so I suppose the current approach is a good option).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know these classes in depth so I simply followed the advice of "Use LocalVector if you don't need COW".

I see it's resized quite a bit so I suppose the current approach is a good option

Is resizing that expensive with Vector? I'm taking a look at the template and it looks like it has a shortcut for when the refcount is 1, which in this case it would obviously be. If we can save a memcpy I guess we could switch to a Vector outright.

That's just my understanding though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, anything that writes to Vector (resize especially) is a lot slower with Vector. Whether or not that's important depends on the context of course.

@lawnjelly
Copy link
Member

BTW something to watch for here which has bit me just now doing conversions from Vector to LocalVector, that the default size is unsigned for LocalVector and signed for Vector (certainly in 3.x, and I think in 4.x) so you might have to check the use cases and convert to unsigned, or make sure they are ok with unsigned counters.

Alternative is to use a LocalVector<T, int32_t> instead of LocalVector<T>.

@Ivorforce
Copy link
Member Author

Ivorforce commented Jun 15, 2025

BTW something to watch for here which has bit me just now doing conversions from Vector to LocalVector, that the default size is unsigned for LocalVector and signed for Vector (certainly in 3.x, and I think in 4.x) so you might have to check the use cases and convert to unsigned, or make sure they are ok with unsigned counters.

Alternative is to use a LocalVector<T, int32_t> instead of LocalVector<T>.

In this PR I made no logical changes (just made the conversion explicit), so there should not be any effect on signed-ness of objects' counters.

@clayjohn
Copy link
Member

For the rendering changes, this looks fine for now. We need to move to using Span more widely so we use LocalVector even more. In particular, the methods here that are converting to Vector should be modified to take a Span, so a LocalVector can be used instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants