-
Notifications
You must be signed in to change notification settings - Fork 6
String
improvement and benchmark setup
#54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
harness = false | ||
|
||
[features] | ||
default = ["compact_str"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't change the default just yet TBH. But also to get the best wins, we could instead make the Token
use a Cow<&str>
that points directly in the source buffer. It would make each token have the same lifetime as the source buffer but in the very vast majority of cases, 0 copiers would be happening, ending up in the best performance. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm asking to see if anyone from Naga can review the change at a high level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could instead make the
Token
use aCow<&str>
that points directly in the source buffer. It would make each token have the same lifetime as the source buffer but in the very vast majority of cases, 0 copiers would be happening, ending up in the best performance. WDYT?
I did try to do this, but the calculation of range is harder than I thought, since this requires that lexer
tracks skipped characters, which brings many branches in parse_*
and increases complexity, also there are tons of lifetime
s, so I think is not worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using CompactString
sounds fine to me, I recently switched from &str
to CompactString
for a parser I'm working on and I expected performance to be much worse but it turned out to be fine. IIRC it was marginally slower (1-2%) with the benefit of not having to have lifetimes everywhere. I guess having most strings inline and having to copy them at least once is about the same as not having to copy them but instead reach out in the source string. Cow<&str>
might be even slower than CompactString
since any access will need to take a branch; but it would be great to have a benchmark prove it.
#extension GL_GOOGLE_include_directive : enable | ||
#extension GL_EXT_samplerless_texture_functions : enable | ||
|
||
#include "cloud_render_common.glsl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do these bench files come from? I think glslpp-rs will very quickly produce an error when it encounters the #include
so we're just benchmarking the parsing of the first few lines.
Also gathering files like this with no license disclosure could be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think glslpp-rs will very quickly produce an error when it encounters the
#include
so we're just benchmarking the parsing of the first few lines.
I checked locally, it did produce err
variants, but not exit since benchmark call collect
here:
glslpp-rs/benches/benchmark.rs
Line 13 in 54a58e6
let _ = Preprocessor::new(input).collect::<Vec<_>>(); |
Where do these bench files come from?
Also gathering files like this with no license disclosure could be a problem.
I mark them here, the original project is MIT licensed, do I need to add license disclosure somewhere else? Or do you have some better source to benchmark with?
glslpp-rs/benches/benchmark.rs
Line 5 in 54a58e6
// Copied from https://github.com/qiutang98/flower/tree/dark/install/shader (MIT License) |
compact_string
as variant ofIdent
,Float
andInteger
gated behind feature(enabled by default, local benchmark shows 13~15% performance improvement).