-
Notifications
You must be signed in to change notification settings - Fork 408
Cache __contains_no_intrinsic_headers and thus speedup parse_options ~2x #4479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0eb3f00
to
c3017f4
Compare
According to line_profiler (https://github.com/pyutils/line_profiler)
|
c3017f4
to
1fe5d4a
Compare
When analyzing Chromium or any other large project, __contains_no_intrinsic_headers is called thousands of times for the same dir name. It is an expensive OS call.
1fe5d4a
to
fbed9eb
Compare
Looks only GUI tests fail and it's not related to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, caching this slow function is a great improvement. I just have two small questions.
@@ -1289,6 +1288,7 @@ def parse_unique_log(compilation_database, | |||
uniqueing_re = re.compile(compile_uniqueing) | |||
|
|||
skipped_cmp_cmd_count = 0 | |||
__contains_no_intrinsic_headers.cache_clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of clearing the cache in the beginning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact it's only used for tests as they share a static cache storage.
In real scenarios this function is called only once so clearing the cache is nop.
@@ -1184,12 +1185,10 @@ def extend_compilation_database_entries(compilation_database): | |||
for source_file in source_files: | |||
new_entry = dict(entry) | |||
new_entry['file'] = source_file | |||
entries.append(new_entry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it useful to build this list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not directly related to the speed-up, but I noticed that the whole entries
is never used at the same time and decided to save a bit of memory here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, for some reason I though it was a "yield -> append" change, but it's just reverse :D Sorry.
When analyzing Chromium or any other large project, __contains_no_intrinsic_headers is called thousands of times for the same dir names. It is an expensive OS call.