Skip to content

[clang][modules] Stop eagerly reading files with diagnostic pragmas #87442

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jansvoboda11
Copy link
Contributor

This makes it so that the importer doesn't need to stat all input files of a module that contain diagnostic pragmas, reducing file system traffic.

This makes it so that the importer doesn't need to stat all input files of a module that contain diagnostic pragmas, reducing file system traffic.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexfh This is basically minimized reproducer you provided in https://reviews.llvm.org/D137213. I don't think it's reasonable to expect this to work.

@cor3ntin cor3ntin added the clang:modules C++20 modules and Clang Header Modules label Apr 3, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2024

@llvm/pr-subscribers-clang-modules

Author: Jan Svoboda (jansvoboda11)

Changes

This makes it so that the importer doesn't need to stat all input files of a module that contain diagnostic pragmas, reducing file system traffic.


Full diff: https://github.com/llvm/llvm-project/pull/87442.diff

2 Files Affected:

  • (modified) clang/lib/Serialization/ASTReader.cpp (-2)
  • (added) clang/test/Modules/home-is-cwd-search-paths.c (+34)
diff --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index 9a39e7d3826e7d..4db0e04b27a06f 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -6624,8 +6624,6 @@ void ASTReader::ReadPragmaDiagnosticMappings(DiagnosticsEngine &Diag) {
              "Invalid data, missing pragma diagnostic states");
       FileID FID = ReadFileID(F, Record, Idx);
       assert(FID.isValid() && "invalid FileID for transition");
-      // FIXME: Remove this once we don't need the side-effects.
-      (void)SourceMgr.getSLocEntryOrNull(FID);
       unsigned Transitions = Record[Idx++];
 
       // Note that we don't need to set up Parent/ParentOffset here, because
diff --git a/clang/test/Modules/home-is-cwd-search-paths.c b/clang/test/Modules/home-is-cwd-search-paths.c
new file mode 100644
index 00000000000000..0b8954e691bc04
--- /dev/null
+++ b/clang/test/Modules/home-is-cwd-search-paths.c
@@ -0,0 +1,34 @@
+// This test demonstrates how -fmodule-map-file-home-is-cwd with -fmodules-embed-all-files
+// extend the importer search paths by relying on the side effects of pragma diagnostic
+// mappings deserialization.
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+//--- dir1/a.modulemap
+module a { header "a.h" }
+//--- dir1/a.h
+#include "search.h"
+// The first compilation is configured such that -I search does contain the search.h header.
+//--- dir1/search/search.h
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wparentheses"
+#pragma clang diagnostic pop
+// RUN: cd %t/dir1 && %clang_cc1 -fmodules -I search \
+// RUN:   -emit-module -fmodule-name=a a.modulemap -o %t/a.pcm \
+// RUN:   -fmodules-embed-all-files -fmodule-map-file-home-is-cwd
+
+//--- dir2/b.modulemap
+module b { header "b.h" }
+//--- dir2/b.h
+#include "search.h" // expected-error{{'search.h' file not found}}
+// The second compilation is configured such that -I search is an empty directory.
+// However, since b.pcm simply embeds the headers as "search/search.h", this compilation
+// ends up seeing it too. This relies solely on ASTReader::ReadPragmaDiagnosticMappings()
+// eagerly reading the corresponding INPUT_FILE record before header search happens.
+// Removing the eager deserialization makes this header invisible and so does removing
+// the pragma directives.
+// RUN: mkdir %t/dir2/search
+// RUN: cd %t/dir2 && %clang_cc1 -fmodules -I search \
+// RUN:   -emit-module -fmodule-name=b b.modulemap -o %t/b.pcm \
+// RUN:   -fmodule-file=%t/a.pcm -verify

jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Apr 3, 2024
…lvm#87442)

This makes it so that the importer doesn't need to stat all input files
of a module that contain diagnostic pragmas, reducing file system
traffic.
Copy link
Contributor

@Bigcheese Bigcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at 9acb99e342376c6269fb70d1e9665c2790b93b12 I agree it's unexpected that this worked at all. This is just missing search paths.

@jansvoboda11 jansvoboda11 merged commit fc3dff9 into llvm:main Apr 10, 2024
@jansvoboda11 jansvoboda11 deleted the pragma-diagnostic-mappings-side-effect branch April 10, 2024 16:08
@alexfh
Copy link
Contributor

alexfh commented Apr 22, 2024

Hi Jan, we started seeing a compilation error in a (quite unusual, frankly speaking) code:

pigweed/pw_rpc/public/pw_rpc/internal/channel_list.h:24:10: fatal error: 'vector' file not found
   24 | #include PW_RPC_DYNAMIC_CONTAINER_INCLUDE
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pigweed/pw_rpc/public/pw_rpc/internal/config.h:207:42: note: expanded from macro 'PW_RPC_DYNAMIC_CONTAINER_INCLUDE'
  207 | #define PW_RPC_DYNAMIC_CONTAINER_INCLUDE <vector>
      |                                          ^~~~~~~~
<scratch space>:73:1: note: expanded from here
   73 | <vector>
      | ^~~~~~~~

The code is coming from here: https://pigweed.googlesource.com/pigweed/pigweed/+/refs/heads/main/pw_rpc/public/pw_rpc/internal/channel_list.h#24

This error only reproduces in our modules build.

@jansvoboda11
Copy link
Contributor Author

@alexfh Can you check it doesn't boil down to the same thing as the issue you reported in https://reviews.llvm.org/D137213?

@sam-mccall
Copy link
Collaborator

sam-mccall commented Apr 22, 2024

Yes, it's approximately the same problem. We'll fix this internally; thanks & sorry for the noise!

(We have a non-clang include-scanner that computes dependencies to ensure hermetic builds. The indirect include defeats the include scanner, so we were accidentally relying on <vector> being available for some other reason - which I suppose was this one).

@alexfh
Copy link
Contributor

alexfh commented Apr 22, 2024

@alexfh Can you check it doesn't boil down to the same thing as the issue you reported in https://reviews.llvm.org/D137213?

Oh, that was long ago ;) And indeed, as @sam-mccall said, it turned out to be the problem of our build setup. Sorry for the noise!

@jansvoboda11
Copy link
Contributor Author

No worries, happy to get to the bottom of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:modules C++20 modules and Clang Header Modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants