Skip to content

[DLCov] Origin-Tracking: Collect stack traces in DebugLoc #143592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/SLTozer/dl-coverage-origin-symbolize
Choose a base branch
from

Conversation

SLTozer
Copy link
Contributor

@SLTozer SLTozer commented Jun 10, 2025

This patch is part of a series that adds origin-tracking to the debugify source location coverage checks, allowing us to report symbolized stack traces of the point where missing source locations appear.

This patch adds the logic for collecting stack traces in DebugLoc instances. We do not symbolize the stack traces in this patch - that only happens when we decide to actually print them, which will be the responsibility of debugify. The collection happens in the constructor of a DebugLoc that has neither a valid location nor an annotation; we also collect an extra stack trace every time we call setDebugLoc, as sometimes the more interesting point is not where the DebugLoc was constructed, but where it was applied to an instruction. This takes the form of a getCopied() method on DebugLoc, which is the identity function in normal builds, but adds an extra stack trace in origin-tracking builds.

Copy link
Contributor Author

SLTozer commented Jun 10, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-debugloc branch from 8ff21d6 to 6ade803 Compare June 10, 2025 20:13
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-symbolize branch from d10a102 to b2ecf5e Compare June 10, 2025 20:13
@SLTozer SLTozer changed the title [DLCov] Origin-Tracking: Core implementation [DLCov] Origin-Tracking: Collect stack traces in DebugLoc Jun 10, 2025
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-debugloc branch from 6ade803 to 37c0c68 Compare June 11, 2025 08:51
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-symbolize branch from b2ecf5e to fe689dd Compare June 11, 2025 08:51
Copy link

github-actions bot commented Jun 11, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

github-actions bot commented Jun 11, 2025

✅ With the latest revision this PR passed the undef deprecator.

@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Stephen Tozer (SLTozer)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/143592.diff

5 Files Affected:

  • (modified) llvm/include/llvm/IR/DebugLoc.h (+42-7)
  • (modified) llvm/include/llvm/IR/Instruction.h (+1-1)
  • (modified) llvm/lib/CodeGen/BranchFolding.cpp (+7)
  • (modified) llvm/lib/IR/DebugLoc.cpp (+21-1)
  • (modified) llvm/lib/IR/Instruction.cpp (+1-1)
diff --git a/llvm/include/llvm/IR/DebugLoc.h b/llvm/include/llvm/IR/DebugLoc.h
index c3d0fb80354a4..1930199607204 100644
--- a/llvm/include/llvm/IR/DebugLoc.h
+++ b/llvm/include/llvm/IR/DebugLoc.h
@@ -27,6 +27,21 @@ namespace llvm {
   class Function;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  struct DbgLocOrigin {
+    static constexpr unsigned long MaxDepth = 16;
+    using StackTracesTy =
+        SmallVector<std::pair<int, std::array<void *, MaxDepth>>, 0>;
+    StackTracesTy StackTraces;
+    DbgLocOrigin(bool ShouldCollectTrace);
+    void addTrace();
+    const StackTracesTy &getOriginStackTraces() const { return StackTraces; };
+  };
+#else
+  struct DbgLocOrigin {
+    DbgLocOrigin(bool) {}
+  };
+#endif
   // Used to represent different "kinds" of DebugLoc, expressing that the
   // instruction it is part of is either normal and should contain a valid
   // DILocation, or otherwise describing the reason why the instruction does
@@ -55,22 +70,29 @@ namespace llvm {
     Temporary
   };
 
-  // Extends TrackingMDNodeRef to also store a DebugLocKind, allowing Debugify
-  // to ignore intentionally-empty DebugLocs.
-  class DILocAndCoverageTracking : public TrackingMDNodeRef {
+  // Extends TrackingMDNodeRef to also store a DebugLocKind and Origin,
+  // allowing Debugify to ignore intentionally-empty DebugLocs and display the
+  // code responsible for generating unintentionally-empty DebugLocs.
+  // Currently we only need to track the Origin of this DILoc when using a
+  // DebugLoc that is not annotated (i.e. has DebugLocKind::Normal) and has a
+  // null DILocation, so only collect the origin stacktrace in those cases.
+  class DILocAndCoverageTracking : public TrackingMDNodeRef,
+                                   public DbgLocOrigin {
   public:
     DebugLocKind Kind;
     // Default constructor for empty DebugLocs.
     DILocAndCoverageTracking()
-        : TrackingMDNodeRef(nullptr), Kind(DebugLocKind::Normal) {}
-    // Valid or nullptr MDNode*, normal DebugLocKind.
+        : TrackingMDNodeRef(nullptr), DbgLocOrigin(true),
+          Kind(DebugLocKind::Normal) {}
+    // Valid or nullptr MDNode*, no annotative DebugLocKind.
     DILocAndCoverageTracking(const MDNode *Loc)
-        : TrackingMDNodeRef(const_cast<MDNode *>(Loc)),
+        : TrackingMDNodeRef(const_cast<MDNode *>(Loc)), DbgLocOrigin(!Loc),
           Kind(DebugLocKind::Normal) {}
     LLVM_ABI DILocAndCoverageTracking(const DILocation *Loc);
     // Explicit DebugLocKind, which always means a nullptr MDNode*.
     DILocAndCoverageTracking(DebugLocKind Kind)
-        : TrackingMDNodeRef(nullptr), Kind(Kind) {}
+        : TrackingMDNodeRef(nullptr),
+          DbgLocOrigin(Kind == DebugLocKind::Normal), Kind(Kind) {}
   };
   template <> struct simplify_type<DILocAndCoverageTracking> {
     using SimpleType = MDNode *;
@@ -142,6 +164,19 @@ namespace llvm {
     static inline DebugLoc getDropped() { return DebugLoc(); }
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+    const DbgLocOrigin::StackTracesTy &getOriginStackTraces() const {
+      return Loc.getOriginStackTraces();
+    }
+    DebugLoc getCopied() const {
+      DebugLoc NewDL = *this;
+      NewDL.Loc.addTrace();
+      return NewDL;
+    }
+#else
+    DebugLoc getCopied() const { return *this; }
+#endif
+
     /// Get the underlying \a DILocation.
     ///
     /// \pre !*this or \c isa<DILocation>(getAsMDNode()).
diff --git a/llvm/include/llvm/IR/Instruction.h b/llvm/include/llvm/IR/Instruction.h
index 10fc9c1298607..1d22bdb0c3f43 100644
--- a/llvm/include/llvm/IR/Instruction.h
+++ b/llvm/include/llvm/IR/Instruction.h
@@ -507,7 +507,7 @@ class Instruction : public User,
   LLVM_ABI bool extractProfTotalWeight(uint64_t &TotalVal) const;
 
   /// Set the debug location information for this instruction.
-  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc); }
+  void setDebugLoc(DebugLoc Loc) { DbgLoc = std::move(Loc).getCopied(); }
 
   /// Return the debug location for this node as a DebugLoc.
   const DebugLoc &getDebugLoc() const { return DbgLoc; }
diff --git a/llvm/lib/CodeGen/BranchFolding.cpp b/llvm/lib/CodeGen/BranchFolding.cpp
index e0f7466ceacff..47fc0ec7549e0 100644
--- a/llvm/lib/CodeGen/BranchFolding.cpp
+++ b/llvm/lib/CodeGen/BranchFolding.cpp
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/TargetPassConfig.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
+#include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfoMetadata.h"
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/IR/Function.h"
@@ -933,7 +934,13 @@ bool BranchFolder::TryTailMergeBlocks(MachineBasicBlock *SuccBB,
 
   // Sort by hash value so that blocks with identical end sequences sort
   // together.
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+  // If origin-tracking is enabled then MergePotentialElt is no longer a POD
+  // type, so we need std::sort instead.
+  std::sort(MergePotentials.begin(), MergePotentials.end());
+#else
   array_pod_sort(MergePotentials.begin(), MergePotentials.end());
+#endif
 
   // Walk through equivalence sets looking for actual exact matches.
   while (MergePotentials.size() > 1) {
diff --git a/llvm/lib/IR/DebugLoc.cpp b/llvm/lib/IR/DebugLoc.cpp
index 0e65ddcec8934..05aad5d393547 100644
--- a/llvm/lib/IR/DebugLoc.cpp
+++ b/llvm/lib/IR/DebugLoc.cpp
@@ -9,11 +9,31 @@
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Config/llvm-config.h"
 #include "llvm/IR/DebugInfo.h"
+
+#if LLVM_ENABLE_DEBUGLOC_ORIGIN_TRACKING
+#include "llvm/Support/Signals.h"
+
+namespace llvm {
+DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) {
+  if (ShouldCollectTrace) {
+    auto &[Depth, StackTrace] = StackTraces.emplace_back();
+    Depth = sys::getStackTrace(StackTrace);
+  }
+}
+void DbgLocOrigin::addTrace() {
+  if (StackTraces.empty())
+    return;
+  auto &[Depth, StackTrace] = StackTraces.emplace_back();
+  Depth = sys::getStackTrace(StackTrace);
+}
+} // namespace llvm
+#endif
+
 using namespace llvm;
 
 #if LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 DILocAndCoverageTracking::DILocAndCoverageTracking(const DILocation *L)
-    : TrackingMDNodeRef(const_cast<DILocation *>(L)),
+    : TrackingMDNodeRef(const_cast<DILocation *>(L)), DbgLocOrigin(!L),
       Kind(DebugLocKind::Normal) {}
 #endif // LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING
 
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 109d516c61b7c..123bc7ecce01a 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -1375,7 +1375,7 @@ void Instruction::copyMetadata(const Instruction &SrcInst,
       setMetadata(MD.first, MD.second);
   }
   if (WL.empty() || WLS.count(LLVMContext::MD_dbg))
-    setDebugLoc(SrcInst.getDebugLoc());
+    setDebugLoc(SrcInst.getDebugLoc().getCopied());
 }
 
 Instruction *Instruction::clone() const {

@SLTozer SLTozer marked this pull request as ready for review June 11, 2025 10:53
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-debugloc branch from 37c0c68 to 3d83059 Compare June 11, 2025 11:32
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-symbolize branch 2 times, most recently from 12f5a10 to f47991b Compare June 11, 2025 12:13
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-debugloc branch 2 times, most recently from 5e56291 to 2ff6e13 Compare June 12, 2025 15:21
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-symbolize branch from f47991b to 622d1fb Compare June 12, 2025 15:21
Copy link
Member

@jmorse jmorse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good with some questions

Comment on lines +33 to +34
using StackTracesTy =
SmallVector<std::pair<int, std::array<void *, MaxDepth>>, 0>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right in reading this as a 16-wide array stored in a vector that's either zero-lengthed or one (given that on initialization we add a single element to the vector)? I feel there must be a better pattern than storing an array in a vector -- a potentially empty unique_ptr to a std::array, and allocate the array off the heap when it's needed?

Comment on lines +79 to +80
class DILocAndCoverageTracking : public TrackingMDNodeRef,
public DbgLocOrigin {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reflex inside my head says "prefer composition to inheritance" -- is there a reason we need to inherit from DbgLocOrigin, instead of having it as a named member of DILocAndCoverageTracking?


namespace llvm {
DbgLocOrigin::DbgLocOrigin(bool ShouldCollectTrace) {
if (ShouldCollectTrace) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early exit instead?

@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-symbolize branch from 622d1fb to 46c1770 Compare June 20, 2025 15:04
@SLTozer SLTozer force-pushed the users/SLTozer/dl-coverage-origin-debugloc branch from 2ff6e13 to 4410b5f Compare June 20, 2025 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants