Skip to content

Commit 11eb9de

Browse files
[llvm-debuginfo-analyzer] Add support for LLVM IR format.
Add support for the LLVM IR format and be able to generate logical views. Both textual representation (.ll) and bitcode (.bc) format are supported. Note: This patch requires: Add DebugSSAUpdater class to track debug value liveness #135349
1 parent 6f7e498 commit 11eb9de

35 files changed

+5025
-283
lines changed

llvm/docs/CommandGuide/llvm-debuginfo-analyzer.rst

Lines changed: 165 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,10 +13,11 @@ SYNOPSIS
1313
DESCRIPTION
1414
-----------
1515
:program:`llvm-debuginfo-analyzer` parses debug and text sections in
16-
binary object files and prints their contents in a logical view, which
17-
is a human readable representation that closely matches the structure
18-
of the original user source code. Supported object file formats include
19-
ELF, Mach-O, WebAssembly, PDB and COFF.
16+
binary object files and textual IR representations and prints their
17+
contents in a logical view, which is a human readable representation
18+
that closely matches the structure of the original user source code.
19+
Supported object file formats include ELF, Mach-O, WebAssembly, PDB,
20+
COFF, IR (textual representation and bitcode).
2021

2122
The **logical view** abstracts the complexity associated with the
2223
different low-level representations of the debugging information that
@@ -2124,6 +2125,138 @@ layout and given the number of matches.
21242125
-----------------------------
21252126
Total 71 8
21262127
2128+
IR (Textual representation and bitcode) SUPPORT
2129+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2130+
The below example is used to show the IR output generated by
2131+
:program:`llvm-debuginfo-analyzer`. We compiled the example for a
2132+
IR 64-bit target with Clang (-O0 -g --target=x86_64-linux):
2133+
2134+
.. code-block:: c++
2135+
2136+
1 using INTPTR = const int *;
2137+
2 int foo(INTPTR ParamPtr, unsigned ParamUnsigned, bool ParamBool) {
2138+
3 if (ParamBool) {
2139+
4 typedef int INTEGER;
2140+
5 const INTEGER CONSTANT = 7;
2141+
6 return CONSTANT;
2142+
7 }
2143+
8 return ParamUnsigned;
2144+
9 }
2145+
2146+
PRINT BASIC DETAILS
2147+
^^^^^^^^^^^^^^^^^^^
2148+
The following command prints basic details for all the logical elements
2149+
sorted by the debug information internal offset; it includes its lexical
2150+
level and debug info format.
2151+
2152+
.. code-block:: none
2153+
2154+
llvm-debuginfo-analyzer --attribute=level,format
2155+
--output-sort=offset
2156+
--print=scopes,symbols,types,lines,instructions
2157+
test-clang.ll
2158+
2159+
or
2160+
2161+
.. code-block:: none
2162+
2163+
llvm-debuginfo-analyzer --attribute=level,format
2164+
--output-sort=offset
2165+
--print=elements
2166+
test-clang.ll
2167+
2168+
Each row represents an element that is present within the debug
2169+
information. The first column represents the scope level, followed by
2170+
the associated line number (if any), and finally the description of
2171+
the element.
2172+
2173+
.. code-block:: none
2174+
2175+
Logical View:
2176+
[000] {File} 'test-clang.ll' -> Textual IR
2177+
2178+
[001] {CompileUnit} 'test.cpp'
2179+
[002] 2 {Function} extern not_inlined 'foo' -> 'int'
2180+
[003] {Block}
2181+
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
2182+
[004] 5 {Line}
2183+
[004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
2184+
[004] 6 {Line}
2185+
[004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
2186+
[004] 6 {Line}
2187+
[004] {Code} 'br label %return, !dbg !33'
2188+
[003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
2189+
[003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
2190+
[003] 2 {Parameter} 'ParamBool' -> 'bool'
2191+
[003] 4 {TypeAlias} 'INTEGER' -> 'int'
2192+
[003] 2 {Line}
2193+
[003] {Code} '%retval = alloca i32, align 4'
2194+
[003] {Code} '%ParamPtr.addr = alloca ptr, align 8'
2195+
[003] {Code} '%ParamUnsigned.addr = alloca i32, align 4'
2196+
[003] {Code} '%ParamBool.addr = alloca i8, align 1'
2197+
[003] {Code} '%CONSTANT = alloca i32, align 4'
2198+
[003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
2199+
[003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
2200+
[003] {Code} '%storedv = zext i1 %ParamBool to i8'
2201+
[003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
2202+
[003] 8 {Line}
2203+
[003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
2204+
[003] 8 {Line}
2205+
[003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
2206+
[003] 8 {Line}
2207+
[003] {Code} 'br label %return, !dbg !35'
2208+
[003] 9 {Line}
2209+
[003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
2210+
[003] 9 {Line}
2211+
[003] {Code} 'ret i32 %2, !dbg !36'
2212+
[003] 3 {Line}
2213+
[003] 3 {Line}
2214+
[003] 3 {Line}
2215+
[003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
2216+
[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
2217+
2218+
SELECT LOGICAL ELEMENTS
2219+
^^^^^^^^^^^^^^^^^^^^^^^
2220+
The following prints all *instructions*, *symbols* and *types* that
2221+
contain **'block'** or **'.store'** in their names or types, using a tab
2222+
layout and given the number of matches.
2223+
2224+
.. code-block:: none
2225+
2226+
llvm-debuginfo-analyzer --attribute=level
2227+
--select-nocase --select-regex
2228+
--select=LOAD --select=store
2229+
--report=list
2230+
--print=symbols,types,instructions,summary
2231+
test-clang.ll
2232+
2233+
Logical View:
2234+
[000] {File} 'test-clang.ll'
2235+
2236+
[001] {CompileUnit} 'test.cpp'
2237+
[003] {Code} '%0 = load i8, ptr %ParamBool.addr, align 1, !dbg !26'
2238+
[003] {Code} '%1 = load i32, ptr %ParamUnsigned.addr, align 4, !dbg !34'
2239+
[003] {Code} '%2 = load i32, ptr %retval, align 4, !dbg !36'
2240+
[004] {Code} '%loadedv = trunc i8 %0 to i1, !dbg !26'
2241+
[003] {Code} '%storedv = zext i1 %ParamBool to i8'
2242+
[003] {Code} 'br i1 %loadedv, label %if.then, label %if.end, !dbg !26'
2243+
[003] {Code} 'store i32 %1, ptr %retval, align 4, !dbg !35'
2244+
[003] {Code} 'store i32 %ParamUnsigned, ptr %ParamUnsigned.addr, align 4'
2245+
[004] {Code} 'store i32 7, ptr %CONSTANT, align 4, !dbg !32'
2246+
[004] {Code} 'store i32 7, ptr %retval, align 4, !dbg !33'
2247+
[003] {Code} 'store i8 %storedv, ptr %ParamBool.addr, align 1'
2248+
[003] {Code} 'store ptr %ParamPtr, ptr %ParamPtr.addr, align 8'
2249+
2250+
-----------------------------
2251+
Element Total Printed
2252+
-----------------------------
2253+
Scopes 5 0
2254+
Symbols 4 0
2255+
Types 2 0
2256+
Lines 22 12
2257+
-----------------------------
2258+
Total 33 12
2259+
21272260
COMPARISON MODE
21282261
^^^^^^^^^^^^^^^
21292262
Given the previous example we found the above debug information issue
@@ -2197,6 +2330,34 @@ giving more context by swapping the reference and target object files.
21972330
The output shows the merging view path (reference and target) with the
21982331
missing and added elements.
21992332

2333+
.. code-block:: none
2334+
2335+
llvm-debuginfo-analyzer --attribute=level,format
2336+
--compare=types
2337+
--report=view
2338+
--print=symbols,types
2339+
test-clang.bc test-dwarf-gcc.o
2340+
2341+
Reference: 'test-clang.bc'
2342+
Target: 'test-dwarf-gcc.o'
2343+
2344+
Logical View:
2345+
[000] {File} 'test-clang.bc' -> Bitcode IR
2346+
2347+
[001] {CompileUnit} 'test.cpp'
2348+
[002] 1 {TypeAlias} 'INTPTR' -> '* const int'
2349+
[002] 2 {Function} extern not_inlined 'foo' -> 'int'
2350+
[003] {Block}
2351+
[004] 5 {Variable} 'CONSTANT' -> 'const INTEGER'
2352+
+[004] 4 {TypeAlias} 'INTEGER' -> 'int'
2353+
[003] 2 {Parameter} 'ParamBool' -> 'bool'
2354+
[003] 2 {Parameter} 'ParamPtr' -> 'INTPTR'
2355+
[003] 2 {Parameter} 'ParamUnsigned' -> 'unsigned int'
2356+
-[003] 4 {TypeAlias} 'INTEGER' -> 'int'
2357+
2358+
The same output but this time comparing the Clang bitcode with the
2359+
binary object (DWARF) generated by GCC.
2360+
22002361
LOGICAL ELEMENTS
22012362
""""""""""""""""
22022363
It compares individual logical elements without considering if their

llvm/include/llvm/DebugInfo/LogicalView/Core/LVReader.h

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ class LVSplitContext final {
5656

5757
/// The logical reader owns of all the logical elements created during
5858
/// the debug information parsing. For its creation it uses a specific
59-
/// bump allocator for each type of logical element.
59+
/// bump allocator for each type of logical element.
6060
class LVReader {
6161
LVBinaryType BinaryType;
6262

@@ -121,7 +121,24 @@ class LVReader {
121121

122122
#undef LV_OBJECT_ALLOCATOR
123123

124+
// Scopes with ranges for current compile unit. It is used to find a line
125+
// giving its exact or closest address. To support comdat functions, all
126+
// addresses for the same section are recorded in the same map.
127+
using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
128+
LVSectionRanges SectionRanges;
129+
124130
protected:
131+
// Current elements during the processing of a DIE/MDNode.
132+
LVElement *CurrentElement = nullptr;
133+
LVScope *CurrentScope = nullptr;
134+
LVSymbol *CurrentSymbol = nullptr;
135+
LVType *CurrentType = nullptr;
136+
LVLine *CurrentLine = nullptr;
137+
LVOffset CurrentOffset = 0;
138+
139+
// Address ranges collected for current DIE/MDNode/AST Node.
140+
std::vector<LVAddressRange> CurrentRanges;
141+
125142
LVScopeRoot *Root = nullptr;
126143
std::string InputFilename;
127144
std::string FileFormatName;
@@ -132,11 +149,18 @@ class LVReader {
132149
// Only for ELF format. The CodeView is handled in a different way.
133150
LVSectionIndex DotTextSectionIndex = UndefinedSectionIndex;
134151

152+
void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
153+
void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
154+
LVAddress LowerAddress, LVAddress UpperAddress);
155+
LVRange *getSectionRanges(LVSectionIndex SectionIndex);
156+
135157
// Record Compilation Unit entry.
136158
void addCompileUnitOffset(LVOffset Offset, LVScopeCompileUnit *CompileUnit) {
137159
CompileUnits.emplace(Offset, CompileUnit);
138160
}
139161

162+
LVElement *createElement(dwarf::Tag Tag);
163+
140164
// Create the Scope Root.
141165
virtual Error createScopes() {
142166
Root = createScopeRoot();

llvm/include/llvm/DebugInfo/LogicalView/Core/LVSupport.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,19 @@ template <typename T> class LVProperties {
9999
#define KIND_3(ENUM, FIELD, F1, F2, F3) \
100100
BOOL_BIT_3(Kinds, ENUM, FIELD, F1, F2, F3)
101101

102+
const int DEC_WIDTH = 8;
103+
inline FormattedNumber decValue(uint64_t N, unsigned Width = DEC_WIDTH) {
104+
return format_decimal(N, Width);
105+
}
106+
107+
// Output the decimal representation of 'Value'.
108+
inline std::string decString(uint64_t Value, size_t Width = DEC_WIDTH) {
109+
std::string String;
110+
raw_string_ostream Stream(String);
111+
Stream << decValue(Value, Width);
112+
return Stream.str();
113+
}
114+
102115
const int HEX_WIDTH = 12;
103116
inline FormattedNumber hexValue(uint64_t N, unsigned Width = HEX_WIDTH,
104117
bool Upper = false) {

llvm/include/llvm/DebugInfo/LogicalView/LVReaderHandler.h

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
#include "llvm/DebugInfo/LogicalView/Core/LVReader.h"
1818
#include "llvm/DebugInfo/PDB/Native/PDBFile.h"
1919
#include "llvm/Object/Archive.h"
20+
#include "llvm/Object/IRObjectFile.h"
2021
#include "llvm/Object/MachOUniversal.h"
2122
#include "llvm/Object/ObjectFile.h"
2223
#include "llvm/Support/MemoryBuffer.h"
@@ -29,7 +30,9 @@ namespace logicalview {
2930

3031
using LVReaders = std::vector<std::unique_ptr<LVReader>>;
3132
using ArgVector = std::vector<std::string>;
32-
using PdbOrObj = PointerUnion<object::ObjectFile *, pdb::PDBFile *>;
33+
using PdbOrObjOrIr =
34+
PointerUnion<object::ObjectFile *, pdb::PDBFile *, object::IRObjectFile *,
35+
MemoryBufferRef *, StringRef *>;
3336

3437
// This class performs the following tasks:
3538
// - Creates a logical reader for every binary file in the command line,
@@ -60,9 +63,12 @@ class LVReaderHandler {
6063
object::Binary &Binary);
6164
Error handleObject(LVReaders &Readers, StringRef Filename, StringRef Buffer,
6265
StringRef ExePath);
66+
Error handleObject(LVReaders &Readers, StringRef Filename,
67+
MemoryBufferRef Buffer);
6368

64-
Error createReader(StringRef Filename, LVReaders &Readers, PdbOrObj &Input,
65-
StringRef FileFormatName, StringRef ExePath = {});
69+
Error createReader(StringRef Filename, LVReaders &Readers,
70+
PdbOrObjOrIr &Input, StringRef FileFormatName,
71+
StringRef ExePath = {});
6672

6773
public:
6874
LVReaderHandler() = delete;

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVBinaryReader.h

Lines changed: 1 addition & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@
2525
#include "llvm/MC/MCSubtargetInfo.h"
2626
#include "llvm/MC/TargetRegistry.h"
2727
#include "llvm/Object/COFF.h"
28+
#include "llvm/Object/IRObjectFile.h"
2829
#include "llvm/Object/ObjectFile.h"
2930

3031
namespace llvm {
@@ -93,12 +94,6 @@ class LVBinaryReader : public LVReader {
9394
SectionAddresses.emplace(Section.getAddress(), Section);
9495
}
9596

96-
// Scopes with ranges for current compile unit. It is used to find a line
97-
// giving its exact or closest address. To support comdat functions, all
98-
// addresses for the same section are recorded in the same map.
99-
using LVSectionRanges = std::map<LVSectionIndex, std::unique_ptr<LVRange>>;
100-
LVSectionRanges SectionRanges;
101-
10297
// Image base and virtual address for Executable file.
10398
uint64_t ImageBaseAddress = 0;
10499
uint64_t VirtualAddress = 0;
@@ -179,11 +174,6 @@ class LVBinaryReader : public LVReader {
179174
Expected<std::pair<LVSectionIndex, object::SectionRef>>
180175
getSection(LVScope *Scope, LVAddress Address, LVSectionIndex SectionIndex);
181176

182-
void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope);
183-
void addSectionRange(LVSectionIndex SectionIndex, LVScope *Scope,
184-
LVAddress LowerAddress, LVAddress UpperAddress);
185-
LVRange *getSectionRanges(LVSectionIndex SectionIndex);
186-
187177
void includeInlineeLines(LVSectionIndex SectionIndex, LVScope *Function);
188178

189179
Error createInstructions();

llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -39,22 +39,13 @@ class LVDWARFReader final : public LVBinaryReader {
3939
LVAddress CUBaseAddress = 0;
4040
LVAddress CUHighAddress = 0;
4141

42-
// Current elements during the processing of a DIE.
43-
LVElement *CurrentElement = nullptr;
44-
LVScope *CurrentScope = nullptr;
45-
LVSymbol *CurrentSymbol = nullptr;
46-
LVType *CurrentType = nullptr;
47-
LVOffset CurrentOffset = 0;
4842
LVOffset CurrentEndOffset = 0;
4943

5044
// In DWARF v4, the files are 1-indexed.
5145
// In DWARF v5, the files are 0-indexed.
5246
// The DWARF reader expects the indexes as 1-indexed.
5347
bool IncrementFileIndex = false;
5448

55-
// Address ranges collected for current DIE.
56-
std::vector<LVAddressRange> CurrentRanges;
57-
5849
// Symbols with locations for current compile unit.
5950
LVSymbols SymbolsWithLocations;
6051

@@ -82,7 +73,6 @@ class LVDWARFReader final : public LVBinaryReader {
8273

8374
void mapRangeAddress(const object::ObjectFile &Obj) override;
8475

85-
LVElement *createElement(dwarf::Tag Tag);
8676
void traverseDieAndChildren(DWARFDie &DIE, LVScope *Parent,
8777
DWARFDie &SkeletonDie);
8878
// Process the attributes for the given DIE.

0 commit comments

Comments
 (0)