Skip to content
This repository was archived by the owner on Apr 2, 2025. It is now read-only.

Commit 1d908a3

Browse files
add discussion of known issues in release notes.
1 parent be881d9 commit 1d908a3

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

README.ReleaseNotes

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,38 @@ bug prevented using hpcrun to profile programs launched with shell scripts.
239239
-- fixed bug in hpcstruct in getRealPath() that caused hpcstruct
240240
to sometimes report incorrect file names.
241241

242+
243+
Known issues:
244+
245+
When profiling optimized code with HPCToolkit, one may find that a program
246+
generates a significant number of "partial unwinds" where the call stack
247+
can't be unwound all the way up to main. This more commonly happens on
248+
x86-64 architectures than on PowerPC and ARM. A large number partial unwinds
249+
may make it harder to use the top-down calling context view in hpcviewer,
250+
which works best when call stacks unwind all the way up to main. Even
251+
with significant numbers of partial unwinds, the bottom-up caller's view
252+
and the flat view in hpcviewer can be used effectively for analyzing
253+
performance. Ongoing work aims to improve call stack unwinding of
254+
optimized code by employing compiler-generated unwinding information
255+
where available in addition to using binary analysis to discover unwinding
256+
recipes.
257+
258+
On x86-64, hpcfnbounds occasionally is too aggressive about inferring the
259+
presence of stripped functions in optimized programs. We have noticed
260+
this particularly for optimized Fortran. This can cause "partial unwinds",
261+
where a call stack can't be unwound fully up to main. Improving this
262+
analysis is the subject of ongoing work.
263+
264+
When using with the LLVM OpenMP runtime's OMPT support, measurements
265+
of programs compiled with GCC using HPCToolkit's ompt branch sometimes
266+
reveal implementation-level stack frames that belong to the OpenMP
267+
runtime system. This will improve with the transition of HPCToolkit
268+
and the LLVM OpenMP runtime to the new OMPT ABI designed for OpenMP
269+
5.0. This transition should occur over the next 6 months. In the
270+
meantime, there is nothing wrong with the quality of the information
271+
collected. The only problem is HPCToolkit's measurements reveal more of
272+
an implementation-level view of OpenMP than intended.
273+
242274
----------------------------------------
243275
HPCToolkit Version 2016.12, Dec 2016
244276
----------------------------------------

0 commit comments

Comments
 (0)