@@ -239,6 +239,38 @@ bug prevented using hpcrun to profile programs launched with shell scripts.
239
239
-- fixed bug in hpcstruct in getRealPath() that caused hpcstruct
240
240
to sometimes report incorrect file names.
241
241
242
+
243
+ Known issues:
244
+
245
+ When profiling optimized code with HPCToolkit, one may find that a program
246
+ generates a significant number of "partial unwinds" where the call stack
247
+ can't be unwound all the way up to main. This more commonly happens on
248
+ x86-64 architectures than on PowerPC and ARM. A large number partial unwinds
249
+ may make it harder to use the top-down calling context view in hpcviewer,
250
+ which works best when call stacks unwind all the way up to main. Even
251
+ with significant numbers of partial unwinds, the bottom-up caller's view
252
+ and the flat view in hpcviewer can be used effectively for analyzing
253
+ performance. Ongoing work aims to improve call stack unwinding of
254
+ optimized code by employing compiler-generated unwinding information
255
+ where available in addition to using binary analysis to discover unwinding
256
+ recipes.
257
+
258
+ On x86-64, hpcfnbounds occasionally is too aggressive about inferring the
259
+ presence of stripped functions in optimized programs. We have noticed
260
+ this particularly for optimized Fortran. This can cause "partial unwinds",
261
+ where a call stack can't be unwound fully up to main. Improving this
262
+ analysis is the subject of ongoing work.
263
+
264
+ When using with the LLVM OpenMP runtime's OMPT support, measurements
265
+ of programs compiled with GCC using HPCToolkit's ompt branch sometimes
266
+ reveal implementation-level stack frames that belong to the OpenMP
267
+ runtime system. This will improve with the transition of HPCToolkit
268
+ and the LLVM OpenMP runtime to the new OMPT ABI designed for OpenMP
269
+ 5.0. This transition should occur over the next 6 months. In the
270
+ meantime, there is nothing wrong with the quality of the information
271
+ collected. The only problem is HPCToolkit's measurements reveal more of
272
+ an implementation-level view of OpenMP than intended.
273
+
242
274
----------------------------------------
243
275
HPCToolkit Version 2016.12, Dec 2016
244
276
----------------------------------------
0 commit comments