Remove maxPassCount from NVPW_RawMetricsConfig_BeginPassGroup_Params Initialization #303
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Description
With Cuda versions 12.4.1 and 12.5.1 upon running
./papi_native_avail
, events that had multiple passes ( > 1) would not output with the proper value, instead showingNumpass=0
. Example from an A100:This occurred from the function
calculate_num_passes
wheremaxPassCount
was set to 1 inNVPW_RawMetricsConfig_BeginPassGroup_Params
. This PR removesmaxPassCount
fromcalculate_num_passes
and adds further documentation on the behavior ofmaxPassCount
.Note: I was able to recreate this behavior with the
simpleQuery.cpp
script from NVIDIA using Cuda 12.5.1.Output for
papi_native_avail
withmaxPassCount
removed fromcalculate_num_passes
:Event from the A100:
Author Checklist
Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
Commits are self contained and only do one thing
Commits have a header of the form:
module: short description
Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
The PR needs to pass all the tests