-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass -foffload-lto instead of -flto for cuda/hip kernels in clangLinkerWrapper #16605
Pass -foffload-lto instead of -flto for cuda/hip kernels in clangLinkerWrapper #16605
Conversation
5b77ca9
to
d24ae0e
Compare
d24ae0e
to
8615f88
Compare
8615f88
to
6c7fbf7
Compare
6c7fbf7
to
da775da
Compare
da775da
to
633d90b
Compare
633d90b
to
ccf221a
Compare
for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm)) | ||
CmdArgs.append( | ||
{"-Xlinker", | ||
Args.MakeArgString("--plugin-opt=" + StringRef(Arg->getValue()))}); | ||
|
||
if (Triple.isNVPTX() || Triple.isAMDGPU()) { | ||
CmdArgs.push_back("-foffload-lto"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this go upstream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on adding this change to upstream code. I am also wondering why it is not an issue in the upstream compiler and why we see it only with SYCL offloading.
Also @omarahmed1111 can you please comment on the test failure?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test failure for loop_extended
is a flaky failure not related to this PR, the failed test disabled on this PR: #16612 . I updated the PR, so should be solved now.
for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm)) | ||
CmdArgs.append( | ||
{"-Xlinker", | ||
Args.MakeArgString("--plugin-opt=" + StringRef(Arg->getValue()))}); | ||
|
||
if (Triple.isNVPTX() || Triple.isAMDGPU()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an upstream pr for this based on previous feedback?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we are going to do this after merging this. I will open one for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok as long as you'll make an upstream PR lgtm to merge here first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm assuming future upstream pr
@intel/llvm-gatekeepers please merge, Thanks! |
@sarnex llvm upstream PR: llvm/llvm-project#125243 |
ClangLinkerWrapper tool in one of its clang commands to generate ptx kernel binary from llvm bitcode kernel was using
-flto
option which should be only used for cpu code not gpu kernel code. This PR fixes that by changing that to-foffload-lto
for cuda/hip kernels.This fixes 16413 issue.