-
Notifications
You must be signed in to change notification settings - Fork 260
Add TTI pass initialization to pass managers. #49
Conversation
std::function<llvm::Error(llvm::Module *)> | ||
makeOptimizingTransformer(unsigned optLevel, unsigned sizeLevel); | ||
makeOptimizingTransformer(unsigned optLevel, unsigned sizeLevel, | ||
llvm::TargetMachine *targetMachine = nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not even have a default nullptr value here to avoid forgetting it at call-sites.
Thanks! I had a similar patch locally that I didn't complete. Ultimately my take was that most of the code here could be moved directly in LLVM as there isn't much specific to MLIR in the JIT abstraction we have in MLIR. In the meantime this PR seems like a good thing to do. |
Yeah, that would be great.
Do you think adding a utility for this ^ in a separate commit is also a good idea? This is a suggestion for the squashed commit message. Sorry, I didn't add it to the first commit: Thanks, Mehdi! |
PR#49 (tensorflow#49) adds TTI initialization pass to pass managers for a given target machine. However, this may cause an inconsistency if the target machine used by the JIT compiler is not the same as the one used by TTI. This PR allows passing a reference target machine to be used by JIT compiler. If no reference target machine is provided, a default host target machine is used, including all its sub-target features.
dcaballe#1 is a stepping stone into ^. Please, let me know what you think. Thanks! |
You have conflicts now apparently, can you rebase?
Thanks, it seems a bit strange to me to manually unwrap a TM to build a TM-builder, I'm not sure why ORCJIT does not just take a |
Done
Yeah, it would make more sense since TM seems to be needed before ORCJIT is created. There is sort of a chicken-egg problem as we need TM to properly initialize TTI pass in the
Sure. Please, keep me posted and let me know if I can help. Diego |
Hi Mehdi, Do you think we could proceed with this PR in the meantime until things are moved to LLVM? That would be very helpful. Thanks, |
lib/ExecutionEngine/OptUtils.cpp
Outdated
@@ -69,7 +71,8 @@ void mlir::initializeLLVMPasses() { | |||
// This behaves similarly to LLVM opt. | |||
static void populatePassManagers(llvm::legacy::PassManager &modulePM, | |||
llvm::legacy::FunctionPassManager &funcPM, | |||
unsigned optLevel, unsigned sizeLevel) { | |||
unsigned optLevel, unsigned sizeLevel, | |||
llvm::TargetMachine *targetMachine = nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a single call site, you should be able to skip the default value here
/// levels (e.g. -O2 or -Os). | ||
/// levels (e.g. -O2 or -Os). If not null, \p targetMachine is used to | ||
/// initialize passes that provide target-specific information to the LLVM | ||
/// optimizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you mention that the the provided TM is expected to outlive the returned std::function?
Sorry, I didn't mean to stale the current PR of course. If you don't mind updating it with the two nit I just commented, that seems OK to merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry I wrote this comment the other day and didn't hit "submit review")
lib/ExecutionEngine/OptUtils.cpp
Outdated
@@ -127,7 +127,8 @@ std::function<llvm::Error(llvm::Module *)> mlir::makeLLVMPassesTransformer( | |||
continue; | |||
|
|||
if (insertOptPasses && optPassesInsertPos == i) { | |||
populatePassManagers(modulePM, funcPM, mbOptLevel.getValue(), 0); | |||
populatePassManagers(modulePM, funcPM, mbOptLevel.getValue(), 0, | |||
nullptr /*TTI*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this function (mlir::makeLLVMPassesTransformer) also take a TM as argument?
Nit: the "standard" way of naming arguments is to prepend with the argument name /* targetMachine=*/ nullptr
(The nice thing is that clang-tidy would then warn when the name in the comment does not match the function declaration name for the parameter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it makes sense. I didn't know that about clang-tidy. Good to know!
auto transformer = | ||
mlir::makeLLVMPassesTransformer(passes, optLevel, optPosition); | ||
auto transformer = mlir::makeLLVMPassesTransformer( | ||
passes, optLevel, /*targetMachine=*/nullptr, optPosition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here can (and should?) we do something like:
auto TM_or_error = JITTargetMachineBuilder::detectHost().createTargetMachine();
if (!TM) {
llvm::errs() << "Failed to create a TargetMachine for the host\n";
return EXIT_FAILURE;
}
auto transformer = mlir::makeLLVMPassesTransformer(
passes, optLevel, /*targetMachine=*/TM_or_error->get(), optPosition);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion here but maybe we should add this in a separate commit since this is changing runner's behavior. Also, I'm not sure how much we are getting from detectHost().createTargetMachine()
since this is not adding sub-target features. We would need something more in the direction of what I pointed out (dcaballe#1) but I understood you preferred that code in LLVM. Let me know if we can reuse part of that code. I could prepare another commit with that. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't most of the TLI independent of sub-target features?
In any case adding sub-target features seems like the kind of thing that detectHost().createTargetMachine()
should take care of, don't you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vectorization is sub-target dependent, for example.
I agree with you but I feel there must be a reason why it's not there already :). Let me try to add it to LLVM and I'll create another PR to update jit runner if you don't beat me to it.
Many LLVM transformations benefits from knowing the targets. This enables optimizations, especially in a JIT context when the target is (generally) well-known. Closes #49 COPYBARA_INTEGRATE_REVIEW=tensorflow/mlir#49 from dcaballe:dcaballe/tti ab02f72eb326f660945696e5dadeeb983cf263b3 PiperOrigin-RevId: 261840617
Hi!
This PR is mostly to ask for feedback :). We are using
mlir::ExecutionEngine
to JIT-compile and execute nGraph ops on CPUs and noticed that LLVM optimizer wasn't getting the proper target information because TTI implementation was initialized withNoTTIImpl
. This happens becauseTargetTransformInfoWrapperPass
is not added to the pass managers.I think there are a couple of ways to fix this:
TargetTransformInfoWrapperPass
initialized with the properTargetMachine
to the pass managers. I tested it by passing aTargetMachine
with all the default host sub-target features and specific cpu name (skylake-avx512). However, I must be missing something since I’m getting skylake-avx512 assembly with this approach even though the target machine used by OrcJit doesn’t have any sub-target features/cpu name set: https://github.com/tensorflow/mlir/blob/master/lib/ExecutionEngine/ExecutionEngine.cpp#L150mlir::makeLLVMPassesTransformer
(https://github.com/tensorflow/mlir/blob/master/lib/ExecutionEngine/OptUtils.cpp#L103) in nGraph to addTargetTransformInfoWrapperPass
to the pass managers.I see value in adding
TargetTransformInfoWrapperPass
pass in MLIR repo so that other users don’t hit the same issue or have to replicate the same code. I also see value in having a utility in MLIR repo that creates aTargetMachine
with the default host cpu parameters (cpu name, sub-target features, etc.) and reusing it every time is needed (for example, in the OrcJIT code I pointed before and also here: https://github.com/tensorflow/mlir/blob/master/lib/ExecutionEngine/ExecutionEngine.cpp#L150)Please, let me know what you think. I should be able to help with this.
Thanks!
Diego