Add TTI pass initialization to pass managers. #49

dcaballe · 2019-07-19T19:08:57Z

Hi!

This PR is mostly to ask for feedback :). We are using mlir::ExecutionEngine to JIT-compile and execute nGraph ops on CPUs and noticed that LLVM optimizer wasn't getting the proper target information because TTI implementation was initialized with NoTTIImpl. This happens because TargetTransformInfoWrapperPass is not added to the pass managers.

I think there are a couple of ways to fix this:

MLIR repo: something like the code attached, which adds TargetTransformInfoWrapperPass initialized with the proper TargetMachine to the pass managers. I tested it by passing a TargetMachine with all the default host sub-target features and specific cpu name (skylake-avx512). However, I must be missing something since I’m getting skylake-avx512 assembly with this approach even though the target machine used by OrcJit doesn’t have any sub-target features/cpu name set: https://github.com/tensorflow/mlir/blob/master/lib/ExecutionEngine/ExecutionEngine.cpp#L150
nGraph repo: using mlir::makeLLVMPassesTransformer (https://github.com/tensorflow/mlir/blob/master/lib/ExecutionEngine/OptUtils.cpp#L103) in nGraph to add TargetTransformInfoWrapperPass to the pass managers.
Any other idea?

I see value in adding TargetTransformInfoWrapperPass pass in MLIR repo so that other users don’t hit the same issue or have to replicate the same code. I also see value in having a utility in MLIR repo that creates a TargetMachine with the default host cpu parameters (cpu name, sub-target features, etc.) and reusing it every time is needed (for example, in the OrcJIT code I pointed before and also here: https://github.com/tensorflow/mlir/blob/master/lib/ExecutionEngine/ExecutionEngine.cpp#L150)

Please, let me know what you think. I should be able to help with this.

Thanks!
Diego

joker-eph · 2019-07-19T19:18:39Z

include/mlir/ExecutionEngine/OptUtils.h

 std::function<llvm::Error(llvm::Module *)>
-makeOptimizingTransformer(unsigned optLevel, unsigned sizeLevel);
+makeOptimizingTransformer(unsigned optLevel, unsigned sizeLevel,
+                          llvm::TargetMachine *targetMachine = nullptr);


I would not even have a default nullptr value here to avoid forgetting it at call-sites.

joker-eph · 2019-07-19T19:19:15Z

Thanks! I had a similar patch locally that I didn't complete.

Ultimately my take was that most of the code here could be moved directly in LLVM as there isn't much specific to MLIR in the JIT abstraction we have in MLIR. In the meantime this PR seems like a good thing to do.

dcaballe · 2019-07-19T22:53:13Z

Ultimately my take was that most of the code here could be moved directly in LLVM as there isn't much specific to MLIR in the JIT abstraction we have in MLIR.

Yeah, that would be great.

I also see value in having a utility in MLIR repo that creates a TargetMachine with the default host cpu parameters (cpu name, sub-target features, etc.)

Do you think adding a utility for this ^ in a separate commit is also a good idea?

This is a suggestion for the squashed commit message. Sorry, I didn't add it to the first commit:
Add TargetMachine parameter to makeOptimizingTransformer and populatePassManagers APIs and initialize TargetTransformInfoWrapperPass with the right target information and add it to the pass managers.

Thanks, Mehdi!
Diego

PR#49 (tensorflow#49) adds TTI initialization pass to pass managers for a given target machine. However, this may cause an inconsistency if the target machine used by the JIT compiler is not the same as the one used by TTI. This PR allows passing a reference target machine to be used by JIT compiler. If no reference target machine is provided, a default host target machine is used, including all its sub-target features.

dcaballe · 2019-07-22T20:24:25Z

I also see value in having a utility in MLIR repo that creates a TargetMachine with the default host cpu parameters (cpu name, sub-target features, etc.)

Do you think adding a utility for this ^ in a separate commit is also a good idea?

dcaballe#1 is a stepping stone into ^. Please, let me know what you think.

Thanks!
Diego

joker-eph · 2019-07-23T04:07:08Z

You have conflicts now apparently, can you rebase?

dcaballe#1 is a stepping stone into ^. Please, let me know what you think.

Thanks, it seems a bit strange to me to manually unwrap a TM to build a TM-builder, I'm not sure why ORCJIT does not just take a std::function<std::unique_ptr<TargetMachine>()> callback instead of this class?
Most of the code you're adding is putting this whole utility even more in LLVM land, at this point I rather invest time into moving this into LLVM itself than adding so much "low-level" glue code in MLIR.

dcaballe · 2019-07-23T16:52:40Z

You have conflicts now apparently, can you rebase?

Done

I'm not sure why ORCJIT does not just take a std::function<std::unique_ptr()> callback instead of this class?

Yeah, it would make more sense since TM seems to be needed before ORCJIT is created. There is sort of a chicken-egg problem as we need TM to properly initialize TTI pass in the transformer function that we pass to create ORCJIT.

at this point I rather invest time into moving this into LLVM itself than adding so much "low-level" glue code in MLIR.

Sure. Please, keep me posted and let me know if I can help.

Diego

dcaballe · 2019-07-29T16:35:17Z

Hi Mehdi,

Do you think we could proceed with this PR in the meantime until things are moved to LLVM? That would be very helpful.

Thanks,
Diego

joker-eph · 2019-07-30T18:09:47Z

lib/ExecutionEngine/OptUtils.cpp

@@ -69,7 +71,8 @@ void mlir::initializeLLVMPasses() {
 // This behaves similarly to LLVM opt.
 static void populatePassManagers(llvm::legacy::PassManager &modulePM,
                                 llvm::legacy::FunctionPassManager &funcPM,
-                                 unsigned optLevel, unsigned sizeLevel) {
+                                 unsigned optLevel, unsigned sizeLevel,
+                                 llvm::TargetMachine *targetMachine = nullptr) {


There is a single call site, you should be able to skip the default value here

joker-eph · 2019-07-30T18:10:56Z

include/mlir/ExecutionEngine/OptUtils.h

-/// levels (e.g. -O2 or -Os).
+/// levels (e.g. -O2 or -Os). If not null, \p targetMachine is used to
+/// initialize passes that provide target-specific information to the LLVM
+/// optimizer.


Can you mention that the the provided TM is expected to outlive the returned std::function?

joker-eph · 2019-07-30T18:12:02Z

Sorry, I didn't mean to stale the current PR of course. If you don't mind updating it with the two nit I just commented, that seems OK to merge.

joker-eph

(Sorry I wrote this comment the other day and didn't hit "submit review")

joker-eph · 2019-08-01T00:12:45Z

lib/ExecutionEngine/OptUtils.cpp

@@ -127,7 +127,8 @@ std::function<llvm::Error(llvm::Module *)> mlir::makeLLVMPassesTransformer(
        continue;

      if (insertOptPasses && optPassesInsertPos == i) {
-        populatePassManagers(modulePM, funcPM, mbOptLevel.getValue(), 0);
+        populatePassManagers(modulePM, funcPM, mbOptLevel.getValue(), 0,
+                             nullptr /*TTI*/);


Shouldn't this function (mlir::makeLLVMPassesTransformer) also take a TM as argument?

Nit: the "standard" way of naming arguments is to prepend with the argument name /* targetMachine=*/ nullptr (The nice thing is that clang-tidy would then warn when the name in the comment does not match the function declaration name for the parameter)

Thanks, it makes sense. I didn't know that about clang-tidy. Good to know!

joker-eph · 2019-08-02T00:12:10Z

lib/Support/JitRunner.cpp

-  auto transformer =
-      mlir::makeLLVMPassesTransformer(passes, optLevel, optPosition);
+  auto transformer = mlir::makeLLVMPassesTransformer(
+      passes, optLevel, /*targetMachine=*/nullptr, optPosition);


Here can (and should?) we do something like:

auto TM_or_error = JITTargetMachineBuilder::detectHost().createTargetMachine(); if (!TM) { llvm::errs() << "Failed to create a TargetMachine for the host\n"; return EXIT_FAILURE; } auto transformer = mlir::makeLLVMPassesTransformer( passes, optLevel, /*targetMachine=*/TM_or_error->get(), optPosition);

No strong opinion here but maybe we should add this in a separate commit since this is changing runner's behavior. Also, I'm not sure how much we are getting from detectHost().createTargetMachine() since this is not adding sub-target features. We would need something more in the direction of what I pointed out (dcaballe#1) but I understood you preferred that code in LLVM. Let me know if we can reuse part of that code. I could prepare another commit with that. WDYT?

Isn't most of the TLI independent of sub-target features?
In any case adding sub-target features seems like the kind of thing that detectHost().createTargetMachine() should take care of, don't you think?

Vectorization is sub-target dependent, for example.

I agree with you but I feel there must be a reason why it's not there already :). Let me try to add it to LLVM and I'll create another PR to update jit runner if you don't beat me to it.

include/mlir/ExecutionEngine/OptUtils.h

Many LLVM transformations benefits from knowing the targets. This enables optimizations, especially in a JIT context when the target is (generally) well-known. Closes #49 COPYBARA_INTEGRATE_REVIEW=tensorflow/mlir#49 from dcaballe:dcaballe/tti ab02f72eb326f660945696e5dadeeb983cf263b3 PiperOrigin-RevId: 261840617

[WIP] Add TTI pass initialization to pass managers.

c060c3a

googlebot added the cla: yes label Jul 19, 2019

joker-eph reviewed Jul 19, 2019

View reviewed changes

Remove nullptr as default value

e2f6ca6

dcaballe changed the title ~~[WIP] Add TTI pass initialization to pass managers.~~ Add TTI pass initialization to pass managers. Jul 19, 2019

dcaballe mentioned this pull request Jul 22, 2019

Pass reference target machine to OrcJit compiler. dcaballe/mlir#1

Open

Merge branch 'master' into dcaballe/tti

229070e

Merge branch 'master' into dcaballe/tti

acc7e6b

joker-eph reviewed Jul 30, 2019

View reviewed changes

dcaballe added 2 commits July 31, 2019 16:14

Merge branch 'master' into dcaballe/tti

586f71f

Address feedback

62eca68

joker-eph reviewed Aug 1, 2019

View reviewed changes

Add TTI argument to makeLLVMPassesTransformer

20ffd37

joker-eph reviewed Aug 2, 2019

View reviewed changes

joker-eph added the ready to pull label Aug 2, 2019

joker-eph approved these changes Aug 2, 2019

View reviewed changes

ftynse added the kokoro:run label Aug 2, 2019

kokoro-team removed the kokoro:run label Aug 2, 2019

ftynse self-requested a review August 2, 2019 13:49

ftynse suggested changes Aug 2, 2019

View reviewed changes

include/mlir/ExecutionEngine/OptUtils.h Show resolved Hide resolved

ftynse removed the ready to pull label Aug 2, 2019

Add targetMachine param to Ch5/toyc.cpp calls

0a62749

Merge branch 'master' into dcaballe/tti

5d114fd

ftynse added the kokoro:run label Aug 2, 2019

kokoro-team removed the kokoro:run label Aug 2, 2019

joker-eph added ready to pull kokoro:run labels Aug 3, 2019

ftynse approved these changes Aug 3, 2019

View reviewed changes

Merge branch 'master' into dcaballe/tti

ab02f72

kokoro-team removed the kokoro:run label Aug 5, 2019

mlir-copybara-bot closed this in dc7bc9e Aug 6, 2019

bondhugula mentioned this pull request Oct 11, 2019

IR printing (-print-ir-after-all) crash #182

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TTI pass initialization to pass managers. #49

Add TTI pass initialization to pass managers. #49

dcaballe commented Jul 19, 2019

joker-eph Jul 19, 2019

joker-eph commented Jul 19, 2019

dcaballe commented Jul 19, 2019

dcaballe commented Jul 22, 2019

joker-eph commented Jul 23, 2019

dcaballe commented Jul 23, 2019

dcaballe commented Jul 29, 2019

joker-eph Jul 30, 2019

joker-eph Jul 30, 2019

joker-eph commented Jul 30, 2019

joker-eph left a comment

joker-eph Aug 1, 2019

dcaballe Aug 2, 2019

joker-eph Aug 2, 2019 •

edited

Loading

dcaballe Aug 2, 2019

joker-eph Aug 2, 2019

dcaballe Aug 2, 2019

Add TTI pass initialization to pass managers. #49

Add TTI pass initialization to pass managers. #49

Conversation

dcaballe commented Jul 19, 2019

Choose a reason for hiding this comment

joker-eph commented Jul 19, 2019

dcaballe commented Jul 19, 2019

dcaballe commented Jul 22, 2019

joker-eph commented Jul 23, 2019

dcaballe commented Jul 23, 2019

dcaballe commented Jul 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joker-eph commented Jul 30, 2019

joker-eph left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joker-eph Aug 2, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joker-eph Aug 2, 2019 •

edited

Loading