Will TensorRT-LLM be available within Triton or will it be a separate server ? #6290
Answered
by
dyastremsky
MatthieuToulemont
asked this question in
Q&A
-
Like many, I am pretty stoked about the TensorRT-LLM announcement. I am wondering if this will be accessible from within Triton as a specific backend or will we need to run a separate process to benefit from TensorRT-LLM ? |
Beta Was this translation helpful? Give feedback.
Answered by
dyastremsky
Sep 13, 2023
Replies: 1 comment 1 reply
-
Very happy to hear that, Matthieu! Thanks for sharing. Triton will continue to work as the one solution for all of your AI model inferencing. We are working on a TensorRT-LLM backend that you can easily plug your models into. |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
dyastremsky
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Very happy to hear that, Matthieu! Thanks for sharing.
Triton will continue to work as the one solution for all of your AI model inferencing. We are working on a TensorRT-LLM backend that you can easily plug your models into.