0.0.2
Pre-release
Pre-release
What's Changed
- Setup repo and package structure by @rmccorm4 in #1
- Add pre-commit hook to upgrade Python syntax by @dyastremsky in #2
- Add initial prototype by @rmccorm4 in #4
- Add README and update default image by @rmccorm4 in #5
- Add rough NGC CLI wrapper by @rmccorm4 in #6
- Basic MPI support by @fpetrini15 in #8
- Populate model repo with TRTLLM templates by @oandreeva-nv in #7
- Minor TRT-LLM tweaks by @rmccorm4 in #11
- Misc fixes by @rmccorm4 in #14
- Add profile subcommand to run perf analyzer by @matthewkotila in #13
- POC: Background Server by @fpetrini15 in #15
- Fix high concurrency generation throughput calculation by @nv-hwoo in #16
- Add demo features for benchmarking LLMs by @rmccorm4 in #17
- Add copyrights and minor cleanup by @rmccorm4 in #19
- Automatic TRT LLM Detail Parsing by @fpetrini15 in #18
- Fix vLLM profiler bug, add fallback logic to server start, cleanup by @rmccorm4 in #20
- Add initial tests for repo subcommand by @rmccorm4 in #21
- Catch errors and improve logging in Profiler by @nv-hwoo in #23
- Bump version to 0.0.2 by @rmccorm4 in #22
New Contributors
- @dyastremsky made their first contribution in #2
- @matthewkotila made their first contribution in #13
- @nv-hwoo made their first contribution in #16
Full Changelog: https://github.com/triton-inference-server/triton_cli/commits/0.0.2