-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: triton-inference-server/server
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Milestones
Assignee
Sort
Issues list
[BUG] [GenAI-Perf] openai-fronted server with --endpoint-type completions
#7995
opened Feb 7, 2025 by
jihyeonRyu
build.py
setting docker build args for secrets even when build-secret flag is not present
#7992
opened Feb 6, 2025 by
BenjaminBraunDev
libtriton_fil.so
missing on Arm64 containers 24.12 and 25.01
#7991
opened Feb 5, 2025 by
dagardner-nv
Performance issue - High queue times in perf_analyzer
performance
A possible performance tune-up
question
Further information is requested
#7986
opened Feb 4, 2025 by
asaff1
Something like "model instance index" inside python backend
enhancement
New feature or request
module: backends
Issues related to the backends
python
Python related, whether backend, in-process API, client, etc
#7984
opened Feb 3, 2025 by
vadimkantorov
Expected model dimensions when expected shape is not suitable to batch
#7981
opened Jan 31, 2025 by
codeofdutyAI
Pytorch backend: Model is run in no_grad mode even with INFERENCE_MODE=false
#7974
opened Jan 28, 2025 by
hakanardo
vLLM backend Hugging Face feature branch model loading
enhancement
New feature or request
#7963
opened Jan 23, 2025 by
knitzschke
unexpected throughput results - Increasing instance group count VS deploying the count distributed on the same card using shared computing windows
performance
A possible performance tune-up
#7956
opened Jan 21, 2025 by
ariel291888
How to start/expose the metrics end point of the Triton Server via openai_frontend/main.py arguments
#7954
opened Jan 21, 2025 by
shuknk8s
Segmentation Fault error when crafting pb_utils.Tensor object Triton BLS model
bug
Something isn't working
#7953
opened Jan 18, 2025 by
carldomond7
Failed to launch triton-server:”error: creating server: Internal - failed to load all models“
module: backends
Issues related to the backends
#7950
opened Jan 17, 2025 by
pzydzh
Triton crashes with SIGSEGV
crash
Related to server crashes, segfaults, etc.
#7938
opened Jan 15, 2025 by
ctxqlxs
[Question] Are the libnvinfer_builder_resources necessary in the triton image ?
question
Further information is requested
#7932
opened Jan 14, 2025 by
MatthieuToulemont
Server build with python BE failing due to missing Boost lib
#7925
opened Jan 9, 2025 by
buddhapuneeth
OpenAI-Compatible Frontend should support world_size larger than 1
enhancement
New feature or request
#7914
opened Jan 3, 2025 by
cocodee
vllm_backend: What is the right way to use downloaded model + Further information is requested
model.json
together?
question
#7912
opened Jan 2, 2025 by
kyoungrok0517
Python backend with multiple instances cause unexpected and non-deterministic results
bug
Something isn't working
#7907
opened Dec 25, 2024 by
NadavShmayo
MIG deployment of triton cause "CacheManager Init Failed. Error: -17"
bug
Something isn't working
#7906
opened Dec 25, 2024 by
LSC527
Shared memory io bottleneck?
performance
A possible performance tune-up
#7905
opened Dec 24, 2024 by
wensimin
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.