triton-inference-server / server Public

Notifications You must be signed in to change notification settings
Fork 1.5k
Star 8.7k

Code
Issues 627
Pull requests 64
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: triton-inference-server/server

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

627 Open 3,231 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[BUG] [GenAI-Perf] openai-fronted server with --endpoint-type completions

#7995 opened Feb 7, 2025 by jihyeonRyu

Batching

#7994 opened Feb 7, 2025 by riyajatar37003

build.py setting docker build args for secrets even when build-secret flag is not present

#7992 opened Feb 6, 2025 by BenjaminBraunDev

libtriton_fil.so missing on Arm64 containers 24.12 and 25.01

#7991 opened Feb 5, 2025 by dagardner-nv

Performance issue - High queue times in perf_analyzer performance

A possible performance tune-up

question

Further information is requested

#7986 opened Feb 4, 2025 by asaff1

Something like "model instance index" inside python backend enhancement

New feature or request

module: backends

Issues related to the backends

python

Python related, whether backend, in-process API, client, etc

#7984 opened Feb 3, 2025 by vadimkantorov

Expected model dimensions when expected shape is not suitable to batch

#7981 opened Jan 31, 2025 by codeofdutyAI

[Question] triton-client numpy 2 support

#7979 opened Jan 30, 2025 by john-pixforce

Pytorch backend: Model is run in no_grad mode even with INFERENCE_MODE=false

#7974 opened Jan 28, 2025 by hakanardo

Method 'forward' is not defined error !

#7968 opened Jan 26, 2025 by MHmi1

vLLM backend Hugging Face feature branch model loading enhancement

New feature or request

#7963 opened Jan 23, 2025 by knitzschke

unexpected throughput results - Increasing instance group count VS deploying the count distributed on the same card using shared computing windows performance

A possible performance tune-up

#7956 opened Jan 21, 2025 by ariel291888

How to start/expose the metrics end point of the Triton Server via openai_frontend/main.py arguments

#7954 opened Jan 21, 2025 by shuknk8s

Segmentation Fault error when crafting pb_utils.Tensor object Triton BLS model bug

Something isn't working

#7953 opened Jan 18, 2025 by carldomond7

Failed to launch triton-server：”error: creating server: Internal - failed to load all models“ module: backends

Issues related to the backends

#7950 opened Jan 17, 2025 by pzydzh

build.py broken in r24.11 bug

Something isn't working

#7939 opened Jan 15, 2025 by prm-james-hill

Triton crashes with SIGSEGV crash

Related to server crashes, segfaults, etc.

#7938 opened Jan 15, 2025 by ctxqlxs

[Question] Are the libnvinfer_builder_resources necessary in the triton image ? question

Further information is requested

#7932 opened Jan 14, 2025 by MatthieuToulemont

Server build with python BE failing due to missing Boost lib

#7925 opened Jan 9, 2025 by buddhapuneeth

running triton as a inference service on host

#7915 opened Jan 3, 2025 by sriram-dsl

OpenAI-Compatible Frontend should support world_size larger than 1 enhancement

New feature or request

#7914 opened Jan 3, 2025 by cocodee

vllm_backend: What is the right way to use downloaded model + model.json together? question

Further information is requested

#7912 opened Jan 2, 2025 by kyoungrok0517

Python backend with multiple instances cause unexpected and non-deterministic results bug

Something isn't working

#7907 opened Dec 25, 2024 by NadavShmayo

MIG deployment of triton cause "CacheManager Init Failed. Error: -17" bug

Something isn't working

#7906 opened Dec 25, 2024 by LSC527

Shared memory io bottleneck? performance

A possible performance tune-up

#7905 opened Dec 24, 2024 by wensimin

Previous 1 2 3 4 5 … 25 26 Next

Previous Next

ProTip! Exclude everything labeled bug with -label:bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly