v1.56.4
github-actions
released this
29 Dec 04:56
·
8 commits
to 24dd6559a68ddf2bbb9642c48c3482d0058b3e75
since this release
What's Changed
- Update model_prices_and_context_window.json by @superpoussin22 in #7452
- (Refactor) 🧹 - remove deprecated litellm server by @ishaan-jaff in #7456
- 📖 Docs - Using LiteLLM with 1M rows in spend logs by @ishaan-jaff in #7461
- (Admin UI - 1) - added the model used either directly before or after the "Assistant" so that it's clear which model provided the given assistant output by @ishaan-jaff in #7459
- (Admin UI - 2) UI chat should render the output in markdown by @ishaan-jaff in #7460
- (Security fix) - Upgrade to
fastapi==0.115.5
by @ishaan-jaff in #7447 - fix OR deepseek by @paul-gauthier in #7425
- (Bug Fix) Add health check support for realtime models by @ishaan-jaff in #7453
- (Refactor) - Re use litellm.completion/litellm.embedding etc for health checks by @ishaan-jaff in #7455
- Litellm dev 12 28 2024 p3 by @krrishdholakia in #7464
- Fireworks AI - document inlining support + model access groups for wildcard models by @krrishdholakia in #7458
Full Changelog: v1.56.3...v1.56.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.56.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 268.74238744669225 | 6.116896356155644 | 0.0 | 1829 | 0 | 214.29422199992132 | 1969.7571099999323 |
Aggregated | Passed ✅ | 240.0 | 268.74238744669225 | 6.116896356155644 | 0.0 | 1829 | 0 | 214.29422199992132 | 1969.7571099999323 |