English | 简体中文
This project is an AI endpoint that provides a unified interface for AI models. It is designed to be multi-tenant, and supports load balancing, rate limiting, and logging capabilities. It is also containerized for easy deployment.
- Azure API Proxy
- Compatible with OpenAI API format
- API Keys load balancing
- Weighted round-robin
- Adaptive weight
- Multi-tenant
- Request authentication
- Configuration isolation
- Rate limiting capabilities
- Tenant-level rate limiting
- Model-level rate limiting
- Logging capabilities
- Multi-channel writing
- File splitting
- Containerized deployment
- POST /v1/chat/completions
curl --location 'localhost:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer xxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' \
--data '{
"stream": true,
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Does Azure OpenAI support customer managed keys?"
},
{
"role": "assistant",
"content": "Yes, customer managed keys are supported by Azure OpenAI."
},
{
"role": "user",
"content": "Do other Azure Cognitive Services support this too?"
}
]
}'
- gin
- google/uuid
- redis/go-redis
- spf13/cast
- spf13/viper
- go.uber.org/zap
- gorm.io/mysql
- gorm.io/gorm
- tidwall/gjson
logger:
channels:
- name: app
filename: /var/log/app.log
maxSize: 1
maxAge: 30
maxBackups: 10
compress: false
level: info
- name: request
filename: /var/log/request.log
maxSize: 1
maxAge: 30
maxBackups: 10
compress: false
level: info
database:
addr: root:password@tcp(127.0.0.1:3306)/endpoints?charset=utf8mb4&parseTime=True&loc=Local
redis:
addr: 127.0.0.1:6379
password: ""
db: 0
azure:
openai:
models:
- gpt-3.5-turbo
- gpt-4
- gpt-4-32k
peers:
- key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
endpoint: https://api.openai.com
weight: 20
deployments:
- model: gpt-3.5-turbo
# if you want to use the original openai model, set isOpenAI to true
isOpenAI: true
- model: gpt-4
isOpenAI: true
- model: gpt-4-32k
isOpenAI: true
- key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
endpoint: https://xxxxx.openai.azure.com
weight: 20
deployments:
- name: gpt-35-turbo
model: gpt-3.5-turbo
version: 2023-03-15-preview
- name: gpt-4
model: gpt-4
version: 2023-03-15-preview
- name: gpt-4-32k
model: gpt-4-32k
version: 2023-03-15-preview