Skip to content

Commit

Permalink
Enable OpenTelemtry Tracing for ChatQnA on Xeon and Gaudi by merging …
Browse files Browse the repository at this point in the history
…new compose_telemetry.yaml files

Signed-off-by: Louie, Tsai <[email protected]>
Signed-off-by: Tsai, Louie <[email protected]>
  • Loading branch information
louie-tsai committed Feb 7, 2025
1 parent a4d028e commit 29a3060
Show file tree
Hide file tree
Showing 13 changed files with 176 additions and 29 deletions.
20 changes: 19 additions & 1 deletion ChatQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,13 @@ cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
# cd GenAIExamples/ChatQnA/docker_compose/nvidia/gpu/
docker compose up -d
```
To enable Open Telemetry Tracing, compose_telemetry.yaml file need to be merged along with default compose.yaml file.
CPU example with Open Telemetry feature:

```bash
cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

It will automatically download the docker image on `docker hub`:

Expand Down Expand Up @@ -232,6 +239,12 @@ cd GenAIExamples/ChatQnA/docker_compose/intel/hpu/gaudi/
docker compose up -d
```

To enable Open Telemetry Tracing, compose_telemetry.yaml file need to be merged along with default compose.yaml file.
```bash
cd GenAIExamples/ChatQnA/docker_compose/intel/hpu/gaudi/
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

Refer to the [Gaudi Guide](./docker_compose/intel/hpu/gaudi/README.md) to build docker images from source.

### Deploy ChatQnA on Xeon
Expand All @@ -242,6 +255,11 @@ Find the corresponding [compose.yaml](./docker_compose/intel/cpu/xeon/compose.ya
cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
docker compose up -d
```
To enable Open Telemetry Tracing, compose_telemetry.yaml file need to be merged along with default compose.yaml file.
```bash
cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

Refer to the [Xeon Guide](./docker_compose/intel/cpu/xeon/README.md) for more instructions on building docker images from source.

Expand Down Expand Up @@ -346,7 +364,7 @@ OPEA microservice deployment can easily be monitored through Grafana dashboards

## Tracing Services with OpenTelemetry Tracing and Jaeger

> NOTE: limited support. Only LLM inference serving with TGI on Gaudi is enabled for this feature.
> NOTE: This feature is disabled by default. Please check the Deploy ChatQnA sessions for how to enable this feature with compose_telemetry.yaml file.
OPEA microservice and TGI/TEI serving can easily be traced through Jaeger dashboards in conjunction with OpenTelemetry Tracing feature. Follow the [README](https://github.com/opea-project/GenAIComps/tree/main/comps/cores/telemetry#tracing) to trace additional functions if needed.

Expand Down
12 changes: 12 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,14 @@ To set up environment variables for deploying ChatQnA services, follow these ste
docker compose up -d
```

To enable Open Telemetry Tracing, compose_telemetry.yaml file need to be merged along with default compose.yaml file.
CPU example with Open Telemetry feature:

```bash
cd GenAIExamples/ChatQnA/docker_compose/intel/cpu/xeon/
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

It will automatically download the docker image on `docker hub`:

```bash
Expand Down Expand Up @@ -263,12 +271,16 @@ If use vLLM as the LLM serving backend.
docker compose -f compose.yaml up -d
# Start ChatQnA without Rerank Pipeline
docker compose -f compose_without_rerank.yaml up -d
# Start ChatQnA with Rerank Pipeline and Open Telemetry Tracing
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

If use TGI as the LLM serving backend.

```bash
docker compose -f compose_tgi.yaml up -d
# Start ChatQnA with Open Telemetry Tracing
docker compose -f compose_tgi.yaml -f compose_tgi_telemetry.yaml up -d
```

### Validate Microservices
Expand Down
27 changes: 27 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/xeon/compose_telemetry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
tei-embedding-service:
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
tei-reranking-service:
command: --model-id ${RERANK_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
- "9411:9411"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
COLLECTOR_ZIPKIN_HOST_PORT: 9411
restart: unless-stopped
chatqna-xeon-backend-server:
environment:
- ENABLE_OPEA_TELEMETRY=true
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
29 changes: 29 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/xeon/compose_tgi_telemetry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
tei-embedding-service:
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
tei-reranking-service:
command: --model-id ${RERANK_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
tgi-service:
command: --model-id ${LLM_MODEL_ID} --cuda-graphs 0 --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
- "9411:9411"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
COLLECTOR_ZIPKIN_HOST_PORT: 9411
restart: unless-stopped
chatqna-xeon-backend-server:
environment:
- ENABLE_OPEA_TELEMETRY=true
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
4 changes: 4 additions & 0 deletions ChatQnA/docker_compose/intel/cpu/xeon/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,7 @@ export INDEX_NAME="rag-redis"
# Set it as a non-null string, such as true, if you want to enable logging facility,
# otherwise, keep it as "" to disable it.
export LOGFLAG=""
# Set OpenTelemetry Tracing Endpoint
export JAEGER_IP=$(ip route get 8.8.8.8 | grep -oP 'src \K[^ ]+')
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=grpc://$JAEGER_IP:4317
export TELEMETRY_ENDPOINT=http://$JAEGER_IP:4318/v1/traces
10 changes: 10 additions & 0 deletions ChatQnA/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,12 @@ To set up environment variables for deploying ChatQnA services, follow these ste
docker compose up -d
```

To enable Open Telemetry Tracing, compose_telemetry.yaml file need to be merged along with default compose.yaml file.

```bash
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

It will automatically download the docker image on `docker hub`:

```bash
Expand Down Expand Up @@ -259,12 +265,16 @@ If use vLLM as the LLM serving backend.
docker compose -f compose.yaml up -d
# Start ChatQnA without Rerank Pipeline
docker compose -f compose_without_rerank.yaml up -d
# Start ChatQnA with Rerank Pipeline and Open Telemetry Tracing
docker compose -f compose.yaml -f compose_telemetry.yaml up -d
```

If use TGI as the LLM serving backend.

```bash
docker compose -f compose_tgi.yaml up -d
# Start ChatQnA with Open Telemetry Tracing
docker compose -f compose_tgi.yaml -f compose_tgi_telemetry.yaml up -d
```

If you want to enable guardrails microservice in the pipeline, please follow the below command instead:
Expand Down
27 changes: 27 additions & 0 deletions ChatQnA/docker_compose/intel/hpu/gaudi/compose_telemetry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
tei-embedding-service:
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
tei-reranking-service:
command: --model-id ${RERANK_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
- "9411:9411"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
COLLECTOR_ZIPKIN_HOST_PORT: 9411
restart: unless-stopped
chatqna-gaudi-backend-server:
environment:
- ENABLE_OPEA_TELEMETRY=true
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
22 changes: 2 additions & 20 deletions ChatQnA/docker_compose/intel/hpu/gaudi/compose_tgi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ services:
INDEX_NAME: ${INDEX_NAME}
TEI_ENDPOINT: http://tei-embedding-service:80
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
TELEMETRY_ENDPOINT: ${TELEMETRY_ENDPOINT}
tei-embedding-service:
image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5
container_name: tei-embedding-gaudi-server
Expand All @@ -38,7 +37,7 @@ services:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate
retriever:
image: ${REGISTRY:-opea}/retriever:${TAG:-latest}
container_name: retriever-redis-server
Expand All @@ -56,7 +55,6 @@ services:
INDEX_NAME: ${INDEX_NAME}
TEI_EMBEDDING_ENDPOINT: http://tei-embedding-service:80
HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
TELEMETRY_ENDPOINT: ${TELEMETRY_ENDPOINT}
LOGFLAG: ${LOGFLAG}
RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_REDIS"
restart: unless-stopped
Expand All @@ -80,7 +78,7 @@ services:
HABANA_VISIBLE_DEVICES: all
OMPI_MCA_btl_vader_single_copy_mechanism: none
MAX_WARMUP_SEQUENCE_LENGTH: 512
command: --model-id ${RERANK_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
command: --model-id ${RERANK_MODEL_ID} --auto-truncate
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.6
container_name: tgi-gaudi-server
Expand All @@ -106,21 +104,6 @@ services:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 2048 --max-total-tokens 4096 --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
- "9411:9411"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
COLLECTOR_ZIPKIN_HOST_PORT: 9411
restart: unless-stopped
chatqna-gaudi-backend-server:
image: ${REGISTRY:-opea}/chatqna:${TAG:-latest}
container_name: chatqna-gaudi-backend-server
Expand All @@ -146,7 +129,6 @@ services:
- LLM_SERVER_PORT=${LLM_SERVER_PORT:-80}
- LLM_MODEL=${LLM_MODEL_ID}
- LOGFLAG=${LOGFLAG}
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
ipc: host
restart: always
chatqna-gaudi-ui-server:
Expand Down
29 changes: 29 additions & 0 deletions ChatQnA/docker_compose/intel/hpu/gaudi/compose_tgi_telemetry.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

services:
tei-embedding-service:
command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
tei-reranking-service:
command: --model-id ${RERANK_MODEL_ID} --auto-truncate --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
tgi-service:
command: --model-id ${LLM_MODEL_ID} --max-input-length 2048 --max-total-tokens 4096 --otlp-endpoint $OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686"
- "4317:4317"
- "4318:4318"
- "9411:9411"
ipc: host
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
COLLECTOR_ZIPKIN_HOST_PORT: 9411
restart: unless-stopped
chatqna-gaudi-backend-server:
environment:
- ENABLE_OPEA_TELEMETRY=true
- TELEMETRY_ENDPOINT=${TELEMETRY_ENDPOINT}
7 changes: 5 additions & 2 deletions ChatQnA/tests/test_compose_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,12 @@ function start_services() {
export INDEX_NAME="rag-redis"
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export host_ip=${ip_address}
export JAEGER_IP=$(ip route get 8.8.8.8 | grep -oP 'src \K[^ ]+')
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=grpc://$JAEGER_IP:4317
export TELEMETRY_ENDPOINT=http://$JAEGER_IP:4318/v1/traces

# Start Docker Containers
docker compose -f compose.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
docker compose -f compose.yaml -f compose_telemetry.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
n=0
until [[ "$n" -ge 160 ]]; do
echo "n=$n"
Expand Down Expand Up @@ -171,7 +174,7 @@ function validate_frontend() {

function stop_docker() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi
docker compose -f compose.yaml down
docker compose -f compose.yaml -f compose_telemetry.yaml down
}

function main() {
Expand Down
7 changes: 5 additions & 2 deletions ChatQnA/tests/test_compose_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,12 @@ function start_services() {
export INDEX_NAME="rag-redis"
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export host_ip=${ip_address}
export JAEGER_IP=$(ip route get 8.8.8.8 | grep -oP 'src \K[^ ]+')
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=grpc://$JAEGER_IP:4317
export TELEMETRY_ENDPOINT=http://$JAEGER_IP:4318/v1/traces

# Start Docker Containers
docker compose -f compose.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
docker compose -f compose.yaml -f compose_telemetry.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
n=0
until [[ "$n" -ge 100 ]]; do
docker logs vllm-service > ${LOG_PATH}/vllm_service_start.log 2>&1
Expand Down Expand Up @@ -172,7 +175,7 @@ function validate_frontend() {

function stop_docker() {
cd $WORKPATH/docker_compose/intel/cpu/xeon
docker compose -f compose.yaml down
docker compose -f compose.yaml -f compose_telemetry.yaml down
}

function main() {
Expand Down
4 changes: 2 additions & 2 deletions ChatQnA/tests/test_compose_tgi_on_gaudi.sh
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ function start_services() {
export TELEMETRY_ENDPOINT=http://$JAEGER_IP:4318/v1/traces

# Start Docker Containers
docker compose -f compose_tgi.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
docker compose -f compose_tgi.yaml -f compose_tgi_telemetry.yaml up -d > ${LOG_PATH}/start_services_with_compose.log

n=0
until [[ "$n" -ge 500 ]]; do
Expand Down Expand Up @@ -217,7 +217,7 @@ function validate_frontend() {

function stop_docker() {
cd $WORKPATH/docker_compose/intel/hpu/gaudi
docker compose -f compose_tgi.yaml down
docker compose -f compose_tgi.yaml -f compose_tgi_telemetry.yaml down
}

function main() {
Expand Down
7 changes: 5 additions & 2 deletions ChatQnA/tests/test_compose_tgi_on_xeon.sh
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,12 @@ function start_services() {
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
export INDEX_NAME="rag-redis"
export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
export JAEGER_IP=$(ip route get 8.8.8.8 | grep -oP 'src \K[^ ]+')
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=grpc://$JAEGER_IP:4317
export TELEMETRY_ENDPOINT=http://$JAEGER_IP:4318/v1/traces

# Start Docker Containers
docker compose -f compose_tgi.yaml up -d > ${LOG_PATH}/start_services_with_compose.log
docker compose -f compose_tgi.yaml -f compose_tgi_telemetry.yaml up -d > ${LOG_PATH}/start_services_with_compose.log

n=0
until [[ "$n" -ge 100 ]]; do
Expand Down Expand Up @@ -216,7 +219,7 @@ function validate_frontend() {

function stop_docker() {
cd $WORKPATH/docker_compose/intel/cpu/xeon
docker compose -f compose_tgi.yaml down
docker compose -f compose_tgi.yaml -f compose_tgi_telemetry.yaml down
}

function main() {
Expand Down

0 comments on commit 29a3060

Please sign in to comment.