Skip to content

Commit

Permalink
Update README.md for changing llm model based on validated model table
Browse files Browse the repository at this point in the history
Signed-off-by: Tsai, Louie <[email protected]>
  • Loading branch information
louie-tsai committed Feb 7, 2025
1 parent 44a689b commit 7d9291b
Showing 1 changed file with 19 additions and 0 deletions.
19 changes: 19 additions & 0 deletions ChatQnA/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,25 @@ To set up environment variables for deploying ChatQnA services, follow these ste
source ./set_env.sh
```

4. Change Model for LLM serving

By default, Meta-Llama-3-8B-Instruct is used for LLM serving, users could change the default model to other validated llm models.
Please pick a [validated llm models](https://github.com/opea-project/GenAIComps/tree/main/comps/llms/src/text-generation#validated-llm-models) from the table.
Users could just overwrite the default model defined in set_env.sh by exporting LLM_MODEL_ID to new model or changing set_env.sh and then repeat step3.
For example, users can change to DeepSeek-R1-Distill-Qwen-32B by below command.

```bash
export LLM_MODEL_ID=deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
```

Please also check [required gaudi cards for different models](https://github.com/opea-project/GenAIComps/blob/deepseek/comps/llms/src/text-generation/README.md#system-requirements-for-llm-models) for your new models.
Users might need to increase number of gaudi cards for the model by exporting NUM_CARDS to new model or changing set_env.sh and then repeat step3.
For example, users should increase number of gaudi cards for DeepSeek-R1-Distill-Qwen-32B by below command.

```bash
export NUM_CARDS=4
```

## Quick Start: 2.Run Docker Compose

```bash
Expand Down

0 comments on commit 7d9291b

Please sign in to comment.