-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Inference] Add validated models for Gaudi #225
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Deegue, although the result showed pass but they actually failed. Please check the run result. So can you please check the response of the model for the tests.
…en to version 1.5
Signed-off-by: Yizhong Zhang <[email protected]>
All CI passed. Gentle ping @carsonwang for review, thanks! |
@kira-lin is helping to review this. For qwen, can you please update to use Qwen/Qwen2-7B-Instruct? |
Added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Qwen 110 not tested.
Many models output weird things, including mpt mistral gpt2 gemma falcon 7b/40b bloom.
Compare them with CPU/GPU to see if it's normal behavior. For example, codellama is outputting some markup language, but I think it's tuned that way. Also pay attention to temperature, which might lead to random result, and it might be different for different models
For falcon and qwen, @KepingYan can look into this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.model.config.model_type == "llama": |
Let's modify this line according to:
https://github.com/huggingface/optimum-habana/blob/595cc3e4ec219b1ce469b323cf94e994c5c5d8f3/examples/text-generation/utils.py#L311-L312
Updated, thanks for comment. Btw, will it still be any place to be changed since I found some other places specially handled through |
Model list:
bloom-7b1 single card without template
Falcon-7b single card without template
Falcon-40b multiple cards without template
Gemma-2b single card without template
Llama3-7b single card unknow
Llama3-70b multiple cards unknow
Mistral-7b single card without template
Mixtral-8x7B-Instruct-v0.1 single card with template
llama-2-7b single card unknow
llama-2-70b multiple cards unknow
CodeLlama single card unknow
GPT2 single card without template
GPT-J single card without template
MPT-7b single card without template
Qwen1.5-110B single card with template