-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generation problem after / before instruction fine-tuning #51
Comments
I have the same issue. |
I have a same issue, and the saved model has a 26GB |
@Xuan-ZW Do you see any errors during training? |
@puyuanliu do you have the full code-snippet for loading the model and conducting the generation? I suspect there might be something weird happened with loading weights from the fine-tuned model. |
@helloeve Yeah I do have. Mentioned in #48 (comment) |
@puyuanliu I was able to conduct prediction without an issue. The only difference comparing to your method is that I didn't use |
@helloeve Thanks a lot! I found the issue was the model saving. For some reason, the script runs into a CUDA OOM error when saving the model, and the actually saved model is corrupted. I fixed the issue and my model is now working. |
I have the same issue. @puyuanliu did figured out why the OOM error is happening ? |
I mentioned the solution in
#81 (comment)
Martin Laprise ***@***.***> 于 2023年3月17日周五 18:19写道:
… I have the same issue. @puyuanliu <https://github.com/puyuanliu> did
figured out why the OOM error is happening ?
—
Reply to this email directly, view it on GitHub
<#51 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKN5FWVLLNC3H4IKLZRCMU3W4T5SXANCNFSM6AAAAAAV43ZTFU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I manage to resolve the issue finally, I fonnd the issue to be the inconsistancy script I used in a newer version of llama conversion scripts which is incompatible of the current version of LLaMA. After reverting to the correct commit |
@puyuanliu when generation I loaded model by 'model = model.to("cuda")', I had 8 A100 gpus, but encountered OOM, it seemed pytorch loading the model in 1 A100 gpu, how do you fix this issue? |
Environment: 6xA6000 48GB with Ubuntu 22.04, Pytorch 1.13.0
I ran into a generation problem after following your instruction to convert LLaMA-7B weight using your attached script.
I simply used the following script to directly test generation after loading the converted LLaMA-7B model:
tokenizer.batch_decode(model.generate(**tokenizer('I want to ', return_tensors="pt")))
The output of above code is:
'I want to acoérницschutzirectorioieckťDEX threshold släktetolasĭüttpiel'
The problem happens both before and after following your README for instruction fine-tuning. (note that I see the loss is decreasing over time during the fine-tuning stage which seems OK)
I have no problem running generation using original code from LLaMA, may I know your generation script so that I can test what caused the problem? Thanks.
The text was updated successfully, but these errors were encountered: