-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapter small fix #356
base: main
Are you sure you want to change the base?
Adapter small fix #356
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!!
Despite seeing all the green lights for merging don't do it just now. |
Thanks. Before we land this, I'd like to run the finetuning to make sure it is still training as expected. I'll do that in the next day or so. |
I don't have a GPU (yeah, I know 😄 ) so I want to excuse myself in advance for any stupid questions/suggestions. Basically the problem that I wasn't able to test my suspicions with the checkpoints for this repo.
I feel like you have already knew/discussed this, nevertheless I wanted to mention it. By the way: padding up to nearest multiple of 64 in my opinion is useful only for |
Hello @awaelchli
Any luck with this? |
Hi there 👋
As @carmocca mentioned in PR #352 some code changes need to be done:
self.n_embd
-->C
since this value is extracted from the shape of input variablex
right in the beginning of the forward method.vocab_size
-->padded_vocab_size
to align it withlit_llama/model.py
. I assume after this checkpoints won't go south since this is just an expansion in size for better performance (I believe up to 25%). With shrinkage it would be a whole another story.