Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you compute the gradient projection? #1

Open
aztec1900 opened this issue Feb 27, 2024 · 4 comments
Open

How do you compute the gradient projection? #1

aztec1900 opened this issue Feb 27, 2024 · 4 comments

Comments

@aztec1900
Copy link

Impressive work on the innovative data selection method!
I recently finished reading your paper. I'm particularly curious about the computation of the gradient projection. In your paper, you mentioned using a 125M model and reducing the gradient dimension to 16384. Does this imply the need to store a 125M x 16384 = 2048G projection matrix? That seems impractical considering memory constraints. Even if one could generate the random projection matrix on-the-fly, the computational cost for projection would still be substantial. However, your paper suggests that the projection cost is only 1% of the forward-backward process. I find this aspect a bit confusing. Could you provide some information on this matter? Thank you very much!

@yuzc19
Copy link

yuzc19 commented Mar 15, 2024

Hi @aztec1900, did you make some progress on this issue? I am also very interested in it, but I didn't find the code to estimate datamodels in this repo.

@lengstrom
Copy link
Member

Hi, sorry about the late response, I will release the implementation code soon!

@RohanAhluwalia
Copy link

Is there any update on this! Would love to know how this works as well / what optimizations could be done?

@311dada
Copy link

311dada commented Jan 9, 2025

Is there any update on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants