Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with NaN error during fine-tuning SHAP-E #184

Open
blueangel1313 opened this issue Jun 27, 2023 · 2 comments
Open

Issue with NaN error during fine-tuning SHAP-E #184

blueangel1313 opened this issue Jun 27, 2023 · 2 comments

Comments

@blueangel1313
Copy link

Hi ThreeStudio team,

We are currently facing an issue while fine-tuning SHAP-E for our fast text-to-3D model. We have been using the dataset from the repo and the training codebase available at [https://github.com/crockwell/Cap3D/blob/main/text-to-3D/finetune_shapE.py]. However, we consistently encounter a NaN error during the training process.

We have attempted to address the issue by reducing the learning rate, but unfortunately, it has not resolved the problem. As experts in this field, we would greatly appreciate any insights or assistance you can provide to help us overcome this NaN error.

Our goal is to create an exceptional open-sourced fast text-to-3D model by fine-tuning SHAP-E, and we believe that resolving this issue will enable us to achieve that.

Thank you for your time and support. If there is any additional information or details we can provide to facilitate the diagnosis and resolution of this problem, please let us know.

@DSaurus
Copy link
Collaborator

DSaurus commented Jun 27, 2023

Hi, @blueangel1313. Would you mind briefly introducing what SHAP-E is and how you perform fine-tuning with ThreeStudio? Additionally, I'm curious to know if the fine-tuning process is similar to current methods like DreamFusion or Prolificdreamer.

@blueangel1313
Copy link
Author

Thank you for your response @DSaurus, SHAP-E is a text-to-3D model trained by OpenAI, with the model being released here. The standout feature of this model is its ability to generate a 3D model in a mere 1 minute and 30 seconds.

Regarding your query about fine-tuning, another researcher has attempted to fine-tune this model, and you can find the details in the attached codebase. From my understanding, their fine-tuning process is distinct from methods like DreamFusion or Prolificdreamer.

This other researcher informed that the training took approximately 3 days on the full dataset, and 1 day on the smaller human dataset. They used the AdamW optimizer and the CosineAnnealingLR scheduler with an initial learning rate of 1e-5 for fine-tuning SHAP-E. The batch sizes were set to 64 and 256 for SHAP-E and Point-E respectively.

However, an issue they encountered was that SHAP-E often produced NaN outputs, necessitating a restart from saved checkpoints. This could be one of the potential reasons why their fine-tuning process didn't result in significant improvements.

I hope this provides some clarity on your questions. Don't hesitate to reach out if you need further information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants