Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consultations regarding DeepSeekR1 #249

Open
baibaiyun opened this issue Feb 3, 2025 · 0 comments
Open

Consultations regarding DeepSeekR1 #249

baibaiyun opened this issue Feb 3, 2025 · 0 comments

Comments

@baibaiyun
Copy link

baibaiyun commented Feb 3, 2025

Hi DeepSeekR1 Team,

I am truly grateful for the opportunity to read the paper on DeepSeekr1. It has been an insightful experience, and I would like to ask a few questions regarding some aspects of the work.
Here below are my consultations:

  1. Is it possible to share the training data including the Distilling detailed data(few data samples of query&answer pairs could be wonderful) for DeepSeekR1 SFT as well as the detailed training data samples & intermediate responses for DeepSeekR1 RL?
  2. Is it possible to share more how the aha moment “Wait, wait. Wait. That’s an aha moment I can flag here” was incorporated into the DeepSeekR1-Zero RL training process?
  3. Regarding the DeepSeekR1 RL, Would u mind sharing the techniques used to decompose the query, generate multiple groups of outputs/trajectories per query and the granularity of each trajectory(token or sentence level)?
  4. Will the DeepSeekR1 RL learning detailed process be disclosured for us to learn?
    Appreciate your response ahead!
    Yun
@baibaiyun baibaiyun changed the title Training data details Consultations regarding DeepSeekR1 Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant