You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am truly grateful for the opportunity to read the paper on DeepSeekr1. It has been an insightful experience, and I would like to ask a few questions regarding some aspects of the work.
Here below are my consultations:
Is it possible to share the training data including the Distilling detailed data(few data samples of query&answer pairs could be wonderful) for DeepSeekR1 SFT as well as the detailed training data samples & intermediate responses for DeepSeekR1 RL?
Is it possible to share more how the aha moment “Wait, wait. Wait. That’s an aha moment I can flag here” was incorporated into the DeepSeekR1-Zero RL training process?
Regarding the DeepSeekR1 RL, Would u mind sharing the techniques used to decompose the query, generate multiple groups of outputs/trajectories per query and the granularity of each trajectory(token or sentence level)?
Will the DeepSeekR1 RL learning detailed process be disclosured for us to learn?
Appreciate your response ahead!
Yun
The text was updated successfully, but these errors were encountered:
baibaiyun
changed the title
Training data details
Consultations regarding DeepSeekR1
Feb 3, 2025
Hi DeepSeekR1 Team,
I am truly grateful for the opportunity to read the paper on DeepSeekr1. It has been an insightful experience, and I would like to ask a few questions regarding some aspects of the work.
Here below are my consultations:
Appreciate your response ahead!
Yun
The text was updated successfully, but these errors were encountered: