Consultations regarding DeepSeekR1 #249

baibaiyun · 2025-02-03T16:40:39Z

Hi DeepSeekR1 Team,

I am truly grateful for the opportunity to read the paper on DeepSeekr1. It has been an insightful experience, and I would like to ask a few questions regarding some aspects of the work.
Here below are my consultations:

Is it possible to share the training data including the Distilling detailed data(few data samples of query&answer pairs could be wonderful) for DeepSeekR1 SFT as well as the detailed training data samples & intermediate responses for DeepSeekR1 RL?
Is it possible to share more how the aha moment “Wait, wait. Wait. That’s an aha moment I can flag here” was incorporated into the DeepSeekR1-Zero RL training process?
Regarding the DeepSeekR1 RL, Would u mind sharing the techniques used to decompose the query, generate multiple groups of outputs/trajectories per query and the granularity of each trajectory(token or sentence level)?
Will the DeepSeekR1 RL learning detailed process be disclosured for us to learn?
Appreciate your response ahead!
Yun

baibaiyun changed the title ~~Training data details~~ Consultations regarding DeepSeekR1 Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consultations regarding DeepSeekR1 #249

Consultations regarding DeepSeekR1 #249

baibaiyun commented Feb 3, 2025 •

edited

Loading

Consultations regarding DeepSeekR1 #249

Consultations regarding DeepSeekR1 #249

Comments

baibaiyun commented Feb 3, 2025 • edited Loading

baibaiyun commented Feb 3, 2025 •

edited

Loading