A curated list of recent robot learning papers incorporating diffusion models for manipulation, navigation, planning etc.
- Benchmarks
- Diffusion Policy
- Diffusion Generation Models in Robot Learning
- Robot Learning Utilizing Diffusion Model Properties
-
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations (RSS 2018)
-
Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning (CoRL 2020)
-
Bridge data: Boosting generalization of robotic skills with cross-domain datasets (RSS 2022)
-
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos (ECCV 2022)
-
Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning (NeurIPS 2022 Datasets and Benchmarks Track)
-
Dexart: Benchmarking generalizable dexterous manipulation with articulated objects (CVPR 2023)
-
BridgeData V2: A Dataset for Robot Learning at Scale (CoRL 2023)
Visual Pusher
Panda Arm
Dexdeform: Dexterous deformable object manipulation with human demonstrations and differentiable physics (To be checked)
-
Imitating Human Behaviour with Diffusion Models (ICLR 2023)
-
Se(3)-diffusionfields: Learning cost functions for joint grasp and motion optimization through diffusion (ICRA 2023)
-
Diffusion policy: Visuomotor policy learning via action diffusion (RSS 2023)
-
Goal-conditioned imitation learning using score-based diffusion policies (RSS 2023)
-
Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition (CoRL 2023)
-
ChainedDiffuser: Unifying Trajectory Diffusion and Keypose Prediction for Robotic Manipulation (CoRL 2023)
-
Memory-Consistent Neural Networks for Imitation Learning (ICLR 2024)
-
EDMP: Ensemble-of-costs-guided Diffusion for Motion Planning (ICRA 2024)
-
Diffskill: Improving Reinforcement Learning Through Diffusion-Based Skill Denoiser for Robotic Manipulation (Knowledge-Based Systems 2024)
-
Consistency policy: Accelerated visuomotor policies via consistency distillation (RSS 2024)
-
Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation (ECCV 2024)
-
Differentiable Robot Rendering (CoRL 2024)
-
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning (CoRL 2024)
-
Equivariant Diffusion Policy (CoRL 2024)
-
GenDP: 3D Semantic Fields for Category-Level Generalizable Diffusion Policy (CoRL 2024)
-
EquiBot: SIM (3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning (CoRL 2024)
-
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations (Mar 2024)
-
Vision-Language-Affordance-based Robot Manipulation with Flow Matching (Sep 2024)
-
Planning-Guided Diffusion Policy Learning for Generalizable Contact-Rich Bimanual Manipulation (Sep 2024)
-
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression (Dec 2024)
PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play
Generative Skill Chaining: Long-Horizon Skill Planning with Diffusion Models
XSkill: Cross Embodiment Skill Discovery
Waypoint-Based Imitation Learning for Robotic Manipulation
UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers
Provable Guarantees for Generative Behavior Cloning: Bridging Low-Level Stability and High-Level Behavior (Theoretically based)
-
Shelving, stacking, hanging: Relational pose diffusion for multi-modal rearrangement (CoRL 2023)
-
Learning score-based grasping primitive for human-assisting dexterous grasping (NeurIPS 2023)
-
Reorientdiff: Diffusion model based reorientation for object manipulation (ICRA 2024)
-
DexDiffuser: Generating Dexterous Grasps with Diffusion Models (Feb 2024)
NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration
-
DALL-E-Bot: DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics (RA-Letters 2023)
-
UniPi: Learning Universal Policies via Text-Guided Video Generation (NeurIPS 2023)
-
AVDC: Learning to Act from Actionless Videos through Dense Correspondences (ICLR 2024)
-
UniSim: UniSim: Learning Interactive Real-World Simulators (ICLR 2024)
-
HiP: Compositional Foundation Models for Hierarchical Planning (NeurIPS 2023)
-
DMD: Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning (Feb 2024)
-
VLP: Video language planning (ICLR 2024)
-
Dreamitate: Dreamitate: Real-World Visuomotor Policy Learning via Video Generation (CoRL 2024)
-
ARDuP: ARDuP: Active Region Video Diffusion for Universal Policies (Jun 2024)
-
This&That: This&That: Language-Gesture Controlled Video Generation for Robot Planning (July 2024)
-
RoboDreamer: RoboDreamer: Learning Compositional World Models for Robot Imagination (ICML 2024)
-
CLOVER: Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation (NeurIPS 2024)
-
Cacti: Cacti: A framework for scalable multi-task multi-scene visual imitation learning (CoRL 2022 Workshop PRL)
-
GenAug: GenAug: Retargeting behaviors to unseen situations via Generative Augmentation
Scaling Robot Learning with Semantically Imagined Experience
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
Large-Scale Actionless Video Pre-Training via Discrete Diffusion for Efficient Policy Learning
Diffusion model is an effective planner and data synthesizer for multi-task reinforcement learning
IRASim: Learning Interactive Real-Robot Action Simulators arXiv 2024.6
Structured World Models from Human Videos RSS 2023
HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator ICIP 2022
DayDreamer: World Models for Physical Robot Learning CoRL 2022
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations
PoCo: Policy Composition from and for Heterogeneous Robot Learning (To be checked)
Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion (To be checked)