Overview
This release adds the following additional features to our PPO implementation based on our research:
- Random Network Distillation (RND) - Encourages exploration by adding a curiosity driven intrinsic reward.
- Symmetry-based Augmentation - Makes the learned behaviors more symmetrical.
We thank the authors of these works for helping in adding these valuable contributions to the library.
Full Changelog: v2.1.2...v2.2.0