-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Terminated/truncated support and Gymnasium wrapper #143
[Feature] Terminated/truncated support and Gymnasium wrapper #143
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Lukas!!
Thanks a million for this, it is very cool to have it and i think it will help many users. it is about time vmas allowed the option of truncated/terminated
A few high-level tenets we should keep in mind:
- vmas is currently depending on
gym
only for the specs. I think those are fine even if the library is unmaintained. I would like to avoid adding a core gymnasium dependency and keep the old specs. Gymnasium can be an optional dependency and its wrapper can handle the spec conversion - i would like to keep the vmas environment separated from the gym/gymnasium way of handling things. The only change i think we need in the vmas env interface is the terminated/truncated one, I would keep the rest as it was. The flag to get terminated and truncated instead of done can be called
terminated_truncated
instead oflegacy_gym
and be false by default - it would be cool if we could support vectorization in the gymnasium wrapper (maybe using numpy) do they have no way of doing this?
requirements.txt
Outdated
@@ -2,4 +2,5 @@ numpy | |||
torch | |||
pyglet<=1.5.27 | |||
gym | |||
gymnasium |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gymnasium |
@@ -45,12 +44,12 @@ def __init__( | |||
multidiscrete_actions: bool = False, | |||
clamp_actions: bool = False, | |||
grad_enabled: bool = False, | |||
legacy_gym: bool = True, | |||
render_mode: str = "human", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
render_mode: str = "human", |
if not self.legacy_gym: | ||
# for gymnasium compatibility, return info | ||
return_info = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not self.legacy_gym: | |
# for gymnasium compatibility, return info | |
return_info = True |
The functionality of vmas reset
should remian the same, if users want the info they can request it from the args. The gymnasium wrapper can do this (torchrl also does this)
@@ -77,6 +80,7 @@ def __init__( | |||
self.headless = None | |||
self.visible_display = None | |||
self.text_lines = None | |||
self.render_mode = render_mode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.render_mode = render_mode |
The render mode of vmas envs is decided dynamically at each render call
if self.legacy_gym: | ||
observations = self.reset(seed=seed) | ||
else: | ||
observations, _ = self.reset(seed=seed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.legacy_gym: | |
observations = self.reset(seed=seed) | |
else: | |
observations, _ = self.reset(seed=seed) | |
observations = self.reset(seed=seed) |
README.md
Outdated
@@ -154,6 +153,8 @@ Here is an example: | |||
``` | |||
A further example that you can run is contained in `use_vmas_env.py` in the `examples` directory. | |||
|
|||
To use an environment with the Gymnasium interface, give the additional `legacy_gym=False` argument. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can explain above what the terminated_truncated
option does
README.md
Outdated
@@ -133,8 +133,7 @@ pip install pytest pyyaml pytest-instafail tqdm | |||
|
|||
To use the simulator, simply create an environment by passing the name of the scenario | |||
you want (from the `scenarios` folder) to the `make_env` function. | |||
The function arguments are explained in the documentation. The function returns an environment | |||
object with the OpenAI gym interface: | |||
The function arguments are explained in the documentation. The function returns an environment object with the OpenAI Gym interface: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function arguments are explained in the documentation. The function returns an environment object with the OpenAI Gym interface: | |
The function arguments are explained in the documentation. The function returns an environment object with the VMAS interface: |
README.md
Outdated
@@ -143,7 +142,7 @@ Here is an example: | |||
num_envs=32, | |||
device="cpu", # Or "cuda" for GPU | |||
continuous_actions=True, | |||
wrapper=None, # One of: None, vmas.Wrapper.RLLIB, and vmas.Wrapper.GYM | |||
wrapper=None, # One of: None, vmas.Wrapper.RLLIB or "rllib", and vmas.Wrapper.GYM or "gym", and vmas.Wrapper.GYMNASIUM or "gymnasium" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapper=None, # One of: None, vmas.Wrapper.RLLIB or "rllib", and vmas.Wrapper.GYM or "gym", and vmas.Wrapper.GYMNASIUM or "gymnasium" | |
wrapper=None, # One of: None, "rllib", "gym", "gymnasium" |
README.md
Outdated
@@ -176,7 +177,7 @@ on how to run MAPPO-IPPO-MADDPG-QMIX-VDN using the [VMAS wrapper](https://github | |||
|
|||
### Input and output spaces | |||
|
|||
VMAS uses gym spaces for input and output spaces. | |||
VMAS uses gym (or gymnasium if `legacy_gym=False`) spaces for input and output spaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VMAS uses gym (or gymnasium if `legacy_gym=False`) spaces for input and output spaces. | |
VMAS uses gym spaces for input and output spaces. |
setup.py
Outdated
@@ -29,6 +29,6 @@ def get_version(): | |||
author="Matteo Bettini", | |||
author_email="[email protected]", | |||
packages=find_packages(), | |||
install_requires=["numpy", "torch", "pyglet<=1.5.27", "gym", "six"], | |||
install_requires=["numpy", "torch", "pyglet<=1.5.27", "gym", "gymnasium", "six"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can instead add options for vmas[gymnasium]
, vmas[rllib]
and so on
Just poking around it seems they support a vector env https://gymnasium.farama.org/api/vector/ interface maybe we can use this? or have 2 wrappers "gymnasium" and "gymnasium_vector"? |
Hi Matteo, Thanks for coming back quickly with comments. Some follow-ups so I best understand how this should look like:
Currently, VMAS uses
So you'd want the
Yes, Gymnasium has wrappers to run multiple instances of environments in vectorised fashion, either synchronously or asynchronously. However, I think it might even be more efficient to write a vectorised gymnasium wrapper that uses a vectorised VMAS environment instance underneath and converts things to numpy arrays e.g. instead of having multiple gymnasium environments each of which holds a VMAS instance with a single environment only underneath. The latter would likely be notably slower. Let me know what you think and I'll have a go at this later today/ this week! |
exactly
Nono, I would like to handle it like you have already done it in the PR. The only differences from the PR I am suggesting is that the The way you implemented
Yes that is what i am referring to: wrap a vmas env (which has multiple subenvs) into a gymnasium vector and then just call If we implement it as a vector of single vmas envs, I would personally resign from my PhD ahaahahah The question here is if we should still have the single gymnasium env wrapper or not |
Just a further clarification on this. I would like excatly the opposite: |
Gotcha, I think I understood what you mean 👍 I'll rename the environment argument flag as suggested. Just to make sure, for the base VMAS environment class, you'd want the only change induced by the flag to be the different in I'll also have a go at the Gymnasium wrapper. Since gymnasium separates singleton and vectorised environments, I'm tempted to keep these things separate here as well and have separate wrappers for a singleton Gymnasium and vectorised Gymnasium environments. |
Exactly, and then the gymnasium wrapper can call reset with All good on all fronts |
cc @Giovannibriglia since we wanted to implement a StableBaselines3 wrapper, maybe the Gymnasium Vector we will work on here will make it easier to bootstrap the SB3 one |
- base VMAS environment uses OpenAI gym spaces - base VMAS environment has new flag `terminated_truncated` (default: False) that determines whether `done()` and `step()` return the default `done` value or separate values for `terminated` and `truncated` - update `gymnasium` wrapper to convert gym spaces of base environment to gymnasium spaces - add `gymnasium_vec` wrapper that can wrap vectorized VMAS environment as gymnasium environment - add new installation options of VMAS for optional dependencies (used for features like rllib, torchrl, gymnasium, rendering, testing) - add `return_numpy` flag in gymnasium wrappers (default: True) to determine whether to convert torch tensors to numpy --> passed through by `make_env` function - add `render_mode` flag in gymnasium wrappers (default: "human") to determine mode to render --> passed through by `make_env` function
@matteobettini I pushed the updated integration including a vectorized Gymnasium wrapper. I tested things via the provided Please let me know if there are any further changes you would like to see! As a note, I slightly modified the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love this and so cool to see the vec wrapper!
I left some comments on the vmas stuff, once all that is settled i will read and test the new wrappers
README.md
Outdated
# install wandb logging dependencies | ||
pip install vmas[wandb] | ||
|
||
# install torchrl dependencies for training with BenchMARL | ||
pip install vmas[torchrl] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# install wandb logging dependencies | |
pip install vmas[wandb] | |
# install torchrl dependencies for training with BenchMARL | |
pip install vmas[torchrl] |
I think we can remove these as they are not actual dependencies, same in setup.py
requirements.txt
Outdated
@@ -3,3 +3,4 @@ torch | |||
pyglet<=1.5.27 | |||
gym | |||
six | |||
cvxpylayers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cvxpylayers |
I defiitely do not want to depend on this library, let's not add it for now, I'll fix the navigation heuristic later. Same in setup
vmas/make_env.py
Outdated
return_numpy: bool = False, | ||
render_mode: str = "human", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have them as individual args otherwise in a future with 10+ wrappers we will go crazy. What we can consider is wrapper_kwargs: Optional[Dict] = None
vmas/simulator/environment/gym.py
Outdated
|
||
def unwrapped(self) -> Environment: | ||
return self._env | ||
|
||
def _ensure_obs_type(self, obs): | ||
return obs.detach().cpu().numpy() if self.return_numpy else obs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use
def to_numpy(data: Union[Tensor, Dict[str, Tensor], List[Tensor]]): |
but here .item()
should be fine and better
EDIT maybe not actually cause the obs and the info are arrays, ok then the first thing above should do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.item()
would not work here afaik since these values might not be scalars but tensors.
from typing import List, Optional | ||
|
||
import gym | ||
import gymnasium |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import gymnasium | |
_has_gymnasium = importlib.util.find_spec("gymnasium") is not None | |
if _has_gymnasium: | |
import gymnasium |
maybe we need to do this also in the rllib file
from vmas.simulator.utils import extract_nested_with_index | ||
|
||
|
||
def _convert_space(space: gym.Space) -> gymnasium.Space: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really would like to avoid mainitaining this function, does gymnasium not have a conversion tool in their library?
vmas/make_env.py
Outdated
max_steps: Optional[int] = None, | ||
seed: Optional[int] = None, | ||
dict_spaces: bool = False, | ||
multidiscrete_actions: bool = False, | ||
clamp_actions: bool = False, | ||
grad_enabled: bool = False, | ||
terminated_truncated: bool = False, | ||
wrapper_kwargs: dict = {}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrapper_kwargs: dict = {}, | |
wrapper_kwargs: Optional[Dict] = None, |
let's not use mutables as defaults as in python they casue a lot of trouble
vmas/simulator/environment/gym.py
Outdated
@@ -17,18 +17,25 @@ class GymWrapper(gym.Env): | |||
def __init__( | |||
self, | |||
env: Environment, | |||
return_numpy: bool = False, | |||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**kwargs, |
Here and in the other wrappers it is better to consume all args so that it results in error if users pass the wrong args
vmas/simulator/environment/gym.py
Outdated
@@ -17,18 +17,25 @@ class GymWrapper(gym.Env): | |||
def __init__( | |||
self, | |||
env: Environment, | |||
return_numpy: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return_numpy: bool = False, | |
return_numpy: bool = True, |
I think we can change this to true, I know it is slightly bc-breaking but it might be better aligned with gym, wdyt?
vmas/simulator/environment/gym.py
Outdated
): | ||
assert ( | ||
env.num_envs == 1 | ||
), f"GymEnv wrapper is not vectorised, got env.num_envs: {env.num_envs}" | ||
|
||
self._env = env | ||
assert not self._env.terminated_truncated, "GymWrapper is not only compatible with termination and truncation flags. Please set `terminated_truncated=False` in the VMAS environment." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert not self._env.terminated_truncated, "GymWrapper is not only compatible with termination and truncation flags. Please set `terminated_truncated=False` in the VMAS environment." | |
assert not self._env.terminated_truncated, "GymWrapper is not compatible with termination and truncation flags. Please set `terminated_truncated=False` in the VMAS environment." |
vmas/simulator/environment/gym.py
Outdated
|
||
def unwrapped(self) -> Environment: | ||
return self._env | ||
|
||
def _ensure_obs_type(self, obs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we do it for info too? they are also tensors
return self._env | ||
|
||
def _ensure_obs_type(self, obs): | ||
return obs.detach().cpu().numpy() if self.return_numpy else obs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use vmas util
def _action_list_to_tensor(self, list_in: List) -> List: | ||
assert ( | ||
len(list_in) == self._env.n_agents | ||
), f"Expecting actions for {self._env.n_agents} agents, got {len(list_in)} actions" | ||
actions = [] | ||
for agent in self._env.agents: | ||
actions.append( | ||
torch.zeros( | ||
1, | ||
self._env.get_agent_action_size(agent), | ||
device=self._env.device, | ||
dtype=torch.float32, | ||
) | ||
) | ||
|
||
for i in range(self._env.n_agents): | ||
act = torch.tensor(list_in[i], dtype=torch.float32, device=self._env.device) | ||
if len(act.shape) == 0: | ||
assert ( | ||
self._env.get_agent_action_size(self._env.agents[i]) == 1 | ||
), f"Action of agent {i} is supposed to be an scalar int" | ||
else: | ||
assert len(act.shape) == 1 and act.shape[ | ||
0 | ||
] == self._env.get_agent_action_size(self._env.agents[i]), ( | ||
f"Action of agent {i} hase wrong shape: " | ||
f"expected {self._env.get_agent_action_size(self._env.agents[i])}, got {act.shape[0]}" | ||
) | ||
actions[i][0] = act | ||
return actions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the shared functions, would it make sense to write them once?
- add base VMAS wrapper class for type conversion between tensors and np for singleton and vectorized envs - change default of gym wrapper to return np data - update interactive rendering to be compatible with non gym wrapper class (to preserve tensor types) - add error messages for gymnasium and rllib wrappers without installing first
@matteobettini Added a new base VMAS wrapper class from which the gym, gymnasium, and vectorized gymnasium wrappers inherit that implements a lot of shared functionality including type conversions before and after feeding data to the environment. Also made other smaller notifications as per your suggestions (gymnasium/ rllib import warnings, removing kwargs of wrappers and making wrapper kwargs optional to avoid mutable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really like this, I think we are almost done
vmas/simulator/environment/rllib.py
Outdated
from ray.rllib.utils.typing import EnvActionType, EnvInfoDict, EnvObsType | ||
else: | ||
raise ImportError( | ||
"RLLib is not installed. Please install it with `pip install ray[rllib]`." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"RLLib is not installed. Please install it with `pip install ray[rllib]`." | |
"RLLib is not installed. Please install it with `pip install ray[rllib]<=2.2`." |
vmas/simulator/environment/base.py
Outdated
terminated[0].cpu().item() | ||
if not self.vectorized | ||
else self._ensure_tensor_type(terminated) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't we have all of these be the same function?
def _convert_output(self, data, item: bool):
if not self.vectorized:
data = extract_nested_with_index(data, index=0)
if item:
return data.item()
return self._ensure_tensor_type(data)
vmas/simulator/environment/base.py
Outdated
def env(self): | ||
return self._env | ||
|
||
def _ensure_tensor_type(self, tensor): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to _maybe_to_numpy
?
vmas/simulator/environment/base.py
Outdated
for agent in self._env.agents: | ||
actions.append( | ||
torch.zeros( | ||
self._env.num_envs, | ||
self._env.get_agent_action_size(agent), | ||
device=self._env.device, | ||
dtype=torch.float32, | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we do not need to do this in case of vectorized
vmas/simulator/environment/base.py
Outdated
1 | ||
] == self._env.get_agent_action_size(self._env.agents[i]), ( | ||
f"Action of agent {i} hase wrong shape: " | ||
f"expected {self._env.get_agent_action_size(self._env.agents[i])}, got {act.shape[0]}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"expected {self._env.get_agent_action_size(self._env.agents[i])}, got {act.shape[0]}" | |
f"expected {self._env.get_agent_action_size(self._env.agents[i])}, got {act.shape[1]}" |
vmas/simulator/environment/base.py
Outdated
if len(act.shape) == 1: | ||
assert ( | ||
self._env.get_agent_action_size(self._env.agents[i]) == 1 | ||
), f"Action of agent {i} is supposed to be an vector of shape ({self.n_num_envs},)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
), f"Action of agent {i} is supposed to be an vector of shape ({self.n_num_envs},)." | |
), f"Action of agent {i} is supposed to be an vector of shape ({self._env.num_envs},)." |
vmas/simulator/environment/base.py
Outdated
else: | ||
assert ( | ||
act.shape[0] == self._env.num_envs | ||
), f"Action of agent {i} is supposed to be a vector of shape ({self._num_envs}, ...)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
), f"Action of agent {i} is supposed to be a vector of shape ({self._num_envs}, ...)" | |
), f"Action of agent {i} is supposed to be a vector of shape ({self._env.n_num_envs}, ...)" |
vmas/simulator/environment/base.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is mostly the base of gym things, what do you think of creating a folder
gym
base.py
gym.py
gymansium.py
gymnasium_vec.py
and renaming the base to BaseGymWrapper
vmas/interactive_rendering.py
Outdated
for act, agent in zip(action_list, self.env.agents) | ||
] | ||
|
||
obs, rew, done, info = self.env.step(action_tensors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the removal of the wrapper you need to add the vmas nested indexing util here for it to run i think
@matteobettini Updated the base VMAS wrapper class for gym-style wrappers with simplified shared functions, and added unit tests for all gym-style wrappers now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow thanks so much for the tests. I never had a user add them without asking and they look amazing.
I left a few comments
from vmas.simulator.environment import Environment | ||
|
||
|
||
def scenario_names(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this function already defined somwhere else?
let s avoid redefining cause it makes maintainability hard
assert o.shape == shape, f"Expected shape {shape}, got {o.shape}" | ||
|
||
|
||
@pytest.mark.parametrize("scenario", scenario_names()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests should not be too different between scenarios, maybe select a few representative scenarios and use only those to lighten the burden on github CI
vmas/interactive_rendering.py
Outdated
@@ -38,7 +40,7 @@ class InteractiveEnv: | |||
|
|||
def __init__( | |||
self, | |||
env: GymWrapper, | |||
env: Environment, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that his change now is not needed
vmas/make_env.py
Outdated
Wrapper | ||
] = None, # One of: None, vmas.Wrapper.RLLIB, and vmas.Wrapper.GYM | ||
Union[Wrapper, str] | ||
] = None, # One of: None, vmas.Wrapper.RLLIB, vmas.Wrapper.GYM, vmas.Wrapper.Gymnasium |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add str examples
) | ||
|
||
for base_key, info in zip(base_keys, values): | ||
for k, v in info.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the reason behind this info compressing? This might be bc-breaking for the gymwrapper users
also, why not compressed_info[base_key]=info
becase info might be a nested dict and we would be having a / for the first level and dicts at all other levels.
if the goal is to add agent names to infos then the above suggestion should do it and we can avoid /
), f"Expecting actions for {self._env.n_agents} agents, got {len(list_in)} actions" | ||
|
||
return [ | ||
torch.tensor(act, device=self._env.device, dtype=torch.float32).reshape( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are actions only floats?
elif self is self.GYM: | ||
from vmas.simulator.environment.gym import GymWrapper | ||
from vmas.simulator.environment.gym.gym import GymWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to avoid changing the import structure for bc-compatibility
what we can do to makes sure
from vmas.simulator.environment.gym import GymWrapper
still runs is:
add __init__.py
to the gym
folder containing:
from .gym inport GymWrapper
from .gymnasium import GymnasiumWrapper
...
that way from vmas.simulator.environment.gym import GymWrapper
still works
and also from vmas.simulator.environment.gym import GymnasiumWrapper
will work
README.md
Outdated
max_steps=None, # Defines the horizon. None is infinite horizon. | ||
seed=None, # Seed of the environment | ||
dict_spaces=False, # By default tuple spaces are used with each element in the tuple being an agent. | ||
# If dict_spaces=True, the spaces will become Dict with each key being the agent's name | ||
grad_enabled=False, # If grad_enabled the simulator is differentiable and gradients can flow from output to input | ||
terminated_truncated=False, # If terminated_truncated the simulator will return separate `terminated` and `truncated` flags in the `done()` and `step()` functions instead of a single `done` flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might want to update the docs on running with this info as well
Also if you could pls follow this for pre-cmmit chores https://github.com/proroklab/VectorizedMultiAgentSimulator/blob/main/CONTRIBUTING.md |
for the tests we need to add gymansium and shimmy to |
- update github dependency installation - unify get scenario test function and limit wrapper tests to fewer scenarios - allow import of all gym wrappers from `vmas.simulator.environment.gym` - consider env continuous_actions for action type conversion in wrappers - compress info to single nested info if needed rather than combining keys
I followed the pre-commit chore besides updating the sphinx documentation and updated according to your comments. I want to wrap this up soonish since I've already spent more time on it than I originally intended. For the documentation, I am not familiar with sphinx. Should I just modify the |
Ok all good! I'll take it on from here and do a few commits to doc as well as solve an import problem |
Sounds great, thanks for taking it over. And let me know if there is a bigger issue that I introduced and you'd want help with! |
I think we should be good to go! Last thing we need to sort out is that I added an mpe task to the tests and they are currently failing. Any idea of the cause? |
Just had a look and found the issue. When checking shapes in the case of dict spaces, I compared them in order of a list but the dict -> list conversion I did does not guarantee that those spaces are then in the right order so sometimes it would fail. I'm just writing a solution and will push in a bit |
This commit seems to resolve the issue on my side. Please have a look but I think this should fix it! |
Merged! Thanks a mil Lukas! I think that this will make a LOT of users happy. I owe you a beer when you come to Cambridge :) |
I'm glad if it will turn out useful! I'll write a simple wrapper to integrate VMAS with its new gymnasium wrapper into EPyMARL, probably next week. And I'll take you up on that offer once I'm properly moved! |
Oh very cool! I'll make a release in the meantime then. Do you think we will be able to use the vector env in EPyMARL and keep the data on the torch device? |
As I see it right now, it might be tricky to use that unfortunately since the parallel rollouts in EPyMARL use multithreading with a different interface than standard vectorised environments (see EPyMARL's parallel runners for more info). To be able to use the vectorised environment of VMAS as an alternative to this parallelisation/ vectorisation, I'd need to fundamentally change data collection in EPyMARL but I don't think that's a change I'd like to do as of right now. While this will cost some performance, I think the loss won't be too bad when using CPUs anyway but might be more noticeable when using a GPU since we won't be able to keep all data constantly on the GPU (it will be moved back and forth). To get the most out of the latter case though, a JAX framework might be better suited in the first place judging by the current landscape but that would be a different discussion altogether. |
@matteobettini Just wanted to ping you that I added a VMAS wrapper to EPyMARL now! See the docs here that integrates all VMAS tasks with the I tried only MAPPO training in balance and transport and with just 4 CPU cores (no GPUs), it took 6 1/2h to train 10M timesteps in transport and 10h in balance which seems reasonable, even though I'm sure it could be notably faster when using and keeping all on the GPU. |
Amazing! This will be so helpful as now users have the epymarl/benchmarl/rllib triad to triple check their results when in doubt. I love this so much. PS If one day you will want to add a |
Thanks for the offer! I'll ping you should I spend some time on this, even though I have to admit that it's unlikely I'll work on this (in the near future) given I start a new job soon. |
The current VMAS implementation supports the OpenAI Gym interface but not the new and still maintained Gymnasium interface. This was already raised in an issue #61 before.
This PR adds both a wrapper that implements the Gymnasium interface for VMAS, and native Gymnasium interface for the VMAS environment via the
legacy_gym=False
argument. By default, the default and previous Gym interface is maintained for backwards compatibility.Small quality of life function to allow the
make_env
function to receive the wrapper name (Gymnasium
,Gym
,RLLib
) as a string argument instead of a wrapper object only.I have tested the interactive environment interface and ensured that (by default with
legacy_gym=True
) VMAS training of BenchMARL still runs as documented.I'm happy to do any further changes as requested to make sure all works fine so let me know if you have any feedback!
fixes #61
bc-breaking changes:
env.unwrapped()
->env.unwrapped
in gym wrapper