Stable multi-agent & multi-world MAPPO on example scene with road lines #89
+471
−204
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds support for the 3 types of collision behavior in the gym environment (
base_environment.py
).Tested: 100% success rate in the following settings
example_scenario.json
with road lines (using the"remove"
option for collision behavior).selfObservationTensor
,agentMapObservationsTensor
, andpartnerObservationsTensor
Gym environment logic
kMaxAgentCount
controlled agents are padded withnan
values.kMaxAgentCount
to five (in consts.hpp) andmax_cont_agents
to three:(num_worlds, kMaxAgentCount)
, where we control at mostmax_cont_agents
per environment. Using the current base environment, the done tensor may look like:The above tensor is interpreted as follows:
1
)0
)nan
)Note
controlledStateTensor
, it is possible that I'm still controlling a few invalid agents. This will be tested later.