Stable multi-agent & multi-world MAPPO on example scene with road lines #89

daphne-cornelisse · 2024-04-19T21:08:43Z

This PR adds support for the 3 types of collision behavior in the gym environment (base_environment.py).

Tested: 100% success rate in the following settings

multi-agent and multi-world in example_scenario.json with road lines (using the "remove" option for collision behavior).
multi-agent and multi-world in $W = 2$ worlds with different different maps. The maps contain different numbers of agents.
used selfObservationTensor, agentMapObservationsTensor, and partnerObservationsTensor

Gym environment logic

We use "nan" values to indicate invalid agents in environments.
Since the number of valid agents varies per scenario (map), scenarios with fewer than kMaxAgentCount controlled agents are padded with nan values.
Example: Suppose we have two scenes, one with two and the other with three valid agents. We set kMaxAgentCount to five (in consts.hpp) and max_cont_agents to three:

 env = Env(
        config=config,
        num_worlds=2,
        max_cont_agents=3,
        data_dir="waymo_data",
        device="cuda",
    )

When we step the environment, we get tensors of shape (num_worlds, kMaxAgentCount), where we control at most max_cont_agents per environment. Using the current base environment, the done tensor may look like:

done = torch.Tensor(
    [0, 1, nan, nan, 0],
    [0, 0, nan, 0, nan],
)

The above tensor is interpreted as follows:

We have one valid agent that is done (as marked by the value 1)
We have five valid agents that are not done (marked by 0)
We have four invalid agents (as marked by the value nan)

Note

Due to garbage values in controlledStateTensor, it is possible that I'm still controlling a few invalid agents. This will be tested later.

…r of agents.

Add collision behavior.

7ed1ada

daphne-cornelisse requested review from eugenevinitsky, aaravpandya and SamanKazemkhani April 19, 2024 21:08

daphne-cornelisse added 8 commits April 20, 2024 16:33

Add video method to callback.

2eb7c12

Restructure

9a14e93

Make SB3 wrapper compatible with rendering options.

f501fdc

Environment updates.

8f05825

Environment updates.

1cb53c3

Stable MAPPO with all observation types.

cb630b5

WIP: Stable MAPPO when max_cont_agents are the first N agents

74bcb99

Updated base_env and wrapper to support worlds with a different numbe…

40fe7bd

…r of agents.

eugenevinitsky approved these changes Apr 24, 2024

View reviewed changes

SamanKazemkhani approved these changes Apr 25, 2024

View reviewed changes

Merge branch 'main' into dc/end_to_end

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Verified
Learn about vigilant mode

Loading
Loading status checks…

f45af0b

daphne-cornelisse merged commit d168a18 into main Apr 25, 2024
1 check passed

daphne-cornelisse deleted the dc/end_to_end branch August 28, 2024 22:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stable multi-agent & multi-world MAPPO on example scene with road lines #89

Stable multi-agent & multi-world MAPPO on example scene with road lines #89

daphne-cornelisse commented Apr 19, 2024 •

edited

Loading

Stable multi-agent & multi-world MAPPO on example scene with road lines #89

Stable multi-agent & multi-world MAPPO on example scene with road lines #89

Conversation

daphne-cornelisse commented Apr 19, 2024 • edited Loading

Tested: 100% success rate in the following settings

Gym environment logic

Note

daphne-cornelisse commented Apr 19, 2024 •

edited

Loading