Skip to content

Emerge-Lab/gpudrive

Repository files navigation

GPUDrive

Python version Poetry Paper

GPUDrive is a GPU-accelerated, multi-agent driving simulator that runs at 1 million FPS. The simulator is written in C++, built on top of the Madrona Game Engine. We provide Python bindings and gymnasium wrappers in torch and jax, allowing you to interface with the simulator in Python using your preferred framework.

For more details, see our paper πŸ“œ and the πŸ‘‰ introduction tutorials, which guide you through the basic usage.

...

Agents in GPUDrive can be controlled by any user-specified actor.

βš™οΈ Integrations

What References README End-to-end training throughput
(agent steps per second)
IPPO implementation Stable Baselines IPPO Use 25 - 50K
IPPO implementation PufferLib 🐑 IPPO, PufferLib Use, Implementation 200 - 500K

πŸ› οΈ Installation

To build GPUDrive, ensure you have all the dependencies listed here. Briefly, you'll need

  1. CMake >= 3.24
  2. Python >= 3.11
  3. CUDA Toolkit >= 12.2 and <=12.4 (Currently we dont support CUDA versions 12.5+. Please check the ouptut of nvcc --version to make sure you are using correct CUDA version.)
  4. For MacOS and Windows, you need to install all the dependencies of XCode and Visual Studio C++ tools resepectively.

Once you have the required dependencies, clone the repository (don't forget --recursive!):

git clone --recursive https://github.com/Emerge-Lab/gpudrive.git
cd gpudrive

Optional: If you want to use the Madrona viewer in C++ (Not needed to render with pygame)

Extra dependencies to use Madrona viewer

To build the simulator with visualization support on Linux (build/viewer), you will need to install X11 and OpenGL development libraries. Equivalent dependencies are already installed by Xcode on macOS. For example, on Ubuntu:

  sudo apt install libx11-dev libxrandr-dev libxinerama-dev libxcursor-dev libxi-dev mesa-common-dev libc++1

Then, you can choose between 3 options for building the simulator:


Option 1️⃣ : Manual install

For Linux and macOS, use the following commands:

mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j # cores to build with, e.g. 32
cd ..

For Windows, open the cloned repository in Visual Studio and build the project using the integrated cmake functionality.

Next, set up the Python components of the repository with pip:

pip install -e . # Add -Cpackages.madrona_escape_room.ext-out-dir=PATH_TO_YOUR_BUILD_DIR on Windows


Option 2️⃣ : Poetry install

First create a conda environment using environment.yml:

conda env create -f environment.yml

Activate the environment:

conda activate gpudrive

Run:

poetry install


Option 3️⃣ : Docker (GPU Only)

Nvidia docker dependency

To run the Docker image with GPU support, ensure that you have the NVIDIA Container Toolkit installed. Detailed installation instructions can be found here - https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html.

Pull the image and run the container

To pull our pre-built Docker image and begin using GPUDrive, execute the following command (you may need to prepend sudo, depending on your Docker setup):

  docker pull ghcr.io/emerge-lab/gpudrive:latest

After pulling the image, you can create and run a new container using the --gpus all flag. Currently cpu version in docker is not working (To be fixed soon). This command will create a new container named gpudrive_container:

  docker run --gpus all -it --name gpudrive_container ghcr.io/emerge-lab/gpudrive:latest

In case you created the container but exited, to rerun the same container, you can:

docker start gpudrive_container # make sure the container is started
docker exec -it gpudrive_container /bin/bash

Once in the container, it will look like this:

(gpudrive) root@8caf35a97e4f:/gpudrive#

The Docker image includes all necessary dependencies, along with Conda and Poetry. However, a compilation step is still required. Once inside the container, run:

 poetry install

Build the image from scratch

If you want to build the image from scratch, ensure that Docker is installed with the Buildx plugin (though classic builds will still work, they are soon to be deprecated). In the GPUDrive repository, run:

docker buildx build -t gpudrive .

The subsequent steps to run and manage the container remain the same as outlined above.


Test whether the installation was successful by importing the simulator:

import gpudrive

To avoid compiling on GPU mode everytime, the following environment variable can be set with any custom path. For example, you can store the compiled program in a cache called gpudrive_cache:

export MADRONA_MWGPU_KERNEL_CACHE=./gpudrive_cache

Please remember that if you make any changes in C++, you need to delete the cache and recompile.

πŸš€ Getting started

To get started, see these entry points:

  • Our intro tutorials. These tutorials take approximately 30-60 minutes to complete and will guide you through the dataset, simulator, and how to populate the simulator with different types of actors.
  • The environment docs provide detailed info on environment settings and supported features.

πŸ“ˆ Tests

To further test the setup, you can run the pytests in the root directory:

pytest

To test if the simulator compiled correctly (and python lib did not), try running the headless program from the build directory.

cd build
./headless CPU 1 # Run on CPU, 1 step

πŸ‹πŸΌβ€β™€οΈ Pre-trained policy

We are open-sourcing a policy trained on 1,000 randomly sampled scenarios. You can download the pre-trained policy here. You can store the policy in models.

πŸ“‚ Dataset

Download the dataset

  • Two versions of the dataset are available, a mini version with a 1000 training files and 300 test/validation files, and a large dataset with 100k unique scenes.
  • Replace 'GPUDrive_mini' with 'GPUDrive' below if you wish to download the full dataset.
Download the dataset
  • To download the dataset you need the huggingface_hub library (if you initialized from environment.yml then you can skip this step):
pip install huggingface_hub

Then you can download the dataset using python or just huggingface-cli.

  • Option 1: Using Python
>>> from huggingface_hub import snapshot_download
>>> snapshot_download(repo_id="EMERGE-lab/GPUDrive_mini", repo_type="dataset", local_dir="data/processed")
  • Option 2: Use the huggingface-cli
  1. Log in to your Hugging Face account:
huggingface-cli login
  1. Download the dataset:
huggingface-cli download EMERGE-lab/GPUDrive_mini --local-dir data/processed --repo-type "dataset"
  • Option 3: Manual Download
  1. Visit https://huggingface.co/datasets/EMERGE-lab/GPUDrive_mini
  2. Navigate to the Files and versions tab.
  3. Download the desired files/directories.

NOTE: If you downloaded the full-sized dataset, it is grouped to subdirectories of 10k files each (according to hugging face constraints). In order for the path to work with GPUDrive, you need to run

python data_utils/extract_groups.py #use --help if you've used a custom download path

Re-build the dataset

If you wish to manually generate the dataset, GPUDrive is compatible with the complete Waymo Open Motion Dataset, which contains well over 100,000 scenarios. To download new files and create scenarios for the simulator, follow the steps below.

Re-build the dataset in 3 steps
  1. First, head to https://waymo.com/open/ and click on the "download" button a the top. After registering, click on the files from v1.2.1 March 2024, the newest version of the dataset at the time of wrting (10/2024). This will lead you a Google Cloud page. From here, you should see a folder structure like this:
waymo_open_dataset_motion_v_1_2_1/
β”‚
β”œβ”€β”€ uncompressed/
β”‚   β”œβ”€β”€ lidar_and_camera/
β”‚   β”œβ”€β”€ scenario/
β”‚   β”‚   β”œβ”€β”€ testing_interactive/
β”‚   β”‚   β”œβ”€β”€ testing/
β”‚   β”‚   β”œβ”€β”€ training_20s/
β”‚   β”‚   β”œβ”€β”€ training/
β”‚   β”‚   β”œβ”€β”€ validation_interactive/
β”‚   β”‚   └── validation/
β”‚   └── tf_example/
  1. Now, download files from testing, training and/or validation in the scenario folder. An easy way to do this is through gsutil. First register using:
gcloud auth login

...then run the command below to download the dataset you prefer. For example, to download the validation dataset:

gsutil -m cp -r gs://waymo_open_dataset_motion_v_1_2_1/uncompressed/scenario/validation/ data/raw

where data/raw is your local storage folder. Note that this can take a while, depending on the size of the dataset you're downloading.

  1. The last thing we need to do is convert the raw data to a format that is compatible with the simulator using:
python data_utils/process_waymo_files.py '<raw-data-path>' '<storage-path>' '<dataset>'

Note: Due to an open issue, installation of waymo-open-dataset-tf-2.12.0 fails for Python 3.11. To use the script, in a separate Python 3.10 environment, run

pip install waymo-open-dataset-tf-2-12-0 trimesh[easy] python-fcl

Then for example, if you want to process the validation data, run:

python data_utils/process_waymo_files.py 'data/raw/' 'data/processed/' 'validation'
>>>
Processing Waymo files: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 150/150 [00:05<00:00, 28.18it/s]
INFO:root:Done!

and that's it!

🧐 Caveat: A single Waymo tfrecord file contains approximately 500 traffic scenarios. Processing speed is about 250 scenes/min on a 16 core CPU. Trying to process the entire validation set for example (150 tfrecords) is a LOT of time.

πŸ“œ Citations

If you use GPUDrive in your work, please cite us:

@misc{kazemkhani2024gpudrivedatadrivenmultiagentdriving,
      title={GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS},
      author={Saman Kazemkhani and Aarav Pandya and Daphne Cornelisse and Brennan Shacklett and Eugene Vinitsky},
      year={2024},
      eprint={2408.01584},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2408.01584},
}

Contributing and learning benchmark

If you find a bug of are missing features, please feel free to create an issue or start contributing! That link also points to a learning benchmark complete with training logs and videos of agent behaviors via wandb.

Timeline

GPUDrive