Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to CUDA 12 #3943

Closed
6 tasks done
xmfcx opened this issue Oct 30, 2023 · 10 comments · Fixed by #3956
Closed
6 tasks done

Upgrade to CUDA 12 #3943

xmfcx opened this issue Oct 30, 2023 · 10 comments · Fixed by #3956
Assignees
Labels
component:perception Advanced sensor data processing and environment understanding.

Comments

@xmfcx
Copy link
Contributor

xmfcx commented Oct 30, 2023

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I've agreed with the maintainers that I can plan this task.

Description

In the Autoware, currently we are using a CUDA version from Ubuntu 20.04 with a workaround.

Related PR:

Related Links:

We haven't updated the version for a while because NVIDIA didn't release the TensorRT for ARM until 2023 September.

Now that it is released, we can test our CUDA dependent packages and finally update.

Purpose

To use recommended, latest NVIDIA packages.

Possible approaches

We can run each CUDA dependent package on both x86_64 and ARM64 and if they work, we can update the ansible script here to install the latest CUDA, TensorRT and cuDNN:

Definition of done

  • CUDA dependent packages are listed
    • cuda_utils
    • image_projection_based_fusion
    • lidar_apollo_instance_segmentation
    • lidar_centerpoint
    • tensorrt_classifier
    • tensorrt_common
    • tensorrt_yolo
    • tensorrt_yolox
    • traffic_light_classifier
    • traffic_light_fine_detector
    • traffic_light_ssd_fine_detector
    • trtexec_vendor
  • CUDA dependent packages are tested
  • Ansible scripts are updated build: update to CUDA 12.3 #3956
@xmfcx
Copy link
Contributor Author

xmfcx commented Oct 30, 2023

I followed:

cd /tmp

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb

sudo dpkg -i cuda-keyring_1.1-1_all.deb

sudo apt update

sudo apt install cuda tensorrt-dev libcudnn8-dev

to install latest CUDA on my machine and I was able to compile Autoware without errors.

I will test some of the packages (starting from the ones required for AWSIM testing) and report my findings.

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Sep__8_19:17:24_PDT_2023
Cuda compilation tools, release 12.3, V12.3.52
Build cuda_12.3.r12.3/compiler.33281558_0

Following dpkg -l | grep cuDNN and dpkg -l | grep TensorRT, some relevant package versions:

libcudnn8                                         8.9.5.29-1+cuda12.2                     amd64
libcudnn8-dev                                     8.9.5.29-1+cuda12.2                     amd64
libnvinfer-dev                                    8.6.1.6-1+cuda12.0                      amd64
libnvinfer8                                       8.6.1.6-1+cuda12.0                      amd64
tensorrt-dev                                      8.6.1.6-1+cuda12.0                      amd64

@xmfcx
Copy link
Contributor Author

xmfcx commented Oct 30, 2023

Screenshot from 2023-10-30 19-52-32

It works!

centerpoint and traffic_light_classifier works.

@xmfcx
Copy link
Contributor Author

xmfcx commented Oct 30, 2023

@mitsudome-r do you think we should also test this in ARM or can we start upgrading the Ansible scripts right away?

@xmfcx xmfcx added the component:perception Advanced sensor data processing and environment understanding. label Oct 30, 2023
@xmfcx
Copy link
Contributor Author

xmfcx commented Oct 31, 2023

@oguzkaganozt have you tested Autoware on AVA developer kit?
If so, could you test it on it too?
It's not urgent, we can use CI to test if it builds and just update the ansible scripts too.

@esteve
Copy link
Contributor

esteve commented Oct 31, 2023

@xmfcx I'm targeting CUDA 12 for the Debian packages, the process for installing the previous version was a bit clunky and hopefully by the time Autoware works with CUDA 12, the old steps will be obsolete.

@oguzkaganozt
Copy link
Contributor

@oguzkaganozt have you tested Autoware on AVA developer kit? If so, could you test it on it too? It's not urgent, we can use CI to test if it builds and just update the ansible scripts too.

I have tested Autoware on AVA and it works. I think this update should also work on it too. But give me some time to validate.

@esteve esteve mentioned this issue Nov 2, 2023
7 tasks
@esteve
Copy link
Contributor

esteve commented Nov 2, 2023

@xmfcx I've updated the Ansible scripts for CUDA in #3956 , I don't have a GPU powerful enough to test it, though.

@esteve
Copy link
Contributor

esteve commented Nov 6, 2023

I've updated the ticket with the list of packages that depend on CUDA, I'd appreciate if someone can test #3956 with a GPU on x86_64 and ARM, thanks.

@miursh miursh moved this from Todo to In Progress in Sensing & Perception Working Group Nov 7, 2023
@oguzkaganozt
Copy link
Contributor

I have tested on both arm64(AADP-AVA) and amd64 platforms using rosbag replay simulator with GeForce RTX 3070. It is working fine, I didn't notice any difference between older versions of CUDA and TensorRT.

1
2
3

@xmfcx
Copy link
Contributor Author

xmfcx commented Jul 2, 2024

CUDA 12.2 downgrade discussion:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:perception Advanced sensor data processing and environment understanding.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants