Skip to content

Commit

Permalink
Document how to enable GPU and PAYG for CRN operator (#139)
Browse files Browse the repository at this point in the history
* Document how to enable GPU and PAYG for CRN operator

* List supported chain in faq
  • Loading branch information
olethanh authored Feb 19, 2025
1 parent 945ce6a commit cacef0a
Show file tree
Hide file tree
Showing 5 changed files with 152 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ Twentysix Cloud is a cross-chain cloud solution that provides scalable, high-per
Purchase ALEPH v2 on platforms such as Coinbase, KuCoin, Gate, LATOKEN, MEXC, Uniswap (Ethereum), Pancakeswap (BNB Chain), TraderJoe (Avalanche), and Raydium (Solana).

### What is Pay-As-You-Go?
Pay-As-You-Go allows you to pay for resources as you use them on the Aleph.im network, eliminating the need to hold or stake large amounts of $ALEPH. This is currently available only on the Avalanche c-chain.
Pay-As-You-Go allows you to pay for resources as you use them on the Aleph.im network, eliminating the need to hold or stake large amounts of $ALEPH. This is currently available on BASE and Avalanche c-chain. More chains will be supported in the future, refer to the [supported chains](protocol/chains.md) table.

## III. Node Operators

Expand Down
2 changes: 1 addition & 1 deletion docs/nodes/compute/advanced/enable-confidential.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Confidential computing
# Enable Confidential computing

This guide outlines how to enable [Confidential Computing](../../../computing/confidential/index.md) on a CRN.

Expand Down
125 changes: 125 additions & 0 deletions docs/nodes/compute/advanced/enable-gpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Enabling GPU support

This guide outlines how to enable GPU support on your CRN, as to allow user do deploy Instance with GPUs.

Enabling GPU support for Virtualization requires changing which Kernel module (driver) is handling the GPU card, as it need the special `vfio` module to works with qemu. Do note that it will make the GPU unavailable for other purposes.

To activate this feature it is also required to enable PAYG support and IPv6 on your CRN, as this is the required payment method for GPU. Follow the steps at [Enable PAYG](enable-payg.md)

## Device compatible with the feature

At the time of writing the GPUs listed below are compatible with the feature, more will be added soon as they are tested and validated.

### Standard GPUs

| GPU Model | vCPU | RAM | vRAM | Price approx |
|--------------|------|-------|-------|-----------------|
| L40S | 12 | 72 GB | 48 GB | 3.33 ALEPH/hour |
| RTX 5090 | 8 | 48 GB | 36 GB | 2.24 ALEPH/hour |
| RTX 4090 | 6 | 36 GB | 24 GB | 1.68 ALEPH/hour |
| RTX 3090 | 4 | 24 GB | 24 GB | 1.12 ALEPH/hour |
| RTX 4000 ADA | 3 | 18 GB | 20 GB | 0.84 ALEPH/hour |

GPUs must be connected via PCIe 4.0 16x each.

## Premium GPUs


Datacenter grade GPUs.

| GPU Model | vCPU | RAM | vRAM | Price approx |
|-----------|------|--------|-------|------------------|
| H100 | 24 | 144 GB | 80 GB | 13.33 ALEPH/hour |
| A100 | 16 | 96 GB | 80 GB | 8.89 ALEPH/hour |

## Switch Kernel modules for GPU
It is possible to enable multiple GPUs on one CRN

Execute these command as root, use `sudo` on Ubuntu system.

1. Edit initramfs to attach GPU to vfio:
`vi /etc/initramfs-tools/modules` and set the content to
```
attach : vfio vfio_iommu_type1 vfio_virqfd vfio_pci ids=10de:27b0,10de:22bc
```

replacing the ids `10de:27b0,10de:22bc` with ids of the VGA card, there should be 2, the Video device and the GPU audio device.
You can get them running `lspci -nvv` as root user.

2. Edit `/etc/modules` to ensure that lods GPU to vfio:
`vi /etc/modules` and add
```
attach: vfio vfio_iommu_type1 vfio_pci ids=10de:27b0,10de:22bc
```

3. Modify nvidia drivers to load after the vfio:
`vi /etc/modprobe.d/nvidia.conf`

set
```
softdep nouveau pre: vfio-pci
softdep nvidia pre: vfio-pci
softdep nvidia* pre: vfio-pci
```

4. Get the know alias from the PCI device:
`cat /sys/bus/pci/devices/0000:01:00.0/modalias`
replace pci address `0000:01:00.0` with the one for your GPU, you can get it using `lspci`

5. Configure vfio module to use that devices:
`vi /etc/modprobe.d/vfio.conf`

```
blacklist nouveau
blacklist snd_hda_intel
alias pci:v000010DEd000027B0sv000010DEsd000016FAbc03sc00i00 vfio-pci
options vfio-pci ids=10de:27b0,10de:22bc
```

6. Enable modprobe vfio-pci
```
modprobe vfio-pci
```

7 . Update initramfs image with your changes:
```shell
update-initramfs -u -k all
```

8. Finally Reboot the server
```shell
reboot
```

9. Confirm that the `vfio-pci` kernel module is used for the card
```shell
lspci -k
```

Should display something similar to
```
[...]
01:00.0 VGA compatible controller: NVIDIA Corporation AD104GL [RTX 4000 SFF Ada Generation] (rev a1)
Subsystem: NVIDIA Corporation AD104GL [RTX 4000 SFF Ada Generation]
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
01:00.1 Audio device: NVIDIA Corporation Device 22bc (rev a1)
Subsystem: NVIDIA Corporation Device 16fa
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
[...]
```

10. Confirm that the GPU are listed and supported on the CRN index page
No additional modification is needed inside the aleph-vm configuration, apart from [enabling PAYG](./enable-payg.md).

Start your aleph-vm supervisor, open the index page and it should list your GPU
```
GPUs
• NVIDIA | AD104GL [RTX 4000 SFF Ada Generation] is compatible ✅
```


## Disabling support.
To disable support for the GPU, revert the modification done to attach the kernel module to the GPU.
22 changes: 22 additions & 0 deletions docs/nodes/compute/advanced/enable-payg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## Enable PAYG (Pay-As-You-Go)

Pay-As-You-Go allows user to pay for resources as you use them on the Aleph.im network, eliminating the need to hold or
stake large amounts of $ALEPH.

The feature is currently available on BASE and Avalanche c-chain.

This guide outlines how to enable [Confidential Computing](../../../computing/confidential/index.md) on a CRN.

#### Configure the stream reward address

1. Create an Avalanche (AVAX) or BASE wallet.
2. Open the information of your CRN on the [aleph.im account page](https://account.aleph.im/) and enter the address in
the section named STREAM REWARD ADDRESS.

3. Add the reward address inside the CRN configuration `/etc/aleph-vm/supervisor.env` in the form of:
```
ALEPH_VM_PAYMENT_RECEIVER_ADDRESS="0x0000000000000000000000000000000000000000"
```
Where `0x0000000000000000000000000000000000000000` is the address of your wallet.
4. Restart the node with `systemctl restart aleph-vm-supervisor.service`
5. Confirm that the address appears on the path `/status/config` on the CRN's URL/config
4 changes: 3 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,9 @@ nav:
- 'Ubuntu 22.04': nodes/compute/installation/ubuntu-22.04.md
- 'Ubuntu 24.04': nodes/compute/installation/ubuntu-24.04.md
- 'Troubleshooting': nodes/compute/troubleshooting.md
- 'Confidential computing': nodes/compute/advanced/enable-confidential.md
- 'Enable PAYG': nodes/compute/advanced/enable-payg.md
- 'Enable Confidential computing': nodes/compute/advanced/enable-confidential.md
- 'Enable GPU support': nodes/compute/advanced/enable-gpu.md
- 'Releases': nodes/compute/releases.md
- 'Reliability':
- 'Introduction': nodes/reliability/index.md
Expand Down

0 comments on commit cacef0a

Please sign in to comment.