SHARK User Guide

These instructions cover the usage of the latest stable release of SHARK. For a more bleeding edge release please install the nightly releases.

Tip

Please note as we are prepping the next stable release, please use nightly releases for usage.

Prerequisites

Our current user guide requires that you have:

Access to a computer with an installed AMD Instinct™ MI300x Series Accelerator
Installed a compatible version of Linux and ROCm on the computer (see the ROCm compatability matrix)

Set up Environment

This section will help you install Python and set up a Python environment with venv.

Officially we support Python versions: 3.11, 3.12, 3.13

The rest of this guide assumes you are using Python 3.11.

Install Python

To install Python 3.11 on Ubuntu:

sudo apt install python3.11 python3.11-dev python3.11-venv

which python3.11
# /usr/bin/python3.11

Create a Python Environment

Setup your Python environment with the following commands:

# Set up a virtual environment to isolate packages from other envs.
python3.11 -m venv 3.11.venv
source 3.11.venv/bin/activate

Install SHARK and its dependencies

First install a torch version that fulfills your needs:

# Fast installation of torch with just CPU support.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

For other options, see https://pytorch.org/get-started/locally/.

Next install shark-ai:

pip install shark-ai[apps]

Tip

To switch from the stable release channel to the nightly release channel, see nightly_releases.md.

Test the installation.

python -m shortfin_apps.sd.server --help

Getting started

As part of our current release we support serving SDXL and Llama 3.1 variants as well as an initial release of sharktank, SHARK's model development toolkit which is leveraged in order to compile these models to be high performant.

SDXL

To get started with SDXL, please follow the SDXL User Guide

Llama 3.1

To get started with Llama 3.1, please follow the Llama User Guide.

Once you've set up the Llama server in the guide above, we recommend that you use SGLang Frontend by following the Using shortfin with sglang guide
If you would like to deploy LLama on a Kubernetes cluster we also provide a simple set of instructions and deployment configuration to do so here.
Finally, if you'd like to leverage the instructions above to run against a different variant of Llama 3.1, it's supported. However, you will need to generate a gguf dataset for that variant (explained in the user guide). In future releases, we plan to streamline these instructions to make it easier for users to compile their own models from HuggingFace.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

user_guide.md

user_guide.md

SHARK User Guide

Prerequisites

Set up Environment

Install Python

Create a Python Environment

Install SHARK and its dependencies

Test the installation.

Getting started

SDXL

Llama 3.1

Files

user_guide.md

Latest commit

History

user_guide.md

File metadata and controls

SHARK User Guide

Prerequisites

Set up Environment

Install Python

Create a Python Environment

Install SHARK and its dependencies

Test the installation.

Getting started

SDXL

Llama 3.1