Expand the accelerator metadata format #3324

achimnol · 2024-12-29T07:26:43Z

Currently AcceleratorMetadata (a typed dict) has the following fields:

class AcceleratorMetadata(TypedDict):
    slot_name: str
    description: str
    human_readable_name: str
    display_unit: str
    number_format: AcceleratorNumberFormat
    display_icon: str

Let’s expand the metadata to provide additional information like:

The supported floating point number formats and integer formats for matrix calculation (e.g., FP8, INT8, BF16, etc.) in a standardized way
Vendor-specific versioning scheme* e.g., NVIDIA: cuda_compute_capability, host_cuda_driver_version

This information could be used to choose the appropriate models and their runtimes in future model player implementations.

Reference

Pytorch

https://github.com/pytorch/pytorch/blob/7101b8ca3574bd9d1997c3539bc0ad7a6df3ba6e/torch/cuda/init.py#L132* Checks the CUDA version and the device’s compute capability version.
ref) https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#compute-capabilities
7.0: FP16, INT8 (Volta architecture)
8.0: FP16, BF16, INT8, FP8, TF32 (Amphere architecture)
8.9: (8.0) + FP8 (A100 only)
9.0: FP8 (E4M3 or E5M2) (Hopper architecture)

The text was updated successfully, but these errors were encountered:

achimnol self-assigned this Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand the accelerator metadata format #3324

Expand the accelerator metadata format #3324

achimnol commented Dec 29, 2024 •

edited

Loading

Expand the accelerator metadata format #3324

Expand the accelerator metadata format #3324

Comments

achimnol commented Dec 29, 2024 • edited Loading

Reference

Pytorch

achimnol commented Dec 29, 2024 •

edited

Loading