GPUs and Accelerators#
SkyPilot supports a wide range of GPUs, TPUs, and other accelerators.
Supported accelerators#
$ sky show-gpus -a
COMMON_GPU AVAILABLE_QUANTITIES
A10 1, 2, 4
A10G 1, 4, 8
A100 1, 2, 4, 8, 16
A100-80GB 1, 2, 4, 8
H100 1, 2, 4, 8, 12
L4 1, 2, 4, 8
L40S 1, 2, 4, 8
P100 1, 2, 4
T4 1, 2, 4, 8
V100 1, 2, 4, 8
V100-32GB 1, 2, 4, 8
GOOGLE_TPU AVAILABLE_QUANTITIES
tpu-v2-8 1
tpu-v3-8 1
tpu-v4-8 1
tpu-v4-16 1
tpu-v4-32 1
tpu-v5litepod-1 1
tpu-v5litepod-4 1
tpu-v5litepod-8 1
tpu-v5p-8 1
tpu-v5p-16 1
tpu-v5p-32 1
tpu-v6e-1 1
tpu-v6e-4 1
tpu-v6e-8 1
OTHER_GPU AVAILABLE_QUANTITIES
A100-80GB-SXM 1, 2, 4, 8
A40 1, 2, 4, 8
A4000 1, 2, 4
A6000 1, 2, 4
GH200 1
Gaudi HL-205 8
H100-MEGA 8
H100-SXM 1, 2, 4, 8
H200 8
K80 1, 2, 4, 8, 16
L40 1, 2, 4, 8
M4000 1
M60 1, 2, 4
P4 1, 2, 4
P4000 1, 2
RTX3060 1, 2
RTX3080 1
RTX3090 1, 2, 4, 8
RTX4000-Ada 1, 2, 4, 8
RTX4090 1, 2, 3, 4, 6, 8, 12
RTX6000 1
RTX6000-Ada 1, 2, 4, 8
RTXA4000 1, 2, 4, 8
RTXA4500 1, 2, 4, 8
RTXA5000 1, 2, 4, 8
RTXA6000 1, 2, 4, 8
Radeon MI25 1
Radeon Pro V520 1, 2, 4
T4g 1, 2
... [omitted long outputs] ...
Behind the scenes, these details are encoded in the SkyPilot Catalog: skypilot-org/skypilot-catalog.
Accelerators in Kubernetes#
Your Kubernetes clusters may contain only certain accelerators.
You can query the accelerators available in your Kubernetes clusters with:
$ sky show-gpus --cloud k8s
Kubernetes GPUs
GPU REQUESTABLE_QTY_PER_NODE TOTAL_GPUS TOTAL_FREE_GPUS
L4 1, 2, 4 12 12
H100 1, 2, 4, 8 16 16
Kubernetes per node GPU availability
NODE_NAME GPU_NAME TOTAL_GPUS FREE_GPUS
my-cluster-0 L4 4 4
my-cluster-1 L4 4 4
my-cluster-2 L4 2 2
my-cluster-3 L4 2 2
my-cluster-4 H100 8 8
my-cluster-5 H100 8 8
Querying accelerator details#
You can query the details of a supported accelerator config, accelerator:count
:
$ sky show-gpus H100:8
GPU QTY CLOUD INSTANCE_TYPE DEVICE_MEM vCPUs HOST_MEM HOURLY_PRICE HOURLY_SPOT_PRICE REGION
H100 8 Vast 8x-H100_NVL-32-65536 749GB 32 64GB $ 16.000 $ 16.000 Australia, AU, OC
H100 8 Vast 8x-H100_SXM-32-65536 637GB 32 64GB $ 21.000 $ 10.670 Iceland, IS, EU
H100 8 Lambda gpu_8x_h100_sxm5 80GB 208 1800GB $ 23.920 - europe-central-1
H100 8 Fluidstack H100_NVLINK_80GB::8 80GB 252 1440GB $ 23.920 - FINLAND
H100 8 RunPod 8x_H100_SECURE - 128 640GB $ 35.920 - CA
H100 8 GCP a3-highgpu-8g 80GB 208 1872GB $ 46.021 $ 35.133 us-central1
H100 8 Paperspace H100x8 - 128 640GB $ 47.600 - East Coast (NY2)
H100 8 DO gpu-h100x8-640gb 80GB 160 1920GB $ 47.600 - tor1
H100 8 OCI BM.GPU.H100.8 80GB 224 2048GB $ 80.000 - eu-amsterdam-1
H100 8 AWS p5.48xlarge 80GB 192 2048GB $ 98.320 $ 9.832 us-east-1
GPU QTY CLOUD INSTANCE_TYPE DEVICE_MEM vCPUs HOST_MEM HOURLY_PRICE HOURLY_SPOT_PRICE REGION
H100-MEGA 8 GCP a3-megagpu-8g 80GB 208 1872GB $ 92.214 $ 36.886 us-central1
GPU QTY CLOUD INSTANCE_TYPE DEVICE_MEM vCPUs HOST_MEM HOURLY_PRICE HOURLY_SPOT_PRICE REGION
H100-SXM 8 RunPod 8x_H100-SXM_SECURE - 208 640GB $ 37.520 - CA
Requesting accelerators#
You can use accelerator:count
in various places that accept accelerator specifications.
$ sky launch --gpus H100:8
$ sky launch --gpus H100 # If count is omitted, default to 1.
$ sky exec my-h100-8-cluster --gpus H100:0.5 job.yaml
# In SkyPilot YAML:
resources:
accelerators: H100:8
# Set: ask SkyPilot to auto-choose the cheapest and available option.
resources:
accelerators: {H100:8, A100:8}
# List: ask SkyPilot to try each one in order.
resources:
accelerators: [L4:8, L40S:8, A10G:8, A10:8]
See Provisioning Compute for more examples.
Google TPUs#
See Cloud TPU.