Welcome to SkyPilot!#

Run LLMs and AI on Any Cloud

SkyPilot is a framework for running LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.

SkyPilot abstracts away cloud infra burdens:

Launch jobs & clusters on any cloud
Easy scale-out: queue and run many jobs, automatically managed
Easy access to object stores (S3, GCS, Azure, R2, IBM)

SkyPilot maximizes GPU availability for your jobs:

Provision in all zones/regions/clouds you have access to (the Sky), with automatic failover

SkyPilot cuts your cloud costs:

Managed Spot: 3-6x cost savings using spot VMs, with auto-recovery from preemptions
Optimizer: 2x cost savings by auto-picking the cheapest VM/zone/region/cloud
Autostop: hands-free cleanup of idle clusters

SkyPilot supports your existing GPU, TPU, and CPU workloads, with no code changes.

Current supported providers (AWS, GCP, Azure, OCI, Lambda Cloud, RunPod, Fluidstack, Cudo, IBM, Samsung, Cloudflare, VMware vSphere, any Kubernetes cluster):

SkyPilot Supported Clouds