Installation#
Note
For Macs, macOS >= 10.15 is required to install SkyPilot. Apple Silicon-based devices (e.g. Apple M1) must run pip uninstall grpcio; conda install -c conda-forge grpcio=1.43.0
prior to installing SkyPilot.
Install SkyPilot using pip:
# Recommended: use a new conda env to avoid package conflicts.
# SkyPilot requires 3.7 <= python <= 3.10.
conda create -y -n sky python=3.10
conda activate sky
# Choose your cloud:
pip install "skypilot-nightly[aws]"
pip install "skypilot-nightly[gcp]"
pip install "skypilot-nightly[azure]"
pip install "skypilot-nightly[oci]"
pip install "skypilot-nightly[lambda]"
pip install "skypilot-nightly[runpod]"
pip install "skypilot-nightly[fluidstack]"
pip install "skypilot-nightly[cudo]"
pip install "skypilot-nightly[ibm]"
pip install "skypilot-nightly[scp]"
pip install "skypilot-nightly[vsphere]"
pip install "skypilot-nightly[kubernetes]"
pip install "skypilot-nightly[all]"
# Recommended: use a new conda env to avoid package conflicts.
# SkyPilot requires 3.7 <= python <= 3.10.
conda create -y -n sky python=3.10
conda activate sky
# Choose your cloud:
pip install "skypilot[aws]"
pip install "skypilot[gcp]"
pip install "skypilot[azure]"
pip install "skypilot[oci]"
pip install "skypilot[lambda]"
pip install "skypilot[runpod]"
pip install "skypilot[fluidstack]"
pip install "skypilot[cudo]"
pip install "skypilot[ibm]"
pip install "skypilot[scp]"
pip install "skypilot[vsphere]"
pip install "skypilot[kubernetes]"
pip install "skypilot[all]"
# Recommended: use a new conda env to avoid package conflicts.
# SkyPilot requires 3.7 <= python <= 3.10.
conda create -y -n sky python=3.10
conda activate sky
git clone https://github.com/skypilot-org/skypilot.git
cd skypilot
# Choose your cloud:
pip install -e ".[aws]"
pip install -e ".[gcp]"
pip install -e ".[azure]"
pip install -e ".[oci]"
pip install -e ".[lambda]"
pip install -e ".[runpod]"
pip install -e ".[fluidstack]"
pip install -e ".[cudo]"
pip install -e ".[ibm]"
pip install -e ".[scp]"
pip install -e ".[vsphere]"
pip install -e ".[kubernetes]"
pip install -e ".[all]"
To use more than one cloud, combine the pip extras:
pip install -U "skypilot-nightly[aws,gcp]"
pip install -U "skypilot[aws,gcp]"
pip install -e ".[aws,gcp]"
Alternatively, we also provide a Docker image as a quick way to try out SkyPilot.
Verifying cloud access#
After installation, run sky check
to verify that credentials are correctly set up:
sky check
This will produce a summary like:
Checking credentials to enable clouds for SkyPilot.
AWS: enabled
GCP: enabled
Azure: enabled
OCI: enabled
Lambda: enabled
RunPod: enabled
Fluidstack: enabled
Cudo: enabled
IBM: enabled
SCP: enabled
vSphere: enabled
Cloudflare (for R2 object store): enabled
Kubernetes: enabled
SkyPilot will use only the enabled clouds to run tasks. To change this, configure cloud credentials, and run sky check.
If any cloud’s credentials or dependencies are missing, sky check
will
output hints on how to resolve them. You can also refer to the cloud setup
section below.
Tip
If your clouds show enabled
— 🎉 🎉 Congratulations! 🎉 🎉 You can now head over to
Quickstart to get started with SkyPilot.
Cloud account setup#
SkyPilot currently supports these cloud providers: AWS, GCP, Azure, OCI, Lambda Cloud, RunPod, Fluidstack, Cudo, IBM, SCP, VMware vSphere and Cloudflare (for R2 object store).
If you already have cloud access set up on your local machine, run sky check
to verify that SkyPilot can properly access your enabled clouds.
Otherwise, configure access to at least one cloud using the following guides.
Amazon Web Services (AWS)#
To get the AWS access key required by aws configure
, please go to the AWS IAM Management Console and click on the “Access keys” dropdown (detailed instructions here). The Default region name [None]: and Default output format [None]: fields are optional and can be left blank to choose defaults.
# Install boto
pip install boto3
# Configure your AWS credentials
aws configure
To use AWS IAM Identity Center (AWS SSO), see here for instructions.
Optional: To create a new AWS user with minimal permissions for SkyPilot, see AWS User Creation.
Google Cloud Platform (GCP)#
conda install -c conda-forge google-cloud-sdk
gcloud init
# Run this if you don't have a credentials file.
# This will generate ~/.config/gcloud/application_default_credentials.json.
gcloud auth application-default login
Tip
If you are using multiple GCP projects, list all the projects by gcloud projects list
and activate one by gcloud config set project <PROJECT_ID>
(see GCP docs).
Common GCP installation errors
Here some commonly encountered errors and their fixes:
RemoveError: 'requests' is a dependency of conda and cannot be removed from conda's operating environment
when runningconda install -c conda-forge google-cloud-sdk
— runconda update --force conda
first and rerun the command.Authorization Error (Error 400: invalid_request)
with the url generated bygcloud auth login
— install the latest version of the Google Cloud SDK (e.g., withconda install -c conda-forge google-cloud-sdk
) on your local machine (which opened the browser) and rerun the command.
Optional: To create and use a long-lived service account on your local machine, see here.
Optional: To create a new GCP user with minimal permissions for SkyPilot, see GCP User Creation.
Azure#
# Login
az login
# Set the subscription to use
az account set -s <subscription_id>
Hint: run az account subscription list
to get a list of subscription IDs under your account.
Oracle Cloud Infrastructure (OCI)#
To access Oracle Cloud Infrastructure (OCI), setup the credentials by following this guide. After completing the steps in the guide, the ~/.oci
folder should contain the following files:
~/.oci/config
~/.oci/oci_api_key.pem
The ~/.oci/config
file should contain the following fields:
[DEFAULT]
user=ocid1.user.oc1..aaaaaaaa
fingerprint=aa:bb:cc:dd:ee:ff:gg:hh:ii:jj:kk:ll:mm:nn:oo:pp
tenancy=ocid1.tenancy.oc1..aaaaaaaa
region=us-sanjose-1
key_file=~/.oci/oci_api_key.pem
Lambda Cloud#
Lambda Cloud is a cloud provider offering low-cost GPUs. To configure Lambda Cloud access, go to the API Keys page on your Lambda console to generate a key and then add it to ~/.lambda_cloud/lambda_keys
:
mkdir -p ~/.lambda_cloud
echo "api_key = <your_api_key_here>" > ~/.lambda_cloud/lambda_keys
RunPod#
RunPod is a specialized AI cloud provider that offers low-cost GPUs. To configure RunPod access, go to the Settings page on your RunPod console and generate an API key. Then, run:
pip install "runpod>=1.5.1"
runpod config
Fluidstack#
Fluidstack is a cloud provider offering low-cost GPUs. To configure Fluidstack access, go to the Home page on your Fluidstack console to generate an API key and then add the API key
to ~/.fluidstack/api_key
and the API token
to ~/.fluidstack/api_token
:
mkdir -p ~/.fluidstack
echo "your_api_key_here" > ~/.fluidstack/api_key
echo "your_api_token_here" > ~/.fluidstack/api_token
Cudo Compute#
Cudo Compute GPU cloud provides low cost GPUs powered with green energy.
Create an API Key by following this guide.
Download and install the cudoctl command line tool
Run
cudoctl init
:
cudoctl init
✔ api key: my-api-key
✔ project: my-project
✔ billing account: my-billing-account
✔ context: default
config file saved ~/.config/cudo/cudo.yml
pip install "cudocompute>=0.1.8"
If you want to want to use skypilot with a different Cudo Compute account or project, just run cudoctl init
: again.
IBM#
To access IBM’s VPC service, store the following fields in ~/.ibm/credentials.yaml
:
iam_api_key: <user_personal_api_key>
resource_group_id: <resource_group_user_is_a_member_of>
Create a new API key by following this guide.
Obtain a resource group’s ID from the web console.
Note
Stock images aren’t currently providing ML tools out of the box. Create private images with the necessary tools (e.g. CUDA), by following the IBM segment in this documentation.
To access IBM’s Cloud Object Storage (COS), append the following fields to the credentials file:
access_key_id: <access_key_id>
secret_access_key: <secret_key_id>
To get access_key_id
and secret_access_key
use the IBM web console:
Create/Select a COS instance from the web console.
From “Service Credentials” tab, click “New Credential” and toggle “Include HMAC Credential”.
Copy “secret_access_key” and “access_key_id” to file.
Finally, install rclone via: curl https://rclone.org/install.sh | sudo bash
Note
sky check
does not reflect IBM COS’s enabled status. IBM: enabled
only guarantees that IBM VM instances are enabled.
Samsung Cloud Platform (SCP)#
Samsung Cloud Platform(SCP) provides cloud services optimized for enterprise customers. You can learn more about SCP here.
To configure SCP access, you need access keys and the ID of the project your tasks will run. Go to the Access Key Management page on your SCP console to generate the access keys, and the Project Overview page for the project ID. Then, add them to ~/.scp/scp_credential
by running:
# Create directory if required
mkdir -p ~/.scp
# Add the lines for "access_key", "secret_key", and "project_id" to scp_credential file
echo "access_key = <your_access_key>" >> ~/.scp/scp_credential
echo "secret_key = <your_secret_key>" >> ~/.scp/scp_credential
echo "project_id = <your_project_id>" >> ~/.scp/scp_credential
Note
Multi-node clusters are currently not supported on SCP.
VMware vSphere#
To configure VMware vSphere access, store the vSphere credentials in ~/.vsphere/credential.yaml
:
mkdir -p ~/.vsphere
touch ~/.vsphere/credential.yaml
Here is an example of configuration within the credential file:
vcenters:
- name: <your_vsphere_server_ip_01>
username: <your_vsphere_user_name>
password: <your_vsphere_user_passwd>
skip_verification: true # If your vcenter have valid certificate then change to 'false' here
# Clusters that can be used by SkyPilot:
# [] means all the clusters in the vSphere can be used by Skypilot
# Instead, you can specify the clusters in a list:
# clusters:
# - name: <your_vsphere_cluster_name1>
# - name: <your_vsphere_cluster_name2>
clusters: []
# If you are configuring only one vSphere instance, omit the following line.
- name: <your_vsphere_server_ip_02>
username: <your_vsphere_user_name>
password: <your_vsphere_user_passwd>
skip_verification: true
clusters: []
After configuring the vSphere credentials, ensure that the necessary preparations for vSphere are completed. Please refer to this guide for more information: Cloud Preparation for vSphere
Cloudflare R2#
Cloudflare offers R2, an S3-compatible object storage without any egress charges. SkyPilot can download/upload data to R2 buckets and mount them as local filesystem on clusters launched by SkyPilot. To set up R2 support, run:
# Install boto
pip install boto3
# Configure your R2 credentials
AWS_SHARED_CREDENTIALS_FILE=~/.cloudflare/r2.credentials aws configure --profile r2
In the prompt, enter your R2 Access Key ID and Secret Access Key (see instructions to generate R2 credentials). Select auto
for the default region and json
for the default output format.
AWS Access Key ID [None]: <access_key_id>
AWS Secret Access Key [None]: <access_key_secret>
Default region name [None]: auto
Default output format [None]: json
Next, get your Account ID from your R2 dashboard and store it in ~/.cloudflare/accountid
with:
mkdir -p ~/.cloudflare
echo <YOUR_ACCOUNT_ID_HERE> > ~/.cloudflare/accountid
Kubernetes#
SkyPilot can also run tasks on on-prem or cloud hosted Kubernetes clusters (e.g., EKS, GKE). The only requirement is a valid kubeconfig at ~/.kube/config
.
# Place your kubeconfig at ~/.kube/config
mkdir -p ~/.kube
cp /path/to/kubeconfig ~/.kube/config
See SkyPilot on Kubernetes for more.
Requesting quotas for first time users#
If your cloud account has not been used to launch instances before, the respective quotas are likely set to zero or a low limit. This is especially true for GPU instances.
Please follow Requesting Quota Increase to check quotas and request quota increases before proceeding.
Quick alternative: trying in Docker#
As a quick alternative to installing SkyPilot on your laptop, we also provide a Docker image with SkyPilot main branch automatically cloned. You can simply run:
# NOTE: '--platform linux/amd64' is needed for Apple silicon Macs
docker run --platform linux/amd64 \
-td --rm --name sky \
-v "$HOME/.sky:/root/.sky:rw" \
-v "$HOME/.aws:/root/.aws:rw" \
-v "$HOME/.config/gcloud:/root/.config/gcloud:rw" \
berkeleyskypilot/skypilot-nightly
docker exec -it sky /bin/bash
If your cloud CLIs are already setup, your credentials (AWS and GCP) will be mounted to the container and you can proceed to Quickstart. Otherwise, you can follow the instructions in Cloud account setup inside the container to set up your cloud accounts.
Once you are done with experimenting with SkyPilot, remember to delete any clusters and storage resources you may have created using the following commands:
# Run inside the container:
sky down -a -y
sky storage delete -a -y
Finally, you can stop the container with:
docker stop sky
See more details about the dev container image
berkeleyskypilot/skypilot-nightly
here.
Enabling shell completion#
SkyPilot supports shell completion for Bash (Version 4.4 and up), Zsh and Fish. This is only available for click
versions 8.0 and up (use pip install click==8.0.4
to install).
To enable shell completion after installing SkyPilot, you will need to modify your shell configuration.
SkyPilot automates this process using the --install-shell-completion
option, which you should call using the appropriate shell name or auto
:
sky --install-shell-completion auto
# sky --install-shell-completion zsh
# sky --install-shell-completion bash
# sky --install-shell-completion fish
Shell completion may perform poorly on certain shells and machines.
If you experience any issues after installation, you can use the --uninstall-shell-completion
option to uninstall it, which you should similarly call using the appropriate shell name or auto
:
sky --uninstall-shell-completion auto
# sky --uninstall-shell-completion zsh
# sky --uninstall-shell-completion bash
# sky --uninstall-shell-completion fish