Deploying SkyPilot API Server#

The SkyPilot API server is packaged as a Helm chart which deploys a Kubernetes ingress controller and the API server.

Tip

This guide is for admins to deploy the API server. If you are a user looking to connect to the API server, refer to Connecting to an API server.

Tip

Deploying the API server to a Kubernetes cluster using Helm provides the best reliability and enables additional features such as OAuth2 authentication and metrics. However, there are two alternative options available for special cases:

Deploying the API server on cloud VMs if you do not have a Kubernetes cluster and do not need the additional features.
Sharing an API server for multiple users on a single machine if you want to share the API server with multiple users on a single machine instead of exposing it to the public internet (e.g. a shared bastion machine).

Prerequisites#

A Kubernetes cluster with LoadBalancer or NodePort service support
Helm
kubectl

Tip

If you do not have a Kubernetes cluster, refer to Kubernetes Deployment Guides to set one up.

Step 1: Deploy the API server Helm chart#

Install the SkyPilot Helm chart with the following command:

# Ensure the helm repository is added and up to date
helm repo add skypilot https://helm.skypilot.co
helm repo update

# The following variables will be used throughout the guide
# NAMESPACE is the namespace to deploy the API server in
NAMESPACE=skypilot
# RELEASE_NAME is the name of the helm release, must be unique within the namespace
RELEASE_NAME=skypilot
# Set up basic username/password HTTP auth, or use OAuth2 proxy
WEB_USERNAME=skypilot
WEB_PASSWORD=yourpassword
AUTH_STRING=$(htpasswd -nb $WEB_USERNAME $WEB_PASSWORD)
# Deploy the API server
helm upgrade --install $RELEASE_NAME skypilot/skypilot-nightly --devel \
  --namespace $NAMESPACE \
  --create-namespace \
  --set ingress.authCredentials=$AUTH_STRING

The above command will install a SkyPilot API server and ingress-nginx controller in the given namespace, which by default conflicts with other installations. To deploy multiple API servers, refer to Reusing ingress-nginx controller for API server. To use a different ingress controller, refer to Use custom ingress controller

Tip

The API server deployed will be configured to use the hosting Kubernetes cluster to launch tasks by default. Refer to Optional: Configure cloud accounts to configure credentials for more clouds and Kubernetes clusters.

After the API server is deployed, you can inspect the API server pod status with:

kubectl get pods --namespace $NAMESPACE -l app=${RELEASE_NAME}-api --watch

You should see the pod is initializing and finally becomes running and ready. If not, refer to Helm deployment troubleshooting to diagnose the issue.

The API server above is deployed with a basic auth provided by Nginx. To use advanced OAuth2 authentication, refer to Using OAuth for API server.

Step 2: Get the API server URL#

Once the API server is deployed, we can fetch the API server URL. The chart uses nginx ingress to expose the API server and exposes the ingress to internet using a LoadBalancer service by default. If you are using a Kubernetes cluster without LoadBalancer support, you can use the NodePort option below instead.

LoadBalancer (Default)

Fetch the ingress controller URL:

$ HOST=$(kubectl get svc ${RELEASE_NAME}-ingress-nginx-controller --namespace $NAMESPACE -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
$ ENDPOINT=http://${WEB_USERNAME}:${WEB_PASSWORD}@${HOST}
$ echo $ENDPOINT
http://skypilot:[email protected]

Tip

If you’re using a Kubernetes cluster without LoadBalancer support, you may get an empty IP address in the output above. In that case, use the NodePort option instead.

Tip

For fine-grained control over the LoadBalancer service, refer to the helm values of ingress-nginx. Note that all values should be put under ingress-nginx. prefix since the ingress-nginx chart is installed as a subchart.

NodePort

Select two ports on your nodes that are not in use and allow network inbound traffic to them. 30050 and 30051 will be used in this example.
Upgrade the API server to use NodePort, and set the node ports to the selected ports:

$ helm upgrade --namespace $NAMESPACE $RELEASE_NAME skypilot/skypilot-nightly --devel \
  --reuse-values \
  --set ingress-nginx.controller.service.type=NodePort \
  --set ingress-nginx.controller.service.nodePorts.http=30050 \
  --set ingress-nginx.controller.service.nodePorts.https=30051

Fetch the ingress controller URL with:

$ NODE_PORT=$(kubectl get svc ${RELEASE_NAME}-ingress-controller-np --namespace $NAMESPACE -o jsonpath='{.spec.ports[?(@.name=="http")].nodePort}')
$ NODE_IP=$(kubectl get nodes -o jsonpath='{ $.items[0].status.addresses[?(@.type=="ExternalIP")].address }')
$ HOST=${NODE_IP}:${NODE_PORT}
$ ENDPOINT=http://${WEB_USERNAME}:${WEB_PASSWORD}@${HOST}
$ echo $ENDPOINT
http://skypilot:[email protected]:30050

Tip

You can also omit ingress-nginx.controller.service.nodePorts.http and ingress-nginx.controller.service.nodePorts.https to use random ports in the NodePort range (default 30000-32767). Make sure these ports are open on your nodes if you do so.

Tip

To avoid frequent IP address changes on nodes by your cloud provider, you can attach a static IP address to your nodes (instructions for GKE) and use it as the NODE_IP in the command above.

Step 3: Test the API server#

Test the API server by curling the health endpoint:

$ curl ${ENDPOINT}/api/health
{"status":"healthy","api_version":"1","commit":"ba7542c6dcd08484d83145d3e63ec9966d5909f3-dirty","version":"1.0.0-dev0"}

If all looks good, you can now start using the API server. Refer to Connecting to an API server to connect your local SkyPilot client to the API server.

Optional: Set up OAuth#

In addition to basic HTTP authentication, SkyPilot also supports using OAuth2 to securely authenticate users.

Refer to Setup OAuth for SkyPilot API Server for detailed instructions on common OAuth2 providers, such as Okta or Google Workspace.

Optional: Back the API server with a persistent database#

The API server can optionally be configured with a PostgreSQL database to persist state. It can be an externally managed database.

If a persistent DB is not specified, the API server uses a Kubernetes persistent volume to persist state.

Note

Database configuration must be set in the Helm deployment.

Optional: Setting the SkyPilot config#

To modify your SkyPilot config, use the SkyPilot dashboard: http://<api-server-url>/dashboard/config.

Optional: Set up GPU monitoring and metrics#

SkyPilot dashboard can be optionally configured to expose GPU metrics and API server metrics.

API Server Metrics Dashboard

GPU Metrics Dashboard

To enable metrics, set apiService.metrics.enabled=true, prometheus.enabled=true and grafana.enabled=true in the Helm chart.

helm upgrade --install $RELEASE_NAME skypilot/skypilot-nightly --devel \
  --namespace $NAMESPACE \
  --reuse-values \
  --set apiService.metrics.enabled=true \
  --set prometheus.enabled=true \
  --set grafana.enabled=true

For detailed setup instructions (including how to set up external Prometheus and Grafana), see:

Upgrade the API server#

Refer to Upgrading SkyPilot API Server for how to upgrade the API server.

Uninstall#

To uninstall the API server, run:

helm uninstall $RELEASE_NAME --namespace $NAMESPACE --wait

This will delete the API server and all associated resources. --wait ensures that all the resources of SkyPilot API server are deleted before the command returns.

Other notes#

Fault tolerance and state persistence#

The skypilot API server is designed to be fault tolerant. If the API server pod is terminated, the Kubernetes will automatically create a new pod to replace it.

To retain state during pod termination, we use a persistent volume claim. The persistent volume claim is backed by a PersistentVolume that is created by the Helm chart.

You can customize the storage settings using the following values by creating a values.yaml file:

storage:
  # Enable/disable persistent storage
  enabled: true
  # Storage class name - leave empty to use cluster default
  storageClassName: ""
  # Access modes - ReadWriteOnce or ReadWriteMany depending on storage class support
  accessMode: ReadWriteOnce
  # Storage size
  size: 10Gi
  # Optional selector for matching specific PVs
  selector: {}
    # matchLabels:
    #   environment: prod
  # Optional volume name for binding to specific PV
  volumeName: ""
  # Optional annotations
  annotations: {}

For example, to use a specific storage class and increase the storage size:

# values.yaml
storage:
  enabled: true
  storageClassName: "standard"
  size: 20Gi

Apply the configuration using:

helm upgrade --install skypilot skypilot/skypilot-nightly --devel -f values.yaml

Additional setup for EKS#

To support persistent storage for the API server’s state, we need a storage class that supports persistent volumes. If you already have a storage class that supports persistent volumes, you can skip the following steps.

We will use the Amazon EBS CSI driver to create a storage class that supports persistent volumes backed by Amazon EBS. You can also use other storage classes that support persistent volumes, such as EFS.

The steps below are based on the official documentation. Please follow the official documentation to adapt the steps to your cluster.

Make sure OIDC is enabled for your cluster. Follow the steps here.
1. You will need to create and bind an IAM role which has permissions to create EBS volumes. See instructions here.
Install the Amazon EBS CSI driver. The recommended method is through creating an EKS add-on.

Once the EBS CSI driver is installed, the default gp2 storage class will be backed by EBS volumes.

Setting an admin policy#

The Helm chart supports installing an admin policy before the API server starts.

To do so, set apiService.preDeployHook to the commands you want to run. For example, to install an admin policy, create a values.yaml file with the following:

# values.yaml
apiService:
  preDeployHook: |
    echo "Installing admin policy"
    pip install git+https://github.com/michaelvll/admin-policy-examples

  config: |
    admin_policy: admin_policy_examples.AddLabelsPolicy

Then apply the values.yaml file using the -f flag when running the helm upgrade command:

helm upgrade --install skypilot skypilot/skypilot-nightly --devel -f values.yaml

Setting minimum permissions in helm deployment#

In helm deployment, a set of default permissions are granted to the API server to access the hosting Kubernetes cluster. You can customize the permissions in the following conditions:

Reduce the RBAC permissions by using kubernetes.remote_identity: by default, the API server creates a service account and RBAC roles to grant permissions to SkyPilot task Pods. This in turn requires the API server to have permissions to manipulate RBAC roles and service accounts. You can disable this by the following steps:
1. Refer to Setting the SkyPilot config to set kubernetes.remote_identity to the service account of API server, which already has the necessary permissions:
```
# TODO: replace ${RELEASE_NAME} with the actual release name in deployment step
kubernetes:
  remote_identity: ${RELEASE_NAME}-api-sa
```
  Note
  
  If you also grant external Kubernetes cluster permissions to the API server via kubernetesCredentials.useKubeconfig, the same service account with enough permissions must be prepared in these Kubernetes clusters manually.
2. Set rbac.manageRbacPolicies=false in helm valuesto disable the RBAC policies:
```
helm upgrade --install skypilot skypilot/skypilot-nightly --devel --reuse-values \
  --set rbac.manageRbacPolicies=false
```
If your use case does not require object storage mounting, you can disable the permissions to manage SkyPilot system components by setting rbac.manageSystemComponents=false:
```
helm upgrade --install skypilot skypilot/skypilot-nightly --devel --reuse-values \
  --set rbac.manageSystemComponents=false
```

If you want to use an existing service account and permissions that meet the minimum permissions required for SkyPilot instead of the one managed by Helm, you can disable the creation of RBAC policies and specify the service account name to use:

helm upgrade --install skypilot skypilot/skypilot-nightly --devel --reuse-values \
  --set rbac.create=false \
  --set rbac.serviceAccountName=my-existing-service-account

Reusing ingress controller for API server#

By default, the SkyPilot helm chart will deploy a new ingress-nginx controller when installing the API server. However, the ingress-nginx controller has some cluster-scope resources that will cause conflicts between multiple installations by default. It is recommended to reuse an existing ingress controller if you want to deploy multiple API servers in the same Kubernetes cluster.

To reuse an existing ingress controller, you can set ingress-nginx.enabled to false and set ingress.path to a unique path for the deploying API server. For example:

# The first API server, with niginx-ingress controller deployed
# It is assumed that the first API server is already deployed. If it is not deployed yet,
# add necessary values instead of specifying --reuse-values
helm upgrade --install $RELEASE_NAME skypilot/skypilot-nightly --devel \
    --namespace $NAMESPACE \
    --reuse-values \
    --set ingress.path=/first-server

# The second API server, reusing the existing ingress controller and using a different path
ANOTHER_RELEASE_NAME=skypilot2
ANOTHER_NAMESPACE=skypilot2
# Replace with your username and password to configure the basic auth credentials for the second API server
ANOTHER_WEB_USERNAME=skypilot
ANOTHER_WEB_PASSWORD=yourpassword2
ANOTHER_AUTH_STRING=$(htpasswd -nb $ANOTHER_WEB_USERNAME $ANOTHER_WEB_PASSWORD)
# Deploy the API server, either in the same namespace or a different namespace
helm upgrade --install $ANOTHER_RELEASE_NAME skypilot/skypilot-nightly --devel \
    --namespace $ANOTHER_NAMESPACE \
    --set ingress-nginx.enabled=false \
    --set ingress.path=/second-server \
    --set ingress.authCredentials=$ANOTHER_AUTH_STRING

With the above commands, these two API servers will share the same ingress controller and serves under different paths of the same host. To get the endpoints, follow Step 2: Get the API server URL to get the host from the helm release that has the ingress-nginx controller deployed, and then append the basic auth and path to the host:

# HOST was the ingress host we got from Step 2
$ FIRST_PATH=$(kubectl get ingress ${RELEASE_NAME}-ingress --namespace $NAMESPACE -o jsonpath='{.metadata.annotations.skypilot\.co\/ingress-path}')
$ FIRST_ENDPOINT=http://${WEB_USERNAME}:${WEB_PASSWORD}@${HOST}${FIRST_PATH}
$ SECOND_PATH=$(kubectl get ingress ${ANOTHER_RELEASE_NAME}-ingress --namespace $ANOTHER_NAMESPACE -o jsonpath='{.metadata.annotations.skypilot\.co\/ingress-path}')
$ SECOND_ENDPOINT=http://${ANOTHER_WEB_USERNAME}:${ANOTHER_WEB_PASSWORD}@${HOST}${SECOND_PATH}
$ echo $FIRST_ENDPOINT
http://skypilot:[email protected]/first-server
$ echo $SECOND_ENDPOINT
http://skypilot:[email protected]/second-server

The same approach also applies when you have a ingress-nginx controller deployed before installing the SkyPilot API server:

# The first API server, disabling the ingress-nginx controller to reuse the existing one
helm upgrade --install $RELEASE_NAME skypilot/skypilot-nightly --devel \
    --namespace $NAMESPACE \
    --set ingress-nginx.enabled=false \
    --set ingress.path=/skypilot

It is a good practice to specify a unique ingress.path too in this case, to avoid conflicts with other backends hosted on the same ingress controller.

Use custom ingress controller#

By default, the SkyPilot helm chart will deploy a new ingress-nginx controller when installing the API server. However, you can use a custom ingress controller by disabling the creation of nginx ingress controller and setting ingress.ingressClassName to the ingress class name of your controller. In addition, most of the ingress controllers support customizing behavior by setting annotations on the ingress resource. You can set ingress.annotations in the helm values to pass annotations to the ingress resource. Here is an example of using a custom ingress controller:

helm upgrade --install $RELEASE_NAME skypilot/skypilot-nightly --devel \
    --namespace $NAMESPACE \
    --reuse-values \
    --set ingress-nginx.enabled=false \
    --set ingress.ingressClassName=custom-ingress-class \
    --set ingress.annotations.custom-ingress-annotation=custom-ingress-annotation-value

Note

Basic auth on ingress is only supported when using ingress-nginx controller. Consider using OAuth2 to protect your API server instead.

Alternative: Deploy on cloud VMs#

Note

VM deployment does not offer failover and graceful upgrading supports. We recommend to use Helm deployment Deploying SkyPilot API Server in production environments.

You can also deploy the API server directly on cloud VMs using an existing SkyPilot installation.

Step 1: Use SkyPilot to deploy the API server on a cloud VM#

Write the SkyPilot API server YAML file and use sky launch to deploy the API server:

# Write the YAML to a file
cat <<EOF > skypilot-api-server.yaml
resources:
  cpus: 8+
  memory: 16+
  ports: 46580
  image_id: docker:berkeleyskypilot/skypilot-nightly:latest

run: |
  sky api start --deploy
EOF

# Deploy the API server
sky launch -c api-server skypilot-api-server.yaml

Step 2: Get the API server URL#

Once the API server is deployed, you can fetch the API server URL with:

$ sky status --endpoint 46580 api-server
http://a.b.c.d:46580

Test the API server by curling the health endpoint:

$ curl ${ENDPOINT}/health
SkyPilot API Server: Healthy

If all looks good, you can now start using the API server. Refer to Connecting to an API server to connect your local SkyPilot client to the API server.

Note

API server deployment using the above YAML does not have any authentication by default. We recommend adding a authentication layer (e.g., nginx reverse proxy) or using the SkyPilot helm chart on a Kubernetes cluster for a more secure deployment.

Tip

If you are installing SkyPilot API client in the same environment, we recommend using a different python environment (venv, conda, etc.) to avoid conflicts with the SkyPilot installation used to deploy the API server.

Deploying SkyPilot API Server#

Prerequisites#

Step 1: Deploy the API server Helm chart#

Step 2: Get the API server URL#

Step 3: Test the API server#

Optional: Configure cloud accounts#

Optional: Set up OAuth#

Optional: Back the API server with a persistent database#

Optional: Setting the SkyPilot config#

Optional: Set up GPU monitoring and metrics#

Upgrade the API server#

Uninstall#

Other notes#

Fault tolerance and state persistence#

Additional setup for EKS#

Setting an admin policy#

Setting minimum permissions in helm deployment#

Reusing ingress controller for API server#

Use custom ingress controller#

Alternative: Deploy on cloud VMs#

Step 1: Use SkyPilot to deploy the API server on a cloud VM#

Step 2: Get the API server URL#