Upgrading SkyPilot API Server#
This page provides an overview of the steps you should follow to upgrade a remote SkyPilot API server:
Upgrade API server deployed with Helm#
With Helm deployement, it is possible to upgrade the SkyPilot API server gracefully without causing client-side error with the steps below.
Step 1: Prepare an upgrade#
Find the version to use in SkyPilot nightly build.
Update SkyPilot helm repository to the latest version:
helm repo update skypilot
Prepare versioning environment variables.
NAMESPACE
andRELEASE_NAME
should be set to the currently installed namespace and release:
NAMESPACE=skypilot # TODO: change to your installed namespace
RELEASE_NAME=skypilot # TODO: change to your installed release name
VERSION=1.0.0-dev20250410 # TODO: change to the version you want to upgrade to
IMAGE_REPO=berkeleyskypilot/skypilot-nightly
Step 2: Upgrade the API server and clients#
Upgrade the clients:
pip install -U skypilot-nightly==${VERSION}
Upgrade the API server:
# --reuse-values is critical to keep the values set in the previous installation steps.
helm upgrade -n $NAMESPACE $RELEASE_NAME skypilot/skypilot-nightly --devel --reuse-values \
--set apiService.image=${IMAGE_REPO}:${VERSION}
When the API server is being upgraded, the SkyPilot CLI and Python SDK will automatically retry requests until the new version of the API server is started. So the upgrade process is graceful if the new version of the API server does not break API compatbility. For more details, refer to Graceful upgrade.
Optionally, you can watch the upgrade progress with:
$ kubectl get pod -l app=${RELEASE_NAME}-api --watch
NAME READY STATUS RESTARTS AGE
skypilot-demo-api-server-cf4896bdf-62c96 0/1 Init:0/2 0 7s
skypilot-demo-api-server-cf4896bdf-62c96 0/1 Init:1/2 0 24s
skypilot-demo-api-server-cf4896bdf-62c96 0/1 PodInitializing 0 26s
skypilot-demo-api-server-cf4896bdf-62c96 0/1 Running 0 27s
skypilot-demo-api-server-cf4896bdf-62c96 1/1 Running 0 50s
The upgraded API server is ready to serve requests after the pod becomes running and the READY
column shows 1/1
.
Note
apiService.config
will be IGNORED during an upgrade. To update your SkyPilot config, see here.
Step 3: Verify the upgrade#
Verify the API server is able to serve requests and the version is consistent with the version you upgraded to:
$ sky api info
Using SkyPilot API server: <ENDPOINT>
├── Status: healthy, commit: 022a5c3ffe258f365764b03cb20fac70934f5a60, version: 1.0.0.dev20250410
└── User: aclice (abcd1234)
If possible, you can also trigger your pipelines that depend on the API server to verify there is no compatibility issue after the upgrade.
Upgrade the API server deployed on VM#
Note
VM deployment does not offer graceful upgrade. We recommend the Helm deployment Deploying SkyPilot API Server in production environments. The following is a workaround for upgrading SkyPilot API server in VM deployments.
Suppose the cluster name of the API server is api-server
(which is used in the Alternative: Deploy on cloud VMs guide), you can upgrade the API server with the following steps:
Get the version to upgrade to from SkyPilot nightly build.
Switch to the original API server endpoint used to launch the cloud VM for API server. It is usually locally started when you ran
sky launch -c api-server skypilot-api-server.yaml
in Alternative: Deploy on cloud VMs guide:
# Replace http://localhost:46580 with the real API server endpoint if you were not using the local API server to launch the API server VM instance.
sky api login -e http://localhost:46580
Check the API server VM instance is
UP
:
$ sky status api-server
Clusters
NAME LAUNCHED RESOURCES STATUS AUTOSTOP COMMAND
api-server 41 mins ago 1x AWS(c6i.2xlarge, image_id={'us-east-1': 'docker:berkeleyskypilot/sk... UP - sky exec api-server pip i...
Upgrade the clients:
pip install -U skypilot-nightly==${VERSION}
Note
After upgrading the clients, they should not be used until the API server is upgraded to the new version.
Upgrade the SkyPilot on the VM and restart the API server:
Note
Upgrading and restarting the API server will interrupt all pending and running requests.
sky exec api-server "pip install -U skypilot-nightly[all] && sky api stop && sky api start --deploy"
# Alternatively, you can also upgrade to a specific version with:
sky exec api-server "pip install -U skypilot-nightly[all]==${VERSION} && sky api stop && sky api start --deploy"
Switch back to the remote API server:
ENDPOINT=$(sky status --endpoint api-server)
sky api login -e $ENDPOINT
Verify the API server is running and the version is consistent with the version you upgraded to:
$ sky api info
Using SkyPilot API server: <ENDPOINT>
├── Status: healthy, commit: 022a5c3ffe258f365764b03cb20fac70934f5a60, version: 1.0.0.dev20250410
└── User: aclice (abcd1234)
Graceful upgrade#
A server can be gracefully upgraded when the following conditions are met:
Helm deployment is used;
Versions before and after upgrade are compatible;
Behavior when the API server is being upgraded:
For critical ongoing requests (e.g., launching a cluster), it waits for them to finish with a timeout.
For non-critical ongoing requests (e.g., log tailing), it cancels them and returns an error to ask the client to retry.
For new requests, it returns an error to ask the client to retry.
SkyPilot Python SDK and CLI will automatically retry until the new version of API server starts, and ongoing requests (e.g., log tailing) will automatically resume:

To ensure that all the regular critical requests can complete within the timeout, you can adjust the timeout by setting apiService.terminationGracePeriodSeconds in helm values based on your workload, e.g.:
helm upgrade -n $NAMESPACE $RELEASE_NAME skypilot/skypilot-nightly --devel --reuse-values \
--set apiService.terminationGracePeriodSeconds=300
API compatbility#
SkyPilot maintain an internal API version which will be bumped when an incompatible API change is introduced. Client and server can only communicate when they run on the same API version.
The version strategy of SkyPilot follows the following API compatbility guarantees:
The API version will not be bumped within a minor version, i.e. upgrading patch version is guaranteed to be compatible;
The API version might be bumped between minior versions, i.e. upgrading minior version should be treated as operation that breaks API compatibility;
There is no guarantee about the API version in the nightly build;