Updating a Service#
SkyServe supports updating a deployed service, which can be used to change:
Replica code (e.g.,
run
/setup
; useful for debugging)Replica resource spec in
resources
(e.g., accelerator or instance type)Service spec in
service
(e.g., number of replicas or autoscaling spec)
During an update, the service will remain accessible with no downtime and its endpoint will remain the same. By default, rolling update is applied, while you can also specify a blue-green update.
Rolling Update#
To update an existing service, use sky serve update
:
$ sky serve update service-name new_service.yaml
SkyServe will launch new replicas described by new_service.yaml
with the following behavior:
An update is initiated, and traffic will continue to be redirected to existing (old) replicas.
New replicas (with new settings) are brought up in the background.
Whenever the total number of old and new replicas exceeds the expected number of replicas (based on autoscaler’s decision), extra old replicas will be scaled down.
Traffic will be redirected to both old and new replicas until all new replicas are ready.
Hint
When only the service
field is updated and no workdir
or file_mounts
is specified in the service task, SkyServe will reuse the old replicas
by applying the new service spec and bumping its version (See sky serve status
for the versions). This will significantly reduce the time to
update the service and avoid potential quota issues.
Example#
We first launch a simple HTTP service:
$ sky serve up examples/serve/http_server/task.yaml -n http-server
We can use sky serve status http-server
to check the status of the service:
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 1 1m 41s READY 2/2 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 2 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 2 1 52.87.241.103 2 mins ago 1x AWS(vCPU=2) READY us-east-1
Service http-server
has an initial version of 1.
Suppose we want to update the service to have 3 replicas instead of 2. We can update
the task yaml examples/serve/http_server/task.yaml
, by changing the replicas
field:
# examples/serve/http_server/task.yaml
service:
readiness_probe:
path: /health
initial_delay_seconds: 20
replicas: 3
resources:
ports: 8081
cpus: 2+
workdir: examples/serve/http_server
run: python3 server.py
We can then use sky serve update
to update the service:
$ sky serve update http-server examples/serve/http_server/task.yaml
SkyServe will trigger launching three new replicas.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 2 6m 15s READY 2/5 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 6 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 2 1 52.87.241.103 6 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 3 2 - 21 secs ago 1x AWS(vCPU=2) PROVISIONING us-east-1
http-server 4 2 - 21 secs ago 1x AWS(vCPU=2) PROVISIONING us-east-1
http-server 5 2 - 21 secs ago 1x AWS(vCPU=2) PROVISIONING us-east-1
Whenever a new replica is ready, the traffic will be redirected to both old and new replicas.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 1,2 10m 4s READY 3/5 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 10 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 2 1 52.87.241.103 10 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 3 2 3.93.241.163 1 min ago 1x AWS(vCPU=2) READY us-east-1
http-server 4 2 - 1 min ago 1x AWS(vCPU=2) PROVISIONING us-east-1
http-server 5 2 - 1 min ago 1x AWS(vCPU=2) PROVISIONING us-east-1
Once the total number of both old and new replicas exceeds the requested number, old replicas will be scaled down.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 1,2 10m 4s READY 3/5 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 10 mins ago 1x AWS(vCPU=2) SHUTTING_DOWN us-east-1
http-server 2 1 52.87.241.103 10 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 3 2 3.93.241.163 1 min ago 1x AWS(vCPU=2) READY us-east-1
http-server 4 2 18.206.226.82 1 min ago 1x AWS(vCPU=2) READY us-east-1
http-server 5 2 - 1 min ago 1x AWS(vCPU=2) PROVISIONING us-east-1
Eventually, we will only have new replicas ready to serve user requests.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 2 11m 42s READY 3/3 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 3 2 3.93.241.163 3 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 4 2 18.206.226.82 3 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 5 2 3.26.232.31 1 min ago 1x AWS(vCPU=2) READY us-east-1
Blue-Green Update#
SkyServe also supports blue-green updates, by the following command:
$ sky serve update --mode blue_green service-name new_service.yaml
In this update mode, SkyServe will launch new replicas described by new_service.yaml
with the following behavior:
An update is initiated, and traffic will continue to be redirected to existing (old) replicas.
New replicas (with new settings) are brought up in the background.
Traffic will be redirected to new replicas only when all new replicas are ready.
Old replicas are scaled down after all new replicas are ready.
During an update, traffic is entirely serviced by either old-versioned or
new-versioned replicas. sky serve status
shows the latest service
version and each replica’s version.
Example#
We use the same service http-server
as an example. We can then use sky serve update --mode blue_green
to update the service:
$ sky serve update http-server --mode blue_green examples/serve/http_server/task.yaml
SkyServe will trigger launching three new replicas.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 2 6m 15s READY 2/5 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 6 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 2 1 52.87.241.103 6 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 3 2 - 21 secs ago 1x AWS(vCPU=2) PROVISIONING us-east-1
http-server 4 2 - 21 secs ago 1x AWS(vCPU=2) PROVISIONING us-east-1
http-server 5 2 - 21 secs ago 1x AWS(vCPU=2) PROVISIONING us-east-1
When a new replica is ready, the traffic will still be redirected to old replicas.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 1 10m 4s READY 3/5 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 10 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 2 1 52.87.241.103 10 mins ago 1x AWS(vCPU=2) READY us-east-1
http-server 3 2 3.93.241.163 1 min ago 1x AWS(vCPU=4) READY us-east-1
http-server 4 2 - 1 min ago 1x AWS(vCPU=4) PROVISIONING us-east-1
http-server 5 2 - 1 min ago 1x AWS(vCPU=4) PROVISIONING us-east-1
Once the total number of new replicas satisfies the requirements, traffics will be redirected to new replicas and old replicas will be scaled down.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 2 10m 4s READY 3/5 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 1 1 54.173.203.169 10 mins ago 1x AWS(vCPU=2) SHUTTING_DOWN us-east-1
http-server 2 1 52.87.241.103 10 mins ago 1x AWS(vCPU=2) SHUTTING_DOWN us-east-1
http-server 3 2 3.93.241.163 1 min ago 1x AWS(vCPU=4) READY us-east-1
http-server 4 2 18.206.226.82 1 min ago 1x AWS(vCPU=4) READY us-east-1
http-server 5 2 3.26.232.31 1 min ago 1x AWS(vCPU=4) READY us-east-1
Eventually, same as the rolling update, we will only have new replicas ready to serve user requests.
$ sky serve status http-server
Services
NAME VERSION UPTIME STATUS REPLICAS ENDPOINT
http-server 2 11m 42s READY 3/3 44.206.240.249:30002
Service Replicas
SERVICE_NAME ID VERSION IP LAUNCHED RESOURCES STATUS REGION
http-server 3 2 3.93.241.163 3 mins ago 1x AWS(vCPU=4) READY us-east-1
http-server 4 2 18.206.226.82 3 mins ago 1x AWS(vCPU=4) READY us-east-1
http-server 5 2 3.26.232.31 1 min ago 1x AWS(vCPU=4) READY us-east-1