Updating a Service#

SkyServe supports updating a deployed service, which can be used to change:

  • Replica code (e.g., run/setup; useful for debugging)

  • Replica resource spec in resources (e.g., accelerator or instance type)

  • Service spec in service (e.g., number of replicas or autoscaling spec)

During an update, the service will remain accessible with no downtime and its endpoint will remain the same. By default, rolling update is applied, while you can also specify a blue-green update.

Rolling update#

To update an existing service, use sky serve update:

$ sky serve update service-name new_service.yaml

SkyServe will launch new replicas described by new_service.yaml with the following behavior:

  • An update is initiated, and traffic will continue to be redirected to existing (old) replicas.

  • New replicas (with new settings) are brought up in the background.

  • Whenever the total number of old and new replicas exceeds the expected number of replicas (based on autoscaler’s decision), extra old replicas will be scaled down.

  • Traffic will be redirected to both old and new replicas until all new replicas are ready.

Hint

When only the service field is updated and no workdir or file_mounts is specified in the service task, SkyServe will reuse the old replicas by applying the new service spec and bumping its version (See sky serve status for the versions). This will significantly reduce the time to update the service and avoid potential quota issues.

Example#

We first launch a simple HTTP service:

$ sky serve up examples/serve/http_server/task.yaml -n http-server

We can use sky serve status http-server to check the status of the service:

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  1        1m 41s  READY   2/2       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED    INFRA                RESOURCES                                 STATUS
http-server   1   1        http://54.173.203.169:8081  2 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   2   1        http://52.87.241.103:8081   2 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY

Service http-server has an initial version of 1.

Suppose we want to update the service to have 3 replicas instead of 2. We can update the task yaml examples/serve/http_server/task.yaml, by changing the replicas field:

# examples/serve/http_server/task.yaml
service:
  readiness_probe:
    path: /health
    initial_delay_seconds: 20
  replicas: 3

resources:
  ports: 8081
  cpus: 2+

workdir: examples/serve/http_server

run: python3 server.py

We can then use sky serve update to update the service:

$ sky serve update http-server examples/serve/http_server/task.yaml

SkyServe will trigger launching three new replicas.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  2        6m 15s  READY   2/5       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                 STATUS
http-server   1   1        http://54.173.203.169:8081  6 mins ago   AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   2   1        http://52.87.241.103:8081   6 mins ago   AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   3   2        -                           21 secs ago  AWS (us-east-1b)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING
http-server   4   2        -                           21 secs ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING
http-server   5   2        -                           21 secs ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING

Whenever a new replica is ready, the traffic will be redirected to both old and new replicas.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  1,2        10m 4s  READY   3/5       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                 STATUS
http-server   1   1        http://54.173.203.169:8081  10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   2   1        http://52.87.241.103:8081   10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   3   2        http://3.93.241.163:8081    1 min ago    AWS (us-east-1b)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   4   2        -                           1 min ago    AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING
http-server   5   2        -                           1 min ago    AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING

Once the total number of both old and new replicas exceeds the requested number, old replicas will be scaled down.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  1,2        10m 4s  READY   3/5       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                 STATUS
http-server   1   1        http://54.173.203.169:8081  10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   2   1        http://52.87.241.103:8081   10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   3   2        http://3.93.241.163:8081    1 min ago    AWS (us-east-1b)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   4   2        -                           1 min ago    AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING
http-server   5   2        -                           1 min ago    AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING

Eventually, we will only have new replicas ready to serve user requests.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME   STATUS  REPLICAS  ENDPOINT
http-server  2        11m 42s  READY   3/3       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED    INFRA                RESOURCES                                 STATUS
http-server   3   2        http://3.93.241.163:8081    3 mins ago  AWS (us-east-1b)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   4   2        http://18.206.226.82:8081   3 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   5   2        http://3.26.232.31:8081     1 min ago   AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY

Blue-green update#

SkyServe also supports blue-green updates, by the following command:

$ sky serve update --mode blue_green service-name new_service.yaml

In this update mode, SkyServe will launch new replicas described by new_service.yaml with the following behavior:

  • An update is initiated, and traffic will continue to be redirected to existing (old) replicas.

  • New replicas (with new settings) are brought up in the background.

  • Traffic will be redirected to new replicas only when all new replicas are ready.

  • Old replicas are scaled down after all new replicas are ready.

During an update, traffic is entirely serviced by either old-versioned or new-versioned replicas. sky serve status shows the latest service version and each replica’s version.

Example#

We use the same service http-server as an example. We can then use sky serve update --mode blue_green to update the service:

$ sky serve update http-server --mode blue_green examples/serve/http_server/task.yaml

SkyServe will trigger launching three new replicas.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  2        6m 15s  READY   2/5       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                 STATUS
http-server   1   1        http://54.173.203.169:8081  6 mins ago   AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   2   1        http://52.87.241.103:8081   6 mins ago   AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          READY
http-server   3   2        -                           21 secs ago  AWS (us-east-1b)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING
http-server   4   2        -                           21 secs ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING
http-server   5   2        -                           21 secs ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          PROVISIONING

When a new replica is ready, the traffic will still be redirected to old replicas.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  1        10m 4s  READY   3/5       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                   STATUS
http-server   1   1        http://54.173.203.169:8081  10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)           SHUTTING_DOWN
http-server   2   1        http://52.87.241.103:8081   10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)           READY
http-server   3   2        http://3.93.241.163:8081    1 min ago    AWS (us-east-1b)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY
http-server   4   2        http://18.206.226.82:8081   1 min ago    AWS (us-east-1a)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY
http-server   5   2        -                           1 min ago    AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)           PROVISIONING

Once the total number of new replicas satisfies the requirements, traffics will be redirected to new replicas and old replicas will be scaled down.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME  STATUS  REPLICAS  ENDPOINT
http-server  2        10m 4s  READY   3/5       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                 STATUS
http-server   1   1        http://54.173.203.169:8081  10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          SHUTTING_DOWN
http-server   2   1        http://52.87.241.103:8081   10 mins ago  AWS (us-east-1a)     1x(cpus=2, mem=8, m5.large, ...)          SHUTTING_DOWN
http-server   3   2        http://3.93.241.163:8081    1 min ago    AWS (us-east-1b)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY
http-server   4   2        http://18.206.226.82:8081   1 min ago    AWS (us-east-1a)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY
http-server   5   2        http://3.26.232.31:8081     1 min ago    AWS (us-east-1a)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY

Eventually, same as the rolling update, we will only have new replicas ready to serve user requests.

$ sky serve status http-server

Services
NAME         VERSION  UPTIME   STATUS  REPLICAS  ENDPOINT
http-server  2        11m 42s  READY   3/3       44.206.240.249:30002

Service Replicas
SERVICE_NAME  ID  VERSION  ENDPOINT                    LAUNCHED     INFRA                RESOURCES                                 STATUS
http-server   3   2        http://3.93.241.163:8081    3 mins ago   AWS (us-east-1b)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY
http-server   4   2        http://18.206.226.82:8081   3 mins ago   AWS (us-east-1a)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY
http-server   5   2        http://3.26.232.31:8081     1 min ago    AWS (us-east-1a)     1x(cpus=4, mem=16, m5.xlarge, ...)        READY