Contributing to SkyPilot#

Thank you for your interest in contributing to SkyPilot! We welcome and value all contributions to the project, including but not limited to:

  • Bug reports and discussions

  • Pull requests for bug fixes and new features

  • Test cases to make the codebase more robust

  • Examples

  • Documentation

  • Tutorials, blog posts and talks on SkyPilot

Contributing code#

We use GitHub to track issues and features. For new contributors, we recommend looking at issues labeled “good first issue”.

Installing SkyPilot for development#

Follow the steps below to set up a local development environment for contributing to SkyPilot.

Create a conda environment#

To avoid package conflicts, create and activate a clean conda environment:

# SkyPilot requires 3.7 <= python <= 3.11.
conda create -y -n sky python=3.10
conda activate sky

Install SkyPilot#

To install SkyPilot, please fork skypilot-org/skypilot to your GitHub account and run:

# Clone your forked repo
git clone https://github.com/<your-github-username>/skypilot.git

# Set upstream to keep in sync with the official repo
cd skypilot
git remote add upstream https://github.com/skypilot-org/skypilot.git

# Install SkyPilot in editable mode
pip install -e ".[all]"
# Alternatively, install specific cloud support only:
# pip install -e ".[aws,azure,gcp,lambda]"

# Install development dependencies
pip install -r requirements-dev.txt

(Optional) Install pre-commit#

You can also install pre-commit hooks to help automatically format your code on commit:

pip install pre-commit
pre-commit install

Testing#

To run smoke tests (NOTE: Running all smoke tests launches ~20 clusters):

# Run all tests on AWS and Azure (default smoke test clouds)
pytest tests/test_smoke.py

# Terminate a test's cluster even if the test failed (default is to keep it around for debugging)
pytest tests/test_smoke.py --terminate-on-failure

# Re-run last failed tests
pytest --lf

# Run one of the smoke tests
pytest tests/test_smoke.py::test_minimal

# Only run managed spot tests
pytest tests/test_smoke.py --managed-spot

# Only run test for GCP + generic tests
pytest tests/test_smoke.py --gcp

# Change cloud for generic tests to Azure
pytest tests/test_smoke.py --generic-cloud azure

For profiling code, use:

pip install py-spy # py-spy is a sampling profiler for Python programs
py-spy record -t -o sky.svg -- python -m sky.cli status # Or some other command
py-spy top -- python -m sky.cli status # Get a live top view
py-spy -h # For more options

Testing in a container#

It is often useful to test your changes in a clean environment set up from scratch. Using a container is a good way to do this. We have a dev container image berkeleyskypilot/skypilot-debug which we use for debugging skypilot inside a container. Use this image by running:

docker run -it --rm --name skypilot-debug berkeleyskypilot/skypilot-debug /bin/bash
# On Apple silicon Macs:
docker run --platform linux/amd64 -it --rm --name skypilot-debug berkeleyskypilot/skypilot-debug /bin/bash

It has some convenience features which you might find helpful (see Dockerfile):

  • Common dependencies and some utilities (rsync, screen, vim, nano etc) are pre-installed

  • requirements-dev.txt is pre-installed

  • Environment variables for dev/debug are set correctly

  • Automatically clones the latest master to /sky_repo/skypilot when the container is launched.

    • Note that you still have to manually run pip install -e ".[all]" to install skypilot, it is not pre-installed.

    • If your branch is on the SkyPilot repo, you can run git checkout <your_branch> to switch to your branch.

Submitting pull requests#

  • Fork the SkyPilot repository and create a new branch for your changes.

  • If relevant, add tests for your changes. For changes that touch the core system, run the smoke tests and ensure they pass.

  • Follow the Google style guide.

  • Ensure code is properly formatted by running format.sh.

    • [Optional] You can also install pre-commit hooks by running pre-commit install to automatically format your code on commit.

  • Push your changes to your fork and open a pull request in the SkyPilot repository.

  • In the PR description, write a Tested: section to describe relevant tests performed.

Some general engineering practice suggestions#

These are suggestions, not strict rules to follow. When in doubt, follow the style guide.

  • Use TODO(author_name)/FIXME(author_name) instead of blank TODO/FIXME. This is critical for tracking down issues. You can write TODOs with your name and assign it to others (on github) if it is someone else’s issue.

  • Delete your branch after merging it. This keeps the repo clean and faster to sync.

  • Use an exception if this is an error. Only use assert for debugging or proof-checking purposes. This is because exception messages usually contain more information.

  • Use LazyImport for third-party modules that meet these criteria:

    • The module is imported during import sky

    • The module has a significant import time (e.g. > 100ms)

  • To measure import time:

    • Basic check: python -X importtime -c "import sky"

    • Detailed visualization: use tuna to analyze import time:

      python -X importtime -c "import sky" 2> import.log
      tuna import.log
      
  • Use modern python features and styles that increases code quality.

    • Use f-string instead of .format() for short expressions to increase readability.

    • Use class MyClass: instead of class MyClass(object):. The later one was a workaround for python2.x.

    • Use abc module for abstract classes to ensure all abstract methods are implemented.

    • Use python typing. But you should not import external objects just for typing. Instead, import typing-only external objects under if typing.TYPE_CHECKING:.

Environment variables for developers#

  • export SKYPILOT_DISABLE_USAGE_COLLECTION=1 to disable usage logging.

  • export SKYPILOT_DEBUG=1 to show debugging logs (use logging.DEBUG level).

  • export SKYPILOT_MINIMIZE_LOGGING=1 to minimize logging. Useful when trying to avoid multiple lines of output, such as for demos.

Test API server on Helm chart deployment#

By default, the Helm Chart Deployment will use the latest released API Server. To test the local change on API Server, you can follow the steps below.

First, start the API Server:

# Ensure the helm repository is added and up to date
helm repo add skypilot https://helm.skypilot.co
helm repo update

# The following variables will be used throughout the guide
# NAMESPACE is the namespace to deploy the API server in
NAMESPACE=skypilot
# RELEASE_NAME is the name of the helm release, must be unique within the namespace
RELEASE_NAME=skypilot
# Set up basic username/password HTTP auth, or use OAuth2 proxy
WEB_USERNAME=skypilot
WEB_PASSWORD=yourpassword
AUTH_STRING=$(htpasswd -nb $WEB_USERNAME $WEB_PASSWORD)
# Deploy the API server
helm upgrade --install $RELEASE_NAME skypilot/skypilot-nightly --devel \
  --namespace $NAMESPACE \
  --create-namespace \
  --set ingress.authCredentials=$AUTH_STRING

Then build the local changes and deploy the new changes to the API Server:

DOCKER_IMAGE=my-docker-repo/image-name:v1 # change the tag to deploy the new changes
docker buildx build --push --platform linux/amd64  -t $DOCKER_IMAGE -f Dockerfile_local .

# Build the local changes
helm dependency build ./charts/skypilot

# Deploy the new changes to the API Server
helm upgrade --install $RELEASE_NAME ./charts/skypilot --devel \
              --namespace $NAMESPACE \
              --reuse-values \
              --set apiService.image=$DOCKER_IMAGE

Notice that the tag should change every time you build the local changes.

Then, watch the status until the READY shows 1/1 and STATUS shows Running:

$ kubectl get pods --namespace $NAMESPACE -l app=${RELEASE_NAME}-api --watch
NAME                                  READY   STATUS              RESTARTS   AGE
skypilot-api-server-866b9b64c-ckxfm   0/1     ContainerCreating   0          25s
skypilot-api-server-866b9b64c-ckxfm   0/1     Running             0          44s
skypilot-api-server-866b9b64c-ckxfm   1/1     Running             0          75s

Finally, fetch the updated API Server endpoint:

HOST=$(kubectl get svc ${RELEASE_NAME}-ingress-nginx-controller --namespace $NAMESPACE -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
ENDPOINT=http://${WEB_USERNAME}:${WEB_PASSWORD}@${HOST}
echo $ENDPOINT