Using Environment Variables#
User-specified environment variables#
You can specify environment variables to be made available to a task in two ways:
The
envs
field (dict) in a task YAMLThe
--env
flag in thesky launch/exec
CLI (takes precedence over the above)
Tip
If an environment variable is required to be specified with –env during
sky launch/exec
, you can set it to null
in task YAML to raise an
error when it is forgotten to be specified. For example, the WANDB_API_KEY
and HF_TOKEN
in the following task YAML:
envs:
WANDB_API_KEY:
HF_TOKEN: null
MYVAR: val
The file_mounts
, setup
, and run
sections of a task YAML can access the variables via the ${MYVAR}
syntax.
Using in file_mounts
#
# Sets default values for some variables; can be overridden by --env.
envs:
MY_BUCKET: skypilot-temp-gcs-test
MY_LOCAL_PATH: tmp-workdir
MODEL_SIZE: 13b
file_mounts:
/mydir:
name: ${MY_BUCKET} # Name of the bucket.
mode: MOUNT
/another-dir2:
name: ${MY_BUCKET}-2
source: ["~/${MY_LOCAL_PATH}"]
/checkpoint/${MODEL_SIZE}: ~/${MY_LOCAL_PATH}
The values of these variables are filled in by SkyPilot at task YAML parse time.
Read more at examples/using_file_mounts_with_env_vars.yaml.
Using in setup
and run
#
All user-specified environment variables are exported to a task’s setup
and run
commands (i.e., accessible when they are being run).
For example, this is useful for passing secrets (see below) or passing configurations:
# Sets default values for some variables; can be overridden by --env.
envs:
MODEL_NAME: decapoda-research/llama-65b-hf
run: |
python train.py --model_name ${MODEL_NAME} <other args>
$ sky launch --env MODEL_NAME=decapoda-research/llama-7b-hf task.yaml # Override.
See complete examples at llm/vllm/serve.yaml and llm/vicuna/train.yaml.
Passing secrets#
We recommend passing secrets to any node(s) executing your task by first making
it available in your current shell, then using --env
to pass it to SkyPilot:
$ sky launch -c mycluster --env WANDB_API_KEY task.yaml
$ sky exec mycluster --env WANDB_API_KEY task.yaml
Tip
In other words, you do not need to pass the value directly such as --env
WANDB_API_KEY=1234
.
SkyPilot environment variables#
SkyPilot exports these environment variables for a task’s execution. setup
and run
stages have different environment variables available.
Environment variables for setup
#
Name |
Definition |
Example |
---|---|---|
|
Rank (an integer ID from 0 to |
0 |
|
A string of IP addresses of the nodes in the cluster with the same order as the node ranks, where each line contains one IP address. |
1.2.3.4 |
|
A unique ID assigned to each task. This environment variable is available only when the task is submitted
with Refer to the description in the environment variables for run. |
sky-2023-07-06-21-18-31-563597_myclus_1 For managed spot jobs: sky-managed-2023-07-06-21-18-31-563597_my-job-name_1-0 |
|
A JSON string containing information about the cluster. To access the information, you could parse the JSON string in bash |
{“cluster_name”: “my-cluster-name”, “cloud”: “GCP”, “region”: “us-central1”, “zone”: “us-central1-a”} |
|
The ID of a replica within the service (starting from 1). Available only for a service’s replica task. |
1 |
Since setup commands always run on all nodes of a cluster, SkyPilot ensures both of these environment variables (the ranks and the IP list) never change across multiple setups on the same cluster.
Environment variables for run
#
Name |
Definition |
Example |
---|---|---|
|
Rank (an integer ID from 0 to |
0 |
|
A string of IP addresses of the nodes reserved to execute the task, where each line contains one IP address. Read more here. |
1.2.3.4 |
|
Number of GPUs reserved on each node to execute the task; the same as the
count in |
0 |
|
A unique ID assigned to each task in the format “sky-<timestamp>_<cluster-name>_<task-id>”.
Useful for logging purposes: e.g., use a unique output path on the cluster; pass to Weights & Biases; etc.
Each task’s logs are stored on the cluster at If a task is run as a managed spot job, then all
recoveries of that job will have the same ID value. The ID is in the format “sky-managed-<timestamp>_<job-name>(_<task-name>)_<job-id>-<task-id>”, where |
sky-2023-07-06-21-18-31-563597_myclus_1 For managed spot jobs: sky-managed-2023-07-06-21-18-31-563597_my-job-name_1-0 |
|
A JSON string containing information about the cluster. To access the information, you could parse the JSON string in bash |
{“cluster_name”: “my-cluster-name”, “cloud”: “GCP”, “region”: “us-central1”, “zone”: “us-central1-a”} |
|
The ID of a replica within the service (starting from 1). Available only for a service’s replica task. |
1 |
The values of these variables are filled in by SkyPilot at task execution time.
You can access these variables in the following ways:
In the task YAML’s
setup
/run
commands (a Bash script), access them using the${MYVAR}
syntax;In the program(s) launched in
setup
/run
, access them using the language’s standard method (e.g.,os.environ
for Python).