Using Environment Variables#

User-specified environment variables#

You can specify environment variables to be made available to a task in two ways:

The envs field (dict) in a task YAML
The --env flag in the sky launch/exec CLI (takes precedence over the above)

The file_mounts, setup, and run sections of a task YAML can access the variables via the ${MYVAR} syntax.

Using in `file_mounts`#

# Sets default values for some variables; can be overridden by --env.
envs:
  MY_BUCKET: skypilot-temp-gcs-test
  MY_LOCAL_PATH: tmp-workdir
  MODEL_SIZE: 13b

file_mounts:
    /mydir:
        name: ${MY_BUCKET}  # Name of the bucket.
        mode: MOUNT

    /another-dir2:
        name: ${MY_BUCKET}-2
        source: ["~/${MY_LOCAL_PATH}"]

    /checkpoint/${MODEL_SIZE}: ~/${MY_LOCAL_PATH}

The values of these variables are filled in by SkyPilot at task YAML parse time.

Using in `setup` and `run`#

All user-specified environment variables are exported to a task’s setup and run commands (i.e., accessible when they are being run).

For example, this is useful for passing secrets (see below) or passing configurations:

# Sets default values for some variables; can be overridden by --env.
envs:
  MODEL_NAME: decapoda-research/llama-65b-hf

run: |
  python train.py --model_name ${MODEL_NAME} <other args>

$ sky launch --env MODEL_NAME=decapoda-research/llama-7b-hf task.yaml  # Override.

See complete examples at llm/vllm/serve.yaml and llm/vicuna/train.yaml.

Passing secrets#

We recommend passing secrets to any node(s) executing your task by first making it available in your current shell, then using --env to pass it to SkyPilot:

$ sky launch -c mycluster --env WANDB_API_KEY task.yaml
$ sky exec mycluster --env WANDB_API_KEY task.yaml

Tip

In other words, you do not need to pass the value directly such as --env WANDB_API_KEY=1234.

SkyPilot environment variables#

SkyPilot exports these environment variables for a task’s execution. setup and run stages have different environment variables available.

Environment variables for `setup`#

Name	Definition	Example
`SKYPILOT_SETUP_NODE_RANK`	Rank (an integer ID from 0 to `num_nodes-1`) of the node being set up.	0
`SKYPILOT_SETUP_NODE_IPS`	A string of IP addresses of the nodes in the cluster with the same order as the node ranks, where each line contains one IP address.	1.2.3.4
`SKYPILOT_TASK_ID`	A unique ID assigned to each task. This environment variable is available only when the task is submitted with `sky launch --detach-setup`, or run as a managed spot job. Refer to the description in the environment variables for run.	sky-2023-07-06-21-18-31-563597_myclus_1 For managed spot jobs: sky-managed-2023-07-06-21-18-31-563597_my-job-name_1-0
`SKYPILOT_CLUSTER_INFO`	A JSON string containing information about the cluster. To access the information, you could parse the JSON string in bash `echo $SKYPILOT_CLUSTER_INFO \| jq .cloud` or in Python `json.loads(os.environ['SKYPILOT_CLUSTER_INFO'])['cloud']`.	{“cluster_name”: “my-cluster-name”, “cloud”: “GCP”, “region”: “us-central1”, “zone”: “us-central1-a”}
`SKYPILOT_SERVE_REPLICA_ID`	The ID of a replica within the service (starting from 1). Available only for a service’s replica task.	1

Since setup commands always run on all nodes of a cluster, SkyPilot ensures both of these environment variables (the ranks and the IP list) never change across multiple setups on the same cluster.

Environment variables for `run`#

Name	Definition	Example
`SKYPILOT_NODE_RANK`	Rank (an integer ID from 0 to `num_nodes-1`) of the node executing the task. Read more here.	0
`SKYPILOT_NODE_IPS`	A string of IP addresses of the nodes reserved to execute the task, where each line contains one IP address. Read more here.	1.2.3.4
`SKYPILOT_NUM_GPUS_PER_NODE`	Number of GPUs reserved on each node to execute the task; the same as the count in `accelerators: <name>:<count>` (rounded up if a fraction). Read more here.	0
`SKYPILOT_TASK_ID`	A unique ID assigned to each task in the format “sky-<timestamp>_<cluster-name>_<task-id>”. Useful for logging purposes: e.g., use a unique output path on the cluster; pass to Weights & Biases; etc. Each task’s logs are stored on the cluster at `~/sky_logs/${SKYPILOT_TASK_ID%%_}/tasks/.log`. If a task is run as a managed spot job, then all recoveries of that job will have the same ID value. The ID is in the format “sky-managed-<timestamp>_<job-name>(_<task-name>)_<job-id>-<task-id>”, where `<task-name>` will appear when a pipeline is used, i.e., more than one task in a managed spot job. Read more here.	sky-2023-07-06-21-18-31-563597_myclus_1 For managed spot jobs: sky-managed-2023-07-06-21-18-31-563597_my-job-name_1-0
`SKYPILOT_CLUSTER_INFO`	A JSON string containing information about the cluster. To access the information, you could parse the JSON string in bash `echo $SKYPILOT_CLUSTER_INFO \| jq .cloud` or in Python `json.loads(os.environ['SKYPILOT_CLUSTER_INFO'])['cloud']`.	{“cluster_name”: “my-cluster-name”, “cloud”: “GCP”, “region”: “us-central1”, “zone”: “us-central1-a”}
`SKYPILOT_SERVE_REPLICA_ID`	The ID of a replica within the service (starting from 1). Available only for a service’s replica task.	1

The values of these variables are filled in by SkyPilot at task execution time.

You can access these variables in the following ways:

In the task YAML’s setup/run commands (a Bash script), access them using the ${MYVAR} syntax;
In the program(s) launched in setup/run, access them using the language’s standard method (e.g., os.environ for Python).

Using Environment Variables#

User-specified environment variables#

Using in file_mounts#

Using in setup and run#

Passing secrets#

SkyPilot environment variables#

Environment variables for setup#

Environment variables for run#

Using in `file_mounts`#

Using in `setup` and `run`#

Environment variables for `setup`#

Environment variables for `run`#