Skip to main content
Ctrl+K
You are viewing the latest developer preview docs. Click here to view docs for the latest stable release.

Site Navigation

  • Docs
  • Case Studies
  • Blog
  • Slack
  • Twitter
  • GitHub

Site Navigation

  • Docs
  • Case Studies
  • Blog
  • Slack
  • Twitter
  • GitHub

Getting Started

  • Overview
  • Installation
  • Quickstart
  • Examples
    • Quickstart: PyTorch
    • Training
      • Axolotl
      • DeepSpeed
      • Distributed PyTorch
      • Distributed TensorFlow
      • Finetuning GPT-OSS
      • Finetuning Llama 4
      • Finetuning Llama 3
      • Finetuning Llama 2
      • nanochat
      • NeMo
      • NeMo RL
      • Ray
      • TorchTitan
      • Training on TPUs
      • Unsloth
      • Verl (RLHF)
      • SkyRL
      • Vertex AI
    • Serving
      • vLLM
      • SGLang
      • Nvidia Dynamo
      • Ollama
      • Hugging Face TGI
      • LoRAX
      • Cog
    • Models
      • OpenAI gpt-oss
      • DeepSeek-R1
      • DeepSeek-R1 Distilled
      • DeepSeek-Janus
      • Gemma 3
      • Llama 4
      • Llama 3.2
      • Llama 3.1
      • Llama 3
      • Llama 2
      • CodeLlama
      • Pixtral
      • Mixtral
      • Mistral 7B
      • Qwen 3
      • Kimi K2
      • Kimi K2 Thinking
      • Yi
      • Gemma
      • DBRX
      • GPT-2 via llm.c
      • Vicuna
    • AI Applications
      • DeepSeek-R1 for RAG
      • DeepSeek OCR with Pools
      • Large-Scale Batch Inference
      • Batch Inference with vLLM
      • Image Vector Database
      • RedisVL Vector Search
      • Streamlit Web Apps
      • Tabby: Coding Assistant
      • LocalGPT: Chat with PDF
      • Stable Diffusion
    • AI Performance
      • AWS EFA
      • GCP/GKE GPUDirect
      • Coreweave with InfiniBand
      • Nebius with InfiniBand
    • Orchestrators
      • Airflow
      • Cron
      • Github Actions
    • Other Frameworks
      • Cross-cloud data transfer
      • DVC
      • Jupyter
      • marimo
      • MLFlow
      • MPI
      • Spyder IDE
  • Concept: Sky Computing

Clusters

  • Start a Development Cluster
  • Cluster Jobs
  • Provisioning Compute
  • Autostop and Autodown

Jobs

  • Managed Jobs
  • Multi-Node Jobs
  • Many Parallel Jobs
  • Model Training Guide
  • Using a Pool of Workers

Model Serving

  • Getting Started
  • Serving User Guides
    • Autoscaling
    • Updating a Service
    • Authorization
    • Using Spot Instances for Serving
    • HTTPS Encryption
    • High Availability Controller

Infra Choices

  • Using Kubernetes
    • Getting Started
    • Kubernetes Cluster Setup
      • Deployment Guides
      • Exposing Services
    • Priority and Preemption
    • Multiple Kubernetes Clusters
    • SkyPilot vs. Vanilla Kubernetes
    • Examples
      • Kueue
      • Dynamic Workload Scheduler
      • Kueue with GKE DWS
      • Multi-region Kubernetes
    • Kubernetes Troubleshooting
  • Using Slurm
    • Getting Started
  • Using Existing Machines
  • Using Reservations
  • Using Cloud VMs
    • Requesting Quota Increase
  • GPUs and Accelerators
    • Using Google TPUs
    • Using AMD GPUs

Data

  • Cloud Buckets
  • Volumes
  • Syncing Code, Git, and Files

User Guides

  • Migrating from Slurm
  • External Links
  • Asynchronous Execution
  • Environment Variables and Secrets
  • Docker Containers
  • Opening Ports
  • Usage Collection
  • Frequently Asked Questions

Administrator Guides

  • API Server Deployment
    • Deploying API Server
      • API server metrics monitoring
      • GPU metrics monitoring
      • Advanced: Cross-Cluster State Persistence
      • Example: Deploy on GKE, GCP, and Nebius with Okta
      • Example: Deploy SkyPilot API Server in Docker
      • Example: Deploy on GKE with Cloud SQL
    • Upgrading API Server
    • Performance Best Practices
    • Troubleshooting
    • Helm Chart Reference
    • Advanced: High Availability Controller
  • Authentication and RBAC
  • Workspaces: Isolating Teams
  • Cloud Accounts and Permissions
    • AWS
      • Using IAM Roles for S3 Access on EKS
    • GCP
    • Nebius
    • vSphere
    • Kubernetes
  • Admin Policies
  • External Logging Storage

References

  • SkyPilot YAML
  • CLI
  • Python SDK
  • Advanced Configuration
    • Configuration Sources
  • SkyPilot Internals
  • Developer Guides
    • Contributing to SkyPilot
    • Guide: Adding a New Cloud

External Links#

External links are URLs associated with managed jobs that are displayed in the SkyPilot dashboard. This is useful for linking to external dashboards, experiment trackers, or any other relevant resources.

SkyPilot automatically detects and displays two types of links:

  1. Instance links: For jobs running on AWS, GCP, or Azure, SkyPilot automatically adds links to the cloud console for the underlying instance.

  2. Log-detected links: The dashboard automatically parses job logs to detect URLs from supported services and displays them as external links.

Managed jobs external links

Supported services#

SkyPilot automatically detects URLs from the following services in your job logs:

  • Weights & Biases (W&B): Run URLs (e.g., https://wandb.ai/<entity>/<project>/runs/<run_id>)

When your job prints a URL from a supported service to stdout or stderr, the dashboard will automatically extract it and display it in the “External Links” section.

Example: Using Weights & Biases#

When using W&B for experiment tracking, the W&B library automatically prints the run URL to stdout when you initialize a run. SkyPilot detects this and displays it in the dashboard.

Here’s an example training job:

# wandb_training.yaml
name: wandb-training

envs:
  WANDB_API_KEY: null # Set via --secret

setup: |
  pip install wandb torch

run: |
  python train.py
# train.py
import wandb
run = wandb.init(project='example', name='demo-run')
run.log({'loss': 1.0})
run.finish()

Launch the job:

$ sky jobs launch -n wandb-example-job --env WANDB_API_KEY=$WANDB_API_KEY wandb_training.yaml

Once the job starts and W&B prints the run URL to the logs, you’ll see the link appear in the dashboard:

Job detail page showing W&B external link

Clicking the link will take you directly to the W&B run page allowing you to quickly view the run metrics and artifacts.

W&B run page

previous

Migrating from Slurm to SkyPilot

next

Asynchronous Execution

On this page
  • Supported services
    • Example: Using Weights & Biases
Edit on GitHub

© Copyright 2025, SkyPilot Team.