Run nanochat on any cloud or Kubernetes with SkyPilot#

Run nanochat on Your Infra

This demo shows how to train and serve nanochat on any cloud provider or Kubernetes cluster with SkyPilot. Run nanochat seamlessly across AWS, GCP, Azure, Lambda Labs, Nebius and more - or bring your own Kubernetes infrastructure.

What is nanochat?#

The best ChatGPT that $100 can buy.

nanochat by Andrej Karpathy is a full-stack LLM training pipeline that runs the complete flow (tokenization -> pretraining -> finetuning -> evaluation -> inference -> web UI) on a single 8×H100 node in ~4 hours for ~$100.

Training: Running the Speedrun Pipeline#

Once SkyPilot is set up (see Appendix: Preparation), launch the speedrun training pipeline on any cloud provider:

sky launch -c nanochat-speedrun speedrun.sky.yaml --infra <k8s|aws|gcp|nebius|lambda|etc>

This will:

Provision an 8×H100 GPU node
Set up the environment
Run the complete training pipeline via speedrun.sh
Save trained model checkpoints to s3://nanochat-data (change this to your own bucket)
Complete in approximately 4 hours (~$100 on most providers)

Monitoring Training Progress#

# View logs
sky logs nanochat-speedrun

# SSH to check the report card with evaluation metrics
ssh nanochat-speedrun
cat report.md

Serving: Deploy Your Trained Model#

Once training is complete, serve your trained model with the web UI:

sky launch -c nanochat-serve serve.sky.yaml --infra <k8s|aws|gcp|nebius|lambda|etc>

This will:

Provision a 1×H100 GPU node (much cheaper than the 8×H100 VM used for training)
Load model weights from the same s3://nanochat-data bucket used during training
Serve the web chat interface on port 8000
Cost is ~$2-3/hour on most providers

Accessing the Web UI#

Get the endpoint URL to access the chat interface:

sky status --endpoint 8000 nanochat-serve

Open the displayed URL in your browser to chat with your trained model!

nanochat web UI

Customizing Your Training#

SkyPilot YAMLs are flexible and can be customized to fit your use case.

Custom Storage Bucket#

To use your own bucket for storing the model weights, replace s3://nanochat-data in the YAML:

file_mounts:
  /tmp/nanochat:
    source: s3://your-bucket-name  # or gs://your-bucket, r2://, cos://<region>/<bucket>, oci://<bucket_name>

Appendix: Preparation#

1. Install SkyPilot#

pip install skypilot-nightly[aws,gcp,nebius,lambda,kubernetes]
# or other clouds (17+ clouds and kubernetes are supported)
# See: https://docs.skypilot.co/en/latest/getting-started/installation.html

2. Check your infra setup#

sky check

🎉 Enabled clouds 🎉
    ✔ AWS
    ✔ GCP
    ...
    ✔ Kubernetes

3. Configure storage access#

Make sure your cloud credentials have read/write access to the bucket specified in the YAML files. See SkyPilot’s Cloud Buckets docs page for more details.

Learn More#

nanochat GitHub - Project repository and full documentation
Announcement tweet - Andrej Karpathy’s announcement
SkyPilot Docs - SkyPilot documentation

Included files#

serve.sky.yaml

# Serve a trained nanochat model with the web UI
#
# Launch:
#   sky launch -c nanochat-serve cloud/serve.sky.yaml --infra <aws|gcp|nebius|lambda|etc>
#
# Access the web UI:
#   sky status --endpoint 8000 nanochat-serve
#
# Then open the URL in your browser to chat with your model!

name: nanochat-serve

resources:
  accelerators: H100:1  # Single GPU sufficient for inference
  ports: 8000  # Expose port 8000 for the web UI
  disk_size: 100

file_mounts:
  /tmp/nanochat:
    source: s3://nanochat-data

workdir: .

setup: |
  uv sync
  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
  source "$HOME/.cargo/env"
  source .venv/bin/activate
  unset CONDA_PREFIX
  uv run maturin develop --release --manifest-path rustbpe/Cargo.toml

run: |
  export NANOCHAT_BASE_DIR=/tmp/nanochat
  source .venv/bin/activate
  python -m scripts.chat_web --host 0.0.0.0 --port 8000

speedrun.sky.yaml

# Run the full nanochat training speedrun
#
# Launch:
#   sky launch -c nanochat-speedrun cloud/speedrun.sky.yaml --infra <aws|gcp|nebius|lambda|etc>
#
# Monitor progress:
#   sky logs nanochat-speedrun
#
# This will train the model using 8x H100 GPUs and save results to S3.

name: nanochat-speedrun

resources:
  accelerators: H100:8
  disk_size: 512

file_mounts:
  /tmp/nanochat:
    source: s3://nanochat-data

workdir: .

setup: |
  sudo apt-get install -y unzip

run: |
  export NANOCHAT_BASE_DIR=/tmp/nanochat
  bash speedrun.sh