Retrieval Augmented Generation with DeepSeek R1#

RAG with DeepSeek R1

For the full blog post, please find it here.

Step 0: Set Up The Environment#

Install the following Prerequisites:

Set up bucket names for storing embeddings and vector database:

export EMBEDDINGS_BUCKET_NAME=sky-rag-embeddings
export VECTORDB_BUCKET_NAME=sky-rag-vectordb

Note that these bucket names need to be unique to the entire SkyPilot community.

Step 2: Build RAG with Vector Database#

After computing embeddings, construct a ChromaDB vector database for efficient similarity search:

sky launch build_rag.yaml --env EMBEDDINGS_BUCKET_NAME=$EMBEDDINGS_BUCKET_NAME --env VECTORDB_BUCKET_NAME=$VECTORDB_BUCKET_NAME

The process builds the database in batches:

Loading embeddings from: embeddings_0_1000.parquet
Adding vectors to ChromaDB: 100%|██████████| 1000/1000 [00:12<00:00, 81.97it/s]
...

Step 3: Serve the RAG#

Deploy the RAG service to handle queries and generate answers:

sky launch -c legal-rag serve_rag.yaml --env VECTORDB_BUCKET_NAME=$VECTORDB_BUCKET_NAME

Or use Sky Serve for managed deployment:

sky serve up -n legal-rag serve_rag.yaml --env VECTORDB_BUCKET_NAME=$VECTORDB_BUCKET_NAME

To query the system, get the endpoint:

sky serve status legal-rag --endpoint

You can visit the website and input your query there! A few queries to try out:

I want to break my lease. my landlord doesn’t allow me to do that. My employer has not provided the final paycheck after termination.

Disclaimer#

This document provides instruction for building a RAG system with SkyPilot. The system and its outputs should not be considered as legal advice. Please consult qualified legal professionals for any legal matters.