What is RAG?
RAG (Retrieval-Augmented Generation) = Search + LLM.
Instead of relying on the LLM's training data (which may be outdated), RAG:
1. Indexes your documents (PDFs, web pages, manuals)
2. Retrieves the most relevant chunks when a user asks a question
3. Generates an answer using DeepSeek API with the retrieved context
Step 1: Set Up DeepSeek API Client
import openai
client = openai.OpenAI(
api_key="YOUR_AICREDITS_API_KEY",
base_url="https://api.aicreditsapi.com/v1"
)
Get your API key from aicreditsapi.com/buy — no Chinese phone number required.
Step 2: Create the Vector Database (FAISS)
from sentence_transformers import SentenceTransformer
import faiss, numpy as np
embedder = SentenceTransformer('all-MiniLM-L6-v2')
def create_vector_db(documents):
embeddings = embedder.encode(documents)
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings.astype(np.float32))
return index, documents
Complete Code (Copy-Paste Ready)
import openai, faiss, numpy as np
from sentence_transformers import SentenceTransformer
client = openai.OpenAI(
api_key="YOUR_AICREDITS_API_KEY",
base_url="https://api.aicreditsapi.com/v1"
)
embedder = SentenceTransformer('all-MiniLM-L6-v2')
docs = ["Your documents here..."]
embeddings = embedder.encode(docs)
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings.astype(np.float32))
def rag(query):
q_emb = embedder.encode([query])
_, I = index.search(q_emb.astype(np.float32), k=3)
context = "\n".join([docs[i] for i in I[0]])
response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": f"Context: {context}\n\nQuestion: {query}"}],
max_tokens=500
)
return response.choices[0].message.content
print(rag("What is DeepSeek?"))