RAG & embeddings cost
A retrieval pipeline has two costs: a one-time charge to embed your documents, then a per-query charge to feed retrieved context through an LLM. Embeddings are cheap; the generation step usually dominates. Here's the split.
Estimated cost
One-time indexing—
Per query—
Monthly (queries)—
Monthly with caching—