RAG & Vector Databases

Build retrieval-augmented AI systems that answer questions from your own documents using semantic search and vector stores. Retrieval-Augmented Generation (RAG) connects LLMs to your own data — documents, databases, codebases — so they can answer questions accurately without hallucinating.

Level: Intermediate · Category: AI Applications · Estimated time: 6 hours

Prerequisites

Machine Learning Basics
Python for AI

Lessons

Why RAG? LLM Limitations & the Solution — Knowledge cutoffs, hallucinations, context limits — and how retrieval-augmented generation solves them.
Text Embeddings & Semantic Similarity — What embeddings are, how embedding models work, cosine similarity, and choosing the right embedding model.
Vector Databases Deep Dive — Chroma, Pinecone, Weaviate, and Qdrant — how they index and search, and how to pick the right one.
Document Processing & Chunking Strategies — Splitting documents intelligently — fixed-size, recursive, semantic, and metadata-aware chunking.
Hybrid Search & Result Reranking — Combining dense semantic and sparse keyword retrieval, then re-ranking with a cross-encoder for precision.
Advanced RAG Architectures — HyDE, multi-hop RAG, corrective RAG, self-RAG, and contextual compression — moving beyond naive retrieve-then-generate.
RAG Evaluation & Observability — RAGAS metrics, hallucination detection, end-to-end RAG evaluation, and monitoring in production.

Topics covered

rag, vector-databases, embeddings, semantic-search, pinecone, langchain, information-retrieval, llm

Browse all neo-ai courses · neo-ai home