RAG Document Assistant

AI-powered document Q&A app that uses Retrieval-Augmented Generation (RAG) to answer questions from uploaded PDFs.

Project Links

Overview

This project lets users upload a PDF and ask natural language questions about its content. The app performs semantic retrieval over document chunks and sends the most relevant context to the LLM, producing grounded answers with source attribution.

Features

Upload a PDF and ask questions in plain language.
Semantic search with vector embeddings for relevant retrieval.
Context-aware answers with source attribution.

Tech Stack

Python, Streamlit, LangChain, FAISS, OpenAI, Docker.

Demo

How it works

PDF ingestion: the uploaded PDF is parsed and split into text chunks.
Embedding: chunks are converted to vectors and indexed in FAISS.
Retrieval: user queries are embedded and matched against the vector store.
Generation: retrieved context is passed to the LLM for grounded answers.

Run locally

pip install -r requirements.txt
streamlit run app.py

Docker

docker build -t rag-app .
docker run -p 8501:8501 rag-app