Agentic Adaptive RAG is a production-ready framework for building self-correcting, reasoning-based LLM systems that dynamically choose between retrieval, web search, and generation.
Build intelligent RAG systems that know when to retrieve documents, search the web, or generate responses directly
An advanced Retrieval-Augmented Generation (RAG) system that intelligently integrates dynamic query analysis with self-correcting mechanisms to optimize response accuracy. Unlike traditional RAG approaches, this system adapts its strategy based on query complexity and context.
- π§ Intelligent Query Routing: Automatically determines whether to use local documents, web search, or direct LLM generation
- π Multi-Stage Quality Assurance: Document relevance assessment, hallucination detection, and answer quality evaluation
- π Self-Correcting Mechanisms: Automatically triggers additional retrieval or regeneration when quality thresholds aren't met
- π Hybrid Knowledge Sources: Seamlessly combines local vector store with real-time web search
- β‘ Production-Ready: Built with LangGraph for robust state management and workflow orchestration
The system implements three different retrieval strategies based on query complexity:
- No Retrieval: For queries answerable from parametric knowledge
- Single-Step Retrieval: For simple queries requiring document lookup
- Multi-Hop Retrieval: For complex queries requiring reasoning across multiple sources
- Python 3.10+
- UV package manager (recommended) or pip
- Clone the repository
git clone https://github.com/your-username/Agentic-Adaptive-RAG.git
cd Agentic-Adaptive-RAG- Set up virtual environment with UV
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create and activate virtual environment
uv venv --python 3.10
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install dependencies
uv pip install -r requirements.txt- Configure environment variables
Create a
.envfile in the root directory:
GOOGLE_API_KEY=your_google_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here
LANGCHAIN_API_KEY=your_langchain_api_key_here
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_PROJECT=agentic-rag- Google AI API Key: Visit Google AI Studio
- Tavily API Key: Sign up at Tavily
- LangChain API Key: Get it from LangSmith
python ingestion.pyThis creates a local Chroma vector store with documents about AI agents, prompt engineering, and adversarial attacks.
python main.pyπ€ Advanced RAG Chatbot
Welcome! Ask me anything or type 'quit', 'exit', or 'bye' to leave.
π¬ You: what is agent memory?
π€ Bot: Thinking...
---ROUTE QUESTION---
---ROUTE QUESTION TO RAG---
---RETRIEVE---
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---
π€ Bot: Agent memory is a key component of AI systems that enables agents to store, retrieve, and utilize information across interactions...
Run the comprehensive test suite:
python -m pytest . -s -vThe test suite validates:
- Document relevance grading
- Hallucination detection
- Query routing logic
- Generation quality
- End-to-end workflow
building-adaptive-rag/
βββ graph/
β βββ chains/ # LLM processing chains
β β βββ tests/
β β β βββ __init__.py
β β β βββ test_chains.py
β β βββ __init__.py
β β βββ answer_grader.py # Answer quality evaluation
β β βββ generation.py # Response generation
β β βββ hallucination_grader.py # Hallucination detection
β β βββ retrieval_grader.py # Document relevance scoring
β β βββ router.py # Query routing logic
β βββ nodes/ # Workflow nodes
β β βββ __init__.py
β β βββ generate.py # Generation node
β β βββ grade_documents.py # Document grading node
β β βββ retrieve.py # Retrieval node
β β βββ web_search.py # Web search node
β βββ __init__.py
β βββ consts.py # System constants
β βββ graph.py # Main workflow orchestration
β βββ state.py # State management
βββ static/ # Assets and diagrams
βββ .env # Environment variables
βββ .gitignore
βββ ingestion.py # Document ingestion pipeline
βββ main.py # Application entry point
βββ model.py # Model configurations
βββ README.md
βββ requirements.txt
The system uses a GraphState TypedDict that flows through all workflow nodes:
question: User's input querygeneration: LLM's responseweb_search: Boolean flag for web search necessitydocuments: Retrieved documents from local and web sources
- Query Router: Determines optimal information source (vectorstore vs. web search)
- Document Retriever: Fetches relevant documents from local vector store
- Document Grader: Evaluates document relevance and triggers web search if needed
- Web Search: Queries external sources for additional information
- Generator: Creates responses using retrieved context
- Quality Graders: Assess hallucinations and answer relevance
The system implements intelligent decision-making at multiple points:
- Routes queries based on content domain
- Grades document relevance and triggers web search for insufficient results
- Detects hallucinations and regenerates responses when needed
- Evaluates answer quality and seeks additional information if required
Edit model.py to customize your language and embedding models:
# Language Model Options
from langchain_aws import ChatBedrock
from langchain_google_genai import GoogleGenerativeAIEmbeddings
llm_model = ChatBedrock(model_id="us.anthropic.claude-3-5-sonnet-20241022-v2:0")
embed_model = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")Customize the knowledge base by editing the URLs in ingestion.py:
urls = [
"https://lilianweng.github.io/posts/2023-06-23-agent/",
"https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
"https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]- Chunk Size: Optimized at 250 tokens for better embedding quality
- Retrieval Limit: Configurable number of documents retrieved
- Web Search Results: Limited to 3 results for efficiency
- Caching: Persistent Chroma vector store for faster subsequent queries
- Document Relevance Scoring: Binary classification of document relevance
- Hallucination Detection: Verification that responses are grounded in evidence
- Answer Quality Assessment: Evaluation of response completeness and relevance
The system intelligently routes queries based on:
- Content domain analysis
- Query complexity assessment
- Available knowledge sources
- Previous retrieval success rates
- LLM Fallback State: Direct LLM responses for conversational queries
- Enhanced Router: Three-way routing (vectorstore/websearch/llm_fallback)
- Multi-Modal Support: Image and document understanding
- Conversation Memory: Context preservation across interactions
- Custom Evaluation Metrics: Domain-specific quality assessment
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License.
- LangChain for the foundational RAG framework
- LangGraph for stateful workflow orchestration
- Mistral AI for inspiration and research contributions
- Research paper: "Adaptive RAG" by Soyeong Jeong et al., 2024
If you use this project in your research, please cite:
@software{agentic_adaptive_rag,
title={Agentic Adaptive RAG with LangGraph},
author={Mohamed Shaad},
year={2025},
url={https://github.com/shaadclt/Agentic-Adaptive-RAG}
}- Author: Mohamed Shaad
- LinkedIn: Connect on LinkedIn
β Star this repository if you find it helpful!