Skip to content

shaadclt/Agentic-Adaptive-RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

16 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Agentic Adaptive RAG with LangGraph

Python 3.10+ LangGraph License: MIT

Agentic Adaptive RAG is a production-ready framework for building self-correcting, reasoning-based LLM systems that dynamically choose between retrieval, web search, and generation.

Build intelligent RAG systems that know when to retrieve documents, search the web, or generate responses directly

An advanced Retrieval-Augmented Generation (RAG) system that intelligently integrates dynamic query analysis with self-correcting mechanisms to optimize response accuracy. Unlike traditional RAG approaches, this system adapts its strategy based on query complexity and context.

🌟 Key Features

  • 🧠 Intelligent Query Routing: Automatically determines whether to use local documents, web search, or direct LLM generation
  • πŸ“Š Multi-Stage Quality Assurance: Document relevance assessment, hallucination detection, and answer quality evaluation
  • πŸ”„ Self-Correcting Mechanisms: Automatically triggers additional retrieval or regeneration when quality thresholds aren't met
  • 🌐 Hybrid Knowledge Sources: Seamlessly combines local vector store with real-time web search
  • ⚑ Production-Ready: Built with LangGraph for robust state management and workflow orchestration

πŸ—οΈ System Architecture

The system implements three different retrieval strategies based on query complexity:

  • No Retrieval: For queries answerable from parametric knowledge
  • Single-Step Retrieval: For simple queries requiring document lookup
  • Multi-Hop Retrieval: For complex queries requiring reasoning across multiple sources
RAG Query Pipeline with-2025-12-31-083543

πŸš€ Quick Start

Prerequisites

  • Python 3.10+
  • UV package manager (recommended) or pip

Installation

  1. Clone the repository
git clone https://github.com/your-username/Agentic-Adaptive-RAG.git
cd Agentic-Adaptive-RAG
  1. Set up virtual environment with UV
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create and activate virtual environment
uv venv --python 3.10
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies
uv pip install -r requirements.txt
  1. Configure environment variables Create a .env file in the root directory:
GOOGLE_API_KEY=your_google_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here
LANGCHAIN_API_KEY=your_langchain_api_key_here
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT=https://api.smith.langchain.com
LANGCHAIN_PROJECT=agentic-rag

Getting Your API Keys

🎯 Usage

1. Initialize the Vector Database

python ingestion.py

This creates a local Chroma vector store with documents about AI agents, prompt engineering, and adversarial attacks.

2. Run the Interactive Chatbot

python main.py

3. Example Interaction

πŸ€– Advanced RAG Chatbot
Welcome! Ask me anything or type 'quit', 'exit', or 'bye' to leave.

πŸ’¬ You: what is agent memory?
πŸ€” Bot: Thinking...
---ROUTE QUESTION---
---ROUTE QUESTION TO RAG---
---RETRIEVE---
---CHECK DOCUMENT RELEVANCE TO QUESTION---
---GRADE: DOCUMENT RELEVANT---
---ASSESS GRADED DOCUMENTS---
---DECISION: GENERATE---
---GENERATE---
---CHECK HALLUCINATIONS---
---DECISION: GENERATION IS GROUNDED IN DOCUMENTS---
---GRADE GENERATION vs QUESTION---
---DECISION: GENERATION ADDRESSES QUESTION---

πŸ€– Bot: Agent memory is a key component of AI systems that enables agents to store, retrieve, and utilize information across interactions...

πŸ§ͺ Testing

Run the comprehensive test suite:

python -m pytest . -s -v

The test suite validates:

  • Document relevance grading
  • Hallucination detection
  • Query routing logic
  • Generation quality
  • End-to-end workflow

πŸ“‚ Project Structure

building-adaptive-rag/
β”œβ”€β”€ graph/
β”‚   β”œβ”€β”€ chains/                 # LLM processing chains
β”‚   β”‚   β”œβ”€β”€ tests/
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   └── test_chains.py
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ answer_grader.py    # Answer quality evaluation
β”‚   β”‚   β”œβ”€β”€ generation.py       # Response generation
β”‚   β”‚   β”œβ”€β”€ hallucination_grader.py  # Hallucination detection
β”‚   β”‚   β”œβ”€β”€ retrieval_grader.py # Document relevance scoring
β”‚   β”‚   └── router.py           # Query routing logic
β”‚   β”œβ”€β”€ nodes/                  # Workflow nodes
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ generate.py         # Generation node
β”‚   β”‚   β”œβ”€β”€ grade_documents.py  # Document grading node
β”‚   β”‚   β”œβ”€β”€ retrieve.py         # Retrieval node
β”‚   β”‚   └── web_search.py       # Web search node
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ consts.py              # System constants
β”‚   β”œβ”€β”€ graph.py               # Main workflow orchestration
β”‚   └── state.py               # State management
β”œβ”€β”€ static/                     # Assets and diagrams
β”œβ”€β”€ .env                       # Environment variables
β”œβ”€β”€ .gitignore
β”œβ”€β”€ ingestion.py               # Document ingestion pipeline
β”œβ”€β”€ main.py                    # Application entry point
β”œβ”€β”€ model.py                   # Model configurations
β”œβ”€β”€ README.md
└── requirements.txt

πŸ”§ Key Components

State Management

The system uses a GraphState TypedDict that flows through all workflow nodes:

  • question: User's input query
  • generation: LLM's response
  • web_search: Boolean flag for web search necessity
  • documents: Retrieved documents from local and web sources

Workflow Nodes

  1. Query Router: Determines optimal information source (vectorstore vs. web search)
  2. Document Retriever: Fetches relevant documents from local vector store
  3. Document Grader: Evaluates document relevance and triggers web search if needed
  4. Web Search: Queries external sources for additional information
  5. Generator: Creates responses using retrieved context
  6. Quality Graders: Assess hallucinations and answer relevance

Decision Logic

The system implements intelligent decision-making at multiple points:

  • Routes queries based on content domain
  • Grades document relevance and triggers web search for insufficient results
  • Detects hallucinations and regenerates responses when needed
  • Evaluates answer quality and seeks additional information if required

πŸ› οΈ Configuration

Model Configuration

Edit model.py to customize your language and embedding models:

# Language Model Options
from langchain_aws import ChatBedrock
from langchain_google_genai import GoogleGenerativeAIEmbeddings

llm_model = ChatBedrock(model_id="us.anthropic.claude-3-5-sonnet-20241022-v2:0")
embed_model = GoogleGenerativeAIEmbeddings(model="models/text-embedding-004")

Document Sources

Customize the knowledge base by editing the URLs in ingestion.py:

urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]

πŸ“ˆ Performance Optimization

  • Chunk Size: Optimized at 250 tokens for better embedding quality
  • Retrieval Limit: Configurable number of documents retrieved
  • Web Search Results: Limited to 3 results for efficiency
  • Caching: Persistent Chroma vector store for faster subsequent queries

πŸ”¬ Advanced Features

Quality Assurance Pipeline

  1. Document Relevance Scoring: Binary classification of document relevance
  2. Hallucination Detection: Verification that responses are grounded in evidence
  3. Answer Quality Assessment: Evaluation of response completeness and relevance

Adaptive Routing

The system intelligently routes queries based on:

  • Content domain analysis
  • Query complexity assessment
  • Available knowledge sources
  • Previous retrieval success rates

🚧 Future Enhancements

  • LLM Fallback State: Direct LLM responses for conversational queries
  • Enhanced Router: Three-way routing (vectorstore/websearch/llm_fallback)
  • Multi-Modal Support: Image and document understanding
  • Conversation Memory: Context preservation across interactions
  • Custom Evaluation Metrics: Domain-specific quality assessment

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License.

πŸ™ Acknowledgments

  • LangChain for the foundational RAG framework
  • LangGraph for stateful workflow orchestration
  • Mistral AI for inspiration and research contributions
  • Research paper: "Adaptive RAG" by Soyeong Jeong et al., 2024

πŸ“Š Citation

If you use this project in your research, please cite:

@software{agentic_adaptive_rag,
  title={Agentic Adaptive RAG with LangGraph},
  author={Mohamed Shaad},
  year={2025},
  url={https://github.com/shaadclt/Agentic-Adaptive-RAG}
}

πŸ“§ Contact


⭐ Star this repository if you find it helpful!

About

Agentic Adaptive RAG is a production-ready framework for building self-correcting, reasoning-based LLM systems that dynamically choose between retrieval, web search, and generation.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages