A multilingual (Hindi + English) AI assistant designed for rural development governance, enabling Gram Panchayats to access scheme information, analyze village deficits, retrieve government documents, and generate actionable development insights using FastAPI, Qdrant Vector DB, Neo4j Knowledge Graph, LLMs, and a modern React + Tailwind frontend.
Demo Video: https://youtu.be/SvaaQusU9nU?si=S9WfEV2PBO07HTdO
- Overview
- Features
- Folder Structure
- How to Run Locally
- Architecture & Design Decisions
- Data Sources & Preprocessing
- RAG + Knowledge Graph Pipeline
- LLM Reasoning Flow
- Challenges & Trade-Offs
Panchayat-Sahayika is an AI-powered digital assistant designed specifically for rural governance in India, helping Gram Panchayat officials, government workers, and citizens to:
- Understand development deficits
- Discover government schemes
- Ask questions about rural indicators
- Receive data-driven recommendations
- Access extended insights using RAG + LLM reasoning
The system integrates:
- FastAPI backend
- Qdrant vector embeddings for semantic search
- LLMs (Groq / OpenAI) for query interpretation
- CSV datasets of Uttarakhand village-level deficits
- React + Tailwind frontend for clean interaction
Real-world application: We tested the prototype in Pawo Malla village (Uttarakhand) with the Gram Pradhan, gathering real on-ground feedback about water issues, connectivity problems, and village priorities.
Ask in Hindi or English β system automatically interprets meaning.
System analyzes infrastructure deficits (roads, water, health, education).
Users can ask:
βHamare gaon ke liye paani yojana kaun si hai?β
Retrieves relevant paragraphs, government documents, schemes, and datasets.
Graph nodes:
- Development Themes
- Government Schemes
- Panchayat Needs
- Infrastructure deficits
Relationships enrich LLM outputs.
LLM synthesizes scheme suggestions, cluster insights, and action strategies.
Built with React + Tailwind for lightweight rural-friendly UX.
Panchayat-Sahayika/
β
βββ backend/
β βββ data/ # Scheme texts, rural indicators, reference docs
β βββ qdrant_data/ # Preprocessed embeddings or vector payloads
β βββ services/ # RAG, embeddings, KG, query handlers
β βββ utils/ # Helper functions, preprocessors
β βββ FinderScreen.py # Infra deficit inference logic
β βββ gram.py # Panchayat-specific retrieval logic
β βββ app.py # FastAPI entry
β βββ main.py # API routing + server startup
β βββ requirements.txt
β βββ uttarakhand_infra_deficits.csv
β
βββ public/
β
βββ src/
β βββ components/ # Chat UI, cards, loaders
β βββ pages/ # Main dashboard, chat page
β βββ styles/ # Tailwind configs
β βββ utils/ # Frontend helpers
β
βββ index.html
βββ package.json
βββ tailwind.config.js
βββ README.md
cd backend
pip install -r requirements.txtCreate .env:
# --- Qdrant Vector DB ---
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_key
EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
# --- LLM Keys ---
GROQ_API_KEY=your_groq_api
OPENAI_API_KEY=your_openai_api
# --- (Optional) Model Settings ---
MODEL_NAME=llama-3-8b
TEMPERATURE=0.2
uvicorn main:app --reloadcd frontend
npm install
npm run devSet .env:
VITE_API_URL=http://127.0.0.1:8000
App opens at:
http://localhost:5173
Lightweight, modular, async-ready.
To store embeddings for:
- schemes
- documents
- development indicators
Captures structured relationships between:
- Themes
- Schemes
- Infrastructure deficits
- Gram Panchayat needs
Uses uttarakhand_infra_deficits.csv to compute village-level gaps.
Groq β Fast inference OpenAI β fallback + improved quality
flowchart TD
A[User Query<br>Hindi/English] --> B[Frontend Chat UI]
B --> C[FastAPI Backend]
C --> D[Preprocessing<br>Language Detection, Normalization]
D --> E[Qdrant Semantic Search]
D --> F[Neo4j Knowledge Graph]
D --> G[Uttarakhand Infra Deficit CSV]
E --> H[RAG Context Builder]
F --> H
G --> H
H --> I[LLM Reasoning Layer<br>Groq / OpenAI]
I --> B
File: uttarakhand_infra_deficits.csv
Contains metrics like:
- Water supply status
- Road access
- Healthcare centers
- Digital connectivity
- Education infrastructure
Stored in /backend/data/.
- Text cleaning
- Stopword handling
- Semantic chunking
- Embedding generation
- KG node + edge creation
- Detect language (Hindi/English)
- Identify keywords (water, roads, health)
Returns top N relevant chunks.
Via FinderScreen.py.
LLM merges:
- semantic context
- graph knowledge
- village deficit data
To generate a final actionable recommendation.
User Query β Parse Intent β Retrieve Relevant Schemes β Fetch Village Deficits
β Expand using Knowledge Graph β LLM synthesis β Final Recommendation
Example:
βHamare gaon me paani ki dikkat hai. Kya sujhav hai?β
LLM Output includes:
- identified deficits from CSV
- related schemes like Jal Jeevan Mission
- local insights
- actionable steps
Solution: Hand-curated CSV for Uttarakhand.
Trade-off: Store compact MiniLM embeddings.
Trade-off: Skip KG rebuild on every startup to save memory.
Trade-off: Simple rules + embeddings instead of a dedicated NLU model.
Solution: fuzzy matching + manual correction list.