Skip to content

raphaelguye/rag_rock_reg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rag_rock_reg

A lightweight Retrieval-Augmented Generation (RAG) app to query competition rules using semantic search and a local Mistral LLM via Ollama.

What it does

  • 📄 Extracts text from rulebook PDFs
  • ✂️ Splits text into semantic chunks (~200 words)
  • 🧠 Embeds the chunks using MiniLM via sentence-transformers
  • 🗂️ Stores them in a FAISS vector index
  • 💬 Retrieves relevant chunks and sends them to a local Mistral LLM for answering

Quickstart

1. Set up the Python environment

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

2. Add your regulation PDFs

Place your WRRC PDFs in a folder called regulations_pdfs in the project root:

mkdir regulations_pdfs
# add your PDFs here

3. Build the index

python index_pdfs.py

This script will:

  • Extract and chunk text
  • Embed chunks using MiniLM
  • Create faiss_index.idx and index_metadata.pkl

5. Run Ollama with Mistral

In a separate terminal:

ollama serve
ollama run mistral

This will start the Ollama server and load the Mistral model (downloaded if needed).

6. Ask questions

python query_rag.py

Example:

Ask a question (or type 'exit'): How many categories are defined in the competition rules?

🧰 Tech stack

About

A PoC of a Retrieval-Augmented Generation (RAG) app to query competition rules using semantic search and a local Mistral LLM (via Ollama)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages