Skip to content

John-777-tech/CFT-RAG-2025

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CFT-RAG-2025

CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

1. Code Usage

Configuration

pip install uv
uv sync
pip install lab-1806-vec-db==0.2.3 python-dotenv sentence-transformers openai pybloom_live
export HF_ENDPOINT=https://hf-mirror.com
python -m spacy download zh_core_web_sm

Complete Process of CFT-RAG

Arguments:

  • vec-db-key: The key for the vector database.
  • tree-num-max: The maximum number of trees to build.
  • entities-file-name: The name of the entities file.
  • search-method: The search method to use:
    • 0 for Vector Database Only
    • 1 for Naive Tree-RAG
    • 2 for Bloom Filter Search in Tree-RAG
    • 5 for improved Bloom Filter Search in Tree-RAG
    • 7 for Cuckoo Filter in Tree-RAG
    • 8 for Approximate Nearest Neighbors in Tree-RAG
    • 9 for Approximate Nearest Neighbors in Graph-RAG
  • node-num-max: The maximum number of nodes to build.

Example:

python main.py --tree-num-max 50 --search-method 7

Test Cuckoo Filter

Individually testing the performance of the improved Cuckoofilter and the sorting results:

python test_tree.py

2. Reference

TRAG-cuckoofilter is based on https://github.com/efficient/cuckoofilter.

Use of datasets:

Dataset MedQA AESLC DART Rui'an People's Hospital
Scale Large Medium Medium Small
Source https://github.com/jind11/MedQA https://huggingface.co/datasets/Yale-LILY/aeslc https://github.com/Yale-LILY/dart https://www.rahos.gov.cn

About

CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 28.3%
  • Makefile 20.5%
  • C++ 13.8%
  • Shell 13.2%
  • Rust 9.9%
  • C 6.5%
  • Other 7.8%