An End-to-End Evaluation Framework for Entity Resolution Systems
-
Updated
Dec 3, 2023 - Python
An End-to-End Evaluation Framework for Entity Resolution Systems
Model Context Protocol Benchmark Runner
A metrics library to evaluate vision language models with a pytorch eco system.
An open-source Streamlit web app to generate beautiful confusion matrices for multi-class machine learning models. Supports numeric and string labels, CSV upload, manual label entry, custom color maps, and displays evaluation metrics like Accuracy, Precision, Recall, and F1-score. Users can download the confusion matrix as an image.
Safety-first legal NLP system with hierarchical long-document processing, deterministic inference, clause extraction, and rule-based risk engine — built for traceability and deployment constraints.
Enterprise-grade machine learning observability platform that detects data drift, concept drift, and performance degradation in production models. Features statistical drift detection (KS test, PSI), real-time alerting, Redis caching, and FastAPI backend.
Small, educational project that shows how to build a **minimal RAG pipeline** with a **simple evaluation loop**
Add a description, image, and links to the ml-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the ml-evaluation topic, visit your repo's landing page and select "manage topics."