Stars
A blazingly fast PDF table extraction library with python API powered by Rust
Faster Whisper transcription with CTranslate2
censusdis is a Python package for discovering, loading and analyzing, U.S. Census demographic, economic, and geographic data and metadata. It is designed to be intuitive and Pythonic, giving users …
Scrapes an ESRI MapServer REST endpoint to spit out more generally-usable geodata.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
A vector search SQLite extension that runs anywhere!
jq, but with many interoperable configuration format transcodings and interactive querying.
an editor for spoken-word audio with automatic transcription
Code and analysis for CBS News Reports documentary on sheriff misconduct.
sqlite3 in ur indexeddb (hopefully a better backend soon)
CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and …
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
A SQLite extension for efficient vector search, based on Faiss!
Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
Quickly and accurately render even the largest data.
Access large language models from the command-line
This repository will house any one-off side projects I take on on the behalf of others or small scripts I write to accomplish x or y task.
System font stack CSS organized by typeface classification for every modern operating system
Fraud detection related data and scripts to share with partners.
A minimal webapp for converting Google Docs to Markdown