🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
-
Updated
Dec 19, 2025 - TypeScript
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Turn any website into clean data pipelines & structured APIs in minutes!
Self-hosted webscraper.
A JavaScript library for generating random user agents with data that's updated daily.
AI-powered web monitoring platform. Create automated scouts that search the web and send email alerts when they find what you're looking for.
A Playwright-based Node.js tool that bypasses search engine anti-scraping mechanisms to execute Google searches. Local alternative to SERP APIs with MCP server integration.
Model Context Protocol (MCP) Server for Graphlit Platform
⚡ Ayakashi.io - The next generation web scraping framework
n8n node to interact with browserless instance
estela, an elastic web scraping cluster 🕸
Turn topics, links, and files into AI-generated research notebooks — summarize, explore, and ask anything.
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
Send webhook messages from your browser (and so much more)
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
CLI tool for agents to quickly access browser telemetry (DOM, network, console) via Chrome DevTools Protocol.
📥 Bot for downloading any media from Instagram, Twitter and videos from TikTok and Youtube
A VSCode extension that generates markdown documentation from web pages and GitHub repositories.
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Paperback - Vietnamese - Extensions
Add a description, image, and links to the web-scraping topic page so that developers can more easily learn about it.
To associate your repository with the web-scraping topic, visit your repo's landing page and select "manage topics."