Skip to content
Change the repository type filter

All

    Repositories list

    • scrapy

      Public
      Scrapy, a fast high-level web crawling & scraping framework for Python.
      Python
      11k60k458195Updated Feb 23, 2026Feb 23, 2026
    • itemadapter

      Public
      Common interface for data container classes
      Python
      1368102Updated Feb 23, 2026Feb 23, 2026
    • w3lib

      Public
      Python library of web-related functions
      Python
      107414105Updated Feb 19, 2026Feb 19, 2026
    • Library to populate items using XPath and CSS with a convenient API
      Python
      1647184Updated Jan 29, 2026Jan 29, 2026
    • queuelib

      Public
      Collection of persistent (disk-based) and non-persistent (memory-based) queues for Python
      Python
      5529242Updated Jan 29, 2026Jan 29, 2026
    • protego

      Public
      A pure-Python robots.txt parser with support for modern conventions.
      DIGITAL Command Language
      298070Updated Jan 29, 2026Jan 29, 2026
    • parsel

      Public
      Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
      Python
      1541.3k3112Updated Jan 29, 2026Jan 29, 2026
    • cssselect

      Public
      CSS Selectors for Python
      Python
      62307184Updated Jan 29, 2026Jan 29, 2026
    • scrapy-lint

      Public
      A linter for Scrapy projects.
      Python
      421420Updated Jan 27, 2026Jan 27, 2026
    • scrapyd

      Public
      A service daemon to run Scrapy spiders
      Python
      5773.1k60Updated Jan 16, 2026Jan 16, 2026
    • Command line client for Scrapyd server
      Python
      14577850Updated Dec 15, 2025Dec 15, 2025
    • Sphinx extension for documentation in the Scrapy ecosystem
      Python
      1100Updated Sep 16, 2025Sep 16, 2025
    • A CLI for benchmarking Scrapy.
      Python
      153261Updated Jun 28, 2025Jun 28, 2025
    • The scrapy.org website
      HTML
      1456511Updated May 8, 2025May 8, 2025
    • Python 3.8+ library to build HTTP requests out of HTML forms
      Python
      3420Updated Mar 21, 2025Mar 21, 2025
    • loginform

      Public
      Fill HTML login forms automatically
      Python
      84278113Updated Apr 24, 2024Apr 24, 2024
    • https://mimesniff.spec.whatwg.org/ implementation for Python
      Python
      21300Updated Jan 16, 2024Jan 16, 2024
    • quotesbot

      Public
      This is a sample Scrapy project for educational purposes
      Python
      7811.4k27Updated Nov 29, 2023Nov 29, 2023
    • booksbot

      Public
      A crawler for http://books.toscrape.com
      Python
      9734204Updated Aug 8, 2023Aug 8, 2023
    • scrapely

      Public
      A pure-python HTML screen-scraping library
      HTML
      2741.9k266Updated Apr 4, 2022Apr 4, 2022
    • scrapy-itemloader

      Public archive
      [Archived] Library to populate Scrapy items using XPath and CSS with a convenient API
      Python
      7620Updated May 5, 2020May 5, 2020
    • scurl

      Public
      Performance-focused replacement for Python urllib
      Python
      621100Updated Oct 2, 2018Oct 2, 2018
    • url component from Chromium source code, forked from https://chromium.googlesource.com/chromium/src/url
      C++
      2300Updated Aug 7, 2018Aug 7, 2018
    • base component forked from Chromium source https://chromium.googlesource.com/chromium/src/base/
      C++
      3700Updated Jul 31, 2018Jul 31, 2018
    • dirbot

      Public
      Scrapy project to scrape public web directories (educational) [DEPRECATED]
      Python
      1.1k1.6k00Updated Oct 27, 2017Oct 27, 2017
    • Codespeed for scrapy-bench
      Python
      2200Updated Aug 28, 2017Aug 28, 2017
    • A fork of http://pydispatcher.sourceforge.net/ with PyPy support
      Python
      41610Updated Jul 3, 2017Jul 3, 2017
    • slybot

      Public
      5722450Updated Apr 27, 2015Apr 27, 2015
    • GSoC2014 - Scrapy Integration tests project
      Shell
      3300Updated Mar 18, 2014Mar 18, 2014