Skip to content

som002/gfg-downloader

Repository files navigation

GeeksforGeeks Course Downloader (down2.js)

why this project exists ?

Anyone would know how frustrating 🤯 it is when you are learning something 🧐 and your internet connection drops or is not stable, it takes away your flow of study 📚

An automated downloader that downloads ⬇️ course videos (Live and Recorded) directly from GeeksforGeeks.

Initial App Screen Initial screen of the app

🚀 Features

  • Session Caching: Saves your login status via auth.json to prevent repeated logins.
  • User Inputs Caching: Remembers your previous preferences out of convenience (user_inputs.json).
  • Headless Browser Navigation: Uses Playwright (Firefox) to simulate real human clicks.
  • Network Interception: Captures master M3U8 playlists triggered by "Play" buttons.
  • Multi-threaded Downloading: Offloads the heavy downloading and decrypting process to a background Worker Thread.
  • Concurrent Segments: Downloads multiple .ts video chunks simultaneously using p-queue to speed up the process.

🛠️ Prerequisites

  • Node.js installed
  • Dependencies installed: npm install
  • Playwright browsers installed: npx playwright install firefox

💻 Usage

Run the main script:

node down2.js

Interactive Prompts

Upon running, the script will ask you for:

  1. Quality: Choose from highest, secondHighest, medium, lowest.
  2. Simultaneous Downloads: Number of videos to download at the exact same time (e.g., 4).
  3. Concurrency: Number of .ts segments to download concurrently per video (e.g., 20).
  4. Course Name: The exact name of your course listed in your "My Courses" tab.
  5. Primary Tab: "Live" or "Chapters".
  6. Chapter Selection: Choose whether to download "All Chapters" or specify a range.

🧠 How it Works (Under the Hood)

The general architecture flows in 3 phases: Setup & Navigation, Scraping & Interception, and Queueing & Downloading.

1. Setup & Navigation (down2.js)

  • Initialization: Loads cached inputs and asks for user preferences via inquirer.
  • Authentication: Launches a Playwright Firefox browser. If auth.json is present, it uses that session. Otherwise, it prompts for a username and password, logs in, and saves the new session.
  • Navigation: Clicks to "My Courses", searches for your course, navigates to the specified tab, and figures out how many chapters exist.

2. Scraping & Interception (utils/video_link_intercepter.js)

  • Iterates over the selected range of chapters and recordings.
  • For each recording, the script clicks the "Play" button.
  • It attaches a network listener to the page to scan outgoing requests for .m3u8 or .mp4 URLs.
  • It intercepts the master playlist URL, attaches the title/folder structure, and groups this metadata securely.

3. Queueing & Multi-threaded Downloads (Worker/worker.js)

  • As URLs are successfully intercepted, they are added to an ObservableQueue.
  • Worker Thread: A background worker is spun up Worker/worker.js.
  • The worker uses p-queue to cap the number of active video downloads.
  • Processing (utils/m3u8Downloader.js):
    • MP4: If the captured link is .mp4, it streams it directly.
    • M3U8 (HLS): It fetches the playlist file to find the available resolutions, selects the one matching your Quality preference, and parses the .ts segment list.
    • It fetches the AES decryption key (if applicable).
    • It downloads the .ts segments concurrently based on your Concurrency setting, decrypts them via crypto, and merges them into a final .ts (or .mp4) file on completion.

⚠️ Important Security Note

  • The auth.json file generated contains your active GeeksforGeeks login session and live JWT tokens. It is ignored in .gitignore, but never share this file publicly.

About

makes you able to download the entire Archive of videos from your subscribed courses

Topics

Resources

License

Stars

Watchers

Forks

Contributors