Skip to content

rootkiller6788/PRForge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PRForge

AI diff evaluation and merge gate for code written by AI coding agents.

PRForge is not a general coding chatbot. It is a review board that evaluates whether an AI-generated patch should be approved, revised, or blocked before merge.

Requirement + acceptance criteria + AI diff
  -> Intent-Match Agent
  -> Diff-Scope Agent
  -> Fake Test Detector
  -> Security Regression Gate
  -> System Test Planner
  -> GitHub PR Bot output
  -> AI Coder Scoreboard
  -> merge verdict

Current Status

PRForge has completed the AI-code-evaluation MVP through the scoreboard layer.

  • Phase 1: AI Diff Evaluation Core complete.
  • Phase 2: Fake Test Detector complete.
  • Phase 3: Security Regression Gate complete.
  • Phase 4: System Test Planner complete.
  • Phase 6: GitHub PR Bot / CI Gate output complete.
  • Phase 7: AI Coder Scoreboard complete.

The recommended product route is still:

AI diff evaluation
  -> fake test detection
  -> security regression gate
  -> real command evidence
  -> GitHub CI gate
  -> long-term AI coder scoreboard

What PRForge Evaluates

Inputs:

  • original requirement,
  • acceptance criteria,
  • AI-generated diff or patch,
  • changed files,
  • repo path,
  • AI tool/source,
  • requirement type.

Outputs:

  • intent_match_score
  • diff_scope_risk
  • ai_code_score
  • merge_verdict
  • required_fixes
  • test_truth_score
  • fake_test_risks
  • missing_test_cases
  • required_test_fixes
  • security_score
  • security_findings
  • risk_level
  • safe_validation_plan
  • block_merge_reason
  • system_test_matrix
  • must_run_checks
  • missing_system_tests
  • release_risk
  • pr_review_comment
  • check_run_summary
  • inline_annotations
  • merge_gate_result
  • ai_coder_scoreboard

Agent Modules

AI Diff Evaluation Core

  • Intent-Match Agent: checks whether the patch actually solves the request.
  • Diff-Scope Agent: flags unrelated or risky file changes.
  • Merge-Judge Agent: produces approve, request_changes, or block_merge.
  • Evidence Board: stores structured findings used by the verdict.

Fake Test Detector

  • Test-Truth Agent: identifies tests without meaningful assertions.
  • Coverage-Gap Agent: detects missing tests for new logic.
  • Regression-Test Agent: flags happy-path-only, over-mocked, snapshot-heavy, or misleading tests.

Security Regression Gate

  • Security-Regression Agent: catches secrets, injection risks, unsafe files, logging leaks.
  • Dependency-Risk Agent: flags dependency changes and package risk.
  • Auth/Permission Agent: catches auth bypass and permission regressions.
  • Safe-Validation Agent: outputs defensive validation steps only.

High or critical security regressions can directly block merge.

System Test Planner

Generates merge-readiness checks:

  • API contract test,
  • E2E happy path,
  • E2E negative path,
  • database migration check,
  • backward compatibility check,
  • permission/auth regression test,
  • performance smoke test,
  • rollback readiness check,
  • observability/logging check.

GitHub PR Bot / CI Gate

Produces:

  • PR review comment,
  • check run summary,
  • inline annotations,
  • merge gate result.

AI Coder Scoreboard

Tracks quality by AI source:

  • Claude Code,
  • Codex,
  • Cursor Agent,
  • OpenHands,
  • unknown/custom tools.

Tracked dimensions include pass rate, rework rate, security findings, fake-test rate, average fix time, and final merge rate.

Backend

Run from:

cd D:\ir\PRForge-main\PRForge-main

Create a venv if dependencies are needed:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install fastapi uvicorn pydantic

Start the API:

$env:PYTHONPATH="backend"
python -m uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

Frontend

cd D:\ir\PRForge-main\PRForge-main\frontend
npm install
npm run dev

The UI is a review workbench for pasting AI diffs, criteria, changed files, GitHub PR metadata, and AI tool source.

API

  • GET /health
  • POST /api/v1/run
  • POST /api/v1/github/pr-review
  • GET /api/v1/scoreboard

Example Payload

{
  "source_type": "ai_diff",
  "title": "Evaluate AI generated diff",
  "description": "Judge whether an AI patch matches the request and is safe to merge.",
  "original_requirement": "Add replay timeline to council output.",
  "acceptance_criteria": ["Return timeline", "Return evidence"],
  "changed_files": ["backend/app/engine.py"],
  "repo_path": ".",
  "ai_tool_source": "Codex",
  "requirement_type": "api",
  "ai_diff": "diff --git a/backend/app/engine.py b/backend/app/engine.py ..."
}

Persistence

The scoreboard module keeps historical AI-coder quality records under backend data storage when evaluations are run.

Validation

Verified locally with backend compilation and manual council runs. pytest was not installed in the local environment during validation, so unit tests require installing test dependencies first.

About

AI engineering board that turns issues, PRs, and repository reviews into a multi-agent workflow for planning, coding, testing, review, security checks, and merge recommendations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors