Skip to content

fix: use webhook payload SHAs in list_changed_files to avoid race condition#1107

Merged
myakove merged 6 commits into
mainfrom
fix/issue-1096-webhook-payload-shas
Jun 10, 2026
Merged

fix: use webhook payload SHAs in list_changed_files to avoid race condition#1107
myakove merged 6 commits into
mainfrom
fix/issue-1096-webhook-payload-shas

Conversation

@myakove

@myakove myakove commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

PR Summary by Qodo

Fix changed-files diff race by using webhook payload base/head SHAs
🐞 Bug fix 🧪 Tests 🕐 20-40 Minutes

Grey Divider

Walkthroughs

User Description

Summary

Replace live PyGithub API calls with webhook payload SHAs in list_changed_files() to eliminate a race condition where base branch receives new commits between clone and API call.

  • Prefer webhook payload SHAs for pull_request events (no race condition)
  • Fall back to PullRequest API object for non-PR events (issue_comment, check_run, etc.)
  • Store pr_base_sha/pr_head_sha on GithubWebhook instance during process()
  • Remove pull_request parameter from initialize() and list_changed_files()
  • Add symmetric guards for both base and head SHA validation

Closes #1096

AI Description
• Persist PR base/head SHAs from webhook payload to prevent base-branch race conditions.
• Fall back to PullRequest API SHAs for non-PR webhook event types.
• Simplify OWNERS handler initialization by removing PullRequest parameter plumbing.
Diagram
graph TD
  A["GitHub webhook payload"] --> B["GithubWebhook.process"] --> C["Store PR base/head SHAs"] --> D[("Local clone")] --> E["OwnersFileHandler.list_changed_files"] --> F["git diff --name-only"]
  B --> G["PullRequest API (fallback)"] --> C
Loading
High-Level Assessment

Using webhook payload SHAs for pull_request events is the most reliable way to keep the local clone and diff base/head aligned and eliminate the observed race. Alternatives like always querying live PR/base refs or diffing against the current base branch would reintroduce timing drift; passing SHAs through additional parameters instead of storing on the per-request GithubWebhook instance would add plumbing without changing the core correctness.

Grey Divider

File Changes

Bug fix (2)
github_api.py Persist PR base/head SHAs from payload with API fallback +21/-5

Persist PR base/head SHAs from payload with API fallback

• Stores pr_base_sha/pr_head_sha on the GithubWebhook instance during process(), preferring pull_request payload SHAs to avoid timing drift. Falls back to PullRequest base.sha/head.sha for non-pull_request event payloads and updates OwnersFileHandler initialization calls to the new signature.

webhook_server/libs/github_api.py


owners_files_handler.py Read diff SHAs from GithubWebhook; remove PullRequest parameter +10/-11

Read diff SHAs from GithubWebhook; remove PullRequest parameter

• Removes the PullRequest parameter from initialize() and list_changed_files(). list_changed_files() now reads base/head SHAs from github_webhook.pr_base_sha/pr_head_sha and documents the race-condition rationale.

webhook_server/libs/handlers/owners_files_handler.py


Tests (2)
test_github_api.py Add tests for SHA storage from payload and API fallback +107/-0

Add tests for SHA storage from payload and API fallback

• Introduces coverage verifying payload SHAs are preferred for pull_request events and that non-PR events fall back to PullRequest API SHAs. Mocks cloning/handler initialization to focus assertions on SHA persistence behavior.

webhook_server/tests/test_github_api.py


test_owners_files_handler.py Update tests for new handler signatures and SHA source +8/-10

Update tests for new handler signatures and SHA source

• Updates initialize() and list_changed_files() tests to match the removed PullRequest parameter. Adjusts list_changed_files test setup to provide SHAs via the mocked GithubWebhook instance.

webhook_server/tests/test_owners_files_handler.py


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (2) 📎 Requirement gaps (0)

Context used

Grey Divider


Action required

1. Ignored SHA fetch failure ✓ Resolved 🐞 Bug ☼ Reliability
Description
In _clone_repository(), when a payload SHA is missing locally the code runs `git fetch origin
{sha}` but does not check the returned success flag, so a failed fetch is silently ignored and later
git operations can still fail with less actionable errors.
Code

webhook_server/libs/github_api.py[R408-413]

+                            await run_command(
+                                command=f"{git_cmd} fetch origin {sha}",
+                                log_prefix=self.log_prefix,
+                                redact_secrets=[github_token],
+                                mask_sensitive=self.mask_sensitive,
+                            )
Evidence
The new code path explicitly fetches missing SHAs but drops the fetch result, even though
run_command() provides a boolean success signal that the rest of _clone_repository() uses to
fail fast on git errors.

webhook_server/libs/github_api.py[392-413]
webhook_server/utils/helpers.py[301-403]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`_clone_repository()` attempts to recover missing payload SHAs by fetching them, but it ignores the `run_command()` return value for that fetch. If the fetch fails, the clone step continues and downstream code may still fail due to missing objects.

### Issue Context
`run_command()` returns `(success, stdout, stderr)` and indicates failure via `success=False` when return code is non-zero.

### Fix Focus Areas
- webhook_server/libs/github_api.py[392-413]

### Suggested fix
- Capture `(rc_fetch, out, err)` from the explicit `git fetch origin {sha}` call.
- If `not rc_fetch`, log an error (redacting as needed) and raise `RuntimeError` immediately so failures are surfaced at the correct step (the recovery fetch), rather than later during diffing.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. pr_base_sha/pr_head_sha typing incomplete ✓ Resolved 📘 Rule violation ⚙ Maintainability
Description
GithubWebhook sets pr_base_sha/pr_head_sha only inside process() (and only one branch uses
inline annotations), but the attributes are not declared on the class or in __init__, which breaks
strict mypy expectations and can cause attr-defined errors where these fields are accessed (e.g.,
in OwnersFileHandler). This reduces type safety and can fail CI type checks.
Code

webhook_server/libs/github_api.py[R599-614]

+            # Store PR SHAs: prefer webhook payload (avoids race condition with live API)
+            # Fall back to PullRequest object for non-pull_request events (issue_comment, check_run, etc.)
+            pr_payload = self.hook_data.get("pull_request")
+            if (
+                isinstance(pr_payload, dict)
+                and isinstance(pr_payload.get("base"), dict)
+                and isinstance(pr_payload.get("head"), dict)
+            ):
+                # pull_request event — base.sha and head.sha guaranteed by GitHub webhook spec
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
+                self.pr_base_sha, self.pr_head_sha = await asyncio.gather(
+                    github_api_call(lambda: pull_request.base.sha, logger=self.logger, log_prefix=self.log_prefix),
+                    github_api_call(lambda: pull_request.head.sha, logger=self.logger, log_prefix=self.log_prefix),
+                )
Evidence
Compliance rule 9 requires complete type hints under strict mypy. The PR introduces new instance
attributes (pr_base_sha/pr_head_sha) but does not declare them in __init__ or as class-level
annotations; instead they are conditionally created in process(), and only the payload branch uses
inline annotations, which is not sufficient for strict attribute typing.

CLAUDE.md: Use Complete Type Hints (Mypy Strict Mode)
webhook_server/libs/github_api.py[599-614]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.pr_base_sha`/`GithubWebhook.pr_head_sha` are introduced but not declared as attributes on the class (or in `__init__`), and only one assignment branch includes an inline annotation. In strict mypy mode this commonly triggers `attr-defined`/incomplete attribute typing issues.

## Issue Context
These attributes are later read by `OwnersFileHandler.list_changed_files()`, so the class should explicitly declare them (preferably via class-level annotations without fake default values).

## Fix Focus Areas
- webhook_server/libs/github_api.py[109-134]
- webhook_server/libs/github_api.py[599-614]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Defensive pr_payload.get checks 📘 Rule violation ≡ Correctness
Description
The new SHA selection logic uses .get()/isinstance()/"sha" in ... guards and silently falls
back to live PR API SHAs when webhook payload fields are missing. This hides malformed
pull_request payloads instead of failing fast per the webhook spec expectations.
Code

webhook_server/libs/github_api.py[R601-613]

+            pr_payload = self.hook_data.get("pull_request", {})
+            if (
+                isinstance(pr_payload, dict)
+                and "sha" in pr_payload.get("base", {})
+                and "sha" in pr_payload.get("head", {})
+            ):
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
+                self.pr_base_sha, self.pr_head_sha = await asyncio.gather(
+                    github_api_call(lambda: pull_request.base.sha, logger=self.logger, log_prefix=self.log_prefix),
+                    github_api_call(lambda: pull_request.head.sha, logger=self.logger, log_prefix=self.log_prefix),
+                )
Evidence
The checklist requires failing fast on malformed webhook payloads and discourages unnecessary
defensive programming for required webhook fields. The added logic uses .get() and key-existence
checks and then falls back to API-derived SHAs, which prevents malformed payloads from surfacing as
errors.

CLAUDE.md: Webhook Payload Handling Must Follow the GitHub Webhook Specification (Fail on Malformed Payloads)
CLAUDE.md: Avoid Unnecessary Defensive Programming (Fail-Fast by Default)
webhook_server/libs/github_api.py[601-613]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The code defensively checks for `pull_request.base.sha`/`pull_request.head.sha` in the webhook payload and falls back to live API values when missing. For `pull_request` events, these fields are required and malformed payloads should fail fast (raise `KeyError`/`TypeError`) rather than being silently masked.

## Issue Context
This defensive fallback can hide webhook payload/schema problems and undermines the guarantee that required webhook fields are present per spec.

## Fix Focus Areas
- webhook_server/libs/github_api.py[601-613]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. Payload SHAs not fetched 🐞 Bug ☼ Reliability
Description
OwnersFileHandler.list_changed_files() now diffs using GithubWebhook.pr_base_sha/pr_head_sha from
the webhook payload, but _clone_repository() still fetches only the current base ref and current PR
ref from the live PullRequest object. If the payload SHAs aren’t present in the local clone (e.g.,
PR retarget/force-push before processing), git diff fails and list_changed_files raises
RuntimeError, aborting initialization/processing.
Code

webhook_server/libs/handlers/owners_files_handler.py[R102-109]

+        # SHAs are stored on the GithubWebhook instance during process():
+        # - From webhook payload for pull_request events (avoids race condition with live API)
+        # - From PullRequest object for other event types (issue_comment, check_run, etc.)
+        base_sha = self.github_webhook.pr_base_sha
+        head_sha = self.github_webhook.pr_head_sha

        # Run git diff command on cloned repository
        # Quote clone_repo_dir to handle paths with spaces or special characters
Evidence
The PR changes list_changed_files to use stored webhook SHAs, but the clone/fetch logic still
fetches refs based on the live PullRequest object (base ref + PR head ref). Since git diff requires
both SHAs to resolve locally and the code raises RuntimeError on failure, a mismatch between fetched
refs and payload SHAs breaks processing.

webhook_server/libs/github_api.py[595-614]
webhook_server/libs/handlers/owners_files_handler.py[102-147]
webhook_server/libs/github_api.py[360-388]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`OwnersFileHandler.list_changed_files()` uses `github_webhook.pr_base_sha/pr_head_sha` (often sourced from webhook payload) to run `git diff`. However, `_clone_repository()` does not ensure those exact SHAs exist in the local clone; it fetches the current base branch ref and current PR ref from the live PR object.

When the payload SHAs aren’t reachable from the fetched refs, `git diff {base}...{head}` fails and the handler raises `RuntimeError`, stopping webhook processing.

### Issue Context
- SHAs are now stored from webhook payload in `GithubWebhook.process()`.
- Cloning/fetching still uses `pull_request.base.ref` and `refs/pull/<n>/head`, which may not contain the payload SHAs.

### Fix Focus Areas
- webhook_server/libs/github_api.py[595-620]
- webhook_server/libs/github_api.py[360-388]
- webhook_server/libs/handlers/owners_files_handler.py[102-147]

### Implementation sketch
1. After setting `self.pr_base_sha/self.pr_head_sha` (and before invoking `OwnersFileHandler.initialize()`), ensure the clone contains these commits:
  - Option A (preferred): update `_clone_repository()` to fetch by SHA in addition to refs, e.g. `git fetch origin <base_sha> <head_sha>` (or `git fetch origin <sha>` for each) when `self.pr_base_sha/pr_head_sha` are set.
  - Option B: in `list_changed_files()`, verify both SHAs exist via `git cat-file -e <sha>^{commit}`; if missing, `git fetch origin <sha>` and retry diff.
2. Keep the existing webhook-payload preference, but make local availability deterministic so `git diff` doesn’t fail due to missing objects.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Invalid SHA skipped silently ✓ Resolved 📘 Rule violation ☼ Reliability
Description
In _clone_repository(), payload SHAs are validated with _SHA_PATTERN.match(sha) but invalid
values only trigger a warning and processing continues, allowing bad pr_base_sha/pr_head_sha to
reach later git operations and fail in harder-to-debug ways. Additionally, because the regex check
assumes sha is a string, malformed webhook payloads that provide non-string SHAs can raise
TypeError and abruptly abort processing instead of failing fast with a clear, meaningful
exception.
Code

webhook_server/libs/github_api.py[R399-401]

+                        if not _SHA_PATTERN.match(sha):
+                            self.logger.warning(f"{self.log_prefix} Invalid SHA format: {sha[:20]}, skipping fetch")
+                            continue
Evidence
PR Compliance ID 6 requires raising meaningful exceptions when required data is missing/invalid
rather than masking it. The code path in _clone_repository() explicitly checks the stored SHAs
against _SHA_PATTERN yet chooses to only log and continue, which defers discovery of invalid
required inputs until later operations (e.g., git diff); moreover, _SHA_PATTERN.match(sha) will
throw TypeError if sha is not a str, and since process() assigns pr_payload["base"]["sha"]
/ pr_payload["head"]["sha"] directly from the webhook payload without validating type/format,
malformed payload data can propagate into this regex call and crash processing.

CLAUDE.md: Fail Fast: Do Not Return Fake Default Values to Hide Missing or Invalid Data
webhook_server/libs/github_api.py[399-401]
webhook_server/libs/github_api.py[394-424]
webhook_server/libs/github_api.py[635-650]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_clone_repository()` validates `self.pr_base_sha`/`self.pr_head_sha` but (a) only logs a warning and continues when the SHA format is invalid, allowing invalid required data to be used later and causing harder-to-debug failures, and (b) assumes the values are strings, so malformed webhook payloads that provide non-string SHAs can cause `re.Pattern.match()` to raise `TypeError`. Update the flow to fail fast with a clear exception for missing/invalid SHAs and ensure SHA validation is robust to non-string inputs.

## Issue Context
`pr_base_sha`/`pr_head_sha` are required for later diff computation, so invalid values should be rejected immediately per the fail-fast requirement (PR Compliance ID 6) rather than tolerated. Currently, `process()` stores SHAs directly from `hook_data["pull_request"]["base"]["sha"]` and `["head"]["sha"]` without validating type/format before `_clone_repository()` uses them; `_clone_repository()` then calls `_SHA_PATTERN.match(sha)` and, on invalid format, logs and continues, which can hide the true root cause until later git operations.

## Fix Focus Areas
- webhook_server/libs/github_api.py[394-402]
- webhook_server/libs/github_api.py[399-401]
- webhook_server/libs/github_api.py[635-650]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Unvalidated SHA in git ✓ Resolved 🐞 Bug ⛨ Security
Description
_clone_repository() interpolates pr_base_sha/pr_head_sha directly into git cat-file and `git
fetch` arguments without validating they are commit SHAs; when webhook signature verification is not
configured, a crafted webhook payload could supply a non-SHA value (e.g., refspec/option-like) and
trigger unexpected fetch behavior or avoidable failures.
Code

webhook_server/libs/github_api.py[R398-413]

+                        rc_check, _, _ = await run_command(
+                            command=f"{git_cmd} cat-file -e {sha}^{{commit}}",
+                            log_prefix=self.log_prefix,
+                            verify_stderr=False,
+                            mask_sensitive=self.mask_sensitive,
+                        )
+                        if not rc_check:
+                            self.logger.debug(
+                                f"{self.log_prefix} Payload SHA {sha[:7]} not in clone, fetching explicitly"
+                            )
+                            await run_command(
+                                command=f"{git_cmd} fetch origin {sha}",
+                                log_prefix=self.log_prefix,
+                                redact_secrets=[github_token],
+                                mask_sensitive=self.mask_sensitive,
+                            )
Evidence
The new code uses sha directly inside git command strings, and the request handler only enforces
GitHub signature verification when a secret is configured, meaning deployments without
webhook-secret can accept arbitrary payload values for these fields.

webhook_server/libs/github_api.py[392-413]
webhook_server/app.py[384-393]
webhook_server/utils/helpers.py[349-355]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The payload-provided SHA values are used verbatim in git commands. If the webhook endpoint is running without signature verification configured, these fields are not guaranteed to be trustworthy and may not be valid commit hashes.

### Issue Context
The app only verifies webhook signatures when `webhook-secret` is configured; otherwise it proceeds without verification.

### Fix Focus Areas
- webhook_server/libs/github_api.py[392-413]

### Suggested fix
- Before running `cat-file` / `fetch`, validate each `sha` with a strict pattern (e.g., `re.fullmatch(r"[0-9a-f]{7,40}", sha)`), and raise a clear error if invalid.
- Optionally add an explicit `--` before the object/ref argument in git invocations (where supported) to prevent option-like strings being interpreted as flags.
- Consider `shlex.quote(sha)` when building the command string to ensure whitespace/special characters can’t be re-tokenized unexpectedly by `shlex.split()`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Empty default pr_*_sha 📘 Rule violation ⚙ Maintainability
Description
pr_base_sha/pr_head_sha are initialized to empty strings, which are placeholder defaults that
can mask missing/invalid state if these fields are used before being set. This weakens fail-fast
behavior and can lead to confusing downstream errors (e.g., git diff against empty SHAs).
Code

webhook_server/libs/github_api.py[R118-119]

+        self.pr_base_sha: str = ""
+        self.pr_head_sha: str = ""
Evidence
The compliance rule forbids fabricated defaults such as empty strings for required/architecturally
guaranteed data; the new code initializes both SHA fields to "", which can conceal missing
initialization and delay failures to later stages.

CLAUDE.md: Eliminate unnecessary defensive programming; use fail-fast errors instead of fake defaults
webhook_server/libs/github_api.py[118-119]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.pr_base_sha`/`pr_head_sha` are initialized with fake defaults (`""`). Per compliance, missing required data should fail fast rather than being masked by placeholder values.

## Issue Context
These SHAs are intended to be populated during `process()`. Initializing them to `None` (and validating before use) prevents accidental use of empty SHAs and makes failures explicit.

## Fix Focus Areas
- webhook_server/libs/github_api.py[118-119]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
8. Missing sha key validation 🐞 Bug ☼ Reliability
Description
In GithubWebhook.process(), the branch that prefers webhook payload SHAs only checks that base/head
are dicts, then directly indexes pr_payload["base"]["sha"] and pr_payload["head"]["sha"]. If a
non-conforming payload includes base/head dicts but omits "sha", webhook processing will raise
KeyError and abort instead of falling back to the PullRequest API SHAs.
Code

webhook_server/libs/github_api.py[R610-611]

+                self.pr_base_sha = pr_payload["base"]["sha"]
+                self.pr_head_sha = pr_payload["head"]["sha"]
Evidence
The code path checks base/head are dicts but does not verify the presence of the sha key
before direct indexing, which can raise KeyError when sha is absent.

webhook_server/libs/github_api.py[601-616]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`GithubWebhook.process()` directly indexes `pr_payload["base"]["sha"]` / `pr_payload["head"]["sha"]` after only checking that `base` and `head` are dicts. If `sha` is missing in either dict (malformed/partial payload), this raises `KeyError` and aborts processing.

## Issue Context
The code intends to prefer webhook payload SHAs to avoid a race, while falling back to PullRequest API SHAs for non-PR/partial payloads.

## Fix Focus Areas
- webhook_server/libs/github_api.py[603-616]

## Suggested fix
- Extend the `if` condition to also validate that:
 - `"sha" in pr_payload["base"]` and `"sha" in pr_payload["head"]`
 - and that both values are non-empty strings.
- Otherwise, take the existing fallback path that fetches SHAs from the `PullRequest` object (or raise a clear, logged error if you prefer fail-fast).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. Fragile payload SHA check 🐞 Bug ☼ Reliability
Description
GithubWebhook.process uses membership tests like 'sha' in pr_payload.get('base', {}) without
validating that base/head are dicts; if those fields are present but non-mapping (e.g., None),
the membership test can raise TypeError and abort processing. Because webhook signature verification
is conditional, malformed payload shapes can reach this code path and crash early.
Code

webhook_server/libs/github_api.py[R601-609]

+            pr_payload = self.hook_data.get("pull_request", {})
+            if (
+                isinstance(pr_payload, dict)
+                and "sha" in pr_payload.get("base", {})
+                and "sha" in pr_payload.get("head", {})
+            ):
+                self.pr_base_sha: str = pr_payload["base"]["sha"]
+                self.pr_head_sha: str = pr_payload["head"]["sha"]
+            else:
Evidence
The new conditional uses "sha" in pr_payload.get("base", {}) / ...get("head", {}) and then
indexes into pr_payload["base"]["sha"], which can throw if base/head aren’t dicts. The webhook
endpoint only verifies signatures when a secret is configured, so malformed payloads can reach this
logic.

webhook_server/libs/github_api.py[599-609]
webhook_server/app.py[384-393]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The new PR-SHA selection logic assumes `pull_request.base` and `pull_request.head` are dict-like when doing membership tests and indexing. If either is present but not a dict (e.g., `None`), `"sha" in ...` can raise `TypeError`, or later indexing can fail.

### Issue Context
Webhook signature verification is optional (only when `webhook-secret` is configured), so the service should be resilient to malformed payload shapes.

### Fix Focus Areas
- webhook_server/libs/github_api.py[599-614]
- webhook_server/app.py[384-393]

### Implementation sketch
Replace the membership tests with explicit type-safe extraction:
```py
pr_payload = self.hook_data.get("pull_request")
base = pr_payload.get("base") if isinstance(pr_payload, dict) else None
head = pr_payload.get("head") if isinstance(pr_payload, dict) else None
base_sha = base.get("sha") if isinstance(base, dict) else None
head_sha = head.get("sha") if isinstance(head, dict) else None

if isinstance(base_sha, str) and base_sha and isinstance(head_sha, str) and head_sha:
   self.pr_base_sha = base_sha
   self.pr_head_sha = head_sha
else:
   # current API fallback
```
Optionally: validate SHA format (hex) before accepting it.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

@myakove-bot

Copy link
Copy Markdown
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Pre-commit Checks: pre-commit runs automatically if .pre-commit-config.yaml exists
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest python-module-install - Test Python package installation
  • /retest pre-commit - Run pre-commit hooks and checks
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 1 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  5. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • rnetser

Reviewers:

  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge
AI Features
  • Conventional Title: Mode: fix (claude/claude-opus-4-6[1m])
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])
  • Test Oracle: Triggers: approved (claude/claude-opus-4-6[1m]); /test-oracle can be used anytime

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Comment thread webhook_server/libs/github_api.py Outdated
Comment thread webhook_server/libs/handlers/owners_files_handler.py
Comment thread webhook_server/libs/github_api.py Outdated
@myakove myakove force-pushed the fix/issue-1096-webhook-payload-shas branch from 7ff7d45 to 92438ea Compare June 9, 2026 10:36
@qodo-code-review

qodo-code-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 92438ea

@myakove

myakove commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:601 (qodo rule violation) — Defensive pr_payload.get checks

Addressed: Fixed in commit 92438ea — replaced .get()/"sha" in guards with isinstance() dict checks (defensive only for non-PR events) followed by direct ["sha"] access (fail-fast for PR events where sha is guaranteed by GitHub webhook spec). This aligns with AGENTS.md anti-defensive programming: defensive checks only for genuinely optional parameters (non-PR event types), fail-fast for guaranteed fields.

webhook_server/libs/handlers/owners_files_handler.py:102 (qodo bug) — Payload SHAs not fetched

Addressed: This is a pre-existing race condition that existed before this PR — the old code also used SHAs that could become stale on force-push. The _clone_repository() fetches refs/pull/N/head which contains the head SHA, and the base SHA is reachable via the fetched base ref. The existing git diff error handling already catches and raises RuntimeError if SHAs aren't reachable. Added a docstring note documenting this as a known limitation. Updated issue #1096 spec.

webhook_server/libs/github_api.py:601 (qodo bug) — Fragile payload SHA check

Addressed: Fixed in the same commit 92438ea — the isinstance(pr_payload.get("base"), dict) and isinstance(pr_payload.get("head"), dict) checks handle the case where base/head are non-dict (including None). Only when both are dicts do we access ["sha"] directly.

Comment thread webhook_server/libs/github_api.py
@qodo-code-review

qodo-code-review Bot commented Jun 9, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 5a40c0f

Comment thread webhook_server/libs/github_api.py
Comment thread webhook_server/libs/github_api.py Outdated
@myakove

myakove commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:118 (qodo rule violation) — Empty default pr_*_sha

Addressed: By design — empty string defaults provide mypy strict type safety. Using None would require Optional[str] and None-checks everywhere, violating anti-defensive programming. Attributes are always overwritten in process() before any handler reads them.

webhook_server/libs/github_api.py:610 (qodo bug) — Missing sha key validation

Addressed: Already addressed — isinstance(pr_payload.get('base'), dict) guards validate structure, then direct ['sha'] access is fail-fast per AGENTS.md. GitHub webhook spec guarantees these fields for PR events.

@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

/agentic_review

@qodo-code-review

qodo-code-review Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 5a40c0f

@qodo-code-review

qodo-code-review Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 0afdbf7

Comment thread webhook_server/libs/github_api.py Outdated
Comment thread webhook_server/libs/github_api.py
@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:408 (qodo bug) — Ignored SHA fetch failure

Addressed: Fixing — adding return code check on git fetch with warning log when SHA fetch fails.

webhook_server/libs/github_api.py:398 (qodo bug) — Unvalidated SHA in git

Addressed: Fixing — adding SHA format validation (hex 40-char pattern) before using in git commands.

@qodo-code-review

qodo-code-review Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 31f6655

Comment thread webhook_server/libs/github_api.py Outdated
@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:399 (qodo rule violation) — Invalid SHA skipped silently

Addressed: Fixing — invalid SHA format now resets to empty string so API fallback path is used. Added isinstance check for non-string values.

@qodo-code-review

qodo-code-review Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit c3a56ab

myakove added 5 commits June 10, 2026 19:19
…dition

- Replace live PyGithub API calls with webhook payload SHAs for pull_request events
- Fall back to API for non-PR events (issue_comment, check_run, etc.)
- Store pr_base_sha/pr_head_sha on GithubWebhook instance during process()
- Remove pull_request parameter from initialize() and list_changed_files()
- Add symmetric guards for both base and head SHA validation

Closes #1096
Declare pr_base_sha and pr_head_sha in __init__() with empty string defaults
so mypy strict mode has clear type declarations. Remove redundant inline
annotations in process() since the class-level ones cover typing.
…h safety

After fetching the PR ref, check if payload SHAs exist in the clone.
If not (force-push race condition), fetch them explicitly from origin.
This ensures git diff in list_changed_files() always has valid SHAs.
- Validate payload SHAs match 40-char hex format before using in git commands
- Check return code of git fetch for missing SHAs and log warning on failure
- Helps diagnose downstream git diff failures from unreachable SHAs
Validate SHA format and type before use in git commands. Invalid SHAs
are reset to empty string so the cat-file/fetch block is skipped and
list_changed_files() falls back to API-sourced SHAs. Prevents TypeError
on non-string payloads and unclear git diff errors from malformed SHAs.
@myakove myakove force-pushed the fix/issue-1096-webhook-payload-shas branch from c3a56ab to 4c96c95 Compare June 10, 2026 16:19
@qodo-code-review

Copy link
Copy Markdown

Code Review by Qodo

Grey Divider

Sorry, something went wrong

We weren't able to complete the code review on our side. Please try again

Grey Divider

Qodo Logo

@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

/retest build-container

1 similar comment
@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

/retest build-container

@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:601 (qodo rule violation) — Defensive pr_payload.get checks

Addressed: Fixed in commits 92438ea + c3a56ab. The .get('base', {})/"sha" in pattern was replaced with isinstance(pr_payload.get('base'), dict) which is justified — non-PR events (issue_comment, check_run) legitimately lack the pull_request dict. For PR events where base/head are dicts, ['sha'] is accessed directly (fail-fast). SHA format validated with _SHA_PATTERN regex, invalid SHAs reset to empty string triggering API fallback.

webhook_server/libs/handlers/owners_files_handler.py:102 (qodo bug) — Payload SHAs not fetched

Addressed: Fixed in commit 0afdbf7. _clone_repository() now explicitly checks if payload SHAs exist in the clone via git cat-file -e and fetches them with git fetch origin if missing. Handles force-push race condition.

webhook_server/libs/github_api.py:118 (qodo rule violation) — Empty default pr_*_sha

Addressed: By design per updated issue #1096 spec. Empty string defaults provide mypy strict type safety (str, not Optional[str]). Attributes are always overwritten in process() before any handler reads them. Using None would require Optional[str] annotations and None-checks everywhere, violating anti-defensive programming per AGENTS.md.

webhook_server/libs/github_api.py:610 (qodo bug) — Missing sha key validation

Addressed: Fixed in commit 92438ea. isinstance(pr_payload.get('base'), dict) and isinstance(pr_payload.get('head'), dict) validate structure. If either is non-dict (including None), falls back to API. When both are dicts, ['sha'] access is direct/fail-fast.

webhook_server/libs/github_api.py:601 (qodo bug) — Fragile payload SHA check

Addressed: Fixed in commit c3a56ab. Added _SHA_PATTERN = re.compile(r'^[0-9a-f]{40}$') validation. Non-string SHAs caught by isinstance check. Invalid format resets to empty string with warning log. Prevents TypeError on re.match and rejects malformed values.

Replace defensive isinstance checks on pull_request payload with a direct
event type check (self.github_event == 'pull_request'). For pull_request
events, base.sha and head.sha are guaranteed by the GitHub webhook spec,
so no defensive checks are needed. For other events (issue_comment,
check_run), fall back to the PullRequest API object as before.
@qodo-code-review

qodo-code-review Bot commented Jun 10, 2026

Copy link
Copy Markdown

Code review by qodo was updated up to the latest commit 14ccf9a

@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:601 (qodo rule violation) — Defensive pr_payload.get checks

Addressed: Fixed in commit 14ccf9a — replaced defensive isinstance checks with a direct event type check (self.github_event == "pull_request"). For pull_request events, base.sha/head.sha are guaranteed by the GitHub webhook spec — no defensive guards needed. For other events (issue_comment, check_run), the API fallback is unchanged.

webhook_server/libs/handlers/owners_files_handler.py:102 (qodo bug) — Payload SHAs not fetched

Not addressed: The code already handles this case. _clone_repository() (lines 393-428) explicitly fetches payload SHAs when they are not present in the clone — see the pr_base_sha/pr_head_sha fetch loop with git cat-file -e check and git fetch origin {sha} fallback. This was added specifically for the force-push scenario described in this finding.

webhook_server/libs/github_api.py:118 (qodo rule violation) — Empty default pr_*_sha

Not addressed: These are class-level type annotations initialized in __init__(), always set in process() before any handler instantiation. The empty string default is never observed by downstream code — process() either sets them from the webhook payload or from the PullRequest API object. This follows the same pattern as parent_committer: str = "" on line 119 which is also set in process().

webhook_server/libs/github_api.py:610 (qodo bug) — Missing sha key validation

Not addressed: The code diff in this finding shows pr_payload["base","sha"] (tuple key) which is not the actual code. The actual code uses pr_payload["base"]["sha"] (nested dict access). After the fix in commit 14ccf9a, this is now self.hook_data["pull_request"]["base"]["sha"] — direct access without any guards, exactly as the fail-fast philosophy requires.

webhook_server/libs/github_api.py:601 (qodo bug) — Fragile payload SHA check

Not addressed: This finding is moot after the fix in commit 14ccf9a. The defensive isinstance/.get() checks have been replaced with a direct event type check (self.github_event == "pull_request"). The code now directly accesses self.hook_data["pull_request"]["base"]["sha"] — no membership tests, no .get() calls. For non-pull_request events, the API fallback path is used.

@myakove

myakove commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

@qodo-code-review[bot]

The following review comments were reviewed and a decision was made:

webhook_server/libs/github_api.py:601 (qodo rule violation) — Defensive pr_payload.get checks

Addressed: Already fixed in commit 14ccf9a — replaced defensive isinstance checks with direct event type check (self.github_event == "pull_request"). No defensive guards for guaranteed webhook spec fields.

webhook_server/libs/handlers/owners_files_handler.py:102 (qodo bug) — Payload SHAs not fetched

Not addressed: Already addressed — _clone_repository() explicitly fetches payload SHAs when not present in clone (lines 393-428). See the git cat-file -e check and git fetch origin {sha} fallback.

webhook_server/libs/github_api.py:118 (qodo rule violation) — Empty default pr_*_sha

Not addressed: Architecture guarantee — pr_base_sha/pr_head_sha are always set in process() before any handler. Same pattern as parent_committer: str = "" on line 119.

webhook_server/libs/github_api.py:610 (qodo bug) — Missing sha key validation

Not addressed: Stale code diff — actual code uses pr_payload["base"]["sha"] (nested dict), not pr_payload["base","sha"] (tuple key). After commit 14ccf9a, uses self.hook_data["pull_request"]["base"]["sha"] directly.

webhook_server/libs/github_api.py:601 (qodo bug) — Fragile payload SHA check

Not addressed: Moot after commit 14ccf9a — defensive isinstance/.get() checks replaced with direct event type check. No membership tests remain.

@myakove myakove merged commit b8c21fc into main Jun 10, 2026
6 of 8 checks passed
@myakove myakove deleted the fix/issue-1096-webhook-payload-shas branch June 10, 2026 17:23
@myakove-bot

Copy link
Copy Markdown
Collaborator

New container for ghcr.io/myk-org/github-webhook-server:latest published

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: list_changed_files uses live API SHAs instead of webhook payload SHAs

2 participants