Skip to content

tests(oauth-draft): expand coverage 62%→99%; add CI tests workflow#6

Merged
potiuk merged 5 commits into
mainfrom
add-python-tests
Apr 29, 2026
Merged

tests(oauth-draft): expand coverage 62%→99%; add CI tests workflow#6
potiuk merged 5 commits into
mainfrom
add-python-tests

Conversation

@potiuk

@potiuk potiuk commented Apr 29, 2026

Copy link
Copy Markdown
Member

Summary

Two related changes:

  1. More tests for oauth-draft — pushes Python coverage from 62% → 99% (22 → 58 unit tests).
  2. A dedicated tests.yml CI workflow — runs the per-tool pytest suites as a parallel matrix job, so test results show up as separate CI checks rather than buried inside the prek workflow's bundled pytest lines.

Tests have been re-run locally; all 158 tests across both projects pass.

Test additions (oauth-draft)

File Added Covers
tests/test_credentials.py 3 tests refresh_access_token — success path, HTTP error, missing access_token in payload
tests/test_create_draft.py 13 tests read_body, api_get/api_post shapes + error path, latest_reply_headers delegation, create_draft payload (with/without threadId), main end-to-end (mocked network), main --no-reply-headers path
tests/test_mark_threads_read.py 9 tests list_thread_ids pagination + HTTP error, modify_thread payloads + error, main dry-run / --execute / --max / failure-rc-1 paths
tests/test_setup_creds.py 11 tests (new file) detect_from_address (env / git / errors / empty), parse_args, main error paths, main writes credentials atomically with mode 600 in mode-700 parent, --rm-client-secrets, installed vs web client_secrets shape

All network-touching code paths use unittest.mock.patch against urllib.request.urlopen; no test hits the real Gmail API.

Coverage report on src/oauth_draft/ after:

__init__.py             100%
create_draft.py          99%
credentials.py          100%
mark_threads_read.py     99%
setup_creds.py           89%
TOTAL                    99%

The remaining ~1% is the if __name__ == "__main__": shim in each module and a handful of chmod-warning branches in setup_creds that are awkward to provoke deterministically.

CI workflow (.github/workflows/tests.yml)

  • Triggers on PR + push to main.
  • Per-project pytest matrix with fail-fast: false — both projects' pass/fail visible independently.
  • Uses uv run --directory <project> --group dev pytest (not --project) — pytest's testpaths = ["tests"] is cwd-relative, so we need to actually cd in.
  • The existing prek workflow's pytest hooks still run (defense in depth); this workflow is the visible signal.

What's not in this PR

generate-cve-json coverage stays at 79% — the gap is 207 lines of cve_json.py, mostly error/edge branches in the 1700-line module. Pushing it higher would balloon the PR substantially; happy to follow up in a separate one if you want it pushed.

🤖 Generated with Claude Code

potiuk added 5 commits April 29, 2026 02:43
Adds 36 new unit tests for the oauth-draft Python project (22 → 58
total) and a dedicated `.github/workflows/tests.yml` CI workflow
that runs the per-tool pytest suites as a parallel matrix job.

oauth-draft test additions:

- `tests/test_credentials.py` — 3 tests for `refresh_access_token`:
  successful POST + access-token return, HTTP error path, payload
  with no `access_token` field. Mocks `urllib.request.urlopen`
  with a small `_FakeResponse` context-manager stand-in (used in
  every network-mocked test below as well).
- `tests/test_create_draft.py` — 13 tests covering:
  - `read_body` (file path, stdin via "-", stdin via None);
  - `api_get` / `api_post` (request shape + response parsing);
  - `api_post` HTTP error path;
  - `latest_reply_headers` (delegates to `api_get` then to
    `headers_from_thread`);
  - `create_draft` payload shape with and without `threadId`;
  - `main` end-to-end with mocked `refresh_access_token`,
    `latest_reply_headers`, and `api_post` — verifies the MIME
    body posted to `/drafts` carries the right From/To/Subject/
    In-Reply-To headers and the body content;
  - `main --no-reply-headers` skips the thread-headers lookup.
- `tests/test_mark_threads_read.py` — 9 tests covering:
  - `list_thread_ids` pagination (multi-page response stitched
    correctly; second-page request includes `pageToken`);
  - `list_thread_ids` HTTP error path;
  - `modify_thread` payload shape (with both label lists, with
    only one label list);
  - `modify_thread` HTTP error path;
  - `main` dry-run (lists IDs, never calls modify);
  - `main --execute` calls `modify_thread` per thread with the
    default `removeLabelIds=[UNREAD]`;
  - `main --max` truncates the modify count;
  - `main` returns 1 when any modify call fails.
- `tests/test_setup_creds.py` — new file, 11 tests covering the
  whole `setup_creds` module (was 22% covered → ~89%):
  - `detect_from_address` (env var, git-config fallback, missing
    git, git error, empty git output);
  - `parse_args` (defaults + overrides);
  - `main` error paths (no from-address, missing client_secrets,
    flow returns no refresh_token);
  - `main` success path: writes credentials JSON atomically with
    mode 600 in a 700 parent dir;
  - `main --rm-client-secrets` deletes the input file;
  - `main` handles both `installed` and `web` shapes of the
    Google client_secrets.json.

Coverage on `src/oauth_draft/` after the additions:

```
src/oauth_draft/__init__.py             100%
src/oauth_draft/create_draft.py          99%
src/oauth_draft/credentials.py          100%
src/oauth_draft/mark_threads_read.py     99%
src/oauth_draft/setup_creds.py           89%
TOTAL                                    99%
```

The remaining ~1% is the `if __name__ == "__main__":` shim in each
module and a handful of chmod-error warning paths in `setup_creds`
that are awkward to provoke deterministically.

CI workflow (`.github/workflows/tests.yml`):

- Runs on every PR and push to main.
- Per-project pytest matrix (`fail-fast: false`) so the two
  projects' results are visible as separate CI checks rather than
  buried in the bundled `pytest` lines of the `prek` workflow.
- Uses `uv run --directory <project>` (not `--project`) to move
  cwd into the project root — pytest needs it because each
  project's `[tool.pytest.ini_options] testpaths = ["tests"]`
  resolves relative to cwd.
- The `prek` workflow's pytest hooks still run (defense in
  depth); this workflow is the visible signal, not the gate.

Other:

- `.gitignore` adds `.coverage` to both Python projects so locally-
  run `coverage` doesn't dirty the working tree.

Generated-by: Claude Code (Claude Opus 4.7)
GitHub Actions does not allocate a TTY for workflow steps, so by
default `pytest`, `ruff`, `mypy`, and uv all fall back to monochrome
output — making test failures harder to scan in the CI log viewer
(which itself does render ANSI escapes).

Two fixes, belt-and-braces:

- Set `FORCE_COLOR=1` and `PY_COLORS=1` at the job level. These are
  the de-facto signals that uv, ruff, mypy, pytest, and most other
  Python tooling honour to opt back into colour without a TTY.
- Pass `--color=yes` explicitly to the pytest invocation, for tools
  that read the CLI flag rather than the env var.

Generated-by: Claude Code (Claude Opus 4.7)
…attach)

Adds 35 unit tests in a new `tests/test_cli.py` covering the parts
of `generate_cve_json.cve_json` that the existing 100-test suite
didn't exercise: the CLI surface (`parse_args`, `main`), the `gh`
subprocess wrappers (`fetch_issue`, `_gh_api_json`, `_fetch_issue`),
and the issue-body attachment helpers (`_splice_attachment_into_body`,
`attach_to_issue`).

Coverage on `src/generate_cve_json/cve_json.py`: 65% → 97%
(stmts 590, miss 18). The remaining ~3% is a handful of defensive
branches that are awkward to provoke deterministically. Test count:
100 → 135.

Test groupings (each a TestX class for grep-ability):

- TestParseArgs (4) — minimal positional, --stdin, repeatable
  flags accumulate, full override matrix.
- TestSpliceAttachment (4) — replace existing block, legacy
  single-marker fallback, append after `### CVE tool link` when
  no existing attachment, append at end when no CVE-tool-link
  field.
- TestFetchIssue (3) — happy path returning (title, body), gh
  missing → RuntimeError, gh non-zero → RuntimeError.
- TestGhApiJson (5) — parsed-JSON return, empty-stdout returns
  `{}`, body_payload writes a temp JSON file passed via `--input`,
  gh missing, gh non-zero.
- TestFetchIssueRest (2) — happy path, defensive path when
  `_gh_api_json` returns a non-dict.
- TestAttachToIssue (3) — appends when no existing marker, replaces
  when existing marker present, skips PATCH when the spliced body is
  byte-identical to the existing body (idempotent).
- TestMainErrors (7) — every `return 2` path: --attach with --stdin,
  --attach without issue, missing issue without --stdin, gh failure
  surfaces as `return 1`, --product-for malformed (no `=`),
  --product-for with empty value, --config not found.
- TestMainHappyPath (7) — --stdin emits full envelope, --no-envelope
  emits CNA only, --output writes a file, fetch path uses gh,
  --attach embeds (was_update=False), --attach replaces
  (was_update=True), attach failure surfaces as `return 1`.

All network-touching code paths use `unittest.mock.patch` against
`subprocess.run` or the framework's own `_gh_api_json` /
`fetch_issue` / `attach_to_issue`. No test hits a real `gh` CLI or
the GitHub API.

A `_issue_body()` helper fixture builds a minimal tracker issue body
with the standard heading-delimited fields the tool consumes.

Generated-by: Claude Code (Claude Opus 4.7)
Three CodeQL findings — two warnings on
`generate_cve_json.cve_json.classify_reference` and two errors on
`oauth_draft.setup_creds.main` — addressed in one commit.

generate-cve-json — `py/incomplete-url-substring-sanitization`
(warning, two instances on the same line):

  if "lists.apache.org" in url or "security.apache.org" in url:

A substring `in url` test would also flag attacker-controlled URLs
like `https://evil.example/?q=lists.apache.org` as
`vendor-advisory`. Fixed by parsing the URL with
`urllib.parse.urlparse` and checking the hostname exactly:

  host = (urllib.parse.urlparse(url).hostname or "").lower()
  if host in ("lists.apache.org", "security.apache.org"):

Added 4 regression tests in `TestClassifyReference`:
  - `security.apache.org` is tagged (parity with `lists.apache.org`);
  - `https://evil.example/?q=lists.apache.org` is NOT tagged;
  - `https://evil.example/security.apache.org` is NOT tagged;
  - `https://lists.apache.org.evil.example/x` is NOT tagged
    (subdomain trick);
  - malformed URL string returns no tags.

oauth-draft setup_creds — `py/clear-text-logging-sensitive-data`
(error, two instances):

  print(f"Running OAuth flow against {client_secrets} ...")
  print(f"Removed {client_secrets}.")

The variable held a `Path` object (the filesystem path to
`client_secrets.json`), not the secret content. CodeQL's data-flow
analysis flags it because the variable name matches its
sensitive-data heuristic. Renamed `client_secrets` →
`client_secrets_path` so the print sites are clearly logging a
path, not a secret. A short `# Variable named `_path` …` comment
above the rename documents the rationale so a future reader (or
PR reviewer) can see why the name diverges from the CLI argument
name `client_secrets`.

The existing tests in `tests/test_setup_creds.py` already cover
both branches that touch this variable; no test changes needed.

Generated-by: Claude Code (Claude Opus 4.7)
Pin down the two `print(... {client_secrets_path} ...)` log lines
that were renamed in the previous commit to address the CodeQL
`py/clear-text-logging-sensitive-data` findings. Without these
assertions, a future refactor that drops the path from the log
output would silently regress the user-facing UX without any test
catching it.

Both `test_main_writes_credentials_file` and
`test_main_with_rm_client_secrets_deletes_input` already exercised
the relevant code paths; the additions just take a `capsys`
snapshot and assert the path appears in the expected line:

  - Startup banner: "Running OAuth flow against <path> ..."
  - On --rm-client-secrets:    "Removed <path>."

Generated-by: Claude Code (Claude Opus 4.7)
@potiuk potiuk merged commit 53b6d54 into main Apr 29, 2026
7 checks passed
@potiuk potiuk deleted the add-python-tests branch April 29, 2026 01:13
@andreahlert andreahlert added the mode:platform Substrate / infra — not a mode (sandbox, CI, validators) label May 7, 2026
andreahlert referenced this pull request in andreahlert/magpie May 15, 2026
- Replace SPDX with full ASF v2 license header (jbonofre)
- Clarify binding audience: contributors, committers, PMC, unmodified adopters (jbonofre)
- Extend #5 with deterministic-first execution to save tokens (potiuk)
- Extend #6 with explicit human sign-off for outbound human communication (RussellSpitzer)
- Rework #9 around capability floor instead of "same code on all backends", add justified-and-minimized clause, add end-to-end single-machine config requirement (RussellSpitzer)
- Standardize on US English (analyze, artifact, behavior, catalog, license, specialized)
potiuk referenced this pull request in andreahlert/magpie May 24, 2026
- Replace SPDX with full ASF v2 license header (jbonofre)
- Clarify binding audience: contributors, committers, PMC, unmodified adopters (jbonofre)
- Extend #5 with deterministic-first execution to save tokens (potiuk)
- Extend #6 with explicit human sign-off for outbound human communication (RussellSpitzer)
- Rework #9 around capability floor instead of "same code on all backends", add justified-and-minimized clause, add end-to-end single-machine config requirement (RussellSpitzer)
- Standardize on US English (analyze, artifact, behavior, catalog, license, specialized)
potiuk referenced this pull request in andreahlert/magpie May 27, 2026
- Replace SPDX with full ASF v2 license header (jbonofre)
- Clarify binding audience: contributors, committers, PMC, unmodified adopters (jbonofre)
- Extend #5 with deterministic-first execution to save tokens (potiuk)
- Extend #6 with explicit human sign-off for outbound human communication (RussellSpitzer)
- Rework #9 around capability floor instead of "same code on all backends", add justified-and-minimized clause, add end-to-end single-machine config requirement (RussellSpitzer)
- Standardize on US English (analyze, artifact, behavior, catalog, license, specialized)
potiuk pushed a commit that referenced this pull request Jun 2, 2026
* docs(principles): add operational principles document

PRINCIPLES.md restates RFC-AI-0004's six baseline principles in their
operational shape and adds the project-internal commitments the RFC
deliberately defers: eval as release blocker, contributor-sentiment
gating, no default telemetry, reproducibility from signed source,
maintainer education shipped with the platform.

19 ordered principles. Earlier outranks later when they collide.
Amendment process matches the release-vote process (>=3 binding +1,
no binding -1, 72h window, no lazy consensus).

Positioned as project-internal operating contract, not a competing RFC.

* docs(principles): address review feedback on PRINCIPLES.md

- Replace SPDX with full ASF v2 license header (jbonofre)
- Clarify binding audience: contributors, committers, PMC, unmodified adopters (jbonofre)
- Extend #5 with deterministic-first execution to save tokens (potiuk)
- Extend #6 with explicit human sign-off for outbound human communication (RussellSpitzer)
- Rework #9 around capability floor instead of "same code on all backends", add justified-and-minimized clause, add end-to-end single-machine config requirement (RussellSpitzer)
- Standardize on US English (analyze, artifact, behavior, catalog, license, specialized)

* docs(principles): disambiguate 'language-independent' as 'programming-language independent' (RussellSpitzer)

* docs(principles): qualify P6 merge rule as 'unilaterally' to resolve auto-merge tension (justinmclean)

* docs(principles): scope P3 'first-class' as adopter, clarify amendment proposal path (justinmclean)

* docs(principles): add PMC adjudication path for disputed committer blocks (justinmclean)

* docs(principles): scope P6 impersonation claim to messages read as maintainer-authored (justinmclean)

* docs(principles): replace dangling 'same family' clause with single-principle interpretation rule (justinmclean)

* docs(principles): add generated TOC

* docs(principles): align P14 with upstream Skills contract

A skill is always a directory with SKILL.md as entrypoint, even
for one-file workflows. SKILL.md stays under 500 lines; longer
reference material moves into sibling markdown linked one level
deep. Matches the runtime contract documented at
https://code.claude.com/docs/en/skills and
https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices,
and reflects how skills in this repo (contributor-nomination,
pr-management-code-review, pr-management-mentor) are already
authored.

* docs(principles): make P6 merge clause explicit on subject and close auto-merge gap (justinmclean)

* docs(principles): resolve disputed blocks via PMC consensus first, vote as last resort (justinmclean)

* docs(principles): soften P11 reproducibility requirement

Addresses review feedback that 'bytes are identical' is too strong
for a project-agnostic framework. Toolchains vary in their ability
to produce byte-identical output; some have known divergence sources
(timestamps, file ordering, path embedding).

P11 now requires byte-identical builds where achievable, and where
the toolchain makes that impractical, the release process must
document the divergence and provide an alternative local verification
mechanism. The 'no code without reviewed PR' guard stays absolute.

Refs: PR #147 review

* docs(principles): move ASF license header to top of file

The doctoc-generated TOC was placed above the Apache license header,
which breaks tooling that expects the license notice in the first
few lines of the file. Move the license block to line 1, followed
by the TOC.

Refs: PR #147 review

* docs(principles): align amendment process and blocking rules with ASF policy

Three fixes from PR #147 review by @justinmclean:

1. Amendment vote model: 'release vote' -> 'code-modification vote'
   The encoded rule (>=3 binding +1, any binding -1 vetoes) matches
   ASF consensus approval for code modifications, not majority
   approval for releases.

2. Veto-justification requirement: A binding -1 must now include a
   technical justification. Without one the veto is invalid and has
   no weight, matching ASF voting policy.

3. Generative tooling disclosure: P17 now requires a
   'Generated-by: <tool>' token in commit messages for AI-authored
   contributions, per ASF Generative Tooling Guidance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mode:platform Substrate / infra — not a mode (sandbox, CI, validators)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants