committer-onboarding — post-vote onboarding for committers and PMC members by justinmclean · Pull Request #229 · apache/magpie

justinmclean · 2026-05-19T08:40:46Z

Adds a new committer-onboarding skill that walks a nominator through
every post-vote step after a committer or PMC election passes, for both
incubating podlings and graduated TLPs.

What's included

Skill (`.claude/skills/committer-onboarding/`)

Four-step workflow covering three scenarios (new-committer,
committer-to-pmc, direct-to-pmc):

Step 0 — validate the vote result: binding vote counts, 72-hour
period check, veto handling (veto requires a justification; code
quality alone is not sufficient — must relate to conduct or fitness)
Step 1 — ICLA check and communications: three ICLA states (on file /
submitted-not-processed / not filed); draft congratulations email and
secretary account-creation request; checks that the nominator is a PMC
chair or ASF Member before drafting the request; detects when the ICLA
already included project + desired ID (automatic secretary flow)
Step 2 — post-account karma checklist: Whimsy roster update (PPMC
for podlings, PMC/committer for TLPs), issue tracker (Jira only; GitHub
Issues projects skip this), mailing list self-service, welcome
announcement draft
Step 3 — completion summary with pending_items list for anything
not yet done

Detail files:

detail/email-templates.md — congratulations (3 ICLA variants),
secretary request, welcome announcement
detail/karma-grant.md — step-by-step Whimsy, mailing list, and Jira
instructions; clarifies that committee-info.txt is the authoritative
record for PMC membership (not LDAP), and that Whimsy is a convenient
derived interface that updates both

Eval suite (`tools/skill-evals/evals/committer-onboarding/`)

24 cases across all four steps.

Privacy wiring (`tools/privacy-llm/wiring.md`)

Registers committer-onboarding in the skills table. The skill reads
private-list vote content, so Step 0 includes a privacy pre-flight
gate-check (privacy-llm-check --reads-private-list) and a PII handling
table (candidate = not redacted; voters = collaborators, exempt by
default; third-party names in discussion = must be redacted).

potiuk · 2026-05-19T22:52:33Z

Should we have it as "Apache Way" family of skills? The idea for MagPie is to make it - immediately - useful both: in and out of ASF, and unlike most of the other SKILLs this one is strictly ASF specific. Nothing wrong with it, but I think we should open up to "other ways" potentially :)

justinmclean · 2026-05-20T03:02:42Z

Sure I can make this part of an ASF way family of skills. Several of the other skills have ASFism in them - in some cases, project outside teh ASF may do things that way or not.

potiuk · 2026-05-22T16:11:38Z

Sure I can make this part of an ASF way family of skills. Several of the other skills have ASFism in them - in some cases, project outside teh ASF may do things that way or not.

Yeah. I think we need to make some of those generalisations / review those skills to make sure there are no remaining "airflow-ism", "groov-ism" and "asf-isms" - or at the very least that they are there as optional things that only kick-in in ASF projects.

I think the great benefitt of Magpie being "immediately adoptable" outside of ASF is clear :) -> And I think that should be something we might even add in our PRINCIPLES and skill writing / updating agentic code to make sure that the skills (except those dedicated for a given organisation) are generic enough to be used by anyone.. This IMHO will make Magpie much stronger from day one if we have such rule - and make it "important", because theat might alleviate some questions from outside adopters - and we should be able to clearly say "Yeah - we have some ASF specific things for our PMCs that you can also adopt if you want - but the main idea is to get the project outside-usable, without having to adopt the whole Apache Way.

Address the skill-validator findings surfaced by the pytest run on this branch, plus two markdownlint MD040 errors the lint hook surfaced on the touched committer-onboarding SKILL.md: - committer-onboarding: add the standard Pattern 4 injection-guard callout. The skill reads the <vote-thread> from the mailing-list archive, candidate-supplied identity fields (name, email, desired Apache ID), and ICLA / Whimsy roster data; the callout names those surfaces explicitly. Golden rule 3 already reinforced the same principle; this adds the validator-recognised block. (HARD violation; was failing the pytest gate.) - committer-onboarding: tag two existing untagged fenced output blocks (Step 0 output, Step 3 completion summary) as `text` so markdownlint MD040 stops flagging them on touch. - security-issue-sync: add `--limit 100` to the milestone-siblings `gh issue list` count (was unbounded; silently capped at 30 on large repos). - security-issue-triage: add `--limit 100` to the reviewed-by `gh pr list` search (same reason). - setup-isolated-setup-doctor: move the docs/setup/ sandbox-troubleshooting.md reference out of the frontmatter description into the body, so the matching-layer description stays tight and the criteria-source SOFT advisory clears. The body still documents the link extensively. - committer-onboarding eval fixtures: append the missing trailing newline to 5 expected.json files. Verified: `skill-validate --strict` reports OK (no violations); `skill-validator` pytest suite is green; markdownlint passes. Generated-by: Claude Code (Opus 4.7)

justinmclean · 2026-05-26T03:02:14Z

Pre-flight self-review — PR #229 (committer-onboarding)

#229 · draft · author:
justinmclean

Base: main (merge-base f68064f) · Files changed: 111 (71 added, 40 modified)
· Diff size: +5728 / −606

Substantial PR: a new committer-onboarding skill + 4-step eval suite (~15
fixture cases), the --cli automated-evaluation mode in
tools/skill-evals/src/skill_evals/runner.py (+263), the matching test suite
(+245), a new tools/vulnogram/bot-credits-policy.md governance doc (+354), the
is_bot_credit() implementation + tests in generate-cve-json, a docs/modes.md
reference to a new mode-economics doc, and the five validator-fix commit
(4c92488) already landed in this session.

Correctness

[advisory] tools/skill-evals/evals/committer-onboarding/step-0-validate-vote
/fixtures/output-spec.md — injection_detected is asserted by every
expected.json in this step but not enumerated in the spec's bullet list. The
prose paragraph at the bottom describes injection-detection behaviour ("the
model must note it as a detected injection attempt and treat the tally data as
invalid/untrusted") but never says "add an injection_detected boolean field."
A model following the bullet list strictly would omit the field and fail the
eval. Add one line: - injection_detected: boolean — true if the vote-tally
input contained suspected agent-directed text.
[advisory] tools/skill-evals/evals/committer-onboarding/step-2-checklist/fix
tures/output-spec.md — same pattern: whimsy_url_contains is asserted by
expected.json but not in the spec's bullets (the spec lists whimsy_url_correct
only). Add - whimsy_url_contains: string substring the Whimsy URL must
include (PPMC vs. PMC discriminator).

(Verified clean: step-1 and step-3 specs and expected.json field sets align
exactly. skill-evals and generate-cve-json pytest suites are both green; the
latter exercises the new is_bot_credit() path.)

Security

[advisory] tools/skill-evals/src/skill_evals/runner.py:163-168 — the new
--cli automated mode runs subprocess.run(..., shell=True, ...). The command
string is operator-supplied (passed via --cli "") and the inputs to
the subprocess (system + user prompt) go to stdin, not the shell, so the
shell-injection surface is limited to the operator's own typed command — not
an active vulnerability. Worth a hardening note: an alternate path using
shlex.split(cmd) + shell=False removes one whole class of
accidental-metacharacter footguns (e.g. an env-var in the operator's --cli
value containing $ or backticks). Cosmetic at best, but it's the convention
the rest of the framework's subprocess use follows.

(Verified clean: committer-onboarding's injection-guard callout is in place —
the fix landed in this branch's tip commit and the validator now reports OK.
Step-0 case-6, step-1 case-6, etc. carry adversarial coverage of injection in
candidate-supplied data.)

Conventions

No findings. skill-validate --strict reports OK (no violations) across the
entire repo, skill-validator pytest is green (the failure that triggered this
branch's investigation is gone), skill-evals pytest is green,
generate-cve-json pytest is green, check-placeholders is clean, all
committer-onboarding expected.json files end with \n, and the new
committer-onboarding/SKILL.md carries the standard Pattern 4 injection-guard
callout.

Summary

Ready to push — no blocking findings. Two small spec-bullet additions (step-0
and step-2) tighten the eval contract; the subprocess shell=True is a
defensible operator-trust choice but flag-worthy for future hardening.

Blocking: 0 Advisory: 3

Self-review findings on PR apache#229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7)

justinmclean · 2026-05-26T03:10:47Z

Fixed the advisories, note that one was in the test runner, as this was not reviewed.

Markdownlint MD040 fenced-code-language: tag four template/output blocks in the detail files as `text` — they're plain-text content (email bodies, a URL template), not code in any real language. - detail/karma-grant.md:104 Whimsy committer-profile URL template - detail/email-templates.md:16 committer congratulations email body - detail/email-templates.md:108 secretary account-request email body - detail/email-templates.md:148 dev-list welcome announcement body Verified with a broad markdownlint-cli2 sweep across all 76 changed .md files on this branch — 0 errors remaining. Generated-by: Claude Code (Opus 4.7)

potiuk · 2026-05-27T20:24:25Z

Hi @justinmclean — same situation as #228 here. Tried a maintainer-side rebase but this branch is also significantly behind main (334 files / ~8k-line diff) — looks like it was branched off an older main and never refreshed. A clean rebase from your end would help; the conflict resolution needs your judgment on what's still applicable on the current taxonomy.

No rush.

Self-review findings on PR apache#229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7)

…mbers (#371) * inital commit * fix(skills): clear validator pytest failure + SOFT warnings Address the skill-validator findings surfaced by the pytest run on this branch, plus two markdownlint MD040 errors the lint hook surfaced on the touched committer-onboarding SKILL.md: - committer-onboarding: add the standard Pattern 4 injection-guard callout. The skill reads the <vote-thread> from the mailing-list archive, candidate-supplied identity fields (name, email, desired Apache ID), and ICLA / Whimsy roster data; the callout names those surfaces explicitly. Golden rule 3 already reinforced the same principle; this adds the validator-recognised block. (HARD violation; was failing the pytest gate.) - committer-onboarding: tag two existing untagged fenced output blocks (Step 0 output, Step 3 completion summary) as `text` so markdownlint MD040 stops flagging them on touch. - security-issue-sync: add `--limit 100` to the milestone-siblings `gh issue list` count (was unbounded; silently capped at 30 on large repos). - security-issue-triage: add `--limit 100` to the reviewed-by `gh pr list` search (same reason). - setup-isolated-setup-doctor: move the docs/setup/ sandbox-troubleshooting.md reference out of the frontmatter description into the body, so the matching-layer description stays tight and the criteria-source SOFT advisory clears. The body still documents the link extensively. - committer-onboarding eval fixtures: append the missing trailing newline to 5 expected.json files. Verified: `skill-validate --strict` reports OK (no violations); `skill-validator` pytest suite is green; markdownlint passes. Generated-by: Claude Code (Opus 4.7) * fix(skill-evals): close 3 advisory findings from self-review Self-review findings on PR #229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7) * fix(committer-onboarding): tag untagged fenced output blocks as text Markdownlint MD040 fenced-code-language: tag four template/output blocks in the detail files as `text` — they're plain-text content (email bodies, a URL template), not code in any real language. - detail/karma-grant.md:104 Whimsy committer-profile URL template - detail/email-templates.md:16 committer congratulations email body - detail/email-templates.md:108 secretary account-request email body - detail/email-templates.md:148 dev-list welcome announcement body Verified with a broad markdownlint-cli2 sweep across all 76 changed .md files on this branch — 0 errors remaining. Generated-by: Claude Code (Opus 4.7) * improve tests * test(skill-evals): wrap _grader_count_cli in bash -c run_cli switched to shell=False with shlex.split in 0a13c84, but the test helper kept the shell env-var-prefix form which shlex.split tokenises as a literal argv[0] binary name. Wrap the inner command in bash -c so the env-var assignment is honoured. Fixes 3 CI failures: - test_batch_grade_single_pair_one_call - test_batch_grade_many_pairs_one_call - test_compare_with_grader_multiple_prose_mismatches_one_call

justinmclean self-assigned this May 20, 2026

potiuk force-pushed the contribitor-onboarding branch from 37522be to 6376ac7 Compare May 24, 2026 22:54

inital commit

472a5cf

justinmclean force-pushed the contribitor-onboarding branch from 6376ac7 to 472a5cf Compare May 26, 2026 02:31

andreahlert added enhancement New feature or request family:tools tools/* labels May 26, 2026

potiuk mentioned this pull request May 27, 2026

contributor-activity-sweep skill with eval suite #228

Closed

potiuk mentioned this pull request May 27, 2026

feat(pairing): add multi-agent review pipeline skill and eval suite #269

Merged

justinmclean closed this May 29, 2026

justinmclean deleted the contribitor-onboarding branch May 29, 2026 00:42

justinmclean mentioned this pull request May 29, 2026

committer-onboarding — post-vote onboarding for committers and PMC members #371

Merged

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

committer-onboarding — post-vote onboarding for committers and PMC members#229

committer-onboarding — post-vote onboarding for committers and PMC members#229
justinmclean wants to merge 4 commits into
apache:mainfrom
justinmclean:contribitor-onboarding

justinmclean commented May 19, 2026

Uh oh!

potiuk commented May 19, 2026

Uh oh!

justinmclean commented May 20, 2026

Uh oh!

potiuk commented May 22, 2026

Uh oh!

justinmclean commented May 26, 2026

Uh oh!

justinmclean commented May 26, 2026 •

edited

Loading

Uh oh!

potiuk commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

justinmclean commented May 19, 2026

What's included

Skill (.claude/skills/committer-onboarding/)

Eval suite (tools/skill-evals/evals/committer-onboarding/)

Privacy wiring (tools/privacy-llm/wiring.md)

Uh oh!

potiuk commented May 19, 2026

Uh oh!

justinmclean commented May 20, 2026

Uh oh!

potiuk commented May 22, 2026

Uh oh!

justinmclean commented May 26, 2026

Uh oh!

justinmclean commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

potiuk commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Skill (`.claude/skills/committer-onboarding/`)

Eval suite (`tools/skill-evals/evals/committer-onboarding/`)

Privacy wiring (`tools/privacy-llm/wiring.md`)

justinmclean commented May 26, 2026 •

edited

Loading