committer-onboarding — post-vote onboarding for committers and PMC members#229
committer-onboarding — post-vote onboarding for committers and PMC members#229justinmclean wants to merge 4 commits into
Conversation
|
Should we have it as "Apache Way" family of skills? The idea for MagPie is to make it - immediately - useful both: in and out of ASF, and unlike most of the other SKILLs this one is strictly ASF specific. Nothing wrong with it, but I think we should open up to "other ways" potentially :) |
|
Sure I can make this part of an ASF way family of skills. Several of the other skills have ASFism in them - in some cases, project outside teh ASF may do things that way or not. |
Yeah. I think we need to make some of those generalisations / review those skills to make sure there are no remaining "airflow-ism", "groov-ism" and "asf-isms" - or at the very least that they are there as optional things that only kick-in in ASF projects. I think the great benefitt of Magpie being "immediately adoptable" outside of ASF is clear :) -> And I think that should be something we might even add in our PRINCIPLES and skill writing / updating agentic code to make sure that the skills (except those dedicated for a given organisation) are generic enough to be used by anyone.. This IMHO will make Magpie much stronger from day one if we have such rule - and make it "important", because theat might alleviate some questions from outside adopters - and we should be able to clearly say "Yeah - we have some ASF specific things for our PMCs that you can also adopt if you want - but the main idea is to get the project outside-usable, without having to adopt the whole Apache Way. |
37522be to
6376ac7
Compare
6376ac7 to
472a5cf
Compare
Address the skill-validator findings surfaced by the pytest run on this branch, plus two markdownlint MD040 errors the lint hook surfaced on the touched committer-onboarding SKILL.md: - committer-onboarding: add the standard Pattern 4 injection-guard callout. The skill reads the <vote-thread> from the mailing-list archive, candidate-supplied identity fields (name, email, desired Apache ID), and ICLA / Whimsy roster data; the callout names those surfaces explicitly. Golden rule 3 already reinforced the same principle; this adds the validator-recognised block. (HARD violation; was failing the pytest gate.) - committer-onboarding: tag two existing untagged fenced output blocks (Step 0 output, Step 3 completion summary) as `text` so markdownlint MD040 stops flagging them on touch. - security-issue-sync: add `--limit 100` to the milestone-siblings `gh issue list` count (was unbounded; silently capped at 30 on large repos). - security-issue-triage: add `--limit 100` to the reviewed-by `gh pr list` search (same reason). - setup-isolated-setup-doctor: move the docs/setup/ sandbox-troubleshooting.md reference out of the frontmatter description into the body, so the matching-layer description stays tight and the criteria-source SOFT advisory clears. The body still documents the link extensively. - committer-onboarding eval fixtures: append the missing trailing newline to 5 expected.json files. Verified: `skill-validate --strict` reports OK (no violations); `skill-validator` pytest suite is green; markdownlint passes. Generated-by: Claude Code (Opus 4.7)
|
Pre-flight self-review — PR #229 (committer-onboarding) #229 · draft · author: Base: main (merge-base f68064f) · Files changed: 111 (71 added, 40 modified) Substantial PR: a new committer-onboarding skill + 4-step eval suite (~15 Correctness
(Verified clean: step-1 and step-3 specs and expected.json field sets align Security
(Verified clean: committer-onboarding's injection-guard callout is in place — Conventions No findings. skill-validate --strict reports OK (no violations) across the Summary Ready to push — no blocking findings. Two small spec-bullet additions (step-0 Blocking: 0 Advisory: 3 |
Self-review findings on PR apache#229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7)
|
Fixed the advisories, note that one was in the test runner, as this was not reviewed. |
Markdownlint MD040 fenced-code-language: tag four template/output blocks in the detail files as `text` — they're plain-text content (email bodies, a URL template), not code in any real language. - detail/karma-grant.md:104 Whimsy committer-profile URL template - detail/email-templates.md:16 committer congratulations email body - detail/email-templates.md:108 secretary account-request email body - detail/email-templates.md:148 dev-list welcome announcement body Verified with a broad markdownlint-cli2 sweep across all 76 changed .md files on this branch — 0 errors remaining. Generated-by: Claude Code (Opus 4.7)
|
Hi @justinmclean — same situation as #228 here. Tried a maintainer-side rebase but this branch is also significantly behind No rush. |
Self-review findings on PR apache#229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7)
Self-review findings on PR apache#229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7)
…mbers (#371) * inital commit * fix(skills): clear validator pytest failure + SOFT warnings Address the skill-validator findings surfaced by the pytest run on this branch, plus two markdownlint MD040 errors the lint hook surfaced on the touched committer-onboarding SKILL.md: - committer-onboarding: add the standard Pattern 4 injection-guard callout. The skill reads the <vote-thread> from the mailing-list archive, candidate-supplied identity fields (name, email, desired Apache ID), and ICLA / Whimsy roster data; the callout names those surfaces explicitly. Golden rule 3 already reinforced the same principle; this adds the validator-recognised block. (HARD violation; was failing the pytest gate.) - committer-onboarding: tag two existing untagged fenced output blocks (Step 0 output, Step 3 completion summary) as `text` so markdownlint MD040 stops flagging them on touch. - security-issue-sync: add `--limit 100` to the milestone-siblings `gh issue list` count (was unbounded; silently capped at 30 on large repos). - security-issue-triage: add `--limit 100` to the reviewed-by `gh pr list` search (same reason). - setup-isolated-setup-doctor: move the docs/setup/ sandbox-troubleshooting.md reference out of the frontmatter description into the body, so the matching-layer description stays tight and the criteria-source SOFT advisory clears. The body still documents the link extensively. - committer-onboarding eval fixtures: append the missing trailing newline to 5 expected.json files. Verified: `skill-validate --strict` reports OK (no violations); `skill-validator` pytest suite is green; markdownlint passes. Generated-by: Claude Code (Opus 4.7) * fix(skill-evals): close 3 advisory findings from self-review Self-review findings on PR #229: - committer-onboarding step-0 output-spec.md: enumerate the `injection_detected` field in the bullet list. The expected.json in every step-0 case asserts it, but the spec's prose only described injection-detection behaviour without naming the output field — a model following the bullets strictly would have omitted the key. - committer-onboarding step-2 output-spec.md: enumerate the `whimsy_url_contains` field (the PPMC-vs-PMC discriminator substring). Same pattern: asserted by expected.json, not in the spec's bullets. - skill-evals runner.py --cli mode: switch run_cli from `subprocess.run(cli, shell=True)` to `subprocess.run(shlex.split(cli), shell=False)`. The operator's command string was already trusted (the docstring said so), but using an argv list rather than a shell string keeps the attacker-controlled prompt content (injection-case fixtures and their like) firmly on stdin, well away from any shell interpretation, and removes a class of accidental-metacharacter footgun in the operator's --cli value. Operators who genuinely need shell features wrap their command in `bash -c '<pipeline>'`. One test follow-on (test_runner.py): the MANUAL-skips-CLI case used `"exit 1"` (a shell builtin) to assert non-zero-rc handling; under shell=False the builtin is not on PATH and would FileNotFoundError instead of exiting 1. Swapped to `"false"` — a real binary that exits 1 the same way — with an inline comment explaining the constraint. Verified: `skill-evals` pytest green; `skill-validate --strict` reports OK (no violations); `skill-validator` pytest green. Generated-by: Claude Code (Opus 4.7) * fix(committer-onboarding): tag untagged fenced output blocks as text Markdownlint MD040 fenced-code-language: tag four template/output blocks in the detail files as `text` — they're plain-text content (email bodies, a URL template), not code in any real language. - detail/karma-grant.md:104 Whimsy committer-profile URL template - detail/email-templates.md:16 committer congratulations email body - detail/email-templates.md:108 secretary account-request email body - detail/email-templates.md:148 dev-list welcome announcement body Verified with a broad markdownlint-cli2 sweep across all 76 changed .md files on this branch — 0 errors remaining. Generated-by: Claude Code (Opus 4.7) * improve tests * test(skill-evals): wrap _grader_count_cli in bash -c run_cli switched to shell=False with shlex.split in 0a13c84, but the test helper kept the shell env-var-prefix form which shlex.split tokenises as a literal argv[0] binary name. Wrap the inner command in bash -c so the env-var assignment is honoured. Fixes 3 CI failures: - test_batch_grade_single_pair_one_call - test_batch_grade_many_pairs_one_call - test_compare_with_grader_multiple_prose_mismatches_one_call
Adds a new
committer-onboardingskill that walks a nominator throughevery post-vote step after a committer or PMC election passes, for both
incubating podlings and graduated TLPs.
What's included
Skill (
.claude/skills/committer-onboarding/)Four-step workflow covering three scenarios (
new-committer,committer-to-pmc,direct-to-pmc):period check, veto handling (veto requires a justification; code
quality alone is not sufficient — must relate to conduct or fitness)
submitted-not-processed / not filed); draft congratulations email and
secretary account-creation request; checks that the nominator is a PMC
chair or ASF Member before drafting the request; detects when the ICLA
already included project + desired ID (automatic secretary flow)
for podlings, PMC/committer for TLPs), issue tracker (Jira only; GitHub
Issues projects skip this), mailing list self-service, welcome
announcement draft
pending_itemslist for anythingnot yet done
Detail files:
detail/email-templates.md— congratulations (3 ICLA variants),secretary request, welcome announcement
detail/karma-grant.md— step-by-step Whimsy, mailing list, and Jirainstructions; clarifies that
committee-info.txtis the authoritativerecord for PMC membership (not LDAP), and that Whimsy is a convenient
derived interface that updates both
Eval suite (
tools/skill-evals/evals/committer-onboarding/)24 cases across all four steps.
Privacy wiring (
tools/privacy-llm/wiring.md)Registers
committer-onboardingin the skills table. The skill readsprivate-list vote content, so Step 0 includes a privacy pre-flight
gate-check (
privacy-llm-check --reads-private-list) and a PII handlingtable (candidate = not redacted; voters = collaborators, exempt by
default; third-party names in discussion = must be redacted).