You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Comet's docs are growing — user guide, contributor guide, compatibility tables, operator references, plus per-release changelogs. Quality varies, terminology drifts, and once the nomenclature style guide from #4419 lands, we need a way to keep new and existing docs aligned with it. PR reviewers shouldn't have to be the last line of defense for prose quality, and contributors shouldn't have to internalize a multi-page style guide before opening a docs PR.
A sibling skill to review-comet-pr, focused on documentation, would let reviewers (and contributors during self-review) get a docs-expert pass over a PR or an existing doc. The skill produces advisory feedback — it does not edit files or post comments directly, mirroring review-comet-pr.
Bare "native" used as a vague adjective (e.g. "runs natively", "the native path") — flag and suggest the specific axis: Rust-implemented, Arrow-native, Comet pipeline.
Permitted compounds (native Rust, Arrow-native, native shuffle in shuffle-pair context) recognized and not flagged.
"Vectorized" used as a synonym for columnar — flag.
"Falls back to Spark" / "Spark fallback" used consistently for the same concept.
Operator names match plan output exactly (e.g. CometProject, not CometProjectExec or Comet Project).
Information architecture
Audience is identifiable from the first paragraph — user, contributor, or operator.
Heading hierarchy is correct (no jumps from H2 to H4, no multiple H1s).
Tables used for structured comparisons (operators, configs, support matrices) rather than bulleted prose.
Long pages have a brief overview / table of contents at the top.
Completeness
New user-guide pages have: overview, prerequisites or version applicability, at least one runnable example, links to related docs.
New contributor-guide pages have: scope/audience, prerequisites, the procedure, how to verify it worked.
New configs in prose are also added to configs.md with key, default, type, version added.
New operators in prose are also added to the operator reference in understanding-comet-plans.md and operators.md.
New expressions are reflected in spark_expressions_support.md.
Accuracy
Config keys mentioned exist in code (grep spark.comet.* against the repo).
Operator names mentioned exist as classes.
Spark version claims ("Spark 3.4+", "Spark 4.0 only") match the version profiles the surrounding code lives in.
Code samples are syntactically valid for the language they claim (Scala / Python / SQL / shell).
Present tense, active voice, second person where appropriate.
No marketing fluff ("blazing fast", "seamlessly", "powerful").
No future-tense aspirations in user-facing docs ("will support", "is planned to") — those belong in the roadmap, not the user guide.
No stale TODOs or XXX markers in user-visible files.
Apache project hygiene
New .md files have the ASF license header.
No copyrighted third-party content without attribution.
Diagrams/images referenced exist in the repo.
Comet-specific conventions
User-guide content lives under docs/source/user-guide/latest/, contributor-guide content under docs/source/contributor-guide/. Flag misplacements.
Spark version-specific content notes the version profile clearly.
Plan-output examples use real plan formatting (tree-form with +- connectors), not invented prose.
Cross-references use relative links within docs/source/.
Scope
The skill should support two modes:
PR review mode — given a PR number or branch, surface docs changes (and prose changes in code comments / config descriptions / Scaladoc) and audit them.
Standalone audit mode — given a file or directory under docs/source/, audit it as-is.
In both modes the skill returns a prioritized list (must-fix / should-fix / nit) suitable for a reviewer to paste or paraphrase into review comments.
Output format
Mirror review-comet-pr: produce structured advisory feedback for the human, not direct edits. Categorize findings by severity. For each finding:
File and line reference.
The rule that was violated (link to the style guide section).
A concrete suggested rewrite when one is obvious.
The skill must not edit files, post PR comments, or push branches.
Open questions
Should the skill enforce stricter rules on user-guide content (lower tolerance for jargon, mandatory examples) than contributor-guide content (where domain-specific vocabulary is fine)?
Should the skill check generated docs (e.g. expression support tables generated by make) or only hand-written prose?
Motivation
Comet's docs are growing — user guide, contributor guide, compatibility tables, operator references, plus per-release changelogs. Quality varies, terminology drifts, and once the nomenclature style guide from #4419 lands, we need a way to keep new and existing docs aligned with it. PR reviewers shouldn't have to be the last line of defense for prose quality, and contributors shouldn't have to internalize a multi-page style guide before opening a docs PR.
A sibling skill to
review-comet-pr, focused on documentation, would let reviewers (and contributors during self-review) get a docs-expert pass over a PR or an existing doc. The skill produces advisory feedback — it does not edit files or post comments directly, mirroringreview-comet-pr.What the skill should check
Style guide adherence (post #4419)
CometProject, notCometProjectExecorComet Project).Information architecture
Completeness
configs.mdwith key, default, type, version added.understanding-comet-plans.mdandoperators.md.spark_expressions_support.md.Accuracy
spark.comet.*against the repo).Voice and tone
XXXmarkers in user-visible files.Apache project hygiene
.mdfiles have the ASF license header.Comet-specific conventions
docs/source/user-guide/latest/, contributor-guide content underdocs/source/contributor-guide/. Flag misplacements.+-connectors), not invented prose.docs/source/.Scope
The skill should support two modes:
docs/source/, audit it as-is.In both modes the skill returns a prioritized list (must-fix / should-fix / nit) suitable for a reviewer to paste or paraphrase into review comments.
Output format
Mirror
review-comet-pr: produce structured advisory feedback for the human, not direct edits. Categorize findings by severity. For each finding:The skill must not edit files, post PR comments, or push branches.
Open questions
make) or only hand-written prose?docs/source/contributor-guide/style_guide.mdonce Establish nomenclature style guide and audit operator names for clarity #4419 lands), or read the latest committed version on each invocation?audit-comet-docsskill for full-site audits versus a single skill that handles both PR review and standalone audit modes?Dependencies