Skip to content

D-047: test(fuzz): signature extraction fuzzers for non-Rust languages#10

Open
Sephyi wants to merge 1 commit intodevelopmentfrom
audit/d-047-non-rust-fuzz-signature
Open

D-047: test(fuzz): signature extraction fuzzers for non-Rust languages#10
Sephyi wants to merge 1 commit intodevelopmentfrom
audit/d-047-non-rust-fuzz-signature

Conversation

@Sephyi
Copy link
Copy Markdown
Owner

@Sephyi Sephyi commented Apr 22, 2026

Summary

test(fuzz): signature extraction fuzzers for non-Rust languages.

Audit context

Closes audit entry D-047 from #3.

Verification

  • cargo fmt --check
  • cargo clippy --all-targets --all-features -- -D warnings
  • cargo test --all-targets

Note: one pre-existing test porcelain_exits_within_timeout_with_no_staged_changes is a known macOS cold-start flake that reproduces on unmodified development — unrelated to this change.

Extend the signature-extraction fuzz coverage from Rust-only to all
ten supported grammars (Rust, TypeScript, JavaScript, Python, Go,
Java, C, C++, Ruby, C#). A new unified `fuzz_signature_multilang`
target dispatches on `data[0] % 10` to pick a language, then feeds
the remaining bytes to the matching `extract_<lang>_signature`
helper. This mirrors the byte-dispatch pattern already used by
`fuzz_classify_span` and keeps boilerplate minimal.

To expose the dispatcher, `lib.rs` grows one public wrapper per
language, each delegating to a small private `extract_signature_for_
language(source, language)` helper that centralises the
`Parser::new() -> set_language -> parse -> root.child(0) ->
AnalyzerService::extract_signature` pipeline. Each wrapper is
gated by its language feature so the crate still builds with
arbitrary subsets of `lang-*` features.

`fuzz/Cargo.toml` now pulls `commitbee` with `default-features =
false` plus every `lang-*` feature, and registers the new
`fuzz_signature_multilang` binary. Turning off default features
also drops the keyring transitive dependency from the fuzz build,
which is pure build-time savings for a workload that never touches
secure storage.

Verified via `cargo check --manifest-path fuzz/Cargo.toml` plus the
standard `cargo fmt --check`, `cargo clippy --all-targets
--all-features -- -D warnings`, and `cargo test --all-targets`.
The fuzzer itself does not need to run to completion — the guarantee
is "never panic on any input," and `cargo-fuzz` will exercise that
as part of the normal fuzzing workflow.

Closes audit entry D-047 from #3.
Copilot AI review requested due to automatic review settings April 22, 2026 19:50
@Sephyi Sephyi added the audit Codebase audit cleanup (issue #3) label Apr 22, 2026
@Sephyi Sephyi self-assigned this Apr 22, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds multi-language fuzz coverage for CommitBee’s signature extraction by exposing language-specific signature wrappers and introducing a new fuzz target that dispatches across supported tree-sitter grammars.

Changes:

  • Refactors signature extraction into a shared extract_signature_for_language helper in src/lib.rs and adds per-language public wrapper functions behind lang-* feature flags.
  • Introduces a new fuzz_signature_multilang fuzz target that selects among 10 languages based on the first input byte.
  • Updates fuzz crate configuration to enable all language features (while disabling default features) and refreshes fuzz/Cargo.lock accordingly.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/lib.rs Adds a language-parameterized signature extraction helper and new extract_*_signature wrappers for non-Rust languages.
fuzz/fuzz_targets/fuzz_signature_multilang.rs New fuzz target that dispatches input to the appropriate language-specific signature extractor.
fuzz/Cargo.toml Enables all language features for fuzzing (and disables default features) and registers the new fuzz binary target.
fuzz/Cargo.lock Lockfile updates reflecting changed feature/dependency resolution (e.g., dropping secure-storage/keyring from fuzz).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/lib.rs
Comment on lines 59 to 73
@@ -76,6 +72,66 @@ pub fn extract_rust_signature(source: &str) -> Option<String> {
services::analyzer::AnalyzerService::extract_signature(first_child, source)
}
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extract_signature_for_language is always compiled, but it’s only referenced from the language-specific wrappers that are all #[cfg(feature = "lang-*")]. In a --no-default-features build with no language features enabled, this helper becomes unused and will trigger a dead_code warning. Consider gating the helper with the same cfg(any(feature = "lang-rust", ...)) set, or explicitly allowing dead_code with a short rationale so minimal-feature builds stay warning-free.

Copilot uses AI. Check for mistakes.
Comment on lines +29 to +61
let Ok(source) = std::str::from_utf8(&data[1..]) else {
return;
};
match selector {
0 => {
let _ = commitbee::extract_rust_signature(source);
}
1 => {
let _ = commitbee::extract_typescript_signature(source);
}
2 => {
let _ = commitbee::extract_javascript_signature(source);
}
3 => {
let _ = commitbee::extract_python_signature(source);
}
4 => {
let _ = commitbee::extract_go_signature(source);
}
5 => {
let _ = commitbee::extract_java_signature(source);
}
6 => {
let _ = commitbee::extract_c_signature(source);
}
7 => {
let _ = commitbee::extract_cpp_signature(source);
}
8 => {
let _ = commitbee::extract_ruby_signature(source);
}
9 => {
let _ = commitbee::extract_csharp_signature(source);
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fuzzer drops any input where data[1..] is not valid UTF-8, which can significantly reduce fuzz coverage since many byte sequences will exit early. To exercise the parsers more thoroughly, consider converting with a lossless-to-&str strategy like String::from_utf8_lossy(&data[1..]) (or switching the target input type to (&str, u8) via a structured fuzzer input) so every input still drives the extractor while staying within the &str API contract.

Suggested change
let Ok(source) = std::str::from_utf8(&data[1..]) else {
return;
};
match selector {
0 => {
let _ = commitbee::extract_rust_signature(source);
}
1 => {
let _ = commitbee::extract_typescript_signature(source);
}
2 => {
let _ = commitbee::extract_javascript_signature(source);
}
3 => {
let _ = commitbee::extract_python_signature(source);
}
4 => {
let _ = commitbee::extract_go_signature(source);
}
5 => {
let _ = commitbee::extract_java_signature(source);
}
6 => {
let _ = commitbee::extract_c_signature(source);
}
7 => {
let _ = commitbee::extract_cpp_signature(source);
}
8 => {
let _ = commitbee::extract_ruby_signature(source);
}
9 => {
let _ = commitbee::extract_csharp_signature(source);
let source = String::from_utf8_lossy(&data[1..]);
match selector {
0 => {
let _ = commitbee::extract_rust_signature(source.as_ref());
}
1 => {
let _ = commitbee::extract_typescript_signature(source.as_ref());
}
2 => {
let _ = commitbee::extract_javascript_signature(source.as_ref());
}
3 => {
let _ = commitbee::extract_python_signature(source.as_ref());
}
4 => {
let _ = commitbee::extract_go_signature(source.as_ref());
}
5 => {
let _ = commitbee::extract_java_signature(source.as_ref());
}
6 => {
let _ = commitbee::extract_c_signature(source.as_ref());
}
7 => {
let _ = commitbee::extract_cpp_signature(source.as_ref());
}
8 => {
let _ = commitbee::extract_ruby_signature(source.as_ref());
}
9 => {
let _ = commitbee::extract_csharp_signature(source.as_ref());

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

audit Codebase audit cleanup (issue #3)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants