add fuzzing infrastructure, expand fuzz coverage, fix CI integration#1
Open
osyniakov wants to merge 15 commits into
Open
add fuzzing infrastructure, expand fuzz coverage, fix CI integration#1osyniakov wants to merge 15 commits into
osyniakov wants to merge 15 commits into
Conversation
Adds cargo-fuzz targets and ClusterFuzzLite CI to satisfy the OSSF Scorecard Fuzzing check. Four fuzz targets cover the main parsing paths: - fuzz_query_dsl: Elasticsearch query DSL JSON deserialization - fuzz_query_string: Lucene query string parser (UserInputQuery) - fuzz_datetime: datetime string parsing across all supported formats - fuzz_doc_mapper: JSON document ingestion via DocMapper Also fixes a panic in parse_timestamp_str discovered during fuzzing: subsecond_digits_str.len().min(9) returns byte count and can slice mid-way through a multi-byte UTF-8 character. The fix uses str::find to locate the first non-ASCII-digit, ensuring the slice boundary is always on a valid char boundary. Building fuzz targets requires: RUSTFLAGS="--cfg tokio_unstable" cargo +nightly fuzz build The ClusterFuzzLite workflow will appear in GitHub Actions and satisfies the OSSF Scorecard Fuzzing check once merged to main. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
The fuzz crate is part of the parent workspace so the workspace-level Cargo.lock is authoritative. Add Cargo.lock to fuzz/.gitignore. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
- workflow: drop unused `id: build` and `id: run` step IDs - fuzz_doc_mapper: remove "what" comment line; the LazyLock purpose is self-evident from the "why" comment that remains - date_time_parsing: replace `str::find` char-closure with `as_bytes().iter().position` — avoids UTF-8 decoding overhead since subsecond digits are always ASCII; the byte-level position remains a valid str char boundary (non-ASCII-digit byte is either ASCII or a multi-byte leading byte, never a continuation byte) https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
The existing **/.* rule excluded all hidden directories, including .clusterfuzzlite/. ClusterFuzzLite's Dockerfile COPYs build.sh from that directory, so the Docker build would fail with "not found". Add an explicit exception matching the existing pattern for .git/. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
The package was already named quickwit-fuzz; rename the directory to match. Also correct the binary output path in build.sh: as a workspace member the fuzz targets are built into the workspace-level target/, not a package-local target/. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
Add path filters to push/pull_request triggers matching the pattern used by ci.yml: run on quickwit/** (excluding quickwit-ui/) plus the ClusterFuzzLite infra files themselves. schedule and workflow_dispatch are left unfiltered as GitHub Actions does not support paths: on those triggers. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
…time format - fuzz_otlp_spans / fuzz_otlp_logs: exercise parse_otlp_spans/logs_protobuf and parse_otlp_spans/logs_json from quickwit-opentelemetry. Both accept untrusted bytes from external tracing/logging agents (OTEL Collector, Fluentd) at the gRPC boundary. The recursive AnyValue processing inside has no depth limit, making this the highest-priority fuzzing surface. - fuzz_doc_mapper_config: fuzz the DocMapperBuilder config deserialization and try_build() validation — the schema-definition path arriving via the index-creation REST API. Orthogonal to the existing fuzz_doc_mapper which tests document ingestion against a fixed schema. - fuzz_java_datetime_format: fuzz StrptimeParser::from_java_datetime_format and from_strptime — the format-string parsing side of datetime handling. Reachable from Elasticsearch range queries via the untrusted "format" field. Also adds --fuzz-dir quickwit-fuzz to build.sh (needed after renaming fuzz/ to quickwit-fuzz/) and expands the binary loop to include the four new targets. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
- quickwit-fuzz/Cargo.toml: add license.workspace = true (required for cargo deny's unlicensed check on workspace members) - deny.toml: add NCSA exception for libfuzzer-sys, which carries (MIT OR Apache-2.0) AND NCSA; NCSA is OSI/FSF-approved but was not in the workspace allow-list; libfuzzer-sys is never shipped in production - fuzz_doc_mapper_config.rs: collapse nested if-let into a let-chain to satisfy clippy's collapsible_if lint (stable since Rust 1.88); also apply rustfmt formatting - fuzz_doc_mapper.rs: apply rustfmt formatting (long line in LazyLock) https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
libfuzzer-sys and its transitive dep arbitrary were introduced when quickwit-fuzz became a workspace member. Regenerated with dd-rust-license-tool (git version) using the existing openssl-macros override in license-tool.toml. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
quickwit-proto's build.rs runs protobuf codegen via prost-build, which requires protoc at compile time. The base-builder-rust image does not include it. ci.yml already installs protobuf-compiler for the same reason; mirror that here. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
The base-builder-rust image is Ubuntu 20.04 where apt-get installs protoc 3.6.1, which predates --experimental_allow_proto3_optional (added in 3.12, required by prost-build for proto3 optional fields). Replace with protoc 3.20.3 from the official GitHub releases, which fully supports the flag. curl is already present in the base image; add unzip for the zip archive. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
Input `*\x0c...\x0c*` triggers a known tantivy bug in
UserInputLeaf::set_field ("Exist query without a field isn't
allowed"). libfuzzer-sys converts panics to abort(), which ASAN
reports as a crash. Use catch_unwind to suppress the panic so the
fuzzer can continue exploring the input space.
https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
catch_unwind alone is insufficient: libfuzzer-sys installs a panic
hook during initialization that calls std::process::abort() before
stack unwinding begins, so the unwind never reaches catch_unwind.
Fix by temporarily swapping the hook for a no-op around the call,
then restoring the original hook, so the known upstream tantivy bug
("Exist query without a field isn't allowed") is caught rather than
aborting the fuzzer process.
https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
code-change (clusterfuzzlite.yml): runs on push/PR with path filters, 30s run to replay crash corpus and catch regressions. batch (clusterfuzzlite-batch.yml): runs on nightly schedule, 1-hour run to grow the corpus and find new bugs. Previously both modes shared one workflow with a schedule trigger, causing code-change to run nightly (wasteful) and making it impossible to tune each mode independently. https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds cargo-fuzz targets and ClusterFuzzLite CI to satisfy the OSSF Scorecard
Fuzzing check (was scoring 0). Also fixes a real panic discovered during
fuzzing, adds four more fuzz targets covering the highest-priority security
surfaces, and mitigates a panic found in tantivy's query grammar parser.
Changes
Fuzzing infrastructure
quickwit/quickwit-fuzz/— cargo-fuzz crate, member of the parent workspace.clusterfuzzlite/Dockerfile— extendsgcr.io/oss-fuzz-base/base-builder-rust; installs protoc 3.20.3 (Ubuntu 20.04's bundled 3.6.x predates--experimental_allow_proto3_optional).clusterfuzzlite/build.sh— builds all fuzz targets withRUSTFLAGS="--cfg tokio_unstable" cargo +nightly fuzz build --fuzz-dir quickwit-fuzz.github/workflows/clusterfuzzlite.yml— code-change mode (push/PR); path filters skip frontend-only changes.github/workflows/clusterfuzzlite-batch.yml— batch mode on nightly schedule (1 hour); grows the corpus and finds new bugs independently of PR CIBug fix found by fuzzer
quickwit-datetime/src/date_time_parsing.rs—parse_timestamp_strpanickedon inputs containing multi-byte UTF-8 characters in the subsecond position.
The old code used
str.len().min(9)(byte count) to slice, which can landmid-character. Fixed with a byte-level scan that always stops at a valid char boundary.
Tantivy query grammar panic (mitigation)
fuzz_query_stringfound a panic in tantivy'sUserInputLeaf::set_field(
query-grammar/src/user_input_ast.rs:51: "Exist query without a field isn'tallowed") triggered by bare exist-style inputs such as
*:. This is a bug inthe upstream tantivy query grammar parser with no tracked issue yet.
Standard
catch_unwindis insufficient here because libfuzzer-sys installs apanic hook that calls
abort()before stack unwinding begins. The workaroundtemporarily replaces the hook with a no-op around the call, then restores it,
so the fuzzer continues rather than aborting.
Fuzz targets (8 total)
fuzz_query_dslElasticQueryDslJSON deserializationfuzz_query_stringUserInputQueryparser*:fuzz_datetimeparse_date_time_stracross all formatsfuzz_doc_mapperDocMapper::doc_from_json_strfuzz_doc_mapper_configDocMapperBuilderdeserialization +try_build()fuzz_java_datetime_formatStrptimeParser::from_java_datetime_format/from_strptime"format"fieldfuzz_otlp_spansparse_otlp_spans_protobuf+parse_otlp_spans_jsonAnyValuehas no depth limitfuzz_otlp_logsparse_otlp_logs_protobuf+parse_otlp_logs_jsonOther
.dockerignore— added!.clusterfuzzlite/exception (was excluded by**/.*)quickwit/Cargo.toml— addedquickwit-fuzzto workspace membersquickwit/deny.toml— NCSA license exception forlibfuzzer-sysLICENSE-3rdparty.csv— addedarbitraryandlibfuzzer-sysentriesTest plan
RUSTFLAGS="--cfg tokio_unstable" cargo +nightly fuzz build --fuzz-dir quickwit-fuzzfuzz_datetimeno longer panics on multi-byte UTF-8 subsecond inputsfuzz_query_stringno longer aborts on*:and similar tantivy-panicking inputscode-changeCI passes end-to-endgoogle/clusterfuzzliteaction → Fuzzing score > 0https://claude.ai/code/session_01PKpEBTpgHSndurjdPbJodB