Support fuzz_targets property in clusterfuzz_manifest.json by notvictorl · Pull Request #5305 · google/clusterfuzz

notvictorl · 2026-06-01T17:19:34Z

Adds support for fuzz_targets property in clusterfuzz_manifest.json. This allows coverage-guided fuzzing jobs with ChromeBuildArchives to skip the unpacking and fuzz target discovery phases.

Problem:
Currently, ClusterFuzz wastes time unpacking complete archives to then traverse the directory to discover fuzz targets. Furthermore, on certain large build types, this unpacking process frequently triggers out-of-disk-space errors due to the massive size of the zip files.

Solution:
By introducing fuzz_targets — a list of fuzz target paths relative to clusterfuzz_manifest.json — we can bypass the discovery step entirely. ClusterFuzz will now map this list so that each fuzz target has a normalized name (key) and its corresponding path (value).

More Info: crbug.com/508214240

letitz · 2026-06-02T08:34:55Z

      self.list_fuzz_targets()

-    if not fuzz_target in self._fuzz_targets:
+    if not self._fuzz_targets or fuzz_target not in self._fuzz_targets:


I think we should assume and document in DefaultBuildArchive that list_fuzz_targets() always sets self._fuzz_targets to a dict value. Then we can skip this second check for not self._fuzz_targets. This ties in to my comment below about making list_fuzz_targets() a concrete method.

Also, we should use return self._fuzz_targets.get(fuzz_target) to avoid the double lookup.

letitz · 2026-06-02T08:40:34Z

    # case, default to schema version 0.
    manifest_path = 'clusterfuzz_manifest.json'
    if self.file_exists(manifest_path):
      with self.open(manifest_path) as f:


This branch has grown enough, let's invert the branch with an early return for the schema v0 case to reduce nesting.

if not self.file_exists(manifest_path): self._archive_schema_version = default_archive_schema_version return with self.open(manifest_path) as f: ...

letitz · 2026-06-02T08:43:58Z

        self._archive_schema_version = default_archive_schema_version
+
+      fuzz_targets_data = manifest.get('fuzz_targets')
+      if isinstance(fuzz_targets_data, list):


Again we can reduce nesting with an early return instead in the error case. Also a nit, let's name the variable fuzz_target_paths:

if not isinstance(fuzz_targets_paths, list): logs.error("not a list") return

letitz · 2026-06-02T09:28:00Z



 class BuildArchive(archive.ArchiveReader):
  """Abstract class for representing a build archive. This is mostly an


Thinking aloud here. If you enjoy refactoring code and feel motivated to take this on, do feel free to, but it's straying off the path a fair bit so please also feel free to just ignore this.

As I review this, I'm coming to the conclusion that this abstraction is a bit wonky.

It really does not make much sense for it to inherit from ArchiveReader, at most it could expose self._reader through a getter but from skimming the code that instantiates BuildArchive objects 1, it does not seem we need to reach down into the reader anyway.

unpacked_size() and unpack() don't need to be abstract, they are only overridden once in the class hierarchy by DefaultBuildArchive. All that's left apart from that is list_fuzz_targets() and get_target_dependencies().

The design would make more sense to me if we had:

ArchiveReader: interface that abstracts over different file formats (tar vs zip), allows operations on files

FuzzTargetFinder: interface that abstracts over how to find fuzz targets and their dependencies, given an archive reader

BuildArchive: concrete class that abstracts over different directory layouts inside the archive, allows operations on the whole build and fuzz targets, using an injected FuzzTargetFinder.

Something like:

class FuzzTargetFinder(abc.ABC): @abc.abstractmethod def find_targets(self, reader: ArchiveReader) -> list[os.PathLike]: pass @abc.abstractmethod def get_target_dependencies(self, reader: ArchiveReader, target: os.PathLike) -> list[os.PathLike]: pass class DefaultFuzzTargetFinder(FuzzTargetFinder): pass class ChromeFuzzTargetFinder(FuzzTargetFinder): pass def make_fuzz_target_finder(archive_reader: ArchiveReader) -> FuzzTargetFinder: # check for args.gn, return either Finder class class BuildArchive: def __init__(self, fuzz_target_finder: FuzzTargetFinder): self._fuzz_target_finder = fuzz_target_finder def archive_reader(self) -> ArchiveReader: return self._fuzz_target_finder.archive_reader

letitz · 2026-06-02T09:51:06Z

@@ -185,8 +185,10 @@
    return to_extract

  def list_fuzz_targets(self) -> List[str]:


We're using inheritance here and below to override the initialization behavior of self._fuzz_targets, which works but is a bit messy and duplicates a bit of logic. See my topmost comment for a completely optional suggestion on a refactor.

A simpler way to make things a bit better here without reworking the class hierarchy entirely is maybe to define an abstract pure method that just handles finding fuzz targets, and share the implementation of list_fuzz_targets() in BuildArchive above.

class BuildArchive: def _init__(self, reader): self._reader = reader self._fuzz_targets = None def list_fuzz_targets(): if self._fuzz_targets is None: self._fuzz_targets = { fuzzer_utils.normalize_target_name(path): path for path in self.find_fuzz_targets() } return list(self._fuzz_targets.keys()) @abc.abstractmethod def find_fuzz_targets(self) -> list[os.PathLike]: pass class DefaultBuildArchive(BuildArchive): def find_fuzz_targets(self): # use `is_fuzz_target()` to filter files in the archive class ChromeBuildArchive(DefaultBuildArchive): def __init__(self, reader): self._fuzz_target_paths = # from the manifest def find_fuzz_targets(self): if self._fuzz_target_paths: return self._fuzz_target_paths return super().find_fuzz_targets()

letitz · 2026-06-02T10:32:45Z

    test_archive = build_archive.ChromeBuildArchive(self.mock.open.return_value)
    self.assertEqual(test_archive.archive_schema_version(), 1)
+
+  def _generate_manifest_with_fuzz_targets(self, archive_schema_version,


Let's collapse all these _generate_manifest() functions into a single one that expects a dict. The format is simple enough that we can repeat it in most tests:

def _generate_manifest(self, contents: any): json_contents = json.dumps(contents).encode() def _mock_open(_): buffer = io.BytesIO(b'') buffer.write(json_contents) buffer.seek(0) return buffer self.mock.open.return_value.open.side_effect = _mock_open # and call it like def test_foo(self): self._generate_manifest({ 'archive_schema_version': 1 })

letitz · 2026-06-02T10:35:09Z

+    test_archive.list_fuzz_targets()
+    self.mock.open.return_value.list_members.assert_called_once()
+
+  def test_manifest_fuzz_targets_missing(self):


We should have a test for what happens when only some of the entries are invalid.

letitz · 2026-06-02T10:40:07Z

+    self.mock.file_exists.return_value = True
+    self._generate_manifest_with_fuzz_targets(
+        1, ['out/build/my_fuzzer', 'out/build/other_fuzzer'])
+    test_archive = build_archive.ChromeBuildArchive(self.mock.open.return_value)


[Optional] self.mock.open.return_value is mentioned many times, and its name is quite opaque. It may be worth defining either a local variable mock_archive_reader or even a test fixture class member self.mock_archive_reader.

letitz · 2026-06-02T10:44:07Z

    self.mock.unzip_over_http_compatible.return_value = True
    self.mock.time.return_value = 1000.0
-    build = build_manager.setup_regular_build(2)
+    fuzz_target = build_manager.pick_random_fuzz_target(


This test may have been exercising the code path for blackbox fuzzing, and no longer is. Is there another test that checks what happens when fuzz_target is None? If not, we should duplicate the old test to conserve coverage.

letitz · 2026-06-02T10:44:15Z

    self.mock.unzip_over_http_compatible.return_value = True
    self.mock.time.return_value = 1000.0
-    build = build_manager.setup_regular_build(2)
+    fuzz_target = build_manager.pick_random_fuzz_target(


notvictorl requested a review from a team as a code owner June 1, 2026 17:19

notvictorl force-pushed the liuvic/fuzz_targets branch 4 times, most recently from 828343c to a4fc5d3 Compare June 1, 2026 18:47

Support fuzz_targets property in clusterfuzz_manifest.json

2647ce6

notvictorl force-pushed the liuvic/fuzz_targets branch from a4fc5d3 to 2647ce6 Compare June 1, 2026 19:39

notvictorl requested a review from letitz June 1, 2026 20:14

letitz reviewed Jun 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support fuzz_targets property in clusterfuzz_manifest.json#5305

Support fuzz_targets property in clusterfuzz_manifest.json#5305
notvictorl wants to merge 1 commit into
google:masterfrom
notvictorl:liuvic/fuzz_targets

notvictorl commented Jun 1, 2026 •

edited

Loading

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

letitz Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants



		class BuildArchive(archive.ArchiveReader):
		"""Abstract class for representing a build archive. This is mostly an

		@@ -185,8 +185,10 @@
		return to_extract

		def list_fuzz_targets(self) -> List[str]:

Conversation

notvictorl commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

notvictorl commented Jun 1, 2026 •

edited

Loading