Skip to content

feat(core): require TrackerResult from collectors and migrate in-repo trackers#279

Merged
wpak-ai merged 6 commits into
cppalliance:developfrom
leostar0412:feat/collector-tracker-result-protocol
Jun 10, 2026
Merged

feat(core): require TrackerResult from collectors and migrate in-repo trackers#279
wpak-ai merged 6 commits into
cppalliance:developfrom
leostar0412:feat/collector-tracker-result-protocol

Conversation

@leostar0412

@leostar0412 leostar0412 commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

Extend AbstractCollector.collect() to return TrackerResult instead of None. run() validates the result, backfills duration_seconds, and exposes last_result.
Add shared DTOs (GenericTrackerResult, GenericIncrementalState) and optional incremental checkpoint hooks (load_incremental_state / persist_incremental_state).
Migrate all in-monorepo collectors to frozen protocol_impl.py DTOs and log structured outcomes from BaseCollectorCommand.
Update collector docs (Core_public_API.md, tutorial, how-to) and add protocol conformance tests.
Blocks the companion PR in wg21-reflector-collector (overlay CI depends on this core contract).

Apps touched

  • core
  • boost_library_docs_tracker
  • boost_library_tracker
  • boost_library_usage_dashboard
  • boost_mailing_list_tracker
  • boost_usage_tracker
  • clang_github_tracker
  • cppa_pinecone_sync
  • cppa_slack_tracker
  • cppa_user_tracker
  • cppa_youtube_script_tracker
  • discord_activity_tracker
  • github_activity_tracker
  • wg21_paper_tracker
  • docs

Test plan

  • python -m pytest core/tests/test_collectors_base.py core/tests/test_collector_protocol_conformance.py core/tests/test_protocols.py
  • python -m pytest */tests/test_protocol_impl.py
  • uv run pyright (if typed code changed)
  • lint-imports (if imports or cross-app coupling changed)
  • App command smoke-tested (if collector/command changed):

Docs / coupling

  • cross-app-dependencies.md updated (if FKs or cross-app imports changed)
  • python scripts/generate_service_docs.py run (if services.py or core/protocols.py changed)
  • App README or docs/ updated (if behavior or ops changed)

Closes #275

Summary by CodeRabbit

  • New Features

    • Collectors now return structured TrackerResult objects and many trackers support incremental checkpoint state.
  • Improvements

    • Standardized, immutable result types across apps (counts, errors, duration), dry‑run reporting, per-run merging/aggregation, and enhanced command logging including last_result semantics.
  • Tests & Documentation

    • Added protocol conformance and result/state tests; updated docs and tutorials to reflect the new collector lifecycle and result contract.

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6669ec4a-61d0-4bd6-bae0-bafdf7b4c30d

📥 Commits

Reviewing files that changed from the base of the PR and between 44cfb18 and aaf5ff7.

📒 Files selected for processing (1)
  • reddit_activity_tracker/management/commands/run_reddit_activity_tracker.py

📝 Walkthrough

Walkthrough

Collectors and commands now return protocol-conformant TrackerResult (and optional IncrementalState) objects; core lifecycle enforces result validation, backfills duration, persists incremental state, app-specific frozen DTOs added/expanded, tests updated for runtime conformance, and docs/scaffold adjusted.

Changes

Core Protocol & Framework Integration

Layer / File(s) Summary
Protocol extensions and generic type implementations
core/protocols.py, core/tracker_result.py, core/incremental_state.py, core/collectors/__init__.py
TrackerResult gains errors and duration_seconds. Added GenericTrackerResult with ok()/failed() and duration helpers. GenericIncrementalState added and re-exported. Runtime helper require_incremental_state() added.
AbstractCollector lifecycle enforcing TrackerResult return
core/collectors/base_collector.py
CollectorRunnable.run()/AbstractCollector.run() now return TrackerResult. Lifecycle loads incremental state, calls collect(), enforces protocol conformance, backfills duration, sets last_result, persists incremental out state in post_collect, and preserves on_error semantics.
BaseCollectorCommand result logging and structured output
core/collectors/command_base.py
Added helpers to compute records_collected and log standardized "Collector finished" structured messages using collector.last_result when present.

Tracker App Protocol Implementations

Layer / File(s) Summary
Simple collectors and DTOs
boost_library_docs_tracker/protocol_impl.py, boost_library_docs_tracker/management/commands/run_boost_library_docs_tracker.py, cppa_user_tracker/management/commands/run_cppa_user_tracker.py, boost_usage_tracker/management/commands/run_boost_usage_tracker.py, cppa_youtube_script_tracker/protocol_impl.py, cppa_youtube_script_tracker/management/commands/run_cppa_youtube_script_tracker.py, boost_library_usage_dashboard/protocol_impl.py, boost_library_usage_dashboard/collectors.py
Added/fixed per-app tracker DTOs and changed collect/command flows to return TrackerResult instances (e.g., LibraryDocsTrackerResult, YoutubeScriptTrackerResult, UsageDashboardTrackerResult).
Stateful collectors with incremental state
boost_library_tracker/protocol_impl.py, boost_library_tracker/management/commands/collect_boost_libraries.py, boost_mailing_list_tracker/protocol_impl.py, boost_mailing_list_tracker/management/commands/run_boost_mailing_list_tracker.py, discord_activity_tracker/protocol_impl.py, discord_activity_tracker/management/commands/run_discord_activity_tracker.py, cppa_slack_tracker/protocol_impl.py, cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py, clang_github_tracker/protocol_impl.py, clang_github_tracker/collectors.py
Collectors implement load_incremental_state() where applicable and return concrete result/state DTOs. Command flows capture and return counts/errors/duration via factory constructors.
Multi-repo and sync result merging
boost_library_tracker/management/commands/run_boost_github_activity_tracker.py, github_activity_tracker/protocol_impl.py, github_activity_tracker/tests/test_protocol_impl.py, cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py, cppa_pinecone_sync/protocol_impl.py, wg21_paper_tracker/collectors.py, wg21_paper_tracker/protocol_impl.py
Per-repo sync dicts are normalized into TrackerResult DTOs and merged (e.g., GitHubSyncTrackerResult.merge, PineconeSyncTrackerResult.from_sync_dict). WG21 pipeline mapped to Wg21PaperTrackerResult.

Test Infrastructure & Documentation

Layer / File(s) Summary
Core protocol conformance and lifecycle tests
core/tests/test_collector_protocol_conformance.py, core/tests/test_collectors_base.py, core/tests/test_protocols.py
New/updated tests assert runtime isinstance() conformance to TrackerResult and IncrementalState, enforce collect() return contract, duration/backfill behavior, and logging of result fields.
App protocol tests and scaffold updates
core/management/commands/startcollector.py, boost_library_tracker/tests/test_protocol_impl.py, boost_mailing_list_tracker/tests/test_protocol_impl.py, clang_github_tracker/tests/test_protocol_impl.py, cppa_pinecone_sync/tests/test_protocol_impl.py, cppa_slack_tracker/tests/test_protocol_impl.py, wg21_paper_tracker/tests/test_protocol_impl.py, github_activity_tracker/tests/test_protocol_impl.py, boost_library_tracker/tests/test_collect_boost_libraries_command.py, discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
Generated collector stubs now return GenericTrackerResult.ok(). Each app adds protocol_impl tests ensuring factory constructors return TrackerResult/IncrementalState. Existing command tests updated for new return patterns.
Docs: collector contract and tutorials
core/collectors/README.md, docs/service_api/core_protocols.md, docs/Core_public_API.md, docs/How_to_add_a_collector.md, docs/Tutorial_building_a_collector.md, core/pyright_samples/protocol_assignment_positive.py
Documentation and tutorials updated to require collect() return TrackerResult, show incremental-state hook, update skeletons to mention optional protocol_impl.py, and include require_incremental_state and new protocol fields.

Estimated code review effort: 🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • jonathanMLDev
  • snowfox1003
  • wpak-ai

Poem

🐰 I hopped through code with tidy paws,

made results immutable, fixed all the laws.
Counters and markers in neat little rows,
tests nod and the runtime now knows.
Hop on, reviewer — the structured trail glows.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
discord_activity_tracker/management/commands/run_discord_activity_tracker.py (1)

170-180: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

channels count is wrong in full-sync mode.

When no --channels allowlist is set, the command can process many channels but returns "channels": 0. This makes TrackerResult.counts inaccurate.

Suggested fix
 def task_discord_sync(
@@
-) -> int:
+) -> tuple[int, int]:
@@
-    processed_total = 0
+    processed_total = 0
+    processed_channels = 0
@@
             count = asyncio.run(
                 collector._persist_channel(guild_info, channel_info, messages)
             )
             processed_total += count
+            processed_channels += 1
@@
-    return processed_total
+    return processed_total, processed_channels
@@
-            messages_synced = task_discord_sync(
+            messages_synced, channels_synced = task_discord_sync(
@@
                 counts={
                     "messages": messages_synced,
-                    "channels": (
-                        len(collector.channel_ids) if collector.channel_ids else 0
-                    ),
+                    "channels": channels_synced,
                 },
             )

Also applies to: 224-277, 641-672

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`
around lines 170 - 180, The reported bug is that the "channels" count in the
result is set to 0 when no --channels allowlist is provided (full-sync); inside
task_discord_sync you must compute and return the actual number of channels
processed rather than using len(channel_ids) which is 0 for full-sync. Fix by
deriving the count from the actual channels iterated/processed (e.g. use the
collector's resolved channel list or the processed_channels collection produced
by collector.run) and assign that value to the "channels" field returned in
TrackerResult.counts; update the same logic in the other affected blocks (the
other task_discord_sync usages at the noted ranges) so they all use the real
processed channel list instead of channel_ids.
discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py (1)

136-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate per-file failures into TrackerResult instead of always returning success.

collect() catches file-level exceptions, but the final DTO still reports success=True with no errors. That misreports failed runs as successful and weakens the new structured outcome contract.

💡 Suggested fix
@@
-        processed_total = 0
+        processed_total = 0
+        failures: list[str] = []
@@
             except Exception as exc:
                 rel = _json_display_path(import_dir, json_path)
                 logger.error("Failed to process %s: %s", rel, exc)
                 self.stdout.write(self.style.ERROR(f"    Failed {rel}: {exc}"))
+                failures.append(f"{rel}: {exc}")
@@
-        return DiscordCollectionTrackerResult(
-            success=True,
-            counts={"messages": processed_total, "files": len(json_files)},
-        )
+        return DiscordCollectionTrackerResult(
+            success=not failures,
+            counts={
+                "messages": processed_total,
+                "files": len(json_files),
+                "failed_files": len(failures),
+            },
+            errors=tuple(failures),
+        )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py`
around lines 136 - 150, The current collect() implementation logs per-file
exceptions but always returns DiscordCollectionTrackerResult(success=True,
counts=...), which hides failures; modify collect() to accumulate per-file
errors (e.g., create an errors = [] list and append a structured entry
containing rel (from _json_display_path(import_dir, json_path)) and exc inside
the except block where logger.error and stdout.write are called), then set
success = len(errors) == 0 and include errors in the returned
DiscordCollectionTrackerResult (e.g.,
DiscordCollectionTrackerResult(success=success, counts={"messages":
processed_total, "files": len(json_files)}, errors=errors)); keep existing
logging/stdout behavior but ensure the DTO reflects any failures.
🧹 Nitpick comments (1)
cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py (1)

219-234: 💤 Low value

Missing error count tracking for channel memberships.

sync_channel_users returns (success_count, error_count) (line 223), but unlike _sync_users, _sync_channels, and _sync_messages, the error_count is not accumulated into self._counts. This results in inconsistent error tracking in the final SlackTrackerResult.

Suggested fix
         self._counts["channel_memberships"] = (
             self._counts.get("channel_memberships", 0) + success_count
         )
+        self._counts["channel_membership_errors"] = (
+            self._counts.get("channel_membership_errors", 0) + error_count
+        )
         logger.info(
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py` around
lines 219 - 234, The _sync_channel_users method is not accumulating error_count
into self._counts for "channel_memberships"; update _sync_channel_users (which
calls sync_channel_users) to increment
self._counts["channel_memberships_errors"] (or follow the existing error key
pattern used elsewhere) by error_count after the call, mirroring how
success_count is added to self._counts so the final SlackTrackerResult includes
channel membership errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@boost_library_docs_tracker/protocol_impl.py`:
- Around line 13-16: LibraryDocsTrackerResult is frozen but its counts field is
a plain dict created in from_run, allowing mutation; update the from_run
constructor to defensively copy counts and wrap them in an immutable mapping
(e.g., types.MappingProxyType(dict(copy)) or an equivalent immutable/frozendict)
before assigning to the counts field so the frozen dataclass is deep-immutable
for that attribute, and keep the declared type Mapping[str,int] unchanged;
reference LibraryDocsTrackerResult and its from_run factory to locate and modify
the creation/assignment of counts.

In `@boost_library_tracker/protocol_impl.py`:
- Around line 13-17: CollectBoostLibrariesResult is frozen but still accepts
mutable dicts for counts via from_totals() and empty(), allowing external
mutation; fix by making counts truly immutable by wrapping any dict passed into
counts with types.MappingProxyType. Update from_totals() and empty() to import
types.MappingProxyType and construct counts as MappingProxyType(totals_dict) (or
MappingProxyType({}) for empty) so the frozen dataclass cannot be mutated at
runtime; adjust any other places that construct CollectBoostLibrariesResult to
perform the same wrapping.

In `@core/collectors/base_collector.py`:
- Around line 94-97: The code currently trusts the value in
self._incremental_state_out and calls self.persist_incremental_state(state_out)
without validation; update the lifecycle boundary after load_incremental_state
(the place where _incremental_state_out is read) to validate the
incremental-state shape/type before persisting: call the same validation logic
you use for TrackerResult (or implement a small validator) to ensure required
keys/types/serializability, and if validation fails either raise a clear error
or log and skip persisting. Specifically modify the block reading
self._incremental_state_out (and the analogous spot around line 237) to run
validate_incremental_state(state_out) (or inline checks) and only call
persist_incremental_state(state_out) when valid, otherwise handle the error path
consistently.
- Around line 34-36: The run method on CollectorRunnable currently has a
nullable return (def run(self) -> TrackerResult | None) which allows
implementations to return None and undermines the new contract; change the
signature to def run(self) -> TrackerResult (remove | None) in base_collector.py
(CollectorRunnable / AbstractCollector), update any subclass implementations of
CollectorRunnable.run (and helper functions that construct or call run) to
always return a TrackerResult instead of None, and update import/type hints
accordingly so callers and implementations type-check against the non-nullable
TrackerResult contract.

In `@core/incremental_state.py`:
- Around line 9-15: GenericIncrementalState currently accepts a plain dict via
the field(default_factory=dict) so extras remains mutable; change it to an
immutable mapping by wrapping the provided mapping in types.MappingProxyType in
a __post_init__ (use object.__setattr__ because the dataclass is frozen).
Specifically, add import MappingProxyType, implement
GenericIncrementalState.__post_init__ that replaces self.extras with
MappingProxyType(dict(self.extras)) (or MappingProxyType(self.extras) after
copying) to ensure extras is read-only after construction and preserve the
frozen DTO contract.

In `@core/tests/test_collector_protocol_conformance.py`:
- Around line 34-37: The test module unconditionally imports
Wg21ReflectorIncrementalState and Wg21ReflectorTrackerResult causing pytest
collection to fail when wg21_reflector_collector is absent; wrap the import in a
try/except ImportError and set a boolean flag (e.g. HAS_WG21_REFLECTOR) so the
rest of the file still loads, then make the reflector-specific parametrized
cases conditional (either skip them with pytest.mark.skipif(not
HAS_WG21_REFLECTOR) or only append those params when HAS_WG21_REFLECTOR is True)
so references to Wg21ReflectorIncrementalState and Wg21ReflectorTrackerResult
are only used when the package is available.

In `@core/tracker_result.py`:
- Around line 12-19: GenericTrackerResult is frozen but its counts field can
still hold a mutable Mapping; make counts truly immutable by converting/copying
it into an immutable mapping in the dataclass post-init. Add a __post_init__ on
GenericTrackerResult that uses types.MappingProxyType(dict(self.counts)) (or
similar defensive copy-to-immutable) and assign it with object.__setattr__(self,
"counts", <mapping-proxy>) so callers passing mutable dicts cannot mutate the
stored counts; ensure you import types.MappingProxyType (or chosen immutable
wrapper). This change preserves the frozen dataclass semantics and fixes the
mutability for GenericTrackerResult.counts (also apply same pattern where the
same pattern is used).
- Around line 46-47: with_duration_if_missing currently unconditionally calls
dataclasses.replace(result, duration_seconds=...), which fails for dataclass
fields that are not init=True or are property-backed; update the function to
first inspect dataclasses.fields(result) for a field named "duration_seconds"
and only call replace if that field exists and field.init is True, otherwise
attempt to set the attribute directly via setattr(result, "duration_seconds",
duration_seconds) if the instance is mutable (i.e., not frozen), and if neither
approach works return the original result unchanged; reference the function
with_duration_if_missing, the variable result, and
dataclasses.replace/dataclasses.fields in your change.

In `@cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py`:
- Around line 89-93: The completion log is using the wrong metric key: the
logging call that formats "CPPA Pinecone Sync completed: upserted=%s, total=%s,
failed_count=%s" is reading result.counts.get("errors", 0) instead of the
intended failed_count metric; update the logging call to use
result.counts.get("failed_count", 0) (locate the logger call that references
result.counts and change the key), ensuring the upserted/total/failed_count
values reflect the correct keys.

In `@cppa_slack_tracker/protocol_impl.py`:
- Around line 13-16: SlackTrackerResult.counts and SlackIncrementalState.extras
are currently mutable dicts despite dataclass(frozen=True); update construction
sites (SlackTrackerResult.from_counts, SlackTrackerResult.dry_run,
SlackIncrementalState.from_team and any default_factory=dict) to wrap/replace
dicts with an immutable mapping (e.g., types.MappingProxyType or a
frozendict-equivalent) before assigning so their contents cannot be mutated
post-instantiation; ensure the dataclass fields remain typed as Mapping[str,int]
/ Mapping[str,Any] and replace any default_factory=dict with an immutable
default (or None plus a wrapper) to preserve frozen semantics.

In `@cppa_youtube_script_tracker/protocol_impl.py`:
- Around line 13-17: YoutubeScriptTrackerResult is frozen but its counts field
can be a mutable dict; wrap counts in an immutable mapping to prevent mutation
by callers. In the YoutubeScriptTrackerResult dataclass add a __post_init__ that
imports types.MappingProxyType and uses object.__setattr__(self, "counts",
MappingProxyType(dict(self.counts))) (or MappingProxyType(self.counts) after
copying) so counts becomes an immutable mapping; reference the
YoutubeScriptTrackerResult class and the counts attribute when making this
change.

In `@wg21_paper_tracker/protocol_impl.py`:
- Around line 16-19: Wg21PaperTrackerResult's counts is currently a mutable dict
allowing post-creation mutation; wrap all dict literals assigned to counts in an
immutable mapping (e.g., types.MappingProxyType) and ensure the default factory
produces an immutable mapping as well. Concretely, import types and replace any
assignments in from_pipeline() and dry_run() like {"new_papers": n} or {} with
types.MappingProxyType({"new_papers": n}) / types.MappingProxyType({}), and
change the dataclass field default_factory for counts to a lambda returning
types.MappingProxyType({}) so counts is deeply immutable after construction.

---

Outside diff comments:
In
`@discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py`:
- Around line 136-150: The current collect() implementation logs per-file
exceptions but always returns DiscordCollectionTrackerResult(success=True,
counts=...), which hides failures; modify collect() to accumulate per-file
errors (e.g., create an errors = [] list and append a structured entry
containing rel (from _json_display_path(import_dir, json_path)) and exc inside
the except block where logger.error and stdout.write are called), then set
success = len(errors) == 0 and include errors in the returned
DiscordCollectionTrackerResult (e.g.,
DiscordCollectionTrackerResult(success=success, counts={"messages":
processed_total, "files": len(json_files)}, errors=errors)); keep existing
logging/stdout behavior but ensure the DTO reflects any failures.

In
`@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`:
- Around line 170-180: The reported bug is that the "channels" count in the
result is set to 0 when no --channels allowlist is provided (full-sync); inside
task_discord_sync you must compute and return the actual number of channels
processed rather than using len(channel_ids) which is 0 for full-sync. Fix by
deriving the count from the actual channels iterated/processed (e.g. use the
collector's resolved channel list or the processed_channels collection produced
by collector.run) and assign that value to the "channels" field returned in
TrackerResult.counts; update the same logic in the other affected blocks (the
other task_discord_sync usages at the noted ranges) so they all use the real
processed channel list instead of channel_ids.

---

Nitpick comments:
In `@cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py`:
- Around line 219-234: The _sync_channel_users method is not accumulating
error_count into self._counts for "channel_memberships"; update
_sync_channel_users (which calls sync_channel_users) to increment
self._counts["channel_memberships_errors"] (or follow the existing error key
pattern used elsewhere) by error_count after the call, mirroring how
success_count is added to self._counts so the final SlackTrackerResult includes
channel membership errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f9f2033b-5987-49ac-85ed-8f3fcb6b3274

📥 Commits

Reviewing files that changed from the base of the PR and between 8b4cba2 and 17590e0.

📒 Files selected for processing (50)
  • boost_library_docs_tracker/management/commands/run_boost_library_docs_tracker.py
  • boost_library_docs_tracker/protocol_impl.py
  • boost_library_docs_tracker/tests/test_run_boost_library_docs_tracker_command.py
  • boost_library_tracker/management/commands/collect_boost_libraries.py
  • boost_library_tracker/management/commands/run_boost_github_activity_tracker.py
  • boost_library_tracker/protocol_impl.py
  • boost_library_tracker/tests/test_collect_boost_libraries_command.py
  • boost_library_tracker/tests/test_protocol_impl.py
  • boost_library_usage_dashboard/collectors.py
  • boost_library_usage_dashboard/protocol_impl.py
  • boost_mailing_list_tracker/management/commands/run_boost_mailing_list_tracker.py
  • boost_mailing_list_tracker/protocol_impl.py
  • boost_mailing_list_tracker/tests/test_protocol_impl.py
  • boost_usage_tracker/management/commands/run_boost_usage_tracker.py
  • clang_github_tracker/collectors.py
  • clang_github_tracker/protocol_impl.py
  • clang_github_tracker/tests/test_protocol_impl.py
  • core/collectors/README.md
  • core/collectors/__init__.py
  • core/collectors/base_collector.py
  • core/collectors/command_base.py
  • core/incremental_state.py
  • core/management/commands/startcollector.py
  • core/protocols.py
  • core/pyright_samples/protocol_assignment_positive.py
  • core/tests/test_collector_protocol_conformance.py
  • core/tests/test_collectors_base.py
  • core/tests/test_protocols.py
  • core/tracker_result.py
  • cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
  • cppa_pinecone_sync/protocol_impl.py
  • cppa_pinecone_sync/tests/test_protocol_impl.py
  • cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py
  • cppa_slack_tracker/protocol_impl.py
  • cppa_slack_tracker/tests/test_protocol_impl.py
  • cppa_user_tracker/management/commands/run_cppa_user_tracker.py
  • cppa_youtube_script_tracker/management/commands/run_cppa_youtube_script_tracker.py
  • cppa_youtube_script_tracker/protocol_impl.py
  • discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
  • discord_activity_tracker/management/commands/run_discord_activity_tracker.py
  • discord_activity_tracker/protocol_impl.py
  • docs/Core_public_API.md
  • docs/How_to_add_a_collector.md
  • docs/Tutorial_building_a_collector.md
  • docs/service_api/core_protocols.md
  • github_activity_tracker/protocol_impl.py
  • github_activity_tracker/tests/test_protocol_impl.py
  • wg21_paper_tracker/collectors.py
  • wg21_paper_tracker/protocol_impl.py
  • wg21_paper_tracker/tests/test_protocol_impl.py

Comment thread boost_library_docs_tracker/protocol_impl.py
Comment thread boost_library_tracker/protocol_impl.py
Comment thread core/collectors/base_collector.py Outdated
Comment thread core/collectors/base_collector.py
Comment thread core/incremental_state.py
Comment thread core/tracker_result.py Outdated
Comment thread cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
Comment thread cppa_slack_tracker/protocol_impl.py
Comment thread cppa_youtube_script_tracker/protocol_impl.py
Comment thread wg21_paper_tracker/protocol_impl.py
@leostar0412 leostar0412 self-assigned this Jun 9, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/collectors/base_collector.py (1)

250-253: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Defer committing last_result until the run fully succeeds.

If post_collect() or persist_incremental_state() raises, run() fails but last_result has already been updated to the new result. That contradicts the property docstring and can expose a failed run as the “most recent successful” one.

💡 Minimal fix
 def run(self) -> TrackerResult:
+    previous_result = getattr(self, "_last_result", None)
     self._incremental_state_in = None
     self._incremental_state_out = None
     started = time.monotonic()
     try:
         self.pre_collect()
@@
         raw_result = self.collect()
         result = require_tracker_result(raw_result)
         elapsed = time.monotonic() - started
         result = with_duration_if_missing(result, elapsed)
         self._last_result = result
         self.post_collect()
         return result
     except Exception as exc:
+        self._last_result = previous_result
         try:
             self.on_error(exc)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/collectors/base_collector.py` around lines 250 - 253, The code currently
assigns self._last_result before calling post_collect() (and before any
persist_incremental_state() inside run()), which can mark a failed run as “most
recent successful”; change the sequence in run()/collect flow so that result =
with_duration_if_missing(...) is computed but self._last_result is only updated
after post_collect() and after any call to persist_incremental_state() completes
without raising—i.e., move the assignment to self._last_result to the end of the
successful run path (after post_collect() and persistence), leaving
elapsed/with_duration_if_missing intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@core/collectors/base_collector.py`:
- Around line 250-253: The code currently assigns self._last_result before
calling post_collect() (and before any persist_incremental_state() inside
run()), which can mark a failed run as “most recent successful”; change the
sequence in run()/collect flow so that result = with_duration_if_missing(...) is
computed but self._last_result is only updated after post_collect() and after
any call to persist_incremental_state() completes without raising—i.e., move the
assignment to self._last_result to the end of the successful run path (after
post_collect() and persistence), leaving elapsed/with_duration_if_missing
intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 684e1b60-7f1b-48e4-83ee-ad771c0beeb9

📥 Commits

Reviewing files that changed from the base of the PR and between efbbfbb and dc1233f.

📒 Files selected for processing (12)
  • boost_library_docs_tracker/protocol_impl.py
  • boost_library_tracker/protocol_impl.py
  • core/collectors/base_collector.py
  • core/collectors/command_base.py
  • core/incremental_state.py
  • core/tracker_result.py
  • cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
  • cppa_pinecone_sync/protocol_impl.py
  • cppa_pinecone_sync/tests/test_protocol_impl.py
  • cppa_slack_tracker/protocol_impl.py
  • cppa_youtube_script_tracker/protocol_impl.py
  • wg21_paper_tracker/protocol_impl.py
🚧 Files skipped from review as they are similar to previous changes (11)
  • cppa_youtube_script_tracker/protocol_impl.py
  • cppa_slack_tracker/protocol_impl.py
  • core/incremental_state.py
  • wg21_paper_tracker/protocol_impl.py
  • boost_library_tracker/protocol_impl.py
  • core/tracker_result.py
  • core/collectors/command_base.py
  • boost_library_docs_tracker/protocol_impl.py
  • cppa_pinecone_sync/tests/test_protocol_impl.py
  • cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
  • cppa_pinecone_sync/protocol_impl.py

@jonathanMLDev jonathanMLDev left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also update CHANGELOG.md, STABILITY.md

Comment thread core/collectors/base_collector.py
@wpak-ai wpak-ai merged commit e0e4dac into cppalliance:develop Jun 10, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IncrementalState/TrackerResult protocol wiring across collectors

3 participants