feat(core): require TrackerResult from collectors and migrate in-repo trackers by leostar0412 · Pull Request #279 · cppalliance/boost-data-collector

leostar0412 · 2026-06-09T19:06:26Z

Summary

Extend AbstractCollector.collect() to return TrackerResult instead of None. run() validates the result, backfills duration_seconds, and exposes last_result.
Add shared DTOs (GenericTrackerResult, GenericIncrementalState) and optional incremental checkpoint hooks (load_incremental_state / persist_incremental_state).
Migrate all in-monorepo collectors to frozen protocol_impl.py DTOs and log structured outcomes from BaseCollectorCommand.
Update collector docs (Core_public_API.md, tutorial, how-to) and add protocol conformance tests.
Blocks the companion PR in wg21-reflector-collector (overlay CI depends on this core contract).

Apps touched

core
boost_library_docs_tracker
boost_library_tracker
boost_library_usage_dashboard
boost_mailing_list_tracker
boost_usage_tracker
clang_github_tracker
cppa_pinecone_sync
cppa_slack_tracker
cppa_user_tracker
cppa_youtube_script_tracker
discord_activity_tracker
github_activity_tracker
wg21_paper_tracker
docs

Test plan

python -m pytest core/tests/test_collectors_base.py core/tests/test_collector_protocol_conformance.py core/tests/test_protocols.py
python -m pytest */tests/test_protocol_impl.py
uv run pyright (if typed code changed)
lint-imports (if imports or cross-app coupling changed)
App command smoke-tested (if collector/command changed):

Docs / coupling

cross-app-dependencies.md updated (if FKs or cross-app imports changed)
python scripts/generate_service_docs.py run (if services.py or core/protocols.py changed)
App README or docs/ updated (if behavior or ops changed)

Closes #275

Summary by CodeRabbit

New Features
- Collectors now return structured TrackerResult objects and many trackers support incremental checkpoint state.
Improvements
- Standardized, immutable result types across apps (counts, errors, duration), dry‑run reporting, per-run merging/aggregation, and enhanced command logging including last_result semantics.
Tests & Documentation
- Added protocol conformance and result/state tests; updated docs and tutorials to reflect the new collector lifecycle and result contract.

…ibrariesResult DTOs

coderabbitai · 2026-06-09T19:06:49Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6669ec4a-61d0-4bd6-bae0-bafdf7b4c30d

📥 Commits

Reviewing files that changed from the base of the PR and between 44cfb18 and aaf5ff7.

📒 Files selected for processing (1)

reddit_activity_tracker/management/commands/run_reddit_activity_tracker.py

📝 Walkthrough

Walkthrough

Collectors and commands now return protocol-conformant TrackerResult (and optional IncrementalState) objects; core lifecycle enforces result validation, backfills duration, persists incremental state, app-specific frozen DTOs added/expanded, tests updated for runtime conformance, and docs/scaffold adjusted.

Changes

Core Protocol & Framework Integration

Layer / File(s)	Summary
Protocol extensions and generic type implementations `core/protocols.py`, `core/tracker_result.py`, `core/incremental_state.py`, `core/collectors/__init__.py`	`TrackerResult` gains `errors` and `duration_seconds`. Added `GenericTrackerResult` with `ok()`/`failed()` and duration helpers. `GenericIncrementalState` added and re-exported. Runtime helper `require_incremental_state()` added.
AbstractCollector lifecycle enforcing TrackerResult return `core/collectors/base_collector.py`	`CollectorRunnable.run()`/`AbstractCollector.run()` now return `TrackerResult`. Lifecycle loads incremental state, calls `collect()`, enforces protocol conformance, backfills duration, sets `last_result`, persists incremental out state in `post_collect`, and preserves on_error semantics.
BaseCollectorCommand result logging and structured output `core/collectors/command_base.py`	Added helpers to compute records_collected and log standardized "Collector finished" structured messages using `collector.last_result` when present.

Tracker App Protocol Implementations

Layer / File(s)	Summary
Simple collectors and DTOs `boost_library_docs_tracker/protocol_impl.py`, `boost_library_docs_tracker/management/commands/run_boost_library_docs_tracker.py`, `cppa_user_tracker/management/commands/run_cppa_user_tracker.py`, `boost_usage_tracker/management/commands/run_boost_usage_tracker.py`, `cppa_youtube_script_tracker/protocol_impl.py`, `cppa_youtube_script_tracker/management/commands/run_cppa_youtube_script_tracker.py`, `boost_library_usage_dashboard/protocol_impl.py`, `boost_library_usage_dashboard/collectors.py`	Added/fixed per-app tracker DTOs and changed collect/command flows to return `TrackerResult` instances (e.g., LibraryDocsTrackerResult, YoutubeScriptTrackerResult, UsageDashboardTrackerResult).
Stateful collectors with incremental state `boost_library_tracker/protocol_impl.py`, `boost_library_tracker/management/commands/collect_boost_libraries.py`, `boost_mailing_list_tracker/protocol_impl.py`, `boost_mailing_list_tracker/management/commands/run_boost_mailing_list_tracker.py`, `discord_activity_tracker/protocol_impl.py`, `discord_activity_tracker/management/commands/run_discord_activity_tracker.py`, `cppa_slack_tracker/protocol_impl.py`, `cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py`, `clang_github_tracker/protocol_impl.py`, `clang_github_tracker/collectors.py`	Collectors implement `load_incremental_state()` where applicable and return concrete result/state DTOs. Command flows capture and return counts/errors/duration via factory constructors.
Multi-repo and sync result merging `boost_library_tracker/management/commands/run_boost_github_activity_tracker.py`, `github_activity_tracker/protocol_impl.py`, `github_activity_tracker/tests/test_protocol_impl.py`, `cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py`, `cppa_pinecone_sync/protocol_impl.py`, `wg21_paper_tracker/collectors.py`, `wg21_paper_tracker/protocol_impl.py`	Per-repo sync dicts are normalized into TrackerResult DTOs and merged (e.g., GitHubSyncTrackerResult.merge, PineconeSyncTrackerResult.from_sync_dict). WG21 pipeline mapped to Wg21PaperTrackerResult.

Test Infrastructure & Documentation

Layer / File(s)	Summary
Core protocol conformance and lifecycle tests `core/tests/test_collector_protocol_conformance.py`, `core/tests/test_collectors_base.py`, `core/tests/test_protocols.py`	New/updated tests assert runtime `isinstance()` conformance to `TrackerResult` and `IncrementalState`, enforce collect() return contract, duration/backfill behavior, and logging of result fields.
App protocol tests and scaffold updates `core/management/commands/startcollector.py`, `boost_library_tracker/tests/test_protocol_impl.py`, `boost_mailing_list_tracker/tests/test_protocol_impl.py`, `clang_github_tracker/tests/test_protocol_impl.py`, `cppa_pinecone_sync/tests/test_protocol_impl.py`, `cppa_slack_tracker/tests/test_protocol_impl.py`, `wg21_paper_tracker/tests/test_protocol_impl.py`, `github_activity_tracker/tests/test_protocol_impl.py`, `boost_library_tracker/tests/test_collect_boost_libraries_command.py`, `discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py`	Generated collector stubs now return `GenericTrackerResult.ok()`. Each app adds protocol_impl tests ensuring factory constructors return `TrackerResult`/`IncrementalState`. Existing command tests updated for new return patterns.
Docs: collector contract and tutorials `core/collectors/README.md`, `docs/service_api/core_protocols.md`, `docs/Core_public_API.md`, `docs/How_to_add_a_collector.md`, `docs/Tutorial_building_a_collector.md`, `core/pyright_samples/protocol_assignment_positive.py`	Documentation and tutorials updated to require `collect()` return `TrackerResult`, show incremental-state hook, update skeletons to mention optional `protocol_impl.py`, and include `require_incremental_state` and new protocol fields.

Estimated code review effort: 🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

cppalliance/boost-data-collector#218: Prior run_boost_library_docs_tracker wiring to AbstractCollector; related migration history.
cppalliance/boost-data-collector#179: Documentation updates to collector onboarding; overlaps tutorial/skeleton edits.
cppalliance/boost-data-collector#202: Earlier protocol/DTO work that this PR extends.

Suggested reviewers

jonathanMLDev
snowfox1003
wpak-ai

Poem

🐰 I hopped through code with tidy paws,

made results immutable, fixed all the laws.
Counters and markers in neat little rows,
tests nod and the runtime now knows.
Hop on, reviewer — the structured trail glows.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

…tor_protocol_conformance.py

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

discord_activity_tracker/management/commands/run_discord_activity_tracker.py (1)

170-180: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

channels count is wrong in full-sync mode.

When no --channels allowlist is set, the command can process many channels but returns "channels": 0. This makes TrackerResult.counts inaccurate.

Suggested fix

 def task_discord_sync(
@@
-) -> int:
+) -> tuple[int, int]:
@@
-    processed_total = 0
+    processed_total = 0
+    processed_channels = 0
@@
             count = asyncio.run(
                 collector._persist_channel(guild_info, channel_info, messages)
             )
             processed_total += count
+            processed_channels += 1
@@
-    return processed_total
+    return processed_total, processed_channels
@@
-            messages_synced = task_discord_sync(
+            messages_synced, channels_synced = task_discord_sync(
@@
                 counts={
                     "messages": messages_synced,
-                    "channels": (
-                        len(collector.channel_ids) if collector.channel_ids else 0
-                    ),
+                    "channels": channels_synced,
                 },
             )

Also applies to: 224-277, 641-672

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`
around lines 170 - 180, The reported bug is that the "channels" count in the
result is set to 0 when no --channels allowlist is provided (full-sync); inside
task_discord_sync you must compute and return the actual number of channels
processed rather than using len(channel_ids) which is 0 for full-sync. Fix by
deriving the count from the actual channels iterated/processed (e.g. use the
collector's resolved channel list or the processed_channels collection produced
by collector.run) and assign that value to the "channels" field returned in
TrackerResult.counts; update the same logic in the other affected blocks (the
other task_discord_sync usages at the noted ranges) so they all use the real
processed channel list instead of channel_ids.

discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py (1)

136-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate per-file failures into TrackerResult instead of always returning success.

collect() catches file-level exceptions, but the final DTO still reports success=True with no errors. That misreports failed runs as successful and weakens the new structured outcome contract.

💡 Suggested fix

@@
-        processed_total = 0
+        processed_total = 0
+        failures: list[str] = []
@@
             except Exception as exc:
                 rel = _json_display_path(import_dir, json_path)
                 logger.error("Failed to process %s: %s", rel, exc)
                 self.stdout.write(self.style.ERROR(f"    Failed {rel}: {exc}"))
+                failures.append(f"{rel}: {exc}")
@@
-        return DiscordCollectionTrackerResult(
-            success=True,
-            counts={"messages": processed_total, "files": len(json_files)},
-        )
+        return DiscordCollectionTrackerResult(
+            success=not failures,
+            counts={
+                "messages": processed_total,
+                "files": len(json_files),
+                "failed_files": len(failures),
+            },
+            errors=tuple(failures),
+        )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py`
around lines 136 - 150, The current collect() implementation logs per-file
exceptions but always returns DiscordCollectionTrackerResult(success=True,
counts=...), which hides failures; modify collect() to accumulate per-file
errors (e.g., create an errors = [] list and append a structured entry
containing rel (from _json_display_path(import_dir, json_path)) and exc inside
the except block where logger.error and stdout.write are called), then set
success = len(errors) == 0 and include errors in the returned
DiscordCollectionTrackerResult (e.g.,
DiscordCollectionTrackerResult(success=success, counts={"messages":
processed_total, "files": len(json_files)}, errors=errors)); keep existing
logging/stdout behavior but ensure the DTO reflects any failures.

🧹 Nitpick comments (1)

cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py (1)

219-234: 💤 Low value

Missing error count tracking for channel memberships.

sync_channel_users returns (success_count, error_count) (line 223), but unlike _sync_users, _sync_channels, and _sync_messages, the error_count is not accumulated into self._counts. This results in inconsistent error tracking in the final SlackTrackerResult.

Suggested fix

         self._counts["channel_memberships"] = (
             self._counts.get("channel_memberships", 0) + success_count
         )
+        self._counts["channel_membership_errors"] = (
+            self._counts.get("channel_membership_errors", 0) + error_count
+        )
         logger.info(

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py` around
lines 219 - 234, The _sync_channel_users method is not accumulating error_count
into self._counts for "channel_memberships"; update _sync_channel_users (which
calls sync_channel_users) to increment
self._counts["channel_memberships_errors"] (or follow the existing error key
pattern used elsewhere) by error_count after the call, mirroring how
success_count is added to self._counts so the final SlackTrackerResult includes
channel membership errors.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@boost_library_docs_tracker/protocol_impl.py`:
- Around line 13-16: LibraryDocsTrackerResult is frozen but its counts field is
a plain dict created in from_run, allowing mutation; update the from_run
constructor to defensively copy counts and wrap them in an immutable mapping
(e.g., types.MappingProxyType(dict(copy)) or an equivalent immutable/frozendict)
before assigning to the counts field so the frozen dataclass is deep-immutable
for that attribute, and keep the declared type Mapping[str,int] unchanged;
reference LibraryDocsTrackerResult and its from_run factory to locate and modify
the creation/assignment of counts.

In `@boost_library_tracker/protocol_impl.py`:
- Around line 13-17: CollectBoostLibrariesResult is frozen but still accepts
mutable dicts for counts via from_totals() and empty(), allowing external
mutation; fix by making counts truly immutable by wrapping any dict passed into
counts with types.MappingProxyType. Update from_totals() and empty() to import
types.MappingProxyType and construct counts as MappingProxyType(totals_dict) (or
MappingProxyType({}) for empty) so the frozen dataclass cannot be mutated at
runtime; adjust any other places that construct CollectBoostLibrariesResult to
perform the same wrapping.

In `@core/collectors/base_collector.py`:
- Around line 94-97: The code currently trusts the value in
self._incremental_state_out and calls self.persist_incremental_state(state_out)
without validation; update the lifecycle boundary after load_incremental_state
(the place where _incremental_state_out is read) to validate the
incremental-state shape/type before persisting: call the same validation logic
you use for TrackerResult (or implement a small validator) to ensure required
keys/types/serializability, and if validation fails either raise a clear error
or log and skip persisting. Specifically modify the block reading
self._incremental_state_out (and the analogous spot around line 237) to run
validate_incremental_state(state_out) (or inline checks) and only call
persist_incremental_state(state_out) when valid, otherwise handle the error path
consistently.
- Around line 34-36: The run method on CollectorRunnable currently has a
nullable return (def run(self) -> TrackerResult | None) which allows
implementations to return None and undermines the new contract; change the
signature to def run(self) -> TrackerResult (remove | None) in base_collector.py
(CollectorRunnable / AbstractCollector), update any subclass implementations of
CollectorRunnable.run (and helper functions that construct or call run) to
always return a TrackerResult instead of None, and update import/type hints
accordingly so callers and implementations type-check against the non-nullable
TrackerResult contract.

In `@core/incremental_state.py`:
- Around line 9-15: GenericIncrementalState currently accepts a plain dict via
the field(default_factory=dict) so extras remains mutable; change it to an
immutable mapping by wrapping the provided mapping in types.MappingProxyType in
a __post_init__ (use object.__setattr__ because the dataclass is frozen).
Specifically, add import MappingProxyType, implement
GenericIncrementalState.__post_init__ that replaces self.extras with
MappingProxyType(dict(self.extras)) (or MappingProxyType(self.extras) after
copying) to ensure extras is read-only after construction and preserve the
frozen DTO contract.

In `@core/tests/test_collector_protocol_conformance.py`:
- Around line 34-37: The test module unconditionally imports
Wg21ReflectorIncrementalState and Wg21ReflectorTrackerResult causing pytest
collection to fail when wg21_reflector_collector is absent; wrap the import in a
try/except ImportError and set a boolean flag (e.g. HAS_WG21_REFLECTOR) so the
rest of the file still loads, then make the reflector-specific parametrized
cases conditional (either skip them with pytest.mark.skipif(not
HAS_WG21_REFLECTOR) or only append those params when HAS_WG21_REFLECTOR is True)
so references to Wg21ReflectorIncrementalState and Wg21ReflectorTrackerResult
are only used when the package is available.

In `@core/tracker_result.py`:
- Around line 12-19: GenericTrackerResult is frozen but its counts field can
still hold a mutable Mapping; make counts truly immutable by converting/copying
it into an immutable mapping in the dataclass post-init. Add a __post_init__ on
GenericTrackerResult that uses types.MappingProxyType(dict(self.counts)) (or
similar defensive copy-to-immutable) and assign it with object.__setattr__(self,
"counts", <mapping-proxy>) so callers passing mutable dicts cannot mutate the
stored counts; ensure you import types.MappingProxyType (or chosen immutable
wrapper). This change preserves the frozen dataclass semantics and fixes the
mutability for GenericTrackerResult.counts (also apply same pattern where the
same pattern is used).
- Around line 46-47: with_duration_if_missing currently unconditionally calls
dataclasses.replace(result, duration_seconds=...), which fails for dataclass
fields that are not init=True or are property-backed; update the function to
first inspect dataclasses.fields(result) for a field named "duration_seconds"
and only call replace if that field exists and field.init is True, otherwise
attempt to set the attribute directly via setattr(result, "duration_seconds",
duration_seconds) if the instance is mutable (i.e., not frozen), and if neither
approach works return the original result unchanged; reference the function
with_duration_if_missing, the variable result, and
dataclasses.replace/dataclasses.fields in your change.

In `@cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py`:
- Around line 89-93: The completion log is using the wrong metric key: the
logging call that formats "CPPA Pinecone Sync completed: upserted=%s, total=%s,
failed_count=%s" is reading result.counts.get("errors", 0) instead of the
intended failed_count metric; update the logging call to use
result.counts.get("failed_count", 0) (locate the logger call that references
result.counts and change the key), ensuring the upserted/total/failed_count
values reflect the correct keys.

In `@cppa_slack_tracker/protocol_impl.py`:
- Around line 13-16: SlackTrackerResult.counts and SlackIncrementalState.extras
are currently mutable dicts despite dataclass(frozen=True); update construction
sites (SlackTrackerResult.from_counts, SlackTrackerResult.dry_run,
SlackIncrementalState.from_team and any default_factory=dict) to wrap/replace
dicts with an immutable mapping (e.g., types.MappingProxyType or a
frozendict-equivalent) before assigning so their contents cannot be mutated
post-instantiation; ensure the dataclass fields remain typed as Mapping[str,int]
/ Mapping[str,Any] and replace any default_factory=dict with an immutable
default (or None plus a wrapper) to preserve frozen semantics.

In `@cppa_youtube_script_tracker/protocol_impl.py`:
- Around line 13-17: YoutubeScriptTrackerResult is frozen but its counts field
can be a mutable dict; wrap counts in an immutable mapping to prevent mutation
by callers. In the YoutubeScriptTrackerResult dataclass add a __post_init__ that
imports types.MappingProxyType and uses object.__setattr__(self, "counts",
MappingProxyType(dict(self.counts))) (or MappingProxyType(self.counts) after
copying) so counts becomes an immutable mapping; reference the
YoutubeScriptTrackerResult class and the counts attribute when making this
change.

In `@wg21_paper_tracker/protocol_impl.py`:
- Around line 16-19: Wg21PaperTrackerResult's counts is currently a mutable dict
allowing post-creation mutation; wrap all dict literals assigned to counts in an
immutable mapping (e.g., types.MappingProxyType) and ensure the default factory
produces an immutable mapping as well. Concretely, import types and replace any
assignments in from_pipeline() and dry_run() like {"new_papers": n} or {} with
types.MappingProxyType({"new_papers": n}) / types.MappingProxyType({}), and
change the dataclass field default_factory for counts to a lambda returning
types.MappingProxyType({}) so counts is deeply immutable after construction.

---

Outside diff comments:
In
`@discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py`:
- Around line 136-150: The current collect() implementation logs per-file
exceptions but always returns DiscordCollectionTrackerResult(success=True,
counts=...), which hides failures; modify collect() to accumulate per-file
errors (e.g., create an errors = [] list and append a structured entry
containing rel (from _json_display_path(import_dir, json_path)) and exc inside
the except block where logger.error and stdout.write are called), then set
success = len(errors) == 0 and include errors in the returned
DiscordCollectionTrackerResult (e.g.,
DiscordCollectionTrackerResult(success=success, counts={"messages":
processed_total, "files": len(json_files)}, errors=errors)); keep existing
logging/stdout behavior but ensure the DTO reflects any failures.

In
`@discord_activity_tracker/management/commands/run_discord_activity_tracker.py`:
- Around line 170-180: The reported bug is that the "channels" count in the
result is set to 0 when no --channels allowlist is provided (full-sync); inside
task_discord_sync you must compute and return the actual number of channels
processed rather than using len(channel_ids) which is 0 for full-sync. Fix by
deriving the count from the actual channels iterated/processed (e.g. use the
collector's resolved channel list or the processed_channels collection produced
by collector.run) and assign that value to the "channels" field returned in
TrackerResult.counts; update the same logic in the other affected blocks (the
other task_discord_sync usages at the noted ranges) so they all use the real
processed channel list instead of channel_ids.

---

Nitpick comments:
In `@cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py`:
- Around line 219-234: The _sync_channel_users method is not accumulating
error_count into self._counts for "channel_memberships"; update
_sync_channel_users (which calls sync_channel_users) to increment
self._counts["channel_memberships_errors"] (or follow the existing error key
pattern used elsewhere) by error_count after the call, mirroring how
success_count is added to self._counts so the final SlackTrackerResult includes
channel membership errors.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f9f2033b-5987-49ac-85ed-8f3fcb6b3274

📥 Commits

Reviewing files that changed from the base of the PR and between 8b4cba2 and 17590e0.

📒 Files selected for processing (50)

boost_library_docs_tracker/management/commands/run_boost_library_docs_tracker.py
boost_library_docs_tracker/protocol_impl.py
boost_library_docs_tracker/tests/test_run_boost_library_docs_tracker_command.py
boost_library_tracker/management/commands/collect_boost_libraries.py
boost_library_tracker/management/commands/run_boost_github_activity_tracker.py
boost_library_tracker/protocol_impl.py
boost_library_tracker/tests/test_collect_boost_libraries_command.py
boost_library_tracker/tests/test_protocol_impl.py
boost_library_usage_dashboard/collectors.py
boost_library_usage_dashboard/protocol_impl.py
boost_mailing_list_tracker/management/commands/run_boost_mailing_list_tracker.py
boost_mailing_list_tracker/protocol_impl.py
boost_mailing_list_tracker/tests/test_protocol_impl.py
boost_usage_tracker/management/commands/run_boost_usage_tracker.py
clang_github_tracker/collectors.py
clang_github_tracker/protocol_impl.py
clang_github_tracker/tests/test_protocol_impl.py
core/collectors/README.md
core/collectors/__init__.py
core/collectors/base_collector.py
core/collectors/command_base.py
core/incremental_state.py
core/management/commands/startcollector.py
core/protocols.py
core/pyright_samples/protocol_assignment_positive.py
core/tests/test_collector_protocol_conformance.py
core/tests/test_collectors_base.py
core/tests/test_protocols.py
core/tracker_result.py
cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
cppa_pinecone_sync/protocol_impl.py
cppa_pinecone_sync/tests/test_protocol_impl.py
cppa_slack_tracker/management/commands/run_cppa_slack_tracker.py
cppa_slack_tracker/protocol_impl.py
cppa_slack_tracker/tests/test_protocol_impl.py
cppa_user_tracker/management/commands/run_cppa_user_tracker.py
cppa_youtube_script_tracker/management/commands/run_cppa_youtube_script_tracker.py
cppa_youtube_script_tracker/protocol_impl.py
discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py
discord_activity_tracker/management/commands/run_discord_activity_tracker.py
discord_activity_tracker/protocol_impl.py
docs/Core_public_API.md
docs/How_to_add_a_collector.md
docs/Tutorial_building_a_collector.md
docs/service_api/core_protocols.md
github_activity_tracker/protocol_impl.py
github_activity_tracker/tests/test_protocol_impl.py
wg21_paper_tracker/collectors.py
wg21_paper_tracker/protocol_impl.py
wg21_paper_tracker/tests/test_protocol_impl.py

…s tracker result classes

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/collectors/base_collector.py (1)

250-253: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Defer committing last_result until the run fully succeeds.

If post_collect() or persist_incremental_state() raises, run() fails but last_result has already been updated to the new result. That contradicts the property docstring and can expose a failed run as the “most recent successful” one.

💡 Minimal fix

 def run(self) -> TrackerResult:
+    previous_result = getattr(self, "_last_result", None)
     self._incremental_state_in = None
     self._incremental_state_out = None
     started = time.monotonic()
     try:
         self.pre_collect()
@@
         raw_result = self.collect()
         result = require_tracker_result(raw_result)
         elapsed = time.monotonic() - started
         result = with_duration_if_missing(result, elapsed)
         self._last_result = result
         self.post_collect()
         return result
     except Exception as exc:
+        self._last_result = previous_result
         try:
             self.on_error(exc)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/collectors/base_collector.py` around lines 250 - 253, The code currently
assigns self._last_result before calling post_collect() (and before any
persist_incremental_state() inside run()), which can mark a failed run as “most
recent successful”; change the sequence in run()/collect flow so that result =
with_duration_if_missing(...) is computed but self._last_result is only updated
after post_collect() and after any call to persist_incremental_state() completes
without raising—i.e., move the assignment to self._last_result to the end of the
successful run path (after post_collect() and persistence), leaving
elapsed/with_duration_if_missing intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@core/collectors/base_collector.py`:
- Around line 250-253: The code currently assigns self._last_result before
calling post_collect() (and before any persist_incremental_state() inside
run()), which can mark a failed run as “most recent successful”; change the
sequence in run()/collect flow so that result = with_duration_if_missing(...) is
computed but self._last_result is only updated after post_collect() and after
any call to persist_incremental_state() completes without raising—i.e., move the
assignment to self._last_result to the end of the successful run path (after
post_collect() and persistence), leaving elapsed/with_duration_if_missing
intact.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 684e1b60-7f1b-48e4-83ee-ad771c0beeb9

📥 Commits

Reviewing files that changed from the base of the PR and between efbbfbb and dc1233f.

📒 Files selected for processing (12)

boost_library_docs_tracker/protocol_impl.py
boost_library_tracker/protocol_impl.py
core/collectors/base_collector.py
core/collectors/command_base.py
core/incremental_state.py
core/tracker_result.py
cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
cppa_pinecone_sync/protocol_impl.py
cppa_pinecone_sync/tests/test_protocol_impl.py
cppa_slack_tracker/protocol_impl.py
cppa_youtube_script_tracker/protocol_impl.py
wg21_paper_tracker/protocol_impl.py

🚧 Files skipped from review as they are similar to previous changes (11)

cppa_youtube_script_tracker/protocol_impl.py
cppa_slack_tracker/protocol_impl.py
core/incremental_state.py
wg21_paper_tracker/protocol_impl.py
boost_library_tracker/protocol_impl.py
core/tracker_result.py
core/collectors/command_base.py
boost_library_docs_tracker/protocol_impl.py
cppa_pinecone_sync/tests/test_protocol_impl.py
cppa_pinecone_sync/management/commands/run_cppa_pinecone_sync.py
cppa_pinecone_sync/protocol_impl.py

jonathanMLDev

Should also update CHANGELOG.md, STABILITY.md

…racker-result-protocol

feat(protocols): introduce LibraryDocsTrackerResult and CollectBoostL…

17590e0

…ibrariesResult DTOs

leostar0412 requested review from jonathanMLDev, snowfox1003 and wpak-ai as code owners June 9, 2026 19:06

refactor(tests): remove unused Wg21Reflector classes from test_collec…

efbbfbb

…tor_protocol_conformance.py

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

feat(protocols): add MappingProxyType for counts and extras in variou…

dc1233f

…s tracker result classes

leostar0412 self-assigned this Jun 9, 2026

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

jonathanMLDev reviewed Jun 10, 2026

View reviewed changes

Comment thread discord_activity_tracker/management/commands/backfill_discord_activity_tracker.py

Comment thread core/collectors/base_collector.py

leostar0412 added 3 commits June 10, 2026 11:50

feat(collectors): enhance collector behavior and error reporting

44cfb18

Merge remote-tracking branch 'upstream/develop' into feat/collector-t…

d7cccec

…racker-result-protocol

feat(collectors): update collect method to return TrackerResult

aaf5ff7

leostar0412 requested a review from jonathanMLDev June 10, 2026 19:06

jonathanMLDev approved these changes Jun 10, 2026

View reviewed changes

wpak-ai approved these changes Jun 10, 2026

View reviewed changes

wpak-ai merged commit e0e4dac into cppalliance:develop Jun 10, 2026
6 checks passed

coderabbitai Bot mentioned this pull request Jun 11, 2026

feat(protocol_dto): add canonical serialization for protocol DTOs #281

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): require TrackerResult from collectors and migrate in-repo trackers#279

feat(core): require TrackerResult from collectors and migrate in-repo trackers#279
wpak-ai merged 6 commits into
cppalliance:developfrom
leostar0412:feat/collector-tracker-result-protocol

leostar0412 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

jonathanMLDev left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

leostar0412 commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Apps touched

Test plan

Docs / coupling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

jonathanMLDev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leostar0412 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading