fix(prospecting-overview): propose next steps only when useful, memory-aware, via native widget#105
Conversation
The prospecting-overview prompt orients the user but never explicitly required a next-step proposal, so an overview could end on bare status — leaving the user stranded (WF7 contract requires 'proposed a concrete next step'). Add an explicit instruction to close with a concrete next step matched to the user's state (discovery batch / follow-ups due / unblock quota+auth). Verified live: 5/5/5/5 across 3 consecutive eval runs (vs IA:3 without the instruction); next step proposed every round, no fabrication on the live AUTH_EXPIRED quota state. Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Auto self-review (
|
milstan
left a comment
There was a problem hiding this comment.
I am not sure we want to force the agent to always do it - sometimes it does not make sens at all, and we're forcing it to pick next steps to show.
I think we may want to be more subtle here - leverage our concent of memory - to remember if the user is dismissing next step proposals or accepting them (if always dismissing, then not propose)
Also we may want to actually teach the agent to decide when next steps make sens, and when they are just "why not" - I am personally very annoyed that ChatGPT always proposes next steps, even when clearly the work is done.
On the other hand I think we want, when next steps are proposed, to make sure they are presented in the form of a native UI choice dialog.
…y-aware, via native widget Address Milan's review: don't force a next step on every overview (the ChatGPT "why not" annoyance). Teach the agent to decide WHEN a next step is genuinely useful vs. reflexive filler, lean on agent memory to suppress proposals for users who routinely dismiss them, and render any proposal through the native choice dialog (ask_user_input_v0 / AskUserQuestion) rather than prose. Soften the WF7 contract criterion to match — proposing none is acceptable when the status read is a complete answer. Co-Authored-By: Claude <noreply@anthropic.com>
…ippet references The ask-user-input-routing snippet ends with "pick rows from the (Observation, Suggest, Calls) table below" and "call the matching Calls tool" — but the overview had no such table, so in the no-next_steps case (e.g. leadbay_account_status) the required widget path was under-specified and the agent could hallucinate options or skip the proposal. Add an overview-specific table (discovery batch / follow-ups / quota / auth blocker / lens mismatch / healthy → propose none) so the reference resolves deterministically. Addresses review P2. Co-Authored-By: Claude <noreply@anthropic.com>
…ate audience adjust Two user-visible bugs in the added NEXT STEPS table (review P2 ×2): - leadbay_daily_check_in / leadbay_followup_check_in are MCP PROMPTS, not agent-callable tools; a host that can't invoke a prompt from a turn would stall or call a non-existent tool. Route to the underlying tools instead (leadbay_pull_leads / leadbay_pull_followups), with a note that the Calls column must always be a callable leadbay_* tool, never a prompt name. - The "Adjust the lens audience" option called leadbay_adjust_audience directly, but the option carries no sectors/sizes/exclusions and the tool has no required inputs — an empty call writes the current filter / may clone the default lens (no-op or unwanted change). Gate it: ASK for the ICP details first, then call with those params; never call with no args. Co-Authored-By: Claude <noreply@anthropic.com>
|
why does it say "The live eval account's quota endpoint returns AUTH_EXPIRED (401)"? Do we need to infer that in fact the results it obtained were considered passing becuase it could not uthenticate so it just thaught it's OK? |
To verify, would require #109 to be merged |
What
leadbay_prospecting_overviewwas changed (in the first cut of this PR) to always end on a concrete next step. Milan pushed back: forcing a next step on every turn is the ChatGPT "why not?" annoyance — sometimes the status read is the complete answer and a manufactured next step erodes trust.This revision reworks the instruction to address all three of Milan's points.
Fix
Replaced the unconditional "always propose" rule with a three-part rule in the prompt template:
_meta.agent_memory.summaryfor how this user reacts to next-step offers; default to not proposing if they routinely dismiss them, and capture the dismiss/accept signal vialeadbay_agent_memory_captureso the preference compounds across sessions.ask_user_input_v0/AskUserQuestion). Pulled in thenext-steps/ask-user-input-routingsnippet.WORKFLOWS.mdWF7 success criterion softened to match: proposing none is acceptable when the status read is complete; if proposed, it must be a concrete widget-routed action, not reflexive filler.Files:
leadbay_prospecting_overview.md.tmpl(source) + regeneratedprompts.generated.tsand the pluginSKILL.md;WORKFLOWS.mdcontract.Eval proof (live
/eval --workflow 7)Workflow: 7 — Prospecting overview (
leadbay_prospecting_overview), single-turn, routing prompt body injected. Live Leadbay API, 3 consecutive runs.ask_user_input_v0widget, 4 concrete optionsAUTH_EXPIRED— reported honestly, no fabricationask_user_input_v0widget, 4 concrete optionsAUTH_EXPIRED— reported honestly, no fabricationask_user_input_v0widget, 4 concrete optionsAUTH_EXPIRED— reported honestly, no fabricationleadbay_account_statuscalled;leadbay_report_outreach/ mutating tools absent).leadbay_pull_leadscall the agent made to enrich the overview — sensible, not a contract violation. Runs 2 and 3 judged the same call 5/5/5/5. Not a clean 3×5/5/5/5 and I'm not claiming one.ask_user_input_v0native widget (e.g.[Refine the lens, Check audience filters, Reconnect first, Triage board]), not prose.Not covered / caveats
AUTH_EXPIRED (401), so the agent's honest handling of a failed quota read is what's exercised — quota-figure rendering against real numbers was not tested (same caveat as the prior revision).