Skip to content

feat: implement indirect prompt injection protection and expanded sec…#3

Merged
sdwolf4103 merged 1 commit into
sdwolf4103:mainfrom
StevenChoo:feat/security-hardening
Apr 27, 2026
Merged

feat: implement indirect prompt injection protection and expanded sec…#3
sdwolf4103 merged 1 commit into
sdwolf4103:mainfrom
StevenChoo:feat/security-hardening

Conversation

@StevenChoo
Copy link
Copy Markdown
Contributor

…ret redaction

@sdwolf4103
Copy link
Copy Markdown
Owner

Thanks for the contribution! This is a useful security hardening direction.

I’m going to merge your branch locally into a review branch first so your author credit is preserved, then run the full test suite and add any small follow-up fixes if needed before merging to main.

@sdwolf4103 sdwolf4103 merged commit e071095 into sdwolf4103:main Apr 27, 2026
sdwolf4103 added a commit that referenced this pull request May 2, 2026
…nce, and validation baseline

Wave 1 — Compaction prompt improvement:
- Add three wording-reuse bullets to buildCompactionPrompt() under
  CRITICAL MEMORY RULES: do not create rephrased duplicates, reuse
  existing wording exactly when re-emitting, only emit new memories
  when the fact is new, materially corrected, or more specific.
- This attacks the root cause of zero reinforcement: compaction
  generating variant text for the same durable fact.

Wave 2 — Bug fixes:
- Bug #2: Add placeholder comment to superseded_existing branch in
  decision dedupe (unreachable until v1.5.4 numbered refs). Preserve
  as const type assertions.
- Bug #3: Add memory_migration_superseded evidence event type. Both
  P0 and quality cleanup migrations now produce evidence events for
  superseded entries. loadWorkspaceMemory appends migration evidence
  on first-load migrations only (idempotent via migration IDs). No
  historical backfill.
- Bug #4: Add documentation comment explaining that feedback identity
  key returns exact key (absorbed_identity currently impossible for
  feedback). Add test verifying this behavior.

Wave 3 — Validation baseline script:
- Add scripts/dev/validate-identity-keys.ts: read-only script that
  scans workspace memory stores, computes exact/identity key
  collisions, and reports reinforcement statistics. Baseline matches
  audit: 0 exact collisions, 0 identity collisions, 0 reinforcement
  events across 123 active memories.

Identity extension is gated on measurement: if the prompt change
produces measurable reinforcement (reinforcementCount > 0), identity
extension may be unnecessary. Decision dedupe stays exact-only
(Wave 4 deferred).
sdwolf4103 added a commit that referenced this pull request May 8, 2026
sdwolf4103 added a commit that referenced this pull request May 8, 2026
…nce, and validation baseline

Wave 1 — Compaction prompt improvement:
- Add three wording-reuse bullets to buildCompactionPrompt() under
  CRITICAL MEMORY RULES: do not create rephrased duplicates, reuse
  existing wording exactly when re-emitting, only emit new memories
  when the fact is new, materially corrected, or more specific.
- This attacks the root cause of zero reinforcement: compaction
  generating variant text for the same durable fact.

Wave 2 — Bug fixes:
- Bug #2: Add placeholder comment to superseded_existing branch in
  decision dedupe (unreachable until v1.5.4 numbered refs). Preserve
  as const type assertions.
- Bug #3: Add memory_migration_superseded evidence event type. Both
  P0 and quality cleanup migrations now produce evidence events for
  superseded entries. loadWorkspaceMemory appends migration evidence
  on first-load migrations only (idempotent via migration IDs). No
  historical backfill.
- Bug #4: Add documentation comment explaining that feedback identity
  key returns exact key (absorbed_identity currently impossible for
  feedback). Add test verifying this behavior.

Wave 3 — Validation baseline script:
- Add scripts/dev/validate-identity-keys.ts: read-only script that
  scans workspace memory stores, computes exact/identity key
  collisions, and reports reinforcement statistics. Baseline matches
  audit: 0 exact collisions, 0 identity collisions, 0 reinforcement
  events across 123 active memories.

Identity extension is gated on measurement: if the prompt change
produces measurable reinforcement (reinforcementCount > 0), identity
extension may be unnecessary. Decision dedupe stays exact-only
(Wave 4 deferred).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants