Skip to content

fix(authz): symmetric SpiceDB cleanup on policy delete#1624

Merged
whoAbhishekSah merged 1 commit into
mainfrom
fix/policy-delete-symmetric-rolebinding-cleanup
May 20, 2026
Merged

fix(authz): symmetric SpiceDB cleanup on policy delete#1624
whoAbhishekSah merged 1 commit into
mainfrom
fix/policy-delete-symmetric-rolebinding-cleanup

Conversation

@whoAbhishekSah
Copy link
Copy Markdown
Member

@whoAbhishekSah whoAbhishekSah commented May 19, 2026

Problem

policy.AssignRole writes three SpiceDB tuples per policy:

T1: rolebinding:<polID>#role_bearer          @ <principal>
T2: rolebinding:<polID>#role                 @ role:<roleID>
T3: <resource>#<grant_relation>              @ rolebinding:<polID>

policy.Delete (and DeleteWithMinRoleGuard) only swept by Object on the rolebinding, catching T1+T2. T3 has the rolebinding as Subject, so it leaked on every per-member removal, ownership swap, and cross-resource principal cleanup.

Full-resource teardown (group.DeleteModel, project.DeleteModel) papered over it with a wildcard sweep on the resource — but only for T3 tied to the deleted resource. Per-member removals leaked forever.

Fix

Both delete paths now route through a deleteRoleBindingRelations helper that issues two relationService.Delete calls: one by Object (T1, T2), one by Subject (T3). The Subject-side delete tolerates relation.ErrNotExist so resource-level sweeps that already removed T3 don't fail the policy delete.

Test plan

  • core/policy unit tests cover happy path, ErrNotExist tolerance, non-ErrNotExist failure semantics for both Delete and DeleteWithMinRoleGuard.
  • Caller-side tests pass: organization, group, project, membership, userpat, deleter, v1beta1connect policy handler.

Follow-ups

Closes #1617.

🤖 Generated with Claude Code

policy.AssignRole writes three tuples per policy but policy.Delete and
DeleteWithMinRoleGuard only swept the two where the rolebinding is the
Object (role_bearer, role). The resource-grant tuple — where the
rolebinding is the Subject — leaked on every per-member removal,
ownership swap, or cross-resource principal cleanup.

Add a second relation.Delete keyed by Subject on both delete paths via
a shared helper. The Subject-side delete tolerates relation.ErrNotExist
so resource-level wildcard sweeps in group.DeleteModel /
project.DeleteModel (which currently paper over the bug at full-resource
teardown) don't cause failures during the overlap period; those sweeps
become redundant for this tuple and can be removed in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 19, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
frontier Ready Ready Preview, Comment May 19, 2026 9:50am

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Improved policy deletion process to ensure complete cleanup of all role-binding relationships, gracefully handling scenarios where concurrent deletion operations may have already removed tuples.
  • Tests

    • Expanded test coverage for policy deletion functionality with new scenarios validating behavior during concurrent operations and edge cases.

Walkthrough

The PR refactors role-binding cleanup in the policy service by extracting dual-tuple deletion logic into a helper function. Service.Delete now delegates to deleteRoleBindingRelations, which removes both the object-side and subject-side SpiceDB tuples and tolerates concurrent deletions via errors.Is(relation.ErrNotExist). Tests validate both successful deletion paths and error scenarios.

Changes

Role-Binding Cleanup Refactoring

Layer / File(s) Summary
Dual-tuple deletion with error tolerance
core/policy/service.go
Service.Delete calls new deleteRoleBindingRelations helper that removes both rolebinding-as-Object and resource-grant tuples sequentially. The subject-side delete tolerates relation.ErrNotExist via errors.Is to handle concurrent/sweeping resource deletions, while other failures are propagated. errors package is imported to support conditional error handling.
Test coverage for deletion behavior
core/policy/service_test.go
TestService_Delete cases validate that both relation tuples are deleted before repository delete and that non-ErrNotExist failures block the repository operation. New TestService_DeleteWithMinRoleGuard test function covers three scenarios: successful deletion with post-delete relation deletes, tolerated ErrNotExist during relation delete, and repository guard failure that skips relation deletes and returns an error.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Suggested reviewers

  • rohilsurana
  • AmanGIT07
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/policy/service_test.go (1)

33-51: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Test names claim call ordering that isn’t asserted.

Line 33 and Line 133 describe strict sequencing, but these cases currently only verify invocation, not order. Please enforce order explicitly so these tests fail on regressions in delete sequencing.

Also applies to: 133-152

🧹 Nitpick comments (1)
core/policy/service_test.go (1)

155-188: ⚡ Quick win

Add the guarded-path non-ErrNotExist subject-delete failure case.

TestService_DeleteWithMinRoleGuard should also cover subject-side delete returning a non-ErrNotExist error, to lock in the helper’s error semantics on this path.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2049107e-41d1-40af-8ebc-20092ea7982c

📥 Commits

Reviewing files that changed from the base of the PR and between 996912f and 2c244b8.

📒 Files selected for processing (2)
  • core/policy/service.go
  • core/policy/service_test.go

@coveralls
Copy link
Copy Markdown

Coverage Report for CI Build 26089607399

Coverage increased (+0.04%) to 42.379%

Details

  • Coverage increased (+0.04%) from the base build.
  • Patch coverage: 16 of 16 lines across 1 file are fully covered (100%).
  • No coverage regressions found.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

No coverage regressions found.


Coverage Stats

Coverage Status
Relevant Lines: 37724
Covered Lines: 15987
Line Coverage: 42.38%
Coverage Strength: 11.9 hits per line

💛 - Coveralls

@whoAbhishekSah
Copy link
Copy Markdown
Member Author

whoAbhishekSah commented May 20, 2026

Manual verification — direct SpiceDB tuple inspection

Tested against a local Frontier + SpiceDB (Postgres backend). For each case I snapshot the relation_tuple table before and after the operation and confirm that all three tuples written by AssignRole (T1 bearer, T2 role, T3 resource-grant) are gone — i.e. no T3 leak. Permission checks are also exercised end-to-end.

# Scenario Tuples after delete Permission after delete Result
1 DeletePolicy — user principal, org resource 0 denied pass
2 DeletePolicy — user principal, project resource 0 denied pass
3 Recreate identical (role, resource, principal) after delete clean 3-tuple set, new polID restored pass
4 Idempotent delete — DeletePolicy called twice on the same id 0 after first call n/a pass (second call returns not_found from repo layer, relations side is clean)
5 RemoveOrganizationMember — per-member removal path 0 denied pass
6 Cross-resource: app/group principal on app/projectDeletePolicy 0 n/a pass
7 Cross-resource: app/group principal on app/projectDeleteGroup 0 n/a pass
8 CheckResourcePermission round-trip (false → true → false → true) n/a correct at every step pass
9 Upsert idempotency — same (role, resource, principal) created twice same polID, 3 tuples n/a pass
10 Service-user principal on org — DeletePolicy 0 n/a pass

Each check was done by SQL against the SpiceDB relation_tuple table, e.g.:

SELECT namespace, object_id, relation, userset_namespace, userset_object_id
FROM relation_tuple
WHERE deleted_xid = '9223372036854775807'::xid8
  AND ((namespace = 'app/rolebinding' AND object_id = '<polID>')
    OR (userset_namespace = 'app/rolebinding' AND userset_object_id = '<polID>'));

Cases 5 and 7 are the ones that previously leaked T3 forever (per-member removal and group-as-principal-on-other-resource) — both clean now.

Note on the relation.ErrNotExist tolerance

The Subject-side delete's errors.Is(err, relation.ErrNotExist) branch is currently defensive — relation.Service.Delete returns nil for empty matches today, so the path isn't reachable from normal flows. I'd keep it for forward-compat: it costs nothing and protects against future relation.Delete implementations that surface ErrNotExist.

Out of scope (filed separately)

While testing I found an unrelated leak: app/serviceuser:<su>#org@app/organization:<org> (the SU↔org identity-link tuple) leaks on DeleteServiceUser when the SU has no remaining org policies at delete time. This is not a rolebinding tuple, so not in scope for this PR — filed as #1629.

@whoAbhishekSah whoAbhishekSah merged commit 4603d2c into main May 20, 2026
8 checks passed
@whoAbhishekSah whoAbhishekSah deleted the fix/policy-delete-symmetric-rolebinding-cleanup branch May 20, 2026 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

policy.Delete is asymmetric with AssignRole: resource-grant tuple leaks on every policy delete

3 participants