feat(gdpr): add document expiration and automatic cleanup#208
Conversation
- Enable experimental.tasks in nuxt.config.ts - server/tasks/cleanup-old-documents.ts: Nitro task definition - server/utils/cleanup-old-documents.ts: reusable utility function Signed-off-by: Florian Chab <f.chabreiron@gmail.com>
- Add expirationDate column + index to document table - Migration for existing databases Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- GET /api/documents: list org documents - PATCH /api/documents/:id/expiration: update expiration date Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- Admin components - Expiration date editing interface - Timeline view updates Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- GDPR_CLEANUP_ENABLED env variable Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- Comprehensive unit tests for cleanup function Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- Update CHANGELOG, PRODUCT and ARCHITECTURE with new feature" Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- All locale files updated with GDPR-related strings Signed-off-by: Florian Chab <fchabreiron@gmail.com>
Signed-off-by: Florian Chab <fchabreiron@gmail.com>
Signed-off-by: Florian Chab <fchabreiron@gmail.com>
📝 WalkthroughWalkthroughImplements GDPR document retention across the full stack: adds an ChangesGDPR Document Retention
Sequence Diagram(s)sequenceDiagram
participant Cron as Nitro Cron (daily)
participant Task as cleanup-old-documents
participant Util as cleanupOldDocuments()
participant DB as PostgreSQL
participant S3 as MinIO/S3
Cron->>Task: run()
Task->>Util: cleanupOldDocuments()
Util->>Util: check GDPR_CLEANUP_ENABLED
Util->>DB: findMany (expirationDate < now)
DB-->>Util: expired documents[]
loop per expired document
Util->>S3: deleteFromS3(storageKey)
S3-->>Util: success / error
alt S3 success
Util->>DB: delete row (id + organizationId)
else S3 error
Util->>Util: record error, continue
end
end
Util-->>Task: { deletedCount, errors }
Task-->>Cron: { success, deletedCount, errors, timestamp }
sequenceDiagram
participant Admin as Admin User
participant Page as /admin/documents
participant Modal as DocumentExpirationModal
participant API as PATCH /api/documents/:id/expiration
participant DB as PostgreSQL
participant Timeline as Activity Timeline
Admin->>Page: load page
Page->>Page: GET /api/documents
Page-->>Admin: document table (color-coded expiration)
Admin->>Page: click edit on a row
Page->>Modal: open (documentId, currentExpiration)
Admin->>Modal: set new datetime-local value, save
Modal->>API: PATCH expirationDate
API->>DB: update expirationDate
API->>DB: insert updated activity (from/to metadata)
API-->>Modal: { success, documentId, expirationDate }
Modal-->>Page: emit updated
Page->>Page: reload document list
Timeline->>DB: fetch activity events
DB-->>Timeline: updated document event with metadata
Timeline-->>Admin: from/to badge + custom description
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Florian Chab <fchabreiron@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (1)
server/utils/schemas/document.ts (1)
17-18: 🏗️ Heavy liftUse calendar-year arithmetic for retention deadlines.
Line 18 encodes “2 years” as a fixed 730-day duration, while the DB contract uses calendar
INTERVAL '2 years'. These can diverge around leap years and cause earlier-than-expected expiration. Prefer a shared calendar-year helper (e.g.,setUTCFullYear(+2)) and reuse it at insert call sites.♻️ Proposed direction
-/** Default document retention period in milliseconds (2 years for GDPR compliance) */ -export const DEFAULT_DOCUMENT_RETENTION_MS = 2 * 365 * 24 * 60 * 60 * 1000 +/** Default document retention period in years (GDPR compliance) */ +export const DEFAULT_DOCUMENT_RETENTION_YEARS = 2 + +/** Calendar-safe expiration date calculation (avoids leap-year drift). */ +export function getDefaultDocumentExpirationDate(from = new Date()): Date { + const expiration = new Date(from) + expiration.setUTCFullYear(expiration.getUTCFullYear() + DEFAULT_DOCUMENT_RETENTION_YEARS) + return expiration +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/utils/schemas/document.ts` around lines 17 - 18, The DEFAULT_DOCUMENT_RETENTION_MS constant uses a fixed 730-day calculation that does not account for leap years, while the database uses calendar INTERVAL '2 years' which properly handles them, causing potential misalignment. Replace this fixed millisecond constant with a shared calendar-year helper function that uses setUTCFullYear(+2) to calculate the retention deadline, and update all insert call sites to use this helper instead of the constant to ensure consistency with the database contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.env.example:
- Around line 47-49: The GDPR_CLEANUP_ENABLED configuration in the .env.example
file is set to true, which increases the risk of accidental data deletion when
teams bootstrap from this template. Change the GDPR_CLEANUP_ENABLED value from
true to false so that automatic document cleanup is disabled by default and
requires explicit opt-in by teams that understand the implications.
In `@app/components/admin/DocumentExpirationModal.vue`:
- Around line 19-23: The getFormattedDate() function uses toISOString() which
returns UTC time, but datetime-local input fields expect local time values. This
causes timezone drift when values are re-saved. Replace the UTC-based
toISOString() approach in getFormattedDate() with local timezone formatting by
extracting local date and time components (getFullYear, getMonth, getDate,
getHours, getMinutes) and constructing the datetime-local format string from
those local values. Apply the same timezone-aware local formatting approach to
the other locations that use toISOString() for datetime-local values, including
the :min binding at line 88 and the related code at line 49.
In `@app/pages/admin/documents/index.vue`:
- Around line 9-10: Replace the hardcoded fr-FR locale string used for time
formatting on line 58 with a dynamic locale reference from the useI18n() hook
that is already initialized at the top of the component. Extract the current
locale from the i18n instance and pass it to the date formatting function
instead of the static fr-FR string to ensure time displays in the user's active
application locale.
In `@app/pages/dashboard/timeline.vue`:
- Around line 375-379: The return statement string in the Document update
condition contains awkward punctuation (colon after "of") and trailing filler
characters (dash and spaces) that make the message read poorly. Revise the
template literal string to remove the unnecessary colon after "Expiration date
of", clean up the grammar, and remove the trailing dash and spaces to produce a
natural, complete sentence that displays properly to users.
In `@server/api/documents/`[id]/expiration.patch.ts:
- Around line 26-30: After parsing and creating the expirationDate Date object
from expirationDateStr in the readValidatedBody call, add validation to ensure
the expiration date is in the future. Compare the expirationDate with the
current date and time, and if it is in the past, throw or return an error
response that will prevent the API from accepting the request. This validation
must occur at the API boundary before any cleanup or deletion logic is triggered
based on the expiration date.
- Around line 79-92: The condition on line 88 that compares `expirationDate !==
doc.expirationDate` is comparing Date objects by reference which will always
evaluate as true for newly parsed dates. Fix this by comparing the
already-formatted date strings instead: change the condition to compare
oldExpirationDate with newExpirationDate. Additionally, the recordActivity
function call is not awaited, which can cause audit log entries to be dropped if
the request lifecycle ends early. Add await before the recordActivity call to
ensure the audit logging completes before the request concludes.
In `@server/api/documents/index.get.ts`:
- Around line 17-25: The columns object in the document list endpoint is
returning more fields than necessary, which increases unnecessary data exposure.
Remove the fields `storageKey`, `organizationId`, and `candidateId` from the
columns selection in the index.get.ts file, keeping only `id`,
`originalFilename`, `createdAt`, and `expirationDate` to match the minimal data
requirements of the admin page.
- Around line 11-16: The defineEventHandler function checks authentication using
requireAuth but does not verify the user has permission to list documents. Add a
permission check after obtaining the session from requireAuth to ensure only
users with appropriate permissions (such as admin role) can access the document
listing. Implement this by checking the user's role or permissions from the
session object before executing the database query in the findMany call.
In `@server/database/migrations/0029_tearful_magik.sql`:
- Around line 2-8: The migration for the document table's expiration_date column
is missing a DEFAULT clause that should match the schema definition in
server/database/schema/app.ts. After the UPDATE statement that sets
expiration_date values for existing documents and before the ALTER COLUMN NOT
NULL statement, add an ALTER TABLE statement to SET DEFAULT with a value of
NOW() plus INTERVAL '2 years' to ensure new document records automatically
receive an expiration date and maintain consistency with the schema definition.
In `@server/database/schema/app.ts`:
- Line 142: Replace the `.$defaultFn` call with `.default` on the
`expirationDate` column definition. Change `.$defaultFn(() => sql\`(NOW() +
INTERVAL '2 years')\`)` to `.default(sql\`(NOW() + INTERVAL '2 years')\`)`. This
ensures the default value is enforced at the database level rather than only at
the application runtime, and allows Drizzle Kit to properly capture this
constraint in migration files.
In `@server/utils/cleanup-old-documents.ts`:
- Around line 19-24: The feature-gate logic in the cleanup-old-documents
function is inverted due to the negation operator in the condition check.
Currently when GDPR_CLEANUP_ENABLED is set to "true", the function returns early
and skips cleanup, which is backwards. Remove the negation operator (!) from the
condition at line 19 so that cleanup is skipped only when GDPR_CLEANUP_ENABLED
is not set to "true", allowing cleanup to execute when the flag is explicitly
enabled.
In `@tests/unit/cleanup-old-documents.test.ts`:
- Around line 33-42: The environment variable GDPR_CLEANUP_ENABLED is being set
in the beforeEach hook, but the cleanupOldDocuments module is imported before
this hook runs, which means any feature-flag checks in that module execute with
the unset environment variable. Move the environment variable setup to before
the imports, either at the top of the test file or within a setup block that
runs before module imports, so that when cleanupOldDocuments is imported and
evaluated, it sees the correct GDPR_CLEANUP_ENABLED flag value.
---
Nitpick comments:
In `@server/utils/schemas/document.ts`:
- Around line 17-18: The DEFAULT_DOCUMENT_RETENTION_MS constant uses a fixed
730-day calculation that does not account for leap years, while the database
uses calendar INTERVAL '2 years' which properly handles them, causing potential
misalignment. Replace this fixed millisecond constant with a shared
calendar-year helper function that uses setUTCFullYear(+2) to calculate the
retention deadline, and update all insert call sites to use this helper instead
of the constant to ensure consistency with the database contract.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b416e9c0-b5f6-4139-9d88-e4e59391df5c
📒 Files selected for processing (27)
.env.exampleARCHITECTURE.mdCHANGELOG.mdPRODUCT.mdapp/components/admin/DocumentExpirationModal.vueapp/pages/admin/documents/index.vueapp/pages/dashboard/timeline.vuei18n/locales/de.jsoni18n/locales/en.jsoni18n/locales/es.jsoni18n/locales/fr.jsoni18n/locales/nb.jsoni18n/locales/vi.jsonnuxt.config.tsserver/api/candidates/[id]/documents/index.post.tsserver/api/documents/[id]/expiration.patch.tsserver/api/documents/index.get.tsserver/api/public/jobs/[slug]/apply.post.tsserver/database/migrations/0029_tearful_magik.sqlserver/database/migrations/meta/0029_snapshot.jsonserver/database/migrations/meta/_journal.jsonserver/database/schema/app.tsserver/tasks/cleanup-old-documents.tsserver/utils/cleanup-old-documents.tsserver/utils/env.tsserver/utils/schemas/document.tstests/unit/cleanup-old-documents.test.ts
| import { cleanupOldDocuments } from '../../server/utils/cleanup-old-documents' | ||
| import { deleteFromS3 } from '../../server/utils/s3' | ||
| import { db } from '../../server/utils/db' | ||
|
|
||
| describe('cleanupOldDocuments', () => { | ||
| beforeEach(() => { | ||
| vi.clearAllMocks() | ||
| // Enable GDPR cleanup for tests | ||
| process.env.GDPR_CLEANUP_ENABLED = 'true' | ||
| }) |
There was a problem hiding this comment.
Env flag setup happens too late to reliably test feature-flag behavior.
cleanupOldDocuments is imported on Line 33 before process.env.GDPR_CLEANUP_ENABLED is set on Line 41. That can freeze env-derived config before each test runs and miss real gating bugs.
💡 Suggested fix pattern
-import { cleanupOldDocuments } from '../../server/utils/cleanup-old-documents'
+let cleanupOldDocuments: typeof import('../../server/utils/cleanup-old-documents').cleanupOldDocuments
describe('cleanupOldDocuments', () => {
- beforeEach(() => {
+ beforeEach(async () => {
vi.clearAllMocks()
- // Enable GDPR cleanup for tests
+ vi.resetModules()
process.env.GDPR_CLEANUP_ENABLED = 'true'
+ ;({ cleanupOldDocuments } = await import('../../server/utils/cleanup-old-documents'))
})🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/cleanup-old-documents.test.ts` around lines 33 - 42, The
environment variable GDPR_CLEANUP_ENABLED is being set in the beforeEach hook,
but the cleanupOldDocuments module is imported before this hook runs, which
means any feature-flag checks in that module execute with the unset environment
variable. Move the environment variable setup to before the imports, either at
the top of the test file or within a setup block that runs before module
imports, so that when cleanupOldDocuments is imported and evaluated, it sees the
correct GDPR_CLEANUP_ENABLED flag value.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Florian Chab <fchabreiron@gmail.com>
- Update the input so that only the date is changed, not the time - Update the date display Signed-off-by: Florian Chab <fchabreiron@gmail.com>
…pdate - Verify database update success via rowsAffected before recording activity - Throw explicit error if document expiration update fails - Prevents false activity logs for failed operations Signed-off-by: Florian Chab <fchabreiron@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…ctices - The approach was causing inconsistencies with the generated migration SQL. Switching to default(sql...) ensures the same expression is used in both the schema definition and the migration file. Signed-off-by: Florian Chab <fchabreiron@gmail.com>
…ab/reqcore into feat/gdpr-document-cleanup
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/database/migrations/0029_tearful_magik.sql`:
- Line 1: The migration in file 0029_tearful_magik.sql incorrectly applies the
expiration_date default at migration time (NOW()) for all existing document
rows, rather than calculating it based on each document's created_at timestamp.
Fix this by splitting the migration into three steps: first ADD the column
without NOT NULL and with a nullable default, then UPDATE all existing document
rows to set expiration_date to their created_at plus 2 years interval, and
finally ALTER the column to add the NOT NULL constraint with the NOW() plus 2
years default for future inserts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b7b219d2-504c-44e3-8747-f1aa0f6cd78e
📒 Files selected for processing (7)
app/components/admin/DocumentExpirationModal.vueapp/pages/admin/documents/index.vueserver/api/documents/[id]/expiration.patch.tsserver/database/migrations/0029_tearful_magik.sqlserver/database/migrations/meta/0029_snapshot.jsonserver/database/migrations/meta/_journal.jsonserver/database/schema/app.ts
✅ Files skipped from review due to trivial changes (1)
- server/database/migrations/meta/_journal.json
🚧 Files skipped from review as they are similar to previous changes (4)
- server/database/schema/app.ts
- app/components/admin/DocumentExpirationModal.vue
- app/pages/admin/documents/index.vue
- server/api/documents/[id]/expiration.patch.ts
| @@ -0,0 +1,2 @@ | |||
| ALTER TABLE "document" ADD COLUMN "expiration_date" timestamp DEFAULT (NOW() + INTERVAL '2 years') NOT NULL;--> statement-breakpoint | |||
There was a problem hiding this comment.
Backfill semantics are incorrect for existing documents.
Line 1 sets NOT NULL with a NOW() + INTERVAL '2 years' default directly on add, so pre-existing rows get the migration-time expiry instead of created_at + INTERVAL '2 years'. That violates the retention contract and can retain already-old personal data longer than intended.
Proposed migration shape
-ALTER TABLE "document" ADD COLUMN "expiration_date" timestamp DEFAULT (NOW() + INTERVAL '2 years') NOT NULL;--> statement-breakpoint
+ALTER TABLE "document" ADD COLUMN "expiration_date" timestamp;
+UPDATE "document"
+SET "expiration_date" = "created_at" + INTERVAL '2 years'
+WHERE "expiration_date" IS NULL;
+ALTER TABLE "document" ALTER COLUMN "expiration_date" SET DEFAULT (NOW() + INTERVAL '2 years');
+ALTER TABLE "document" ALTER COLUMN "expiration_date" SET NOT NULL;--> statement-breakpoint🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@server/database/migrations/0029_tearful_magik.sql` at line 1, The migration
in file 0029_tearful_magik.sql incorrectly applies the expiration_date default
at migration time (NOW()) for all existing document rows, rather than
calculating it based on each document's created_at timestamp. Fix this by
splitting the migration into three steps: first ADD the column without NOT NULL
and with a nullable default, then UPDATE all existing document rows to set
expiration_date to their created_at plus 2 years interval, and finally ALTER the
column to add the NOT NULL constraint with the NOW() plus 2 years default for
future inserts.
|
Hey @FloChab Thank you for taking the time to tackle this, really appreciate it. I'm going to close this in favor of #211, but I genuinely appreciate you putting the work in and helping put this on the radar. Would love your eyes on #211 just let me know if there's something you wan't to improve, and I'll be on it, See you around |
Summary
✨ Key Changes
🗃️ Database
0029: Addsexpiration_datecolumn todocumenttablecreated_at + INTERVAL '2 years'⚡ Nitro Scheduled Task
server/utils/cleanup-old-documents.ts→ Reusable utility function (testable without Nitro dependencies)server/tasks/cleanup-old-documents.ts→ Nitro task definition usingdefineTasknitro.experimental.tasks: trueinnuxt.config.tsnitro.scheduledTasks(default: daily at 00:00 UTC)GDPR_CLEANUP_ENABLEDenvironment variable (default:falsefor safety)🔌 API Endpoints
GET /api/documents→ List all organization documentsPATCH /api/documents/:id/expiration→ Extend retention period for specific documents (legal hold)POST /api/candidates/[id]/documents→ Setsexpiration_dateon uploadPOST /api/public/jobs/[slug]/apply→ Enforces document limits🖥️ Admin UI
app/components/admin/for expiration managementapp/pages/admin/for document retention settingsapp/pages/dashboard/timeline.vue🌍 Internationalization
de.json,en.json,es.json,fr.json,nb.json,vi.json) updated with GDPR-related strings📚 Documentation
ARCHITECTURE.mdwith new task structurePRODUCT.mdwith technical detailsCHANGELOG.mdwith feature description🎯 GDPR Compliance Features
expiration_datefieldcleanup_old_documents.*eventsGDPR_CLEANUP_ENABLED=false📋 Checklist
git commit -sorganizationId🚀 Deployment Notes
Before merging to main:
tasks). Ensure your deployment platform supports it.GDPR_CLEANUP_ENABLED=false). Enable only after testing in staging.expiration_dateset tocreated_at + 2 yearsvia migration.❗ Important Notice: Documents Older Than 2 Years
Migration
0029automatically setsexpiration_date = created_at + INTERVAL '2 years'for all existing documents in your database.🚨 Immediate Impact
If your instance contains documents created more than 2 years ago:
created_at + 2 yearsDeletion is permanent and irreversible:
🔗 Related Issues / Context
Type of change
Validation
DCO
Signed-off-by) viagit commit -sSummary by CodeRabbit
Release Notes
GDPR_CLEANUP_ENABLEDtoggle.