Skip to content

Conversation

@telekosmos
Copy link
Contributor

@telekosmos telekosmos commented Mar 24, 2025

Addresses #69 .

As commented there, overlaps noSensitiveInfoInRepositories but expands to do checks in commits as well.

Summary by CodeRabbit

  • New Features

    • Introduced a compliance check for scanning commits for sensitive information, evaluating secret scanning settings at both organization and repository levels.
    • Added alerts and remediation tasks for projects that do not meet secret scanning requirements.
  • Bug Fixes

    • Improved database schema to ensure related records are automatically updated or deleted when compliance checks or resources are changed.
  • Tests

    • Added comprehensive integration and validation tests covering multiple scenarios for secret scanning compliance.
  • Chores

    • Updated database configuration to support specifying a custom port.
    • Enhanced compliance check metadata and documentation references.

@coderabbitai
Copy link

coderabbitai bot commented Jun 30, 2025

Walkthrough

This update introduces a new compliance check for scanning GitHub commits for sensitive information. It adds the check's implementation, validation logic, integration tests, and schema/migration changes to support the feature. The database configuration is updated, and foreign key constraints are enhanced to include cascading behavior for improved data integrity.

Changes

File(s) Change Summary
tests/checks/scanCommitsForSensitiveInfo.test.js Added integration tests for the new scanCommitsForSensitiveInfo compliance check, covering multiple pass/fail scenarios.
tests/checks/validators/scanCommitsForSensitiveInfo.test.js Added comprehensive tests for the scanCommitsForSensitiveInfo validator, simulating various org/repo secret scanning setups.
src/checks/complianceChecks/scanCommitsForSensitiveInfo.js Introduced the main compliance check logic for scanning commits for sensitive info, with database interactions and logging.
src/checks/validators/scanCommitsForSensitiveInfo.js Added a validator function to assess secret scanning status across orgs and repos, producing results, alerts, and tasks.
src/checks/validators/index.js Registered the new scanCommitsForSensitiveInfo validator in the validators export.
src/config/index.js Updated DB config to include a port property, defaulting to 5432 or using DB_PORT env variable.
src/database/migrations/1742403845916_update_check_scanCommitsForSensitiveInfo.js.js Migration to update compliance_checks table for scanCommitsForSensitiveInfo check status, type, and reference URL.
src/database/schema/schema.sql Modified foreign key constraints on resources_for_compliance_checks table to add cascade on update/delete.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ComplianceCheck (scanCommitsForSensitiveInfo)
    participant DataStore (DB)
    participant Validator

    User->>ComplianceCheck: Trigger check (with optional projects)
    ComplianceCheck->>DataStore: Fetch check metadata and projects
    ComplianceCheck->>DataStore: Fetch GitHub orgs and repos for projects
    ComplianceCheck->>Validator: Validate org/repo secret scanning settings
    Validator-->>ComplianceCheck: Return results, alerts, tasks
    ComplianceCheck->>DataStore: Delete existing alerts/tasks for this check
    ComplianceCheck->>DataStore: Upsert results, insert alerts/tasks
    ComplianceCheck-->>User: Completion (results stored)
Loading

Poem

In the warren where code bunnies dwell,
We sniff for secrets, and sniff them well!
With checks and tests, our paws are swift—
Guarding your repos, we’re security’s gift.
With cascading keys and configs anew,
This rabbit ensures your secrets stay few!
🐇🔍✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27ee196 and 341eb76.

📒 Files selected for processing (1)
  • __tests__/checks/validators/scanCommitsForSensitiveInfo.test.js (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/checks/validators/scanCommitsForSensitiveInfo.test.js
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build
  • GitHub Check: Playwright Tests
  • GitHub Check: Analyze
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/database/migrations/1742403845916_update_check_scanCommitsForSensitiveInfo.js.js (1)

1-19: Well-structured migration with proper rollback support.

The migration correctly updates the compliance check metadata and includes a proper down migration. The reference URL aligns with the PR objectives (issue #69).

Consider adding error handling for cases where the compliance check doesn't exist:

exports.up = async (knex) => {
+  const updated = await knex('compliance_checks')
-  await knex('compliance_checks')
    .where({ code_name: 'scanCommitsForSensitiveInfo' })
    .update({
      implementation_status: 'completed',
      implementation_type: 'computed',
      implementation_details_reference: 'https://github.com/OpenPathfinder/visionBoard/issues/69'
    })
+  if (updated === 0) {
+    throw new Error('scanCommitsForSensitiveInfo compliance check not found')
+  }
}
src/checks/complianceChecks/scanCommitsForSensitiveInfo.js (2)

13-16: Consider adding validation for check existence.

While the code handles the projects parameter well, consider adding validation to ensure the compliance check exists:

const check = await getCheckByCodeName('scanCommitsForSensitiveInfo')
+if (!check) {
+  throw new Error('scanCommitsForSensitiveInfo compliance check not found')
+}

26-31: Efficient parallel processing of database operations.

Using Promise.all for bulk operations is a good performance optimization. Consider adding error handling to provide better debugging information if individual operations fail:

-await Promise.all(analysis.results.map(result => upsertComplianceCheckResult(result)))
+try {
+  await Promise.all(analysis.results.map(result => upsertComplianceCheckResult(result)))
+} catch (error) {
+  debug('Error upserting results:', error)
+  throw error
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 05747a5 and 27ee196.

📒 Files selected for processing (8)
  • __tests__/checks/scanCommitsForSensitiveInfo.test.js (1 hunks)
  • __tests__/checks/validators/scanCommitsForSensitiveInfo.test.js (1 hunks)
  • src/checks/complianceChecks/scanCommitsForSensitiveInfo.js (1 hunks)
  • src/checks/validators/index.js (1 hunks)
  • src/checks/validators/scanCommitsForSensitiveInfo.js (1 hunks)
  • src/config/index.js (1 hunks)
  • src/database/migrations/1742403845916_update_check_scanCommitsForSensitiveInfo.js.js (1 hunks)
  • src/database/schema/schema.sql (1 hunks)
🧰 Additional context used
🧠 Learnings (4)
src/config/index.js (2)
Learnt from: UlisesGascon
PR: OpenPathfinder/visionBoard#233
File: playwright.config.js:59-59
Timestamp: 2025-05-03T07:11:12.719Z
Learning: In the visionBoard repository, the team has deliberately chosen to run Playwright tests without explicitly setting NODE_ENV=test in the webServer command, as part of a change that implements automatic database migrations on server start.
Learnt from: UlisesGascon
PR: OpenPathfinder/visionBoard#229
File: .github/workflows/e2e-tests.yml:16-29
Timestamp: 2025-05-03T05:39:42.677Z
Learning: For the VisionBoard project, database testing credentials (username: visionBoard, password: password) can be hardcoded in GitHub workflow files as they are the same default values used throughout the application for testing purposes.
src/database/migrations/1742403845916_update_check_scanCommitsForSensitiveInfo.js.js (1)
Learnt from: UlisesGascon
PR: OpenPathfinder/visionBoard#233
File: playwright.config.js:59-59
Timestamp: 2025-05-03T07:11:12.719Z
Learning: In the visionBoard repository, the team has deliberately chosen to run Playwright tests without explicitly setting NODE_ENV=test in the webServer command, as part of a change that implements automatic database migrations on server start.
__tests__/checks/validators/scanCommitsForSensitiveInfo.test.js (1)
Learnt from: UlisesGascon
PR: OpenPathfinder/visionBoard#229
File: e2e/website.spec.js:9-10
Timestamp: 2025-05-03T05:38:39.703Z
Learning: In the e2e tests for the VisionBoard repository, the testProjectId is currently hardcoded to match the test data created in the global-setup.js file. Even though there is a TODO comment about loading it from fixtures, the fixtures structure doesn't exist yet, so suggestions to implement this TODO should verify the fixtures structure first.
__tests__/checks/scanCommitsForSensitiveInfo.test.js (2)
Learnt from: UlisesGascon
PR: OpenPathfinder/visionBoard#229
File: e2e/website.spec.js:9-10
Timestamp: 2025-05-03T05:38:39.703Z
Learning: In the e2e tests for the VisionBoard repository, the testProjectId is currently hardcoded to match the test data created in the global-setup.js file. Even though there is a TODO comment about loading it from fixtures, the fixtures structure doesn't exist yet, so suggestions to implement this TODO should verify the fixtures structure first.
Learnt from: UlisesGascon
PR: OpenPathfinder/visionBoard#233
File: playwright.config.js:59-59
Timestamp: 2025-05-03T07:11:12.719Z
Learning: In the visionBoard repository, the team has deliberately chosen to run Playwright tests without explicitly setting NODE_ENV=test in the webServer command, as part of a change that implements automatic database migrations on server start.
🧬 Code Graph Analysis (4)
src/checks/validators/index.js (1)
src/checks/validators/scanCommitsForSensitiveInfo.js (1)
  • require (2-6)
__tests__/checks/validators/scanCommitsForSensitiveInfo.test.js (2)
src/checks/validators/scanCommitsForSensitiveInfo.js (1)
  • require (2-6)
src/checks/validators/index.js (1)
  • scanCommitsForSensitiveInfo (7-7)
__tests__/checks/scanCommitsForSensitiveInfo.test.js (4)
src/checks/complianceChecks/scanCommitsForSensitiveInfo.js (3)
  • require (2-2)
  • check (13-13)
  • initializeStore (6-10)
src/store/index.js (1)
  • getCheckByCodeName (52-55)
__utils__/index.js (2)
  • resetDatabase (2-11)
  • generateGithubRepoData (13-29)
__fixtures__/index.js (1)
  • sampleGithubOrg (2-66)
src/checks/validators/scanCommitsForSensitiveInfo.js (1)
src/utils/index.js (3)
  • groupArrayItemsByCriteria (75-83)
  • generatePercentage (85-89)
  • getSeverityFromPriorityGroup (56-73)
🪛 GitHub Actions: CI
__tests__/checks/validators/scanCommitsForSensitiveInfo.test.js

[error] 1-1: ESLint: 'se' is assigned a value but never used. (no-unused-vars)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Analyze
  • GitHub Check: Playwright Tests
🔇 Additional comments (14)
src/config/index.js (1)

9-10: LGTM! Good addition for database configuration flexibility.

The port configuration follows the established pattern and uses the standard PostgreSQL port as a sensible default.

src/database/schema/schema.sql (2)

1305-1305: Verify that CASCADE behavior aligns with business requirements.

The addition of ON UPDATE CASCADE ON DELETE CASCADE will automatically propagate changes and deletions from the compliance_checks table to resources_for_compliance_checks. This is generally beneficial for maintaining referential integrity, but ensure this automatic cleanup behavior matches your business logic.


1313-1313: Consistent CASCADE behavior applied.

Good consistency in applying the same CASCADE constraints to both foreign key relationships in the table.

src/checks/validators/index.js (1)

7-7: LGTM! Proper integration of the new validator.

The new scanCommitsForSensitiveInfo validator is correctly imported and added to the exports, following the established pattern.

Also applies to: 15-15

src/checks/complianceChecks/scanCommitsForSensitiveInfo.js (2)

5-10: Good initialization and dependency injection pattern.

The function signature and store initialization follow established patterns in the codebase.


22-24: Excellent cleanup strategy to prevent orphaned records.

The deletion of previous alerts and tasks before creating new ones is a best practice that prevents data inconsistency.

__tests__/checks/scanCommitsForSensitiveInfo.test.js (1)

1-259: Excellent comprehensive integration test coverage.

The test suite provides thorough coverage of the scanCommitsForSensitiveInfo compliance check with well-structured scenarios:

  • Proper database setup/teardown using beforeAll/beforeEach/afterAll hooks
  • Tests for passed checks with and without existing alerts/tasks cleanup
  • Tests for various failure scenarios (org-level, repo-level, mixed configurations)
  • Proper verification of database state changes (results, alerts, tasks)
  • Clean separation of test data setup and assertions

The integration tests effectively validate the end-to-end behavior of the compliance check including database interactions.

__tests__/checks/validators/scanCommitsForSensitiveInfo.test.js (1)

4-403: Outstanding validator test coverage.

The test suite provides comprehensive coverage of the scanCommitsForSensitiveInfo validator with 9 well-designed test cases covering:

  • All possible combinations of organization and repository secret scanning states
  • Proper handling of enabled, disabled, and unknown (null) values
  • Mixed scenarios with organizations and repositories in different states
  • Correct generation of alerts, results, and tasks for each scenario
  • Appropriate status determination (passed, failed, unknown)

The test fixtures are well-structured and the assertions verify the complete output structure including rationale messages, percentages, and task descriptions.

src/checks/validators/scanCommitsForSensitiveInfo.js (6)

10-47: Well-designed utility functions for failure detection.

The getOrgFailures and getRepoFailures functions provide clean separation of concerns for identifying organizations and repositories with disabled or unknown secret scanning configurations. The filtering logic correctly handles the different secret scanning attributes and null values.


49-69: Excellent message building functions.

The buildOrgMessage and buildRepoMessage functions generate clear, descriptive messages for different failure scenarios. The percentage calculation for repositories provides valuable context about the scope of issues.


94-113: Smart early optimization for fully compliant projects.

The early check for allReposEnabled && allOrgDefaultsEnabled provides an efficient path for projects that are fully compliant, avoiding unnecessary processing of failure scenarios. This optimization improves performance for the common success case.


127-150: Robust status determination logic.

The status determination logic properly prioritizes failures over unknowns and handles complex rationale building for mixed scenarios. The conditional logic for combining organization and repository rationales is well-thought-out.


158-199: Comprehensive alert and task generation.

The logic for generating alerts and tasks correctly handles different failure combinations (org-only, repo-only, mixed) with appropriate titles and descriptions. The task title generation provides actionable guidance with specific counts and percentages.


72-210: Excellent overall validator implementation.

The main validator function demonstrates solid software engineering practices:

  • Clear structure with logical flow from data processing to result generation
  • Proper error handling and edge case coverage
  • Effective use of debug logging for troubleshooting
  • Clean separation between data analysis and output generation
  • Consistent data structure for results, alerts, and tasks

The implementation correctly addresses the PR objective of extending sensitive information detection to include commit scanning capabilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant