Skip to content

Feat: Add support for Snowflake key-pair authentication#702

Merged
Aaron ("AJ") Steers (aaronsteers) merged 5 commits into
airbytehq:mainfrom
Alioune05:alioune05/feature/snowflake-keypair-auth
Jun 27, 2025
Merged

Feat: Add support for Snowflake key-pair authentication#702
Aaron ("AJ") Steers (aaronsteers) merged 5 commits into
airbytehq:mainfrom
Alioune05:alioune05/feature/snowflake-keypair-auth

Conversation

@Alioune05

@Alioune05 Yves Alioune Amoussou (Alioune05) commented Jun 26, 2025

Copy link
Copy Markdown
Contributor

Summary

Related to #654 and #681 and this Slack conversation.
Update SnowflakeConfig and SqlConfig to allow connection with :

  • private_key
  • private_key + private_key_passphrase
  • private_key_path
  • private_key_path + private_key_passphrase

It also make the password attribute optional.

Testing

  • Added unit tests especially on get_vendor_client and get_sql_alchemy_url

Summary by CodeRabbit

  • New Features

    • Added support for Snowflake key pair authentication, allowing users to connect using a private key (with optional passphrase) or a private key file, in addition to password-based authentication.
  • Documentation

    • Expanded usage examples to demonstrate connecting to Snowflake using private key authentication methods.
  • Tests

    • Introduced comprehensive unit tests covering Snowflake authentication scenarios, including password and private key methods, validation logic, and error handling.
  • Chores

    • Added a new dependency on the cryptography package to support key handling and encryption features.

@coderabbitai

coderabbitai Bot commented Jun 26, 2025

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

The SnowflakeConfig class was enhanced to support multiple authentication methods, including password and private key (via string or file path, with optional passphrase). Validation logic and key handling methods were added. Corresponding documentation, tests, and the base SqlConfig interface were updated to accommodate these authentication options.

Changes

File(s) Change Summary
airbyte/_processors/sql/snowflake.py Extended SnowflakeConfig for password and private key authentication; added validation, key loading, and connect args logic.
airbyte/shared/sql_processor.py Added get_sql_alchemy_connect_args to SqlConfig and integrated into engine creation.
airbyte/caches/snowflake.py Expanded documentation with usage examples for all authentication methods; no logic changes.
tests/unit_tests/test_snowflake_config.py Added comprehensive unit tests for SnowflakeConfig covering all authentication methods and validation logic.
pyproject.toml Added cryptography dependency for private key handling.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SnowflakeConfig
    participant FileSystem
    participant SnowflakeConnector

    User->>SnowflakeConfig: Initialize with auth parameters
    SnowflakeConfig->>SnowflakeConfig: _validate_authentication_config()
    alt Using password
        SnowflakeConfig->>SnowflakeConnector: Connect with password
    else Using private_key
        SnowflakeConfig->>SnowflakeConfig: _get_private_key_content()
        alt private_key_path
            SnowflakeConfig->>FileSystem: Read private key file
        else private_key string
            SnowflakeConfig->>SnowflakeConfig: Use private_key string
        end
        SnowflakeConfig->>SnowflakeConfig: _get_private_key_bytes()
        SnowflakeConfig->>SnowflakeConnector: Connect with private_key_bytes and optional passphrase
    end
    SnowflakeConnector-->>User: Connection established
Loading

Suggested reviewers

Would you like to see additional tests for edge cases around malformed private key files, or is the current coverage sufficient, wdyt?


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e09fd1 and fa517be.

📒 Files selected for processing (1)
  • airbyte/_processors/sql/snowflake.py (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
airbyte/_processors/sql/snowflake.py (3)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#415
File: examples/run_perf_test_reads.py:117-127
Timestamp: 2024-10-09T19:21:45.994Z
Learning: In `examples/run_perf_test_reads.py`, the code for setting up Snowflake configuration in `get_cache` and `get_destination` cannot be refactored into a shared helper function because there are differences between them.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-11T22:05:15.550Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#411
File: airbyte/cli.py:26-26
Timestamp: 2024-10-06T17:43:09.319Z
Learning: In the PyAirbyte project, when reviewing Python code and encountering a TODO comment without an issue link, I should post a friendly reminder to resolve it before merging, instead of suggesting to add an issue link.
🧬 Code Graph Analysis (1)
airbyte/_processors/sql/snowflake.py (2)
airbyte/secrets/base.py (1)
  • SecretString (34-139)
airbyte/shared/sql_processor.py (2)
  • get_sql_alchemy_connect_args (131-133)
  • get_vendor_client (147-156)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Pytest (No Creds)
  • GitHub Check: Pytest (Fast)
🔇 Additional comments (8)
airbyte/_processors/sql/snowflake.py (8)

7-7: LGTM on the new imports!

The added imports are appropriate for the private key handling functionality - Path for file operations and cryptography modules for secure key processing.

Also applies to: 12-13


41-46: Well-designed authentication field structure!

Good use of SecretString for sensitive data and making password optional to support multiple authentication methods. The field types are appropriate for their purposes.


53-83: Excellent validation logic!

The authentication validation is comprehensive and covers all the necessary scenarios:

  • Ensures exactly one primary authentication method
  • Prevents invalid combinations (like passphrase with password auth)
  • Provides clear error messages for each failure case

The logic looks solid, wdyt?


84-91: Clean private key content handling!

The method correctly handles both private key sources (string vs file) with appropriate encoding and error handling. The logic is straightforward and robust.


92-110: Solid cryptographic implementation!

The private key processing follows best practices:

  • Proper PEM loading with optional passphrase support
  • Conversion to DER format with PKCS8 (standard for Snowflake)
  • Secure handling using the cryptography library

This looks well-implemented, wdyt?


111-117: Correct SQLAlchemy integration!

The method properly handles connection arguments for SQLAlchemy, only including the private key when using key-based authentication. The conditional logic ensures clean separation between auth methods.


134-149: Good URL construction with validation!

The method correctly validates authentication first and conditionally includes the password only when present. This properly supports both password and key-based authentication methods.


151-176: Comprehensive vendor client configuration!

The method correctly handles all three authentication scenarios (password, private key file, private key string) with appropriate connection parameters for each. The conditional logic properly sets up the Snowflake connection based on the authentication method chosen.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in Comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
airbyte/_processors/sql/snowflake.py (2)

84-92: Simplify by removing unnecessary elif.

The method logic is correct, but the code can be cleaner by removing the elif after return.

 def _get_private_key_content(self) -> bytes:
     """Get the private key content from either private_key or private_key_path."""
     if self.private_key:
         return str(self.private_key).encode("utf-8")
-    elif self.private_key_path:
+    if self.private_key_path:
         return Path(self.private_key_path).read_bytes()
-    else:
-        raise ValueError("No private key provided")
+    raise ValueError("No private key provided")

93-113: Remove unnecessary assignment before return.

The private key handling is secure and correct, but we can simplify the code slightly.

-    private_key_bytes = private_key.private_bytes(
+    return private_key.private_bytes(
         encoding=serialization.Encoding.DER,
         format=serialization.PrivateFormat.PKCS8,
         encryption_algorithm=serialization.NoEncryption(),
     )
-
-    return private_key_bytes
tests/unit_tests/test_snowflake_config.py (1)

1-510: Excellent test coverage! Just needs formatting.

The test suite provides comprehensive coverage of all authentication methods, validation logic, and error scenarios. Great job on testing edge cases like wrong passphrases and multiple auth methods!

However, the pipeline indicates formatting issues that need to be fixed. Could you run ruff format on this file to resolve the formatting issues?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68078b6 and 4deb246.

📒 Files selected for processing (4)
  • airbyte/_processors/sql/snowflake.py (3 hunks)
  • airbyte/caches/snowflake.py (2 hunks)
  • airbyte/shared/sql_processor.py (3 hunks)
  • tests/unit_tests/test_snowflake_config.py (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
airbyte/caches/snowflake.py (1)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#415
File: examples/run_perf_test_reads.py:117-127
Timestamp: 2024-10-09T19:21:45.994Z
Learning: In `examples/run_perf_test_reads.py`, the code for setting up Snowflake configuration in `get_cache` and `get_destination` cannot be refactored into a shared helper function because there are differences between them.
tests/unit_tests/test_snowflake_config.py (1)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#415
File: examples/run_perf_test_reads.py:117-127
Timestamp: 2024-10-09T19:21:45.994Z
Learning: In `examples/run_perf_test_reads.py`, the code for setting up Snowflake configuration in `get_cache` and `get_destination` cannot be refactored into a shared helper function because there are differences between them.
airbyte/_processors/sql/snowflake.py (1)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#415
File: examples/run_perf_test_reads.py:117-127
Timestamp: 2024-10-09T19:21:45.994Z
Learning: In `examples/run_perf_test_reads.py`, the code for setting up Snowflake configuration in `get_cache` and `get_destination` cannot be refactored into a shared helper function because there are differences between them.
🧬 Code Graph Analysis (2)
airbyte/shared/sql_processor.py (1)
airbyte/_processors/sql/snowflake.py (1)
  • get_sql_alchemy_connect_args (115-119)
airbyte/_processors/sql/snowflake.py (2)
airbyte/secrets/base.py (1)
  • SecretString (34-139)
airbyte/shared/sql_processor.py (2)
  • get_sql_alchemy_connect_args (131-133)
  • get_vendor_client (147-156)
🪛 Flake8 (7.2.0)
tests/unit_tests/test_snowflake_config.py

[error] 466-466: expected 2 blank lines, found 1

(E302)

airbyte/_processors/sql/snowflake.py

[error] 84-84: too many blank lines (2)

(E303)

🪛 GitHub Actions: Run Linters
tests/unit_tests/test_snowflake_config.py

[error] 1-1: Ruff formatting check failed. File would be reformatted. Run 'ruff format' to fix code style issues.

airbyte/_processors/sql/snowflake.py

[error] 66-66: Ruff E501: Line too long (118 > 100).


[error] 84-84: Ruff E303: Too many blank lines (2). Remove extraneous blank line(s).


[error] 88-88: Ruff RET505: Unnecessary elif after return statement. Remove unnecessary elif.


[error] 112-112: Ruff RET504: Unnecessary assignment to private_key_bytes before return statement. Remove unnecessary assignment.

🪛 Ruff (0.11.9)
airbyte/_processors/sql/snowflake.py

66-66: Line too long (118 > 100)

(E501)


84-84: Too many blank lines (2)

Remove extraneous blank line(s)

(E303)


88-88: Unnecessary elif after return statement

Remove unnecessary elif

(RET505)


112-112: Unnecessary assignment to private_key_bytes before return statement

Remove unnecessary assignment

(RET504)

🪛 Pylint (3.3.7)
airbyte/_processors/sql/snowflake.py

[refactor] 86-91: Unnecessary "elif" after "return", remove the leading "el" from "elif"

(R1705)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (No Creds)
🔇 Additional comments (6)
airbyte/shared/sql_processor.py (2)

131-134: LGTM! Clean extension point for connection arguments.

The base implementation returning an empty dict is appropriate and provides a nice interface for subclasses to override when needed.


144-144: Correct integration of connection arguments.

The connect_args are properly passed to SQLAlchemy's create_engine, enabling database-specific authentication parameters.

airbyte/caches/snowflake.py (1)

6-57: Excellent documentation examples!

The examples clearly demonstrate all three authentication methods and follow security best practices by using ab.get_secret() for sensitive values. The inline comments for optional parameters are helpful.

airbyte/_processors/sql/snowflake.py (3)

114-120: Well-implemented connection arguments override.

The method correctly provides private key bytes only when key-based authentication is used, maintaining compatibility with password authentication.


137-152: Correct URL construction for different authentication methods.

The method properly validates the configuration and constructs the URL appropriately for both password and key-based authentication.


154-177: Comprehensive vendor client configuration for all auth methods.

The implementation correctly handles all three authentication scenarios and properly sets the JWT authenticator for key-based authentication. Nice work on the dynamic parameter building!

Comment thread airbyte/_processors/sql/snowflake.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4deb246 and 2e09fd1.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • airbyte/_processors/sql/snowflake.py (3 hunks)
  • airbyte/caches/snowflake.py (1 hunks)
  • pyproject.toml (1 hunks)
  • tests/unit_tests/test_snowflake_config.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (2)
  • airbyte/caches/snowflake.py
  • tests/unit_tests/test_snowflake_config.py
🧰 Additional context used
🧠 Learnings (1)
airbyte/_processors/sql/snowflake.py (1)
Learnt from: aaronsteers
PR: airbytehq/PyAirbyte#415
File: examples/run_perf_test_reads.py:117-127
Timestamp: 2024-10-09T19:21:45.994Z
Learning: In `examples/run_perf_test_reads.py`, the code for setting up Snowflake configuration in `get_cache` and `get_destination` cannot be refactored into a shared helper function because there are differences between them.
🧬 Code Graph Analysis (1)
airbyte/_processors/sql/snowflake.py (2)
airbyte/secrets/base.py (1)
  • SecretString (34-139)
airbyte/shared/sql_processor.py (2)
  • get_sql_alchemy_connect_args (131-133)
  • get_vendor_client (147-156)
🔇 Additional comments (6)
airbyte/_processors/sql/snowflake.py (6)

7-7: LGTM on the new imports!

The new imports for Path, cryptography modules, and typing updates are appropriate for the key-pair authentication functionality.

Also applies to: 9-9, 12-13


41-45: Well-structured authentication fields!

The new optional fields provide comprehensive support for different key-pair authentication methods while maintaining security with SecretString for sensitive data.


53-82: Excellent validation logic!

The authentication validation is comprehensive and handles all edge cases properly. The error messages are clear and the logic prevents conflicting authentication configurations.


84-109: Solid private key handling implementation!

The methods properly handle both direct private key strings and file paths, with correct cryptography library usage for PEM loading and DER conversion. The separation of concerns is well done.


111-116: Clean implementation of connect args!

The method correctly provides private key bytes only when using key-pair authentication, following SQLAlchemy patterns.


136-149: Good URL method updates!

The method correctly validates authentication configuration and conditionally includes password only when using password authentication.

Comment thread airbyte/_processors/sql/snowflake.py Outdated
Comment thread airbyte/_processors/sql/snowflake.py Outdated
@aaronsteers

Aaron ("AJ") Steers (aaronsteers) commented Jun 27, 2025

Copy link
Copy Markdown
Member

/test-pr

PR test job started... Check job output.

✅ Tests passed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yves Alioune Amoussou (@Alioune05) - Looks great and all tests are passing. Thanks very much for contributing!

Comment thread airbyte/shared/sql_processor.py
Comment thread airbyte/shared/sql_processor.py
Comment thread airbyte/caches/snowflake.py
Comment thread airbyte/_processors/sql/snowflake.py
@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit 6794987 into airbytehq:main Jun 27, 2025
19 checks passed
@coderabbitai coderabbitai Bot mentioned this pull request Jul 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants