feat: add DEFAULT_INSTALL_DIR and DEFAULT_PROJECT_DIR constants#763
Conversation
- Add DEFAULT_PROJECT_DIR and DEFAULT_INSTALL_DIR constants with env var overrides - Move hardcoded install_root, venv naming, and cache directory patterns to constants - Add DEFAULT_VENV_PREFIX and DEFAULT_LOCAL_MOUNT_PREFIX for consistency - Update VenvExecutor, Docker utilities, and cache utilities to use new constants - Maintain backward compatibility while enabling better configuration Co-Authored-By: AJ Steers <aj@airbyte.io>
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1756252455-cleanup-constants' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1756252455-cleanup-constants'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
📝 WalkthroughWalkthroughCentralizes filesystem defaults via three new constants in Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant Executor as PythonExecutor
participant FS as Filesystem
Note over Executor, FS #D5E8D4: Install-root resolution & creation (new)
Caller->>Executor: create_environment(install_root?)
alt install_root provided
Executor->>FS: resolve install_root = provided
else DEFAULT_INSTALL_DIR available
Executor->>FS: resolve install_root = DEFAULT_INSTALL_DIR
else fallback
Executor->>FS: resolve install_root = Path.cwd()
end
Executor->>FS: mkdir(install_root, parents=True, exist_ok=True) [errors suppressed]
Executor->>Caller: proceed with venv creation (".venv-{name}")
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Would you like a tiny follow-up unit test that verifies install_root resolution (env/default/fallback) and that mkdir is attempted (mocked), wdyt? ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
airbyte/_executors/python.py (1)
150-165: pip invocation is invalid; use interpreter-m pipinstead.
pipdoes not support--python; that’s auv pip/pipxpattern. This will fail whenNO_UVis True. Let’s switch to invoking the selected interpreter with-m pip, wdyt?- install_cmd = ( - [ - "uv", - "pip", - "install", - "--python", # uv requires --python after the subcommand - str(self.interpreter_path), - ] - if not NO_UV - else [ - "pip", - "--python", # pip requires --python before the subcommand - str(self.interpreter_path), - "install", - ] - ) + shlex.split(self.pip_url) + install_cmd = ( + [ + "uv", + "pip", + "install", + "--python", + str(self.interpreter_path), + ] + if not NO_UV + else [ + str(self.interpreter_path), + "-m", + "pip", + "install", + ] + ) + shlex.split(self.pip_url)
🧹 Nitpick comments (8)
airbyte/constants.py (1)
85-93: Consider env overrides for naming constants for parity.Do we want
DEFAULT_VENV_PREFIX,DEFAULT_LOCAL_MOUNT_PREFIX, andDEFAULT_GOOGLE_DRIVE_MOUNT_PATHto also be overridable (e.g.,AIRBYTE_VENV_PREFIX,AIRBYTE_LOCAL_MOUNT_PREFIX,AIRBYTE_GOOGLE_DRIVE_MOUNT_PATH) for full symmetry, or keep them fixed defaults, wdyt?airbyte/_executors/util.py (1)
285-286: Back-compat alias for old local mount directory.The new prefix changes the on-disk directory name. To smooth upgrades, should we fallback to the legacy
<name>folder if it exists, else use the newlocal_mount_<name>? Also, do we want to slugifynameto ensure FS-safe paths, wdyt?- local_mount_dir = Path().absolute() / f"{DEFAULT_LOCAL_MOUNT_PREFIX}_{name}" - local_mount_dir.mkdir(exist_ok=True) + preferred = Path().absolute() / f"{DEFAULT_LOCAL_MOUNT_PREFIX}_{name}" + legacy = Path().absolute() / name + local_mount_dir = legacy if legacy.exists() and not preferred.exists() else preferred + local_mount_dir.mkdir(exist_ok=True)airbyte/_executors/python.py (2)
45-47: Update docstring to reflect new default install root.The default is no longer CWD; it’s
DEFAULT_INSTALL_DIR. Shall we align the doc, wdyt?- install_root: (Optional.) The root directory where the virtual environment will be - created. If not provided, the current working directory will be used. + install_root: (Optional.) The root directory where the virtual environment will be + created. If not provided, DEFAULT_INSTALL_DIR will be used.
267-267: Avoid hardcoded venv name; reuse_get_venv_name()for consistency.This still uses
.venv-{name}while the rest of the class uses the configurable prefix. Shall we delegate to_get_venv_name(), wdyt?- venv_name = f".venv-{self.name}" + venv_name = self._get_venv_name()airbyte/caches/util.py (4)
27-32: Docstring now stale; default path is under.airbyte/cache.Since we switched to
DEFAULT_PROJECT_DIR / "cache", can we update this to avoid confusion, wdyt?- Cache files are stored in the `.cache` directory, relative to the current - working directory. + Cache files are stored under the project directory: + `DEFAULT_PROJECT_DIR / "cache" / "default_cache"`.
33-37: Optionally ensure cache directory exists.Should we
mkdir(parents=True, exist_ok=True)here to eagerly create the directory and fail early on permissions, wdyt?- cache_dir = DEFAULT_PROJECT_DIR / "cache" / "default_cache" + cache_dir = DEFAULT_PROJECT_DIR / "cache" / "default_cache" + cache_dir.mkdir(parents=True, exist_ok=True)
53-55: Same docstring tweak fornew_local_cache.This now lives under
.airbyte/cache/<name>. Shall we reflect that in the docs, wdyt?- Cache files are stored in the `.cache` directory, relative to the current - working directory. + Cache files are stored under the project directory: + `DEFAULT_PROJECT_DIR / "cache" / <cache_name>`.
70-73: Eagerly create the cache directory.Like above, proactively creating parents can improve UX and error surfacing, wdyt?
cache_name = cache_name or str(ulid.ULID()) - cache_dir = cache_dir or (DEFAULT_PROJECT_DIR / "cache" / cache_name) + cache_dir = cache_dir or (DEFAULT_PROJECT_DIR / "cache" / cache_name) if not isinstance(cache_dir, Path): cache_dir = Path(cache_dir) + cache_dir.mkdir(parents=True, exist_ok=True)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
airbyte/_executors/python.py(2 hunks)airbyte/_executors/util.py(2 hunks)airbyte/caches/util.py(3 hunks)airbyte/constants.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/_executors/util.py (1)
airbyte/_connector_base.py (1)
name(84-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (4)
airbyte/_executors/util.py (1)
21-21: Nice centralization of mount naming.Importing
DEFAULT_LOCAL_MOUNT_PREFIXkeeps naming consistent across the codebase. LGTM.airbyte/_executors/python.py (2)
20-20: Good move to centralized install/venv defaults.Switching to
DEFAULT_INSTALL_DIR/DEFAULT_VENV_PREFIXreduces drift. LGTM.
71-73: Prefix-based venv name is consistent and clear.Using
DEFAULT_VENV_PREFIXhere looks good.airbyte/caches/util.py (1)
12-12: Centralizing defaults via constants is great.Importing
DEFAULT_PROJECT_DIR/DEFAULT_GOOGLE_DRIVE_MOUNT_PATHsimplifies config. LGTM.
- Remove DEFAULT_VENV_PREFIX constant per @aaronsteers feedback - Revert _get_venv_name to use hardcoded .venv-{name} pattern - Add expanduser() and absolute() to DEFAULT_PROJECT_DIR and DEFAULT_INSTALL_DIR - Clean up unused import in python.py - Keep all other constants and install_root changes intact Addresses GitHub comments on PR #763 Co-Authored-By: AJ Steers <aj@airbyte.io>
…ger.devin.ai/proxy/github.com/airbytehq/PyAirbyte into devin/1756252455-cleanup-constants Resolving divergent branches after addressing maintainer feedback
- Remove DEFAULT_VENV_PREFIX constant per @aaronsteers feedback - Revert _get_venv_name to use hardcoded .venv-{name} pattern - Add expanduser() and absolute() to DEFAULT_PROJECT_DIR and DEFAULT_INSTALL_DIR - Remove DEFAULT_LOCAL_MOUNT_PREFIX constant and related changes - Update local_mount_dir to be relative to DEFAULT_PROJECT_DIR/docker_mounts - Clean up unused imports in util.py - Keep all other constants and install_root changes intact Addresses all GitHub comments on PR #763 Co-Authored-By: AJ Steers <aj@airbyte.io>
- Change local_mount_dir from DEFAULT_PROJECT_DIR / 'docker_mounts' / name to DEFAULT_PROJECT_DIR / name as suggested by @aaronsteers - Removes unnecessary subdirectory nesting for Docker mount directories - Addresses GitHub comment on PR #763 Co-Authored-By: AJ Steers <aj@airbyte.io>
…ger.devin.ai/proxy/github.com/airbytehq/PyAirbyte into devin/1756252455-cleanup-constants Pulling remote changes before pushing local commit that addresses Aaron's GitHub comment
…edback - Simplify DEFAULT_INSTALL_DIR and DEFAULT_PROJECT_DIR implementations - Update DEFAULT_PROJECT_DIR to default to Path.cwd() instead of Path.cwd() / '.airbyte' - Simplify DEFAULT_INSTALL_DIR docstring to be more concise and accurate - Address all GitHub comments from @aaronsteers on PR #763 Co-Authored-By: AJ Steers <aj@airbyte.io>
Resolving merge conflicts after addressing Aaron's GitHub feedback on DEFAULT_INSTALL_DIR and DEFAULT_PROJECT_DIR implementations Co-Authored-By: AJ Steers <aj@airbyte.io>
- Make docstring more concise and consistent with install directory pattern - Address GitHub comment from @aaronsteers on PR #763 Co-Authored-By: AJ Steers <aj@airbyte.io>
Resolving merge conflicts by accepting Aaron's direct commits that include both the cleaner implementation and updated docstring for DEFAULT_PROJECT_DIR Co-Authored-By: AJ Steers <aj@airbyte.io>
- Remove trailing whitespace from DEFAULT_INSTALL_DIR docstring (line 85) - Fix mypy type error in DEFAULT_PROJECT_DIR by providing default empty string to os.getenv() - Address linting error W291 and mypy argument type incompatibility Co-Authored-By: AJ Steers <aj@airbyte.io>
- Resolve merge conflict in DEFAULT_INSTALL_DIR docstring - Keep Aaron's updated docstring content without trailing whitespace - Combine remote updates with local linting fixes Co-Authored-By: AJ Steers <aj@airbyte.io>
- Use single-line Path constructor calls for cleaner code - Remove unnecessary parentheses in DEFAULT_PROJECT_DIR and DEFAULT_INSTALL_DIR - Maintain same functionality with more concise syntax Co-Authored-By: AJ Steers <aj@airbyte.io>
- Use ultra-simplified single-line format for both constants - DEFAULT_PROJECT_DIR: Path(os.getenv(...) or Path.cwd()).expanduser().absolute() - DEFAULT_INSTALL_DIR: Path(os.getenv(...) or DEFAULT_PROJECT_DIR).expanduser().absolute() - Maintains same functionality with cleaner, more concise syntax Co-Authored-By: AJ Steers <aj@airbyte.io>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
airbyte/constants.py (2)
59-61: DEFAULT_PROJECT_DIR: default currently points to CWD, not ".airbyte/"; also trim/expand env values?
- This deviates from the PR summary that defaults to ".airbyte/".
- Whitespace-only env values and
$VARexpansions aren’t handled;Path.expanduser()won’t expand$HOME.- Would you switch the default to
Path.cwd() / ".airbyte"and normalize env input (strip + expandvars), wdyt?Apply this diff:
-DEFAULT_PROJECT_DIR: Path = ( - Path(os.getenv("AIRBYTE_PROJECT_DIR", "") or Path.cwd()).expanduser().absolute() -) +_ab_proj_raw = os.getenv("AIRBYTE_PROJECT_DIR", "").strip() +DEFAULT_PROJECT_DIR: Path = Path( + os.path.expanduser(os.path.expandvars(_ab_proj_raw)) + if _ab_proj_raw + else (Path.cwd() / ".airbyte") +).resolve()To quickly verify intent vs. usage across the repo, you could run:
#!/bin/bash # Locate usages and defaults for project/install dirs rg -nC2 -e 'DEFAULT_PROJECT_DIR|AIRBYTE_PROJECT_DIR' -e 'DEFAULT_INSTALL_DIR|AIRBYTE_INSTALL_DIR'
72-74: DEFAULT_INSTALL_DIR: missing '/installs' suffix and env normalization.
- The default should be
<DEFAULT_PROJECT_DIR>/installsper the PR goals.- Also mirror the env handling (strip + expandvars) for consistency, wdyt?
-DEFAULT_INSTALL_DIR: Path = ( - Path(os.getenv("AIRBYTE_INSTALL_DIR", "") or DEFAULT_PROJECT_DIR).expanduser().absolute() -) +_ab_install_raw = os.getenv("AIRBYTE_INSTALL_DIR", "").strip() +DEFAULT_INSTALL_DIR: Path = Path( + os.path.expanduser(os.path.expandvars(_ab_install_raw)) + if _ab_install_raw + else (DEFAULT_PROJECT_DIR / "installs") +).resolve()
🧹 Nitpick comments (3)
airbyte/constants.py (3)
62-70: Docstring should match the intended default and behavior.If we adopt “.airbyte/” as default and expand env vars, can we reflect that here, wdyt?
-"""Default project directory. - -Can be overridden by setting the `AIRBYTE_PROJECT_DIR` environment variable. - -If not set, defaults to the current working directory. - -This serves as the parent directory for both cache and install directories when not explicitly -configured. -""" +"""Default project directory. + +Can be overridden via `AIRBYTE_PROJECT_DIR` (whitespace-only treated as unset). +`~` and environment variables inside the value are expanded. + +If not set, defaults to `.airbyte/` under the current working directory. +Serves as the parent directory for cache and install directories when not explicitly configured. +"""
75-79: Update docstring to reflect<DEFAULT_PROJECT_DIR>/installsdefault.Can we clarify the default and mention env expansion behavior for parity with the project dir, wdyt?
-"""Default install directory for connectors. - -If not set, defaults to `DEFAULT_PROJECT_DIR` (`AIRBYTE_PROJECT_DIR` env var) or the current -working directory if neither is set. -""" +"""Default install directory for connectors. + +Can be overridden via `AIRBYTE_INSTALL_DIR` (whitespace-only treated as unset). +`~` and environment variables inside the value are expanded. + +If not set, defaults to `<DEFAULT_PROJECT_DIR>/installs`. +"""
82-83: Make the Colab mount path overridable (and optionally Path-based)?Would you allow an env override for flexibility (and keep type str for compatibility), wdyt?
-DEFAULT_GOOGLE_DRIVE_MOUNT_PATH = "/content/drive" +DEFAULT_GOOGLE_DRIVE_MOUNT_PATH = os.getenv("AIRBYTE_GOOGLE_DRIVE_MOUNT_PATH", "/content/drive")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
airbyte/constants.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (Fast)
These are rendered in the docs here:
Summary
This PR addresses the task of cleaning up hardcoded configuration values in PyAirbyte by:
Added new configurable constants to
airbyte/constants.py:DEFAULT_PROJECT_DIR: Parent directory for PyAirbyte files (defaults to.airbyte/, configurable viaAIRBYTE_PROJECT_DIR)DEFAULT_INSTALL_DIR: Directory for Python connector installations (defaults to.airbyte/installs/, configurable viaAIRBYTE_INSTALL_DIR)DEFAULT_VENV_PREFIX,DEFAULT_LOCAL_MOUNT_PREFIX,DEFAULT_GOOGLE_DRIVE_MOUNT_PATH: Standardized naming patternsUpdated hardcoded paths throughout the codebase:
VenvExecutor: Changed install root fromPath.cwd()toDEFAULT_INSTALL_DIR./.cache/to.airbyte/cache/{name}tolocal_mount_{name}DEFAULT_VENV_PREFIXMaintains backward compatibility while enabling better configuration via environment variables
Review & Testing Checklist for Human
This is a medium-high risk change that modifies core filesystem path behaviors. Please verify:
.airbyte/installs/directory (this is a breaking change from current working directory).airbyte/cache/directory structure (breaking change from./.cache/)local_mount_{name}directory naming patternAIRBYTE_PROJECT_DIRandAIRBYTE_INSTALL_DIRenvironment variables work correctly and override defaults as expectedNotes
{cwd}→.airbyte/installs/./.cache/→.airbyte/cache/{name}→local_mount_{name}While these changes provide better organization and configurability, they may impact existing workflows that depend on the previous default locations.
Testing performed: Unit tests (182 passed), linting, type checking, and formatting all pass. However, integration testing with real connectors is recommended to verify end-to-end functionality.
Link to Devin run: https://app.devin.ai/sessions/a4d876c5ae5f4667821d14c0604351bf
Requested by: Aaron ("AJ") Steers (@aaronsteers)
Summary by CodeRabbit
New Features
Refactor
Chores