Fix: Add path sanitization for api_environment to prevent path traversal#63708
Fix: Add path sanitization for api_environment to prevent path traversal#63708K1nakoo wants to merge 4 commits into
Conversation
…t to prevent path traversalnvironment name Validate the API environment name to allow only alphanumeric characters, dashes, and underscores.
|
@K1nakoo This PR has a few issues that need to be addressed before it can be reviewed — please see our Pull Request quality criteria. Issues found:
What to do next:
There is no rush — take your time and work at your own pace. We appreciate your contribution and are happy to wait for updates. If you have questions, feel free to ask on the Airflow Slack. Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you. |
SameerMesiah97
left a comment
There was a problem hiding this comment.
Not a bad idea but it is too strict, which risks breaking existing deployments. Also, there are no tests at all. At minimum, I think you should cover both valid and invalid inputs.
| self.api_token = api_token | ||
| self.api_environment = os.getenv("AIRFLOW_CLI_ENVIRONMENT") or api_environment | ||
| raw_env = os.getenv("AIRFLOW_CLI_ENVIRONMENT") or api_environment | ||
| if not re.match(r'^[a-zA-Z0-9_.-]+$', raw_env): |
There was a problem hiding this comment.
This is too strict. What about existing valid api_environment values like 'team:prod', or 'prod v2'?Previously this accepted any string, but now it is limited to , so these would fail. Since this comes from CLI/env, this could break real usages.
Seems like it would be better to loosen restriction to only block / and ..
There was a problem hiding this comment.
Could you please move your comment to the original PR? #63691
|
#63691 duplicate |
What does this PR do?
This PR introduces a strict regex validation for the
api_environmentvariable (populated via CLI arguments orAIRFLOW_CLI_ENVIRONMENT) within theCredentialsclass ofairflowctl.Why is this needed?
Currently, the environment name is directly passed to
os.path.join(default_config_dir, f"{self.api_environment}.json")without any sanitization.While
airflowctlis a client tool, it is frequently executed in automated CI/CD pipelines where the environment variable might be populated dynamically (e.g., from a Git branch name or GitHub Actions runner). If an untrusted input containing directory traversal sequences (like../../../tmp/evil) is passed, it could unintentionally write.jsonfiles outside the target configuration directory, leading to potential CI pipeline configuration overrides.This patch enforces a Defense-in-Depth approach, ensuring that only valid, safe alphanumeric names (including dashes, periods, and underscores) are processed, completely mitigating the risk of path traversal.
Testing done
production,dev.env-1) work as expected.../evil) correctly raises aValueErrorand halts execution before any file system operations occur.Was generative AI tooling used to co-author this PR?
Generated-by: Gemini following the guidelines