A comprehensive Terraform project for deploying Databricks workspaces across multiple cloud providers (AWS, Azure, and GCP) in a very simple way.
This project is not intended to implement best security practices, but rather to deploy Databricks in a simplified way.
For best practice implementations, see:
- Databricks Security Best Practices
- Security Best Practices for Databricks Data Intelligence Platform
- Databricks Security Reference Architecture
- Data Exfiltration Protection with Databricks on AWS
- Data Exfiltration Protection with Databricks on Azure
- Data Exfiltration Protection with Databricks on GCP
Important Notice: This project is provided as-is for educational and reference purposes. While these Terraform modules follow industry best practices, they are intended as starting points for your own infrastructure deployments.
Before using in production:
- Thoroughly review all configurations and adapt them to your specific requirements
- Test extensively in non-production environments
- Ensure compliance with your organization's security policies and standards
- Validate that the configurations meet your specific networking and security requirements
- Consider engaging with Databricks and your cloud provider's professional services for production deployments
Liability: The authors and contributors of this project are not responsible for any issues, costs, or damages that may arise from the use of these templates. Use at your own risk and discretion.
Support: This is a community-driven project. While we strive to maintain and improve these modules, there is no guarantee of support or maintenance. For production workloads, consider using officially supported deployment methods from Databricks.
databricks-deployer/
βββ modules/ # Reusable Terraform modules
β βββ aws-workspace/ # AWS Databricks workspace module
β βββ azure-workspace/ # Azure Databricks workspace module
β βββ gcp-workspace/ # GCP Databricks workspace module
β βββ gcp-sa-provisioning/ # GCP service account provisioning
βββ examples/ # Example implementations
β βββ aws-workspace/ # AWS workspace deployment example
β βββ azure-workspace/ # Azure workspace deployment example
β βββ gcp-workspace/ # GCP workspace deployment example
β βββ gcp-sa-provisioning/ # GCP service account example
βββ tools/ # Installation and utility scripts
βββ install-terraform-windows.ps1 # Terraform installer for Windows
βββ install-awscli-windows.ps1 # AWS CLI installer for Windows
βββ install-azurecli-windows.ps1 # Azure CLI installer for Windows
βββ install-gcp-cli-windows.ps1 # Google Cloud CLI installer for Windows
- Terraform >= 1.0.0
- Cloud provider CLI tools:
- AWS CLI (for AWS deployments)
- Azure CLI (for Azure deployments)
- gcloud CLI (for GCP deployments)
- Databricks account with appropriate permissions
- Cloud provider account with sufficient permissions
This project includes PowerShell scripts for Windows users to automatically download and install the required CLI tools. These scripts provide a convenient way to set up your development environment.
| Script | Tool | Description |
|---|---|---|
tools/install-terraform-windows.ps1 |
Terraform | Infrastructure as Code tool |
tools/install-awscli-windows.ps1 |
AWS CLI v2 | AWS command-line interface |
tools/install-azurecli-windows.ps1 |
Azure CLI | Azure command-line interface |
tools/install-gcp-cli-windows.ps1 |
Google Cloud CLI | Google Cloud command-line interface |
# Install Terraform
.\tools\install-terraform-windows.ps1
# Install AWS CLI
.\tools\install-awscli-windows.ps1
# Install Azure CLI
.\tools\install-azurecli-windows.ps1
# Install Google Cloud CLI
.\tools\install-gcp-cli-windows.ps1# Install to custom directory
.\tools\install-terraform-windows.ps1 -InstallPath "C:\Tools\Terraform"
# Install specific version
.\tools\install-terraform-windows.ps1 -Version "1.5.0"
# Install without adding to PATH
.\tools\install-terraform-windows.ps1 -AddToPath:$false
# Combined options
.\tools\install-terraform-windows.ps1 -InstallPath "C:\Tools\Terraform" -Version "1.6.0"# Silent installation (no user prompts)
.\tools\install-awscli-windows.ps1 -Silent
.\tools\install-azurecli-windows.ps1 -Silent
.\tools\install-gcp-cli-windows.ps1 -Silent
# Skip installation verification
.\tools\install-awscli-windows.ps1 -SkipVerification
.\tools\install-azurecli-windows.ps1 -SkipVerification
.\tools\install-gcp-cli-windows.ps1 -SkipVerification
# Combined options for unattended installation
.\tools\install-awscli-windows.ps1 -Silent -SkipVerification| Parameter | Type | Default | Description |
|---|---|---|---|
-Silent |
Switch | $false |
Run installation without user prompts |
-SkipVerification |
Switch | $false |
Skip post-installation verification |
-Version |
String | "latest" |
Install specific version (limited support) |
| Parameter | Type | Default | Description |
|---|---|---|---|
-InstallPath |
String | $env:LOCALAPPDATA\Terraform |
Installation directory |
-AddToPath |
Switch | $true |
Add to user PATH environment variable |
-Version |
String | "latest" |
Terraform version to install |
- β Automatic Downloads: Download latest versions from official sources
- β Architecture Detection: Automatically detect 32-bit vs 64-bit systems
- β Existing Installation Check: Detect and handle existing installations
- β Progress Indicators: Visual feedback during download and installation
- β Error Handling: Comprehensive error handling with helpful messages
- β PATH Management: Automatic PATH environment variable updates
- β Installation Verification: Post-installation testing and validation
- β Colored Output: Enhanced readability with color-coded messages
- β Cleanup: Automatic removal of temporary installation files
Terraform Script:
- Downloads from HashiCorp releases with version selection
- Zip file extraction to custom directories
- GitHub API integration for latest version detection
- Portable installation (no administrator rights required by default)
AWS CLI Script:
- MSI installer with UAC elevation handling
- 64-bit requirement detection
- Comprehensive MSI error code handling
- Post-installation configuration guidance
Azure CLI Script:
- Microsoft's official installer with redirect URL handling
- Cross-architecture support (32-bit and 64-bit)
- MSI installer with progress display options
- Integration guidance for Azure authentication
Google Cloud CLI Script:
- Universal installer with automatic architecture detection
- Interactive and silent installation modes
- Component management guidance
- Comprehensive setup instructions for GCP projects
- PowerShell: Windows PowerShell 5.1 or PowerShell Core 7.x
- Internet Connection: Required to download installers
- Execution Policy: May need to allow script execution:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
After running the installation scripts, you'll need to configure each tool:
terraform --version # Verify installationaws --version # Verify installation
aws configure # Configure credentialsaz --version # Verify installation
az login # Authenticate with Azure
az account list # List available subscriptionsgcloud --version # Verify installation
gcloud init # Initialize and authenticate
gcloud auth login # Authenticate with Google Cloud-
PowerShell Execution Policy:
# If you get execution policy errors: Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
-
PATH Not Updated:
# Refresh environment variables in current session: $env:Path = [System.Environment]::GetEnvironmentVariable("Path", "User") + ";" + [System.Environment]::GetEnvironmentVariable("Path", "Machine") # Or restart your PowerShell/Command Prompt session
-
Administrator Privileges:
- CLI installers may prompt for elevation (this is normal)
- For Terraform, use default user directory to avoid admin requirements
-
Download Issues:
- Ensure internet connectivity
- Check if corporate firewall is blocking downloads
- Verify TLS 1.2 support in your environment
Each script provides built-in help and detailed error messages. For additional help:
# View script parameters
Get-Help .\tools\install-terraform-windows.ps1 -Detailed
Get-Help .\tools\install-awscli-windows.ps1 -Detailed-
Choose your cloud provider and navigate to the corresponding example:
cd examples/aws-workspace # For AWS cd examples/azure-workspace # For Azure cd examples/gcp-workspace # For GCP
-
Configure your variables:
cp terraform.tfvars.example terraform.tfvars # Edit terraform.tfvars with your specific values -
Deploy the infrastructure:
terraform init terraform plan terraform apply
Location: modules/aws-workspace/
Features:
- VPC with public and private subnets
- Security groups and NACLs
- NAT gateways and Internet gateway
- Optional PrivateLink endpoints
- Databricks workspace with MWS configuration
Key Resources:
databricks_mws_workspacesdatabricks_mws_networksdatabricks_mws_private_access_settings- AWS VPC and networking components
Location: modules/azure-workspace/
Features:
- Azure Virtual Network (VNet) with subnets
- Network Security Groups (NSG)
- Route tables and NAT Gateway
- Optional Private Link endpoints
- Databricks workspace with custom network parameters
Key Resources:
azurerm_databricks_workspacedatabricks_mws_private_access_settings- Azure VNet and networking components
Location: modules/gcp-workspace/
Features:
- GCP VPC network with multiple subnets
- Cloud Router and NAT Gateway
- Optional Private Service Connect (PSC) endpoints
- Databricks workspace with network configuration
Key Resources:
databricks_mws_workspacesdatabricks_mws_networksdatabricks_mws_private_access_settings- GCP VPC and networking components
Location: modules/gcp-sa-provisioning/
Features:
- Service account creation for Databricks
- IAM role assignments
- Key management
| Feature | AWS | Azure | GCP |
|---|---|---|---|
| Workspace Resource | databricks_mws_workspaces |
azurerm_databricks_workspace |
databricks_mws_workspaces |
| Network Configuration | databricks_mws_networks |
Custom parameters in workspace | databricks_mws_networks |
| Private Connectivity | VPC Endpoints | Private Link | Private Service Connect |
| Subnets Required | 2 (public/private) | 2 (public/private) | 3 (primary/pods/services) |
| NAT Gateway | AWS NAT Gateway | Azure NAT Gateway | Cloud NAT |
All modules support these common configuration patterns:
- Naming:
prefixvariable for consistent resource naming - Networking: Options for new or existing network infrastructure
- Private Connectivity: Optional private endpoints/links
- Databricks Configuration: Account ID, credentials, workspace settings
Each cloud provider has specific variables:
AWS:
aws_regionavailability_zonesenable_private_link
Azure:
locationresource_group_nameenable_private_link
GCP:
project_idregionenable_private_service_connect
- All modules create isolated network environments
- Private subnets for compute resources
- Security groups/NSGs with minimal required access
- Optional private connectivity to Databricks services
- Service principal/service account authentication
- Least privilege IAM policies
- Encrypted storage and transit
- Use cloud provider secret managers
- Avoid hardcoding credentials in Terraform files
- Use environment variables or external secret stores
Each example in the examples/ directory provides:
- Complete working configuration
terraform.tfvars.examplewith all required variables- Provider configuration
- Output definitions
- Detailed README with deployment instructions
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
- Follow Terraform best practices
- Use consistent naming conventions
- Document all variables and outputs
- Test across all supported cloud providers
- Update README files for any changes
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- Check the module-specific README files
- Review the examples for your cloud provider
- Open an issue in the repository
- Consult the Databricks Terraform Provider documentation
| Component | Version |
|---|---|
| Terraform | >= 1.0.0 |
| Databricks Provider | >= 1.0.0 |
| AWS Provider | >= 3.0.0 |
| Azure Provider | >= 3.0.0 |
| Google Provider | >= 4.0.0 |
Note: This project provides infrastructure-as-code templates for Databricks workspace deployment. Always review and test configurations in non-production environments before applying to production workloads.