- Developer pushes code -> GitHub Repo.
- GitHub Actions workflow (
.github/workflows/ci-cd.yml) runs. - Workflow connected to the GCP VM via SSH.
- Repository is cloned/pulled on the VM.
- DAGs are copied into the Airflow instance (
~/airflow/dags). - Airflow automatically loads new/updated DAGs.
-
Prerequisites
- Google Cloud Platform (GCP) account.
- A running Ubuntu VM instance in GCP
- Open firwall ports:
- 22 -> SHH
- 8080 -> Airflow Web UI
- GitHub Repository with you DAGs and pipeline code.
-
Install Docker & Airflow on the VM
- SSH into your VM:
ssh <your-username>@<your-vm-external-ip> - Then run the installation script:
chmod +x ./scripts/setup_vm.sh ./scripts/setup_vm.sh - SSH into your VM:
-
Setup SSH Keys for CI/CD
- On your local machine:
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
id_rsa→ private key (add as GitHub secret:GCP_SSH_KEY).id_rsa.pub→ public key (add to VM:~/.ssh/authorized_keys).
- Also add the following GitHub repo secrets:
GCP_VM_HOST= <your-vm-ip>GCP_VM_USER= <your-vm-username>
- On your local machine:
cloud-airflow-data-pipeline/
│── dags/ # Airflow DAGs
│── plugins/ # Airflow custom plugins
│── scripts/ # Helper scripts
│── docker-compose.yml # Docker setup for Airflow
│── install_airflow.sh # Setup script for Docker + Airflow
│── .github/workflows/ # CI/CD definitions
-
Add new DAGs to dags/ folder.
-
Push changes to main branch.
-
GitHub Actions deploys automatically.
-
Check Airflow UI (http://<vm-ip>:8080) for new DAGs.
