edge-ai/blueprints at main · microsoft/edge-ai

Name	Name	Last commit message	Last commit date
parent directory ..
azure-local	azure-local
azureml	azureml
dual-peered-single-node-cluster	dual-peered-single-node-cluster
fabric-rti	fabric-rti
fabric	fabric
full-multi-node-cluster	full-multi-node-cluster
full-single-node-cluster	full-single-node-cluster
minimum-single-node-cluster	minimum-single-node-cluster
modules/robotics/terraform	modules/robotics/terraform
only-cloud-single-node-cluster	only-cloud-single-node-cluster
only-edge-iot-ops	only-edge-iot-ops
only-output-cncf-cluster-script	only-output-cncf-cluster-script
partial-single-node-cluster	partial-single-node-cluster
robotics	robotics
README.md	README.md

title

Blueprints

description

Infrastructure as Code composition mechanism providing ready-to-deploy end-to-end solutions for edge computing environments with Azure IoT Operations

author

Edge AI Team

ms.date

2025-06-07

ms.topic

reference

keywords

blueprints

infrastructure as code

azure iot operations

edge computing

terraform

bicep

kubernetes

arc-enabled clusters

deployment templates

iac composition

estimated_reading_time

Blueprints

Blueprints are the Infrastructure as Code (IaC) composition mechanism for this repository. They provide ready-to-deploy end-to-end solutions that showcase how to combine individual components into complete edge computing solutions. Blueprints can be deployed as-is, extended, modified, or layered to build complex multi-stage solutions that meet your specific requirements.

Available Blueprints

Blueprint	Description
Minimum Single Cluster	Minimum deployment of Azure IoT Operations on a single-node, Arc-enabled Kubernetes cluster, omitting observability, messaging, and ACR components
Full Single Cluster	Complete deployment of Azure IoT Operations on a single-node, Arc-enabled Kubernetes cluster
Full Multi-node Cluster	Complete deployment of Azure IoT Operations on a multi-node, Arc-enabled Kubernetes cluster
CNCF Cluster Script Only	Generates scripts for cluster creation without deploying resources
Azure Fabric Environment	Provisions Azure Fabric environment Terraform only currently
Dual Peered Single Node Cluster	Deploys a two single-node clusters with peered networks for proving secured communication via multiple instances of AIO MQ
More coming soon...

Bicep Architecture

Each Bicep blueprint in this repository follows a consistent structure:

Main Configuration: Root module that orchestrates component deployment using Azure's declarative syntax
Parameters: Defined with type safety and validation rules, with descriptions and default values
Outputs: Critical resource information returned after deployment
Type Definitions: Shared type definitions in types.core.bicep or component-specific types for parameter consistency
Reusable Modules: Leverages components from /src to ensure consistency and maintainability
Deployment Scope: Supports both subscription-level and resource group-level deployments

Terraform Architecture

Each blueprint in this repository follows a consistent structure:

Main Configuration: Root module that orchestrates component deployment
Variables: Defined in variables.tf with descriptions and default values
Outputs: Critical resource information returned after deployment in outputs.tf
Reusable Modules: Leverages components from /src to ensure consistency and maintainability
Local State: By default, state is stored locally but can be configured for remote backends

Blueprint Selection Guide

Full Single Cluster: Best for development, testing, and proof-of-concept deployments
Full Multi-node Cluster: Recommended for general purpose lab and production-grade deployments requiring high availability
CNCF Cluster Script Only: Ideal for environments with existing infrastructure or custom deployment processes
Azure Fabric Environment: For users looking to provision Azure Fabric environments with options to deploy Lakehouse, EventStream, and Fabric workspace

Testing Blueprints

Selected blueprints include comprehensive test suites for validation and quality assurance. Tests validate both infrastructure declarations (contract tests) and actual deployments (integration tests).

Available Tests:

Contract Tests - Fast static validation ensuring output declarations match expectations (zero cost, runs in seconds)
Deployment Tests - Full end-to-end validation creating real Azure resources and testing functionality

Blueprints with Test Coverage:

Full Single Cluster - Complete test suite for Terraform and Bicep

See individual blueprint tests/ directories for detailed testing documentation, setup instructions, and maintainer guidelines.

Using Existing Resource Groups

All blueprints support deploying to existing resource groups rather than creating new ones.

Terraform Implementation

To use an existing resource group with Terraform:

terraform apply -var="resource_group_name=your-existing-rg"

Bicep Implementation

To use an existing resource group with Bicep:

az deployment sub create --name deploy1 --location eastus \
  --template-file ./main.bicep \
  --parameters useExistingResourceGroup=true resourceGroupName=your-existing-rg

Important Considerations

When using an existing resource group:

Ensure it's in the same region specified in your deployment parameters
Verify you have appropriate permissions to deploy resources within it
Be aware that name conflicts may occur with existing resources
The existing resource group's location will be used for resources that are location-sensitive

Required Permissions and Custom Roles

Deploying blueprints requires specific Azure permissions to provision and manage the various resources across multiple Azure services. This section outlines the minimum required permissions.

Custom Role Requirements

For environments requiring least-privilege access, create a custom role with the following permission categories:

Core Infrastructure Permissions

Resource Groups and Authorization:

Microsoft.Authorization/roleAssignments/read - Read role assignments
Microsoft.Authorization/roleAssignments/write - Create and update role assignments

Identity Management:

Microsoft.ManagedIdentity/userAssignedIdentities/read - Read user-assigned managed identities
Microsoft.ManagedIdentity/userAssignedIdentities/write - Create and update user-assigned managed identities
Microsoft.ManagedIdentity/userAssignedIdentities/assign/action - Assign identities to resources
Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/read - Read federated identity credentials
Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/write - Create and update federated identity credentials

Security and Secrets:

Microsoft.KeyVault/vaults/read - Read Key Vaults
Microsoft.KeyVault/vaults/write - Create and update Key Vaults
Microsoft.KeyVault/locations/deletedVaults/purge/action - Purge soft-deleted Key Vaults during cleanup
Microsoft.SecretSyncController/azureKeyVaultSecretProviderClasses/read - Read secret provider classes
Microsoft.SecretSyncController/azureKeyVaultSecretProviderClasses/write - Create and update secret provider classes

Storage:

Microsoft.Storage/storageAccounts/read - Read storage accounts
Microsoft.Storage/storageAccounts/write - Create and update storage accounts
Microsoft.Storage/storageAccounts/blobServices/containers/read - Read blob containers
Microsoft.Storage/storageAccounts/blobServices/containers/write - Create and update blob containers

Messaging and Eventing Permissions

Event Grid:

Microsoft.EventGrid/namespaces/read - Read Event Grid namespaces
Microsoft.EventGrid/namespaces/write - Create and update Event Grid namespaces for MQTT broker functionality
Microsoft.EventGrid/namespaces/topicSpaces/read - Read topic spaces
Microsoft.EventGrid/namespaces/topicSpaces/write - Create and update topic spaces for message routing

Event Hubs:

Microsoft.EventHub/namespaces/read - Read Event Hub namespaces
Microsoft.EventHub/namespaces/write - Create and update Event Hub namespaces
Microsoft.EventHub/namespaces/eventhubs/read - Read Event Hubs
Microsoft.EventHub/namespaces/eventhubs/write - Create and update individual Event Hubs
Microsoft.EventHub/namespaces/eventhubs/consumergroups/read - Read consumer groups
Microsoft.EventHub/namespaces/eventhubs/consumergroups/write - Create and update consumer groups
Microsoft.EventHub/namespaces/eventhubs/authorizationRules/read - Read authorization rules
Microsoft.EventHub/namespaces/eventhubs/authorizationRules/write - Create and update access policies and connection strings

IoT Operations Permissions

Azure IoT Operations Core:

Microsoft.IoTOperations/instances/* - Deploy and manage IoT Operations instances
Microsoft.IoTOperations/instances/brokers/* - Configure MQTT brokers
Microsoft.IoTOperations/instances/brokers/listeners/* - Set up broker listeners
Microsoft.IoTOperations/instances/brokers/authentications/* - Configure authentication methods
Microsoft.IoTOperations/instances/dataflowEndpoints/* - Define data flow endpoints
Microsoft.IoTOperations/instances/dataflowProfiles/* - Create data flow profiles and dataflows

Device Registry:

Microsoft.DeviceRegistry/schemaRegistries/* - Manage schema registries for message validation
Microsoft.DeviceRegistry/assets/* - Register and manage edge assets
Microsoft.DeviceRegistry/assetEndpointProfiles/* - Configure asset endpoint profiles

Arc-Enabled Kubernetes:

Microsoft.Kubernetes/connectedClusters/read - Read Arc-enabled cluster information
Microsoft.KubernetesConfiguration/extensions/* - Deploy and manage cluster extensions
Microsoft.ExtendedLocation/customLocations/resourceSyncRules/* - Configure resource synchronization

Observability Permissions

Monitoring and Logging:

Microsoft.OperationalInsights/workspaces/* - Create Log Analytics workspaces
Microsoft.Monitor/accounts/* - Manage Azure Monitor accounts for Prometheus metrics
Microsoft.Insights/dataCollectionRules/* - Define data collection rules
Microsoft.Insights/dataCollectionEndpoints/* - Configure data collection endpoints
Microsoft.Insights/dataCollectionRuleAssociations/* - Associate rules with resources
Microsoft.Insights/components/* - Create Application Insights resources

Grafana and Dashboards:

Microsoft.Dashboard/grafana/* - Provision managed Grafana instances for visualization
Microsoft.AlertsManagement/prometheusRuleGroups/* - Configure Prometheus alerting rules
Microsoft.OperationsManagement/solutions/* - Deploy monitoring solutions

Private Monitoring:

Microsoft.Insights/privateLinkScopes/* - Create private link scopes for secure monitoring
Microsoft.Insights/privateLinkScopes/scopedResources/* - Associate resources with private link scopes

Detailed Deployment Workflow

Prerequisites

IMPORTANT: We highly suggest using this project's integrated dev container to get started quickly with Windows-based systems and also works well with nix-compatible environments.

Refer to the Environment Setup section in the Root README for detailed instructions on setting up your environment.

Ensure your Azure CLI is logged in and your subscription context is set correctly.

Getting Started and Deploying with Terraform

Note on Telemetry: If you wish to opt-out of sending telemetry data to Microsoft when deploying Azure resources with Terraform, you can set the environment variable ARM_DISABLE_TERRAFORM_PARTNER_ID=true before running any terraform commands.

Navigate to your chosen blueprint directory, as an example:

# Navigate to the terraform directory
cd ./full-single-node-cluster/terraform

Set up required environment variables:

ARM_SUBSCRIPTION_ID -- The Azure Subscription ID target for this deployment (required to be set for the Terraform tasks below)

# Dynamically get the Subscription ID or manually get and pass to ARM_SUBSCRIPTION_ID
current_subscription_id=$(az account show --query id -o tsv)
export ARM_SUBSCRIPTION_ID="$current_subscription_id"

Generate a terraform.tfvars file using terraform-docs:

# Generate the tfvars file
terraform-docs tfvars hcl .

If terraform-docs is not installed, you'll need to install it:

# Install terraform-docs - macOS
brew install terraform-docs

# Install terraform-docs - Linux
./scripts/install-terraform-docs.sh

Or visit the terraform-docs installation page for more options.

The generated output will look similar to the following:

# Required variables
environment     = "dev"                 # Environment type (dev, test, prod)
resource_prefix = "myprefix"            # Short unique prefix for resource naming
location        = "eastus2"             # Azure region location
# Optional (recommended) variables
instance        = "001"                 # Deployment instance number

Copy this output to a file named terraform.tfvars and fill in any required values. Update any optional values that you want to change as well.

NOTE: To have Terraform automatically use your variables, you can name your tfvars file terraform.auto.tfvars. Terraform will use variables from any *.auto.tfvars files located in the same deployment folder.

Initialize and apply Terraform:

# Pulls down providers and modules, initializes state and backend
terraform init -upgrade # Use '-reconfigure' if backend for tfstate needs to be reconfigured

# Preview changes before applying
terraform plan -var-file=terraform.tfvars  # Use -var-file if not using *.auto.tfvars file

# Review resource change list, then deploy
terraform apply -var-file=terraform.tfvars # Add '-auto-approve' to skip confirmation

Note: To deploy to an existing resource group instead of creating a new one, add -var="resource_group_name=your-existing-rg" to your apply command.

Wait for the deployments to complete, an example successful deployment message looks like the following:
```
Apply complete! Resources: *** added, *** changed, *** destroyed.
```

Getting Started and Deploying with Bicep

Bicep provides an alternative Infrastructure as Code (IaC) approach that's native to Azure. Follow these steps to deploy blueprints using Bicep:

Navigate to your chosen blueprint directory, as an example:

# Navigate to the bicep directory
cd ./full-single-node-cluster/bicep

Use the Azure CLI to get the Custom Locations OID:

# Get the custom locations OID and export it as an environment variable
export CUSTOM_LOCATIONS_OID=$(az ad sp show --id bc313c14-388c-4e7d-a58e-70017303ee3b --query id -o tsv)

# Verify the environment variable is set correctly
echo $CUSTOM_LOCATIONS_OID

Check that the Bicep CLI is installed or install it:

# Verify Bicep installation (included in recent Azure CLI versions)
az bicep version

# If not installed:
az bicep install

Create a parameters file for your deployment:

Generate a parameters file using the Azure CLI's Bicep parameter generation feature:

# Generate the parameters file template
az bicep generate-params --file main.bicep --output-format bicepparam --include-params all > main.bicepparam

Edit the generated main.bicepparam file to customize your deployment parameters:

// Parameters for full-single-node-cluster blueprint
using './main.bicep'

// Required parameters for the common object
param common = {
  resourcePrefix: 'prf01a2'     // Keep short (max 8 chars) to avoid resource naming issues
  location: 'eastus2'            // Replace with your Azure region
  environment: 'dev'             // 'dev', 'test', or 'prod'
  instance: '001'                // For multiple deployments
}

// This is not optimal, to be replaced by KeyVault usage in future
@secure()
param adminPassword = 'YourSecurePassword123!' // Replace with a secure password

// When customLocationsOid is required:
param customLocationsOid = readEnvironmentVariable('CUSTOM_LOCATIONS_OID') // Read from environment variable

// Any additional parameters with defaults, example:
param resourceGroupName = 'rg-${common.resourcePrefix}-${common.environment}-${common.instance}'
param shouldCreateAnonymousBrokerListener = false // Set to true only for dev/test environments
param shouldInitAio = true // Deploy the Azure IoT Operations initial connected cluster resources
param shouldDeployAio = true // Deploy an Azure IoT Operations Instance
param useExistingResourceGroup = false // Set to true to use an existing resource group instead of creating a new one

Note: When setting useExistingResourceGroup to true, ensure the resource group already exists or your deployment will fail.

(Optional) Determine available Azure locations:

Navigate to the scripts directory:
```
cd ../../../scripts
```
Run the location-check.sh script:
```
./location-check.sh --blueprint {blueprint_name} --method bicep
```

Deploy Resources with Bicep:

# Deploy using the Azure CLI at the subscription level, keep deployment_name less than 8 characters:
az deployment sub create --name {deployment_name} --location {location} --parameters ./main.bicepparam

Note: When deploying with a customLocationsOid, ensure the CUSTOM_LOCATIONS_OID environment variable is set in your current shell session before running the deployment command.

Monitor deployment progress:

You can check the deployment status in the Azure portal or using the Azure CLI:

# Get the resource group name (after deployment starts)
RG_NAME="rg-{resource_prefix}-{environment}-{instance}"

# List resources in the resource group
az resource list --resource-group $RG_NAME -o table

Accessing Deployed Resources

After a successful deployment, you will want to verify your resources have been deployed correctly. You can do this by listing all resources in the resource group to check that they've been deployed successfully:

# Get the resource group name (after deployment starts)
RG_NAME="rg-{resource_prefix}-{environment}-{instance}"

# List resources in the resource group
az resource list --resource-group $RG_NAME -o table

Any Arc Connected Cluster Deployment

After a successful deployment, verify you can connect to the cluster and that there are pods:

# Get the arc connected cluster name after deployment, default looks like the following:
ARC_CONNECTED_CLUSTER_NAME="arck-{resource_prefix}-{environment}-{instance}"

# Access the Kubernetes cluster (in one prompt)
az connectedk8s proxy -n $ARC_CONNECTED_CLUSTER_NAME -g $RG_NAME

# View AIO resources (in a separate prompt)
kubectl get pods -n azure-iot-operations

# Check cluster node status
kubectl get nodes -o wide

Any Key Vault Stored Resources (Such as Scripts)

After a successful deployment, for any resources stored in Key Vault that you may want to retrieve locally for verification or as a result of a blueprint. Check the output from the deployment framework for specifics on what to download.

Here is also how you will typically follow this process, using cluster setup scripts as an example:

# Get the Key Vault name after deployment, default looks like the following:
KV_NAME="kv-{resource_prefix}-{environment}-{instance}"

# Retrieve scripts from Key Vault and save to local files
az keyvault secret show --name cluster-server-ubuntu-k3s --vault-name $KV_NAME --query value -o tsv > cluster-server-ubuntu-k3s.sh
az keyvault secret show --name cluster-node-ubuntu-k3s --vault-name $KV_NAME --query value -o tsv > cluster-node-ubuntu-k3s.sh

# Make scripts executable
chmod +x cluster-server-ubuntu-k3s.sh cluster-node-ubuntu-k3s.sh

Deployment Cleanup

It is recommend that you either use the Azure Portal or AZ CLI commands for deleting deployed resources. If you've deploy a Resource Group with resources in it, then the quickest way to clean up resources is to delete the resource group.

The following is an example using AZ CLI:

# Delete the resource group and all its resources
az group delete --name "$RG_NAME"

Deployment Troubleshooting

Deployment duration for multi-node clusters will be longer than single-node deployments. Be patient during the provisioning process.

Terraform Troubleshooting

Terraform can fail if resources already exist that are not properly in its state, use the following to correct terraform state:

# List resources in the current state file
terraform state list

# Show details of a specific resource in state, example:
terraform state show 'module.edge_iot_ops.azurerm_arc_kubernetes_cluster_extension.iot_operations'

# Remove a resource from state (doesn't delete the actual resource), example:
terraform state rm 'module.edge_iot_ops.azurerm_arc_kubernetes_cluster_extension.iot_operations'

# Import an existing Azure resource into your Terraform state
# For Arc-enabled K8s cluster, example:
terraform import 'module.edge_cncf_cluster.azurerm_kubernetes_cluster.arc_cluster' \
  /subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.Kubernetes/connectedClusters/{cluster_name}

# Update state to match the real infrastructure without making changes
terraform refresh

# Remove all resources from state (useful when you want to start fresh without destroying resources)
terraform state rm $(terraform state list)

# Move a resource within your state (useful for refactoring), example:
terraform state mv 'module.old_name.azurerm_resource.example' 'module.new_name.azurerm_resource.example'

Bicep Troubleshooting

Bicep uses the actual deployed resources in Azure, any correction needed to deployed resources must be done directly with bicep or AZ CLI commands:

# Get deployment status and errors
az deployment sub show --name {deployment_name} --query "properties.error" -o json

# List all deployments at subscription level
az deployment sub list --query "[].{Name:name, State:properties.provisioningState}" -o table

# Get detailed information about a specific resource
az resource show --resource-group $RG_NAME --name {resource_name} --resource-type {resource_type}

# Delete a specific resource without deleting the whole resource group
az resource delete --resource-group $RG_NAME --name {resource_name} --resource-type {resource_type}

# View deployment operations for troubleshooting
az deployment sub operation list --name {deployment_name}

# Export a deployed resource group as a template for comparison or backup
az group export --name $RG_NAME > exported-resources.json

# Export all child resources of a specific parent resource (useful for nested resources)
az resource list --parent "subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/{provider}/{resource-type}/{parent-resource}" -o json > child-resources.json

Common Issues

Node joining failures: Multi-node deployments with worker nodes that fail to join the cluster, verify networking connectivity between VMs
Terraform timeouts: Multi-node deployments may require increased timeouts for resource creation, increase timeout and retry the deployment.
Arc-enabled Kubernetes issues: Arc connection issues may occur during first deployment, attempt retrying the deployment.
Custom Locations OID: Verify correct OID for your tenant. This can vary between Azure AD instances and permissions.
VM size availability: Ensure the chosen VM size is available in your selected region
Bicep deployment name too long: Ensure that the original deployment name is roughly 5-8 characters long, this name is used for additional deployments throughout the process.
Resource name issues: Ensure the provided resource prefix does not include anything other than alphanumeric characters. Also ensure your resource prefix is at most 8 characters long. Additional, pick a resource prefix that is likely to be unique, add 4 unique characters to the end of your resource prefix if needed.
Existing resource group issues: When using the existing resource group feature, make sure the resource group exists before deployment.

Common Terraform Commands

# Validate Terraform configuration
terraform validate

# Format Terraform files
terraform fmt

# Preview changes before applying
terraform plan -var-file=terraform.tfvars

# Clean up deployed resources
terraform destroy -var-file=terraform.tfvars

State Management

By default, Terraform state is stored locally. For team environments, consider configuring a remote backend:

terraform {
  backend "azurerm" {
    resource_group_name  = "terraform-state-rg"
    storage_account_name = "terraformstate"
    container_name       = "tfstate"
    key                  = "blueprint.tfstate"
  }
}

Industry Solutions

The blueprints in this repository can be used to implement a variety of industry solutions across different pillars. We are working aggressively towards building blueprints for each of these scenarios as time move on.

For a detailed list of industry pillars and scenarios, please see the Industry Scenarios and Platform Capabilities document.

Getting Started with Your Industry Solution

Identify the industry scenario from the table above that best matches your requirements
Select the appropriate blueprint based on your infrastructure needs (single node or multi-node)
Follow the deployment instructions in this document
After deployment, customize the solution with additional components specific to your industry scenario

For more information about implementing specific industry solutions, please contact the solution team.

🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Blueprints

Available Blueprints

Bicep Architecture

Terraform Architecture

Blueprint Selection Guide

Testing Blueprints

Using Existing Resource Groups

Terraform Implementation

Bicep Implementation

Important Considerations

Required Permissions and Custom Roles

Custom Role Requirements

Core Infrastructure Permissions

Messaging and Eventing Permissions

IoT Operations Permissions

Observability Permissions

Detailed Deployment Workflow

Prerequisites

Getting Started and Deploying with Terraform

Getting Started and Deploying with Bicep

Accessing Deployed Resources

Any Arc Connected Cluster Deployment

Any Key Vault Stored Resources (Such as Scripts)

Deployment Cleanup

Deployment Troubleshooting

Terraform Troubleshooting

Bicep Troubleshooting

Common Issues

Common Terraform Commands

State Management

Industry Solutions

Getting Started with Your Industry Solution

FilesExpand file tree

blueprints

Directory actions

More options

Directory actions

More options

Latest commit

History

blueprints

Folders and files

parent directory

README.md

Blueprints

Available Blueprints

Bicep Architecture

Terraform Architecture

Blueprint Selection Guide

Testing Blueprints

Using Existing Resource Groups

Terraform Implementation

Bicep Implementation

Important Considerations

Required Permissions and Custom Roles

Custom Role Requirements

Core Infrastructure Permissions

Messaging and Eventing Permissions

IoT Operations Permissions

Observability Permissions

Detailed Deployment Workflow

Prerequisites

Getting Started and Deploying with Terraform

Getting Started and Deploying with Bicep

Accessing Deployed Resources

Any Arc Connected Cluster Deployment

Any Key Vault Stored Resources (Such as Scripts)

Deployment Cleanup

Deployment Troubleshooting

Terraform Troubleshooting

Bicep Troubleshooting

Common Issues

Common Terraform Commands

State Management

Industry Solutions

Getting Started with Your Industry Solution