Skip to content

Latest commit

 

History

History

README.md

title Blueprints
description Infrastructure as Code composition mechanism providing ready-to-deploy end-to-end solutions for edge computing environments with Azure IoT Operations
author Edge AI Team
ms.date 2025-06-07
ms.topic reference
keywords
blueprints
infrastructure as code
azure iot operations
edge computing
terraform
bicep
kubernetes
arc-enabled clusters
deployment templates
iac composition
estimated_reading_time 7

Blueprints

Blueprints are the Infrastructure as Code (IaC) composition mechanism for this repository. They provide ready-to-deploy end-to-end solutions that showcase how to combine individual components into complete edge computing solutions. Blueprints can be deployed as-is, extended, modified, or layered to build complex multi-stage solutions that meet your specific requirements.

Available Blueprints

Blueprint Description
Minimum Single Cluster Minimum deployment of Azure IoT Operations on a single-node, Arc-enabled Kubernetes cluster, omitting observability, messaging, and ACR components
Full Single Cluster Complete deployment of Azure IoT Operations on a single-node, Arc-enabled Kubernetes cluster
Full Multi-node Cluster Complete deployment of Azure IoT Operations on a multi-node, Arc-enabled Kubernetes cluster
CNCF Cluster Script Only Generates scripts for cluster creation without deploying resources
Azure Fabric Environment Provisions Azure Fabric environment Terraform only currently
Dual Peered Single Node Cluster Deploys a two single-node clusters with peered networks for proving secured communication via multiple instances of AIO MQ
More coming soon...

Bicep Architecture

Each Bicep blueprint in this repository follows a consistent structure:

  • Main Configuration: Root module that orchestrates component deployment using Azure's declarative syntax
  • Parameters: Defined with type safety and validation rules, with descriptions and default values
  • Outputs: Critical resource information returned after deployment
  • Type Definitions: Shared type definitions in types.core.bicep or component-specific types for parameter consistency
  • Reusable Modules: Leverages components from /src to ensure consistency and maintainability
  • Deployment Scope: Supports both subscription-level and resource group-level deployments

Terraform Architecture

Each blueprint in this repository follows a consistent structure:

  • Main Configuration: Root module that orchestrates component deployment
  • Variables: Defined in variables.tf with descriptions and default values
  • Outputs: Critical resource information returned after deployment in outputs.tf
  • Reusable Modules: Leverages components from /src to ensure consistency and maintainability
  • Local State: By default, state is stored locally but can be configured for remote backends

Blueprint Selection Guide

  • Full Single Cluster: Best for development, testing, and proof-of-concept deployments
  • Full Multi-node Cluster: Recommended for general purpose lab and production-grade deployments requiring high availability
  • CNCF Cluster Script Only: Ideal for environments with existing infrastructure or custom deployment processes
  • Azure Fabric Environment: For users looking to provision Azure Fabric environments with options to deploy Lakehouse, EventStream, and Fabric workspace

Testing Blueprints

Selected blueprints include comprehensive test suites for validation and quality assurance. Tests validate both infrastructure declarations (contract tests) and actual deployments (integration tests).

Available Tests:

  • Contract Tests - Fast static validation ensuring output declarations match expectations (zero cost, runs in seconds)
  • Deployment Tests - Full end-to-end validation creating real Azure resources and testing functionality

Blueprints with Test Coverage:

See individual blueprint tests/ directories for detailed testing documentation, setup instructions, and maintainer guidelines.

Using Existing Resource Groups

All blueprints support deploying to existing resource groups rather than creating new ones.

Terraform Implementation

To use an existing resource group with Terraform:

terraform apply -var="resource_group_name=your-existing-rg"

Bicep Implementation

To use an existing resource group with Bicep:

az deployment sub create --name deploy1 --location eastus \
  --template-file ./main.bicep \
  --parameters useExistingResourceGroup=true resourceGroupName=your-existing-rg

Important Considerations

When using an existing resource group:

  • Ensure it's in the same region specified in your deployment parameters
  • Verify you have appropriate permissions to deploy resources within it
  • Be aware that name conflicts may occur with existing resources
  • The existing resource group's location will be used for resources that are location-sensitive

Required Permissions and Custom Roles

Deploying blueprints requires specific Azure permissions to provision and manage the various resources across multiple Azure services. This section outlines the minimum required permissions.

Custom Role Requirements

For environments requiring least-privilege access, create a custom role with the following permission categories:

Core Infrastructure Permissions

Resource Groups and Authorization:

  • Microsoft.Authorization/roleAssignments/read - Read role assignments
  • Microsoft.Authorization/roleAssignments/write - Create and update role assignments

Identity Management:

  • Microsoft.ManagedIdentity/userAssignedIdentities/read - Read user-assigned managed identities
  • Microsoft.ManagedIdentity/userAssignedIdentities/write - Create and update user-assigned managed identities
  • Microsoft.ManagedIdentity/userAssignedIdentities/assign/action - Assign identities to resources
  • Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/read - Read federated identity credentials
  • Microsoft.ManagedIdentity/userAssignedIdentities/federatedIdentityCredentials/write - Create and update federated identity credentials

Security and Secrets:

  • Microsoft.KeyVault/vaults/read - Read Key Vaults
  • Microsoft.KeyVault/vaults/write - Create and update Key Vaults
  • Microsoft.KeyVault/locations/deletedVaults/purge/action - Purge soft-deleted Key Vaults during cleanup
  • Microsoft.SecretSyncController/azureKeyVaultSecretProviderClasses/read - Read secret provider classes
  • Microsoft.SecretSyncController/azureKeyVaultSecretProviderClasses/write - Create and update secret provider classes

Storage:

  • Microsoft.Storage/storageAccounts/read - Read storage accounts
  • Microsoft.Storage/storageAccounts/write - Create and update storage accounts
  • Microsoft.Storage/storageAccounts/blobServices/containers/read - Read blob containers
  • Microsoft.Storage/storageAccounts/blobServices/containers/write - Create and update blob containers

Messaging and Eventing Permissions

Event Grid:

  • Microsoft.EventGrid/namespaces/read - Read Event Grid namespaces
  • Microsoft.EventGrid/namespaces/write - Create and update Event Grid namespaces for MQTT broker functionality
  • Microsoft.EventGrid/namespaces/topicSpaces/read - Read topic spaces
  • Microsoft.EventGrid/namespaces/topicSpaces/write - Create and update topic spaces for message routing

Event Hubs:

  • Microsoft.EventHub/namespaces/read - Read Event Hub namespaces
  • Microsoft.EventHub/namespaces/write - Create and update Event Hub namespaces
  • Microsoft.EventHub/namespaces/eventhubs/read - Read Event Hubs
  • Microsoft.EventHub/namespaces/eventhubs/write - Create and update individual Event Hubs
  • Microsoft.EventHub/namespaces/eventhubs/consumergroups/read - Read consumer groups
  • Microsoft.EventHub/namespaces/eventhubs/consumergroups/write - Create and update consumer groups
  • Microsoft.EventHub/namespaces/eventhubs/authorizationRules/read - Read authorization rules
  • Microsoft.EventHub/namespaces/eventhubs/authorizationRules/write - Create and update access policies and connection strings

IoT Operations Permissions

Azure IoT Operations Core:

  • Microsoft.IoTOperations/instances/* - Deploy and manage IoT Operations instances
  • Microsoft.IoTOperations/instances/brokers/* - Configure MQTT brokers
  • Microsoft.IoTOperations/instances/brokers/listeners/* - Set up broker listeners
  • Microsoft.IoTOperations/instances/brokers/authentications/* - Configure authentication methods
  • Microsoft.IoTOperations/instances/dataflowEndpoints/* - Define data flow endpoints
  • Microsoft.IoTOperations/instances/dataflowProfiles/* - Create data flow profiles and dataflows

Device Registry:

  • Microsoft.DeviceRegistry/schemaRegistries/* - Manage schema registries for message validation
  • Microsoft.DeviceRegistry/assets/* - Register and manage edge assets
  • Microsoft.DeviceRegistry/assetEndpointProfiles/* - Configure asset endpoint profiles

Arc-Enabled Kubernetes:

  • Microsoft.Kubernetes/connectedClusters/read - Read Arc-enabled cluster information
  • Microsoft.KubernetesConfiguration/extensions/* - Deploy and manage cluster extensions
  • Microsoft.ExtendedLocation/customLocations/resourceSyncRules/* - Configure resource synchronization

Observability Permissions

Monitoring and Logging:

  • Microsoft.OperationalInsights/workspaces/* - Create Log Analytics workspaces
  • Microsoft.Monitor/accounts/* - Manage Azure Monitor accounts for Prometheus metrics
  • Microsoft.Insights/dataCollectionRules/* - Define data collection rules
  • Microsoft.Insights/dataCollectionEndpoints/* - Configure data collection endpoints
  • Microsoft.Insights/dataCollectionRuleAssociations/* - Associate rules with resources
  • Microsoft.Insights/components/* - Create Application Insights resources

Grafana and Dashboards:

  • Microsoft.Dashboard/grafana/* - Provision managed Grafana instances for visualization
  • Microsoft.AlertsManagement/prometheusRuleGroups/* - Configure Prometheus alerting rules
  • Microsoft.OperationsManagement/solutions/* - Deploy monitoring solutions

Private Monitoring:

  • Microsoft.Insights/privateLinkScopes/* - Create private link scopes for secure monitoring
  • Microsoft.Insights/privateLinkScopes/scopedResources/* - Associate resources with private link scopes

Detailed Deployment Workflow

Prerequisites

IMPORTANT: We highly suggest using this project's integrated dev container to get started quickly with Windows-based systems and also works well with nix-compatible environments.

Refer to the Environment Setup section in the Root README for detailed instructions on setting up your environment.

Ensure your Azure CLI is logged in and your subscription context is set correctly.

Getting Started and Deploying with Terraform

Note on Telemetry: If you wish to opt-out of sending telemetry data to Microsoft when deploying Azure resources with Terraform, you can set the environment variable ARM_DISABLE_TERRAFORM_PARTNER_ID=true before running any terraform commands.

  1. Navigate to your chosen blueprint directory, as an example:

    # Navigate to the terraform directory
    cd ./full-single-node-cluster/terraform
  2. Set up required environment variables:

    • ARM_SUBSCRIPTION_ID -- The Azure Subscription ID target for this deployment (required to be set for the Terraform tasks below)
    # Dynamically get the Subscription ID or manually get and pass to ARM_SUBSCRIPTION_ID
    current_subscription_id=$(az account show --query id -o tsv)
    export ARM_SUBSCRIPTION_ID="$current_subscription_id"
  3. Generate a terraform.tfvars file using terraform-docs:

    # Generate the tfvars file
    terraform-docs tfvars hcl .

    If terraform-docs is not installed, you'll need to install it:

    # Install terraform-docs - macOS
    brew install terraform-docs
    
    # Install terraform-docs - Linux
    ./scripts/install-terraform-docs.sh

    Or visit the terraform-docs installation page for more options.

    The generated output will look similar to the following:

    # Required variables
    environment     = "dev"                 # Environment type (dev, test, prod)
    resource_prefix = "myprefix"            # Short unique prefix for resource naming
    location        = "eastus2"             # Azure region location
    # Optional (recommended) variables
    instance        = "001"                 # Deployment instance number

    Copy this output to a file named terraform.tfvars and fill in any required values. Update any optional values that you want to change as well.

    NOTE: To have Terraform automatically use your variables, you can name your tfvars file terraform.auto.tfvars. Terraform will use variables from any *.auto.tfvars files located in the same deployment folder.

  4. Initialize and apply Terraform:

    # Pulls down providers and modules, initializes state and backend
    terraform init -upgrade # Use '-reconfigure' if backend for tfstate needs to be reconfigured
    
    # Preview changes before applying
    terraform plan -var-file=terraform.tfvars  # Use -var-file if not using *.auto.tfvars file
    
    # Review resource change list, then deploy
    terraform apply -var-file=terraform.tfvars # Add '-auto-approve' to skip confirmation

    Note: To deploy to an existing resource group instead of creating a new one, add -var="resource_group_name=your-existing-rg" to your apply command.

  5. Wait for the deployments to complete, an example successful deployment message looks like the following:

    Apply complete! Resources: *** added, *** changed, *** destroyed.

Getting Started and Deploying with Bicep

Bicep provides an alternative Infrastructure as Code (IaC) approach that's native to Azure. Follow these steps to deploy blueprints using Bicep:

  1. Navigate to your chosen blueprint directory, as an example:

    # Navigate to the bicep directory
    cd ./full-single-node-cluster/bicep
  2. Use the Azure CLI to get the Custom Locations OID:

    # Get the custom locations OID and export it as an environment variable
    export CUSTOM_LOCATIONS_OID=$(az ad sp show --id bc313c14-388c-4e7d-a58e-70017303ee3b --query id -o tsv)
    
    # Verify the environment variable is set correctly
    echo $CUSTOM_LOCATIONS_OID
  3. Check that the Bicep CLI is installed or install it:

    # Verify Bicep installation (included in recent Azure CLI versions)
    az bicep version
    
    # If not installed:
    az bicep install
  4. Create a parameters file for your deployment:

    Generate a parameters file using the Azure CLI's Bicep parameter generation feature:

    # Generate the parameters file template
    az bicep generate-params --file main.bicep --output-format bicepparam --include-params all > main.bicepparam

    Edit the generated main.bicepparam file to customize your deployment parameters:

    // Parameters for full-single-node-cluster blueprint
    using './main.bicep'
    
    // Required parameters for the common object
    param common = {
      resourcePrefix: 'prf01a2'     // Keep short (max 8 chars) to avoid resource naming issues
      location: 'eastus2'            // Replace with your Azure region
      environment: 'dev'             // 'dev', 'test', or 'prod'
      instance: '001'                // For multiple deployments
    }
    
    // This is not optimal, to be replaced by KeyVault usage in future
    @secure()
    param adminPassword = 'YourSecurePassword123!' // Replace with a secure password
    
    // When customLocationsOid is required:
    param customLocationsOid = readEnvironmentVariable('CUSTOM_LOCATIONS_OID') // Read from environment variable
    
    // Any additional parameters with defaults, example:
    param resourceGroupName = 'rg-${common.resourcePrefix}-${common.environment}-${common.instance}'
    param shouldCreateAnonymousBrokerListener = false // Set to true only for dev/test environments
    param shouldInitAio = true // Deploy the Azure IoT Operations initial connected cluster resources
    param shouldDeployAio = true // Deploy an Azure IoT Operations Instance
    param useExistingResourceGroup = false // Set to true to use an existing resource group instead of creating a new one

    Note: When setting useExistingResourceGroup to true, ensure the resource group already exists or your deployment will fail.

  5. (Optional) Determine available Azure locations:

    Navigate to the scripts directory:

    cd ../../../scripts

    Run the location-check.sh script:

    ./location-check.sh --blueprint {blueprint_name} --method bicep
  6. Deploy Resources with Bicep:

    # Deploy using the Azure CLI at the subscription level, keep deployment_name less than 8 characters:
    az deployment sub create --name {deployment_name} --location {location} --parameters ./main.bicepparam

    Note: When deploying with a customLocationsOid, ensure the CUSTOM_LOCATIONS_OID environment variable is set in your current shell session before running the deployment command.

  7. Monitor deployment progress:

    You can check the deployment status in the Azure portal or using the Azure CLI:

    # Get the resource group name (after deployment starts)
    RG_NAME="rg-{resource_prefix}-{environment}-{instance}"
    
    # List resources in the resource group
    az resource list --resource-group $RG_NAME -o table

Accessing Deployed Resources

After a successful deployment, you will want to verify your resources have been deployed correctly. You can do this by listing all resources in the resource group to check that they've been deployed successfully:

# Get the resource group name (after deployment starts)
RG_NAME="rg-{resource_prefix}-{environment}-{instance}"

# List resources in the resource group
az resource list --resource-group $RG_NAME -o table

Any Arc Connected Cluster Deployment

After a successful deployment, verify you can connect to the cluster and that there are pods:

# Get the arc connected cluster name after deployment, default looks like the following:
ARC_CONNECTED_CLUSTER_NAME="arck-{resource_prefix}-{environment}-{instance}"

# Access the Kubernetes cluster (in one prompt)
az connectedk8s proxy -n $ARC_CONNECTED_CLUSTER_NAME -g $RG_NAME

# View AIO resources (in a separate prompt)
kubectl get pods -n azure-iot-operations

# Check cluster node status
kubectl get nodes -o wide

Any Key Vault Stored Resources (Such as Scripts)

After a successful deployment, for any resources stored in Key Vault that you may want to retrieve locally for verification or as a result of a blueprint. Check the output from the deployment framework for specifics on what to download.

Here is also how you will typically follow this process, using cluster setup scripts as an example:

# Get the Key Vault name after deployment, default looks like the following:
KV_NAME="kv-{resource_prefix}-{environment}-{instance}"

# Retrieve scripts from Key Vault and save to local files
az keyvault secret show --name cluster-server-ubuntu-k3s --vault-name $KV_NAME --query value -o tsv > cluster-server-ubuntu-k3s.sh
az keyvault secret show --name cluster-node-ubuntu-k3s --vault-name $KV_NAME --query value -o tsv > cluster-node-ubuntu-k3s.sh

# Make scripts executable
chmod +x cluster-server-ubuntu-k3s.sh cluster-node-ubuntu-k3s.sh

Deployment Cleanup

It is recommend that you either use the Azure Portal or AZ CLI commands for deleting deployed resources. If you've deploy a Resource Group with resources in it, then the quickest way to clean up resources is to delete the resource group.

The following is an example using AZ CLI:

# Delete the resource group and all its resources
az group delete --name "$RG_NAME"

Deployment Troubleshooting

Deployment duration for multi-node clusters will be longer than single-node deployments. Be patient during the provisioning process.

Terraform Troubleshooting

Terraform can fail if resources already exist that are not properly in its state, use the following to correct terraform state:

# List resources in the current state file
terraform state list

# Show details of a specific resource in state, example:
terraform state show 'module.edge_iot_ops.azurerm_arc_kubernetes_cluster_extension.iot_operations'

# Remove a resource from state (doesn't delete the actual resource), example:
terraform state rm 'module.edge_iot_ops.azurerm_arc_kubernetes_cluster_extension.iot_operations'

# Import an existing Azure resource into your Terraform state
# For Arc-enabled K8s cluster, example:
terraform import 'module.edge_cncf_cluster.azurerm_kubernetes_cluster.arc_cluster' \
  /subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/Microsoft.Kubernetes/connectedClusters/{cluster_name}

# Update state to match the real infrastructure without making changes
terraform refresh

# Remove all resources from state (useful when you want to start fresh without destroying resources)
terraform state rm $(terraform state list)

# Move a resource within your state (useful for refactoring), example:
terraform state mv 'module.old_name.azurerm_resource.example' 'module.new_name.azurerm_resource.example'

Bicep Troubleshooting

Bicep uses the actual deployed resources in Azure, any correction needed to deployed resources must be done directly with bicep or AZ CLI commands:

# Get deployment status and errors
az deployment sub show --name {deployment_name} --query "properties.error" -o json

# List all deployments at subscription level
az deployment sub list --query "[].{Name:name, State:properties.provisioningState}" -o table

# Get detailed information about a specific resource
az resource show --resource-group $RG_NAME --name {resource_name} --resource-type {resource_type}

# Delete a specific resource without deleting the whole resource group
az resource delete --resource-group $RG_NAME --name {resource_name} --resource-type {resource_type}

# View deployment operations for troubleshooting
az deployment sub operation list --name {deployment_name}

# Export a deployed resource group as a template for comparison or backup
az group export --name $RG_NAME > exported-resources.json

# Export all child resources of a specific parent resource (useful for nested resources)
az resource list --parent "subscriptions/{subscription_id}/resourceGroups/{resource_group}/providers/{provider}/{resource-type}/{parent-resource}" -o json > child-resources.json

Common Issues

  • Node joining failures: Multi-node deployments with worker nodes that fail to join the cluster, verify networking connectivity between VMs
  • Terraform timeouts: Multi-node deployments may require increased timeouts for resource creation, increase timeout and retry the deployment.
  • Arc-enabled Kubernetes issues: Arc connection issues may occur during first deployment, attempt retrying the deployment.
  • Custom Locations OID: Verify correct OID for your tenant. This can vary between Azure AD instances and permissions.
  • VM size availability: Ensure the chosen VM size is available in your selected region
  • Bicep deployment name too long: Ensure that the original deployment name is roughly 5-8 characters long, this name is used for additional deployments throughout the process.
  • Resource name issues: Ensure the provided resource prefix does not include anything other than alphanumeric characters. Also ensure your resource prefix is at most 8 characters long. Additional, pick a resource prefix that is likely to be unique, add 4 unique characters to the end of your resource prefix if needed.
  • Existing resource group issues: When using the existing resource group feature, make sure the resource group exists before deployment.

Common Terraform Commands

# Validate Terraform configuration
terraform validate

# Format Terraform files
terraform fmt

# Preview changes before applying
terraform plan -var-file=terraform.tfvars

# Clean up deployed resources
terraform destroy -var-file=terraform.tfvars

State Management

By default, Terraform state is stored locally. For team environments, consider configuring a remote backend:

terraform {
  backend "azurerm" {
    resource_group_name  = "terraform-state-rg"
    storage_account_name = "terraformstate"
    container_name       = "tfstate"
    key                  = "blueprint.tfstate"
  }
}

Industry Solutions

The blueprints in this repository can be used to implement a variety of industry solutions across different pillars. We are working aggressively towards building blueprints for each of these scenarios as time move on.

For a detailed list of industry pillars and scenarios, please see the Industry Scenarios and Platform Capabilities document.

Getting Started with Your Industry Solution

  1. Identify the industry scenario from the table above that best matches your requirements
  2. Select the appropriate blueprint based on your infrastructure needs (single node or multi-node)
  3. Follow the deployment instructions in this document
  4. After deployment, customize the solution with additional components specific to your industry scenario

For more information about implementing specific industry solutions, please contact the solution team.


🤖 Crafted with precision by ✨Copilot following brilliant human instruction, then carefully refined by our team of discerning human reviewers.