Understanding the Azure Image Builder ACI Failure

Azure Image Builder (AIB) relies heavily on Azure Container Instances (ACI) to run customization and validation steps under the hood. When you encounter the dreaded "internal error occurred" or find that the container instance is stuck or failing to reach the Running state, it usually points to network restrictions, resource provider registration issues, or managed identity permission gaps.

Let's walk through the most common root causes and how to resolve them to ensure your image building pipeline runs reliably every time.

1. Register the Microsoft.ContainerInstance Resource Provider

Before AIB can spin up container instances for validation or customization, the Microsoft.ContainerInstance resource provider must be registered in your Azure subscription. If it isn't, the deployment will fail silently or throw generic internal errors.

az provider register --namespace Microsoft.ContainerInstance
az provider show -n Microsoft.ContainerInstance --query registrationState

2. Assign Correct Permissions to the User-Assigned Managed Identity

Azure Image Builder uses a User-Assigned Managed Identity to provision resources in the staging resource group (usually prefixed with IT_). If this identity lacks permissions, it cannot create or configure the container instances.

  • Ensure the Managed Identity has the Contributor role on the subscription or the specific resource group where you are building the image.
  • If using a custom Virtual Network (VNet), the identity must have the Network Contributor role on the VNet/Subnet so it can join the container instance to your network.

3. Configure Subnet Delegation for ACI

If you are running your build inside a private virtual network, Azure Container Instances require a dedicated subnet delegated specifically to container groups. If the subnet is not delegated properly, the container instance will fail to start.

To delegate a subnet to ACI via Azure CLI:

az network vnet subnet update \
  --resource-group MyResourceGroup \
  --vnet-name MyVnet \
  --name MyACISubnet \
  --delegations Microsoft.ContainerInstance/containerGroups

Note: Ensure this subnet has a sufficient range of IP addresses (at least /29) and does not have restrictive Network Security Group (NSG) rules blocking outbound traffic to Azure storage or Azure active directory.

4. Define a Custom Staging Resource Group

By default, Azure Image Builder dynamically creates a staging resource group (e.g., IT_rgName_templateName_uuid) and deletes it later. Sometimes, subsequent runs fail because of orphaned resources or naming collisions. You can prevent this by explicitly defining a dedicated staging resource group in your ARM or Bicep template:

{
  "type": "Microsoft.VirtualMachineImages/imageTemplates",
  "apiVersion": "2022-07-01",
  "properties": {
    "stagingResourceGroup": "/subscriptions/{sub-id}/resourceGroups/my-custom-aib-staging-rg",
    ...
  }
}

This ensures that permissions are statically assigned, and AIB has a predictable environment to spin up validation containers.

5. Check Regional Quotas and ACI Availability

Azure Container Instances are not available in all zones or regions with the same SKU capacities. If your Azure Image Builder template is deployed in a region with high resource constraints or where ACI availability is limited, the container creation may time out. Try deploying your AIB template in an alternate, highly-available region close to your primary resources.