Skip to content

az login: intermittent "No subscriptions found" with OIDC service-principal login — transient empty ARM subscription list (HTTP 200) #33480

@varga-istvan-nx

Description

@varga-istvan-nx

Describe the bug

When authenticating a service principal via OIDC/federated token (no secret), az login intermittently fails with:

Attempting Azure CLI login by using OIDC...
Error: No subscriptions found for ***.

Re-running the exact same command, unchanged, succeeds. We see this in ~10–20% of CI runs. Configuration (SP, tenant, subscription, RBAC) is stable and correct.

This is the CLI-side counterpart of the widely-reported GitHub Actions symptom in Azure/login#592 (where @isra-fel is investigating) and the retry request in Azure/login#591. Since azure/login just shells out to az login, I believe the behaviour originates here, in Profile.login().

Root cause analysis

In src/azure-cli-core/azure/cli/core/_profile.py, Profile.login() calls SubscriptionFinder.find_using_specific_tenant() → a single client.subscriptions.list(). I believe ARM transiently returns a successful HTTP 200 with an empty value[] (not an error), because newly-effective access can lag due to ARM/RBAC propagation ("up to 10 minutes," with ARM caching — see Troubleshoot Azure RBAC). The empty list then trips:

if not subscriptions and not allow_no_subscriptions:
    ...
    raise CLIError("No subscriptions found for {}.".format(username))

Crucially, the azure-core RetryPolicy cannot mitigate this: is_retry() returns False for any status_code < 400 and never inspects the body, so a 200-with-empty-list is terminal to the HTTP retry layer. The retry has to live at the application layer, where the empty result is evaluated.

Caveat: I've verified each link in the chain — empty-200 semantics (Subscriptions - List), the RetryPolicy behaviour (azure-core _retry.py), and the documented RBAC propagation latency — from primary sources, but I have not found an MS doc attributing this exact intermittent-OIDC symptom to the race verbatim; it is inferred and corroborated by Azure/login#592.

Proposed fix

A bounded retry of subscription discovery, firing only on the exact path that would otherwise raise (not subscriptions and not is_bare_mode and not allow_no_subscriptions), with exponential backoff + jitter and an environment variable to tune/disable it. This adds zero latency for --allow-no-subscriptions and --skip-subscription-discovery logins (excluded by construction). I have a worked-out patch design and unit tests ready to implement, and will open a PR as soon as the approach/layer is confirmed.

Question for maintainers

Before I open the PR: do you agree the fix belongs in azure-cli (Profile.login) rather than in azure/login? And is a small, env-tunable retry-on-empty acceptable here, or would you prefer a different approach (e.g. surfacing a clearer transient error)?

Expected behavior

A transient empty subscription response from ARM immediately after token issuance should not fail az login; it should be retried briefly before giving up, so legitimate logins do not fail ~10–20% of the time.

Environment summary

Metadata

Metadata

Labels

Accountaz login/accountAuto-AssignAuto assign by botAzure CLI TeamThe command of the issue is owned by Azure CLI teamLoginact-identity-squadcustomer-reportedIssues that are reported by GitHub users external to the Azure organization.questionThe issue doesn't require a change to the product in order to be resolved. Most issues start as that

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions