The Azure extraction functionality is now available as a module in src/backend/.
The easiest way to authenticate with Azure is using Azure CLI:
# Install Azure CLI (macOS)
brew install azure-cli
# Login to Azure (opens browser for authentication)
az login
# Verify you're logged in
az account showAfter logging in, Azure credentials will be automatically used by the Python SDK.
Make sure you have your .env file with Azure credentials:
AZURE_DOC_KEY=your_azure_key_here
AZURE_EXISTING_AIPROJECT_ENDPOINT=your_endpoint_here
AZURE_EXISTING_AGENT_ID=your_agent_id_here
AZURE_SUBSCRIPTION_ID=your_subscription_id_hereNote: If you've run az login, you may not need all these variables depending on which Azure services you're using.
pip install -r requirements.txtpython src/backend/extractData.py front.jpg back.jpgfrom src.backend import get_ocr_lines, parse_front, parse_back, print_results
# Extract data
lines = get_ocr_lines("front.jpg")
front_data = parse_front(lines)
# Access fields
account = front_data.get("account_number")
total = front_data.get("invoice_total_eur")python example_usage.pyaz loginThen run your scripts normally - authentication is handled automatically.
Set these in your .env file for service principal authentication:
AZURE_TENANT_ID=your_tenant_id
AZURE_CLIENT_ID=your_client_id
AZURE_CLIENT_SECRET=your_client_secretSome Azure services support direct API key authentication via AZURE_DOC_KEY.
get_ocr_lines(file_path)- Extract text lines from image using Azureparse_front(lines)- Parse front page dataparse_back(lines)- Parse back page dataprint_results(front_data, back_data)- Pretty print extracted dataparse_date(s)- Helper to parse datesafter(lines, keyword, offset)- Helper to find text after keywordfind_re(text, pattern)- Helper to find regex patterns
- Customer info (name, address, account)
- Invoice details (number, dates, total)
- Service period and consumption
- Meter readings
- All charge breakdowns (supply, regulated, misc, VAT, municipality)
If you see DefaultAzureCredential failed to retrieve a token:
- Run
az loginto authenticate - Verify with
az account show - Make sure your account has permissions on the Azure resources
If you see permission errors, you need the appropriate role:
- Go to Azure Portal
- Navigate to your resource
- Access Control (IAM) → Add role assignment
- Assign "Azure ML Data Scientist" or "Contributor" role
Now all your other code is preserved, and you can easily import and use the extraction functions!