Securing Applications and Automation - Training Series
Section 5 of 6
Reference Guide

Maintenance and Management

Source Control, Observability, AI-Assisted Development, and Service Principal Hygiene


Section
Section 5 - Maintenance and Management
Document Type
Reference Guide
Format
Instructor-led with guided lab
Audience
Infrastructure and security administrators

Section purpose

Automation that runs silently and fails silently is worse than no automation, because administrators assume it is working when it is not. This section covers the operational discipline required to keep automation workloads maintainable and observable over time: source control, CI/CD deployment, safe use of AI-assisted development, monitoring and alerting, and service principal lifecycle hygiene. These practices apply across all the platforms covered in Section 4.

Learning Objectives

Operational Excellence Monitor Log Analytics Alert Azure Monitor Rotate SP credentials Maintain Source control Operational Excellence - configure before production -

Learning journey - the four operational excellence pillars covered in this section

Source control and version management

Day 0 Day 18 Day 30 Runbook starts failing Graph permission revoked 18 days of silence No alerts · No visibility · No action CA exclusion runbook: failing every run Break-glass needed CA exclusion missing Emergency access blocked Silent fail worse than no automation Monitor first before prod, not after Every section motivated by this scenario

Opening scenario - 18 days of silent failure on a CA exclusion runbook

? ? ? ? Check portal manually? Wait for someone to complain? Any alerts configured? Last 30 days any silent failures?

Discussion prompt - how would you detect a silent automation failure today?

Without Source Control No change history Who changed what? Portal edits invisible No review, no record No rollback path Bad deploy = manual fix No peer review Errors ship to production With Source Control Full audit history Git log shows every change Portal is read-only Only CI/CD deploys Instant rollback git revert in seconds PR review enforced 4-eyes before production Golden Rule Treat the Azure portal as READ-ONLY for production automation

Source control as a security requirement - the with/without comparison

Code Edit(Git branch) Pull RequestPeer review CI ValidationLint + Tests Merge to mainAuto-trigger Deploy to AzureWIF - no secrets Developer Team review PSScriptAnalyzer Protected branch Auditable + reproducible

CI/CD pipeline - the five stages from code edit to deploy

Why source control is required

Without source control, automation code exists only in the Azure portal or in a local file on someone's workstation. There is no change history, no peer review, no rollback capability, and no audit trail. Portal-edited runbooks are the automation equivalent of ungoverned infrastructure: they change without anyone knowing, they break without anyone understanding why, and they cannot be recovered to a known-good state.

Rule: treat the Azure portal as read-only for production automation code. All changes must go through a pull request in a Git repository.

What belongs in source control

repo/ automation-workloads runbooks/ .ps1 .py pipelines/ .yml .yaml infra/ .bicep ca-exclusion.ps1 graph-query.py automation.bicep logicapp.bicep logic-apps/ workflow.json tests/ Pester .pssa NOT in source control: secrets, credentials, .env files use Key Vault references portal edit = ungoverned change

Repository layout - what belongs (and what does not) in source control

Automation Account source control sync

Automation Accounts support native source control integration with GitHub and Azure DevOps. Configure sync to pull from a specific branch. The sync copies runbook files from the repository to the Automation Account automatically when changes are merged.

Important limitation: native source control sync has incomplete support for PowerShell 7.x runbooks. Teams using PS7 must deploy via a CI/CD pipeline instead.

GitHub / ADO main branch scheduled pull sync Automation Acct synced runbooks Portal edit not recommended Enforce read-only as a team rule Two sync approaches Native Source Control Sync GitHub / Azure DevOps Scheduled pull · Easy setup Limited PS7 support Good for PS5.1 runbooks CI/CD Pipeline Deploy GitHub Actions · Azure DevOps Validate → deploy on merge ✓ Full PS7 support Preferred for all teams Rule: portal is read-only for production runbooks All changes via Git → validated → deployed by pipeline

Automation Account source control sync - native sync vs CI/CD pipeline

CI/CD for runbook deployment

A CI/CD pipeline provides what native sync cannot: pre-deployment validation.

Basic GitHub Actions pipeline for runbook deployment:

name: Deploy Runbook
on:
  push:
    branches: [main]
    paths: ['runbooks/**']

permissions:
  id-token: write
  contents: read

jobs:
  validate-and-deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run PSScriptAnalyzer
        shell: pwsh
        run: |
          Install-Module PSScriptAnalyzer -Force -Scope CurrentUser
          $results = Invoke-ScriptAnalyzer -Path ./runbooks -Recurse -Severity Error,Warning
          if ($results) { $results; exit 1 }

      - uses: azure/login@v2
        with:
          client-id: ${{ vars.AZURE_CLIENT_ID }}
          tenant-id: ${{ vars.AZURE_TENANT_ID }}
          subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}

      - name: Deploy runbook
        run: |
          az automation runbook replace-content \
            --resource-group ${{ vars.RG_NAME }} \
            --automation-account-name ${{ vars.AA_NAME }} \
            --name MyRunbook \
            --content @./runbooks/MyRunbook.ps1

This pipeline runs PSScriptAnalyzer before deploying. Any severity Error or Warning result fails the pipeline and blocks deployment.

1. Commit runbook with PSSA violation 2. Pipeline runs PSScriptAnalyzer check Pipeline BLOCKED Fix violation 3. Fix & recommit remove violation Pipeline PASSES 4. Deploy to Azure az automation runbook PSScriptAnalyzer violation $pass = "hardcoded123" PSAvoidUsingPlainText...

CI/CD demo flow - violation blocks pipeline, fix unblocks deployment

Infrastructure as Code for automation resources

Define Automation Accounts, Logic Apps, Function Apps, RBAC role assignments, and managed identities in Bicep. Store IaC in the same repository as the code. This enables consistent, repeatable deployments and prevents configuration drift between environments.

AI-assisted development

1. Hardcoded Credentials AI suggests: $clientSecret = "abc123..." Fix: Use managed identity or Key Vault PSScriptAnalyzer: PSAvoidUsingPlainTextForPassword 2. Overly Permissive Scopes AI suggests: .ReadWrite.All to avoid permission errors Fix: Start with .Read, add only what the task needs Review: does the task actually write data? 3. Missing Error Handling AI ignores: 401/403 responses in token acquisition code Fix: Check status codes, surface failures to logging Silent auth failures → silent automated failures 4. Insecure Token Storage AI suggests: Write tokens to logs or environment variables Fix: Keep tokens in memory only, never log them A token in a log = a credential in a log

The four common weaknesses in AI-assisted automation code

Safe sandboxes

When using GitHub Copilot, Claude, or other AI coding assistants to write automation:

AI-generated code REVIEW Peer Code Review treat as untrusted input Dev / Test Sandbox Separate subscription or RG M365 Dev tenant for identity code NEVER test in production AI prompt safety rules No secrets in AI prompts tokens · passwords No prod data in AI prompts tenant IDs · PII No direct prod deployment review → test first

Safe sandboxes for AI-assisted coding - review, sandbox, never paste production data

What AI-generated code commonly gets wrong

AI models tend to generate code with predictable security weaknesses. Review every AI-generated script for:

PSScriptAnalyzer for PowerShell code review

PSScriptAnalyzer is a static analysis tool for PowerShell. Run it on every runbook before deployment:

Install-Module PSScriptAnalyzer -Scope CurrentUser

# Analyze a single runbook
Invoke-ScriptAnalyzer -Path .\MyRunbook.ps1 -Severity Error, Warning

# Analyze a directory recursively
Invoke-ScriptAnalyzer -Path .\runbooks -Recurse -Severity Error, Warning

# Include specific security rules
Invoke-ScriptAnalyzer -Path .\MyRunbook.ps1 `
  -IncludeRule PSAvoidUsingPlainTextForPassword, PSAvoidUsingConvertToSecureStringWithPlainText

PSScriptAnalyzer catches common security issues: plain-text passwords, use of deprecated cmdlets, missing error handling patterns, and code style violations that obscure intent.

PSScriptAnalyzer Pipeline runbook.ps1 $pass = "secret123" Connect-MgGraph -ClientSecret $pass Get-MgUser -All PSScriptAnalyzer Invoke-ScriptAnalyzer -Severity Error,Warning Violation found PSAvoidUsingPlainText ForPassword [line 1] "$pass" is plain text Pipeline blocked Dev fixes code in runbook.ps1 PSScriptAnalyzer again Pipeline deploys tools catch patterns reviewers catch logic both are required

PSScriptAnalyzer pipeline - block on violation, deploy on clean run

Bandit for Python

Bandit is the Python equivalent of PSScriptAnalyzer. Run it on Python runbooks and Function App scripts:

pip install bandit
bandit -r ./scripts/ -ll   # report medium and high severity
bandit -r ./scripts/ -ll Python static security analyzer · CI pipeline step Hardcoded Passwords B105 / B106 HIGH severity eval() use detected B307 MED severity Insecure random B311 MED severity >> Run started: 2024-03-18 Issue: [B105] hardcoded_password_string Severity: HIGH Confidence: MEDIUM Issue: [B307] eval Severity: MEDIUM Confidence: HIGH Total issues: 2 (HIGH: 1 MEDIUM: 1) Same principle as PSScriptAnalyzer Run in CI · block on HIGH · fix before deploy

Bandit for Python - same CI gate pattern for Python runbooks

Peer review as a security control

Static analysis tools catch syntax and pattern issues. Peer review catches logic errors, incorrect permission scope choices, and missing security considerations that tools cannot detect. Require at least one reviewer for all automation code changes, even in small teams.

Monitoring automation health

Platforms Platform telemetry Shared monitoring Response destinations Automation Accounts Logic Apps Function Apps Job history + diagnostic settings Run history + diagnostic settings Application Insights execution telemetry Log Analytics Workspace centralized logs + query Azure Monitor Alerts failure · missing-run · metric alerts Microsoft Sentinel SIEM / SOAR Teams / Email / Tickets action groups / connectors

Monitoring Stack

The silent failure problem

Automation that runs silently and fails silently is operationally dangerous. An emergency access CA exclusion runbook that has been failing for 18 days looks fine from the outside - until a major incident occurs and the break-glass account does not have the expected policy exclusions. The failure was invisible because there were no alerts.

Monitoring must be configured on every automation workload before it is deployed to production.

Automation Account monitoring

// Find failed runbook jobs in the last 24 hours
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.AUTOMATION"
| where Category == "JobLogs"
| where ResultType == "Failed"
| where TimeGenerated > ago(24h)
| project TimeGenerated, RunbookName_s, ResultDescription_s
| order by TimeGenerated desc
Automation Account Runbook jobs run here Diagnostic Settings -> JobLogs + JobStreams Log Analytics Workspace centralized log store KQL query AzureDiagnostics | where ResourceProvider == "MICROSOFT.AUTOMATION" | where ResultType == "Failed" | project TimeGenerated, RunbookName_s, ResultDescription_s Azure Monitor Alert fires on any failure result Portal job history exists, but it does not alert you configure BEFORE production

Automation Account monitoring - diagnostic settings → Log Analytics → alert

Log Analytics Workspace KQL query · AzureDiagnostics table AzureDiagnostics | where ResourceProvider == "MICROSOFT.AUTOMATION" | where Category == "JobLogs" | where ResultType == "Failed" | project TimeGenerated, RunbookName_s, ResultDescription_s ResourceProvider → scopes to Automation Account only Category == "JobLogs" → job status records (not output) ResultType == "Failed" → failures only - the important ones ResultDescription_s → the actual error message text Sample result: 2024-03-15 ca-exclusion-runbook Insufficient privileges to complete the operation

Log Analytics KQL anatomy - each clause has a teaching purpose

Azure Portal - Log Analytics Workspace: training-la-workspace Run KQL Editor AzureDiagnostics | where ResourceProvider == "MICROSOFT.AUTOMATION" | where ResultType == "Failed" | project TimeGenerated, RunbookName_s, ResultDescription_s TimeGenerated RunbookName_s ResultDescription_s 2024-03-15 08:14 ca-exclusion Insufficient privileges... 2024-03-14 08:14 ca-exclusion Insufficient privileges... This field is your error message Without this query, how would you know the job failed if you weren't looking in the portal?

Log Analytics portal view - failed job results with the error message field highlighted

Logic App run history monitoring

Logic App run history is visible in the portal under the workflow view. Enable diagnostic settings to send run history to Log Analytics:

// Find failed Logic App runs
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.LOGIC"
| where Category == "WorkflowRuntime"
| where status_s == "Failed"
| where TimeGenerated > ago(24h)
| project TimeGenerated, resource_runId_s, code_s, error_message_s
| order by TimeGenerated desc

Function App monitoring with Application Insights

Function App execution telemetry is captured in Application Insights. Query for failures:

// Application Insights - failed function executions
requests
| where success == false
| where timestamp > ago(24h)
| project timestamp, name, resultCode, duration, operation_Id
| order by timestamp desc

Enable Application Insights on every security automation Function App. Without it, failures are not queryable.

Function App Application Insights enable on every security Function App Execution telemetry start/end · duration · status Error tracking exceptions · stack traces Dependency tracing Graph API · Key Vault calls Performance metrics latency · throttle · p99 requests | where success == false | project timestamp, name, resultCode, duration Cost is minimal - the risk of no monitoring is not Enable before deployment · never disable for production

Application Insights for Function Apps - four telemetry pillars

Error alerting and runbook failure notifications

Configure Azure Monitor alerts so that automation failures create visible signals rather than silent voids.

Creating an alert for failed runbook jobs

  1. In the Azure portal, go to Monitor > Alerts > Create alert rule.
  2. Select the Automation Account as the scope.
  3. In Condition, select Total Job Runs metric, dimension Status = Failed, threshold > 0.
  4. In Actions, configure an action group that sends an email or Teams notification.
  5. In Details, name the alert and set severity.

Alternatively, create a Log Analytics alert rule that queries job logs:

AzureDiagnostics
| where ResourceProvider == "MICROSOFT.AUTOMATION"
| where Category == "JobLogs"
| where ResultType == "Failed"

Set this as a scheduled query alert with a frequency of 5 minutes and a threshold of 0 results.

Playbook deployed deployment succeeded ASSUMPTION It works because deploy succeeded months pass Real incident fires Sentinel triggers the playbook Playbook fails silently Logic App connection expired or permissions changed FIX: Synthetic test incident simulate trigger and verify the playbook end to end test before relying on it and retest after changes

Sentinel playbooks - the testing gap that makes them silently broken

Azure Monitor Alert Rule ResultType == Failed | every 5 min Scope Automation Account or Log Analytics workspace Condition metric: Failed jobs > 0 or KQL: ResultType=Failed Action Group email · Teams webhook ServiceNow ticket Severity Sev 1 for CA/break-glass Sev 2 for standard jobs Same pattern for Logic Apps and Function Apps Diag settings → Log Analytics · App Insights KQL queries One action group can serve all workloads Email on-call team Teams channel webhook ServiceNow auto-ticket Threshold: > 0 failures one failure = one alert - no grace margin on security automation

Azure Monitor alert rule anatomy - scope, condition, action group, severity

Connecting failures to incident management

Wire failure events to your incident management workflow:

⚠️Risk - Untested Sentinel Playbooks

A Sentinel playbook that has never been tested with a synthetic incident may have been silently broken for months. Running a playbook only on real incidents means the first time you discover it is broken is during an actual security event.

Service principal naming conventions and ownership

SP Naming Pattern secops - signinmonitor - prod display name in Entra ID [team] Who owns it? secops infra devteam [purpose] What does it do? signinmonitor caexclusion [env] Where does it run? prod dev staging More examples infra-bicepdeployer-prod devops-wifconnector-dev secops-orphandetector-prod Weak example app1 - MyApp - test123 A good name answers owner, purpose, and environment

SP naming convention - one display name answers owner, purpose, environment

App Registration secops-signinmonitor-prod Owner: jane.smith@contoso.com ✓ Notes: Owner: jane.smith@contoso.com Ticket: SEC-2847 Purpose: Sign-in anomaly detection Last reviewed: 2024-03-01 No owner assigned Creator leaves org → orphaned forever attack surface Graph API queryable notes field in response automated inventory monthly reports Notes field must contain: ● Owner email address ● ITSM ticket reference ● Purpose description (one sentence) ● Last reviewed date → becomes queryable metadata in Graph inventory

Owner field plus Notes field - making every SP self-documenting

Why naming conventions matter

Without a naming standard, SP inventories become unmanageable at scale. When a service principal named App1 or test_new shows up in sign-in logs, there is no way to determine ownership, purpose, or environment from the name alone. Orphan detection and lifecycle management depend on being able to identify what an SP is for from its display name alone.

Recommended naming convention

Use the pattern [team]-[purpose]-[env]:

This encodes team ownership, automation purpose, and target environment into every display name. Combined with the Notes field and owner assignment, it makes inventory and lifecycle management tractable at scale.

Notes field and owner requirement

The app registration Notes field is queryable via Graph API and can be included in automated inventory reports. Use it to record:

Every app registration must have at least one owner assigned in the Entra portal. Ownerless registrations should be flagged by automated hygiene checks. An SP whose creator left the organization becomes an orphan with no one responsible for it.

Service principal lifecycle hygiene

CreateName · Owner · Notes ConfigureLeast-priv · WIF/MI MonitorSign-ins · Failures Review90-day hygiene check RetireDisable → delete No owner =orphan risk Silent failures= blind automation Stale = attacksurface

Sp Hygiene

Credential expiry alerts

Windows PowerShell (training tenant) PS> Connect-MgGraph -Scopes "Application.Read.All" Welcome To Microsoft Graph! PS> $apps = Get-MgApplication -All PS> $threshold = (Get-Date).AddDays(30) PS> $apps | ForEach-Object { $_.PasswordCredentials | Where-Object { $_.EndDateTime -lt $threshold }} DisplayName DaysRemaining secops-signinmonitor-prod 7 infra-bicepdeployer-dev 18 devops-wifconnector-prod 24 legacy-noname-??? 3 Bad name owner unknown This takes 2 minutes to run What would you do with these results?

Credential expiry query - sample output highlighting bad names and 7-day-out secrets

Set expiry on all secrets and certificates. Build automation to alert on credentials expiring within 30, 60, and 90 days:

# Find credentials expiring in the next 30 days
$cutoff = (Get-Date).AddDays(30)

Get-MgApplication -All | ForEach-Object {
    $app = $_
    $app.PasswordCredentials | Where-Object {
        $_.EndDateTime -lt $cutoff -and $_.EndDateTime -gt (Get-Date)
    } | ForEach-Object {
        [PSCustomObject]@{
            DisplayName = $app.DisplayName
            AppId       = $app.AppId
            SecretHint  = $_.Hint
            ExpiresOn   = $_.EndDateTime
        }
    }
} | Sort-Object ExpiresOn

Key Vault emits Event Grid events when certificates approach expiry. Use these events to trigger Logic Apps or Automation Account runbooks that alert or initiate rotation.

90-day activity review

Enterprise apps with no recent sign-in activity are candidates for review and possible deletion. Use sign-in logs as a first-pass heuristic by checking whether the app ID has appeared in the last 90 days:

Import-Module Microsoft.Graph.Applications
Import-Module Microsoft.Graph.Reports
Connect-MgGraph -Scopes "Application.Read.All", "AuditLog.Read.All"

# Enterprise apps with no sign-in activity in the past 90 days
$cutoff = (Get-Date).AddDays(-90)
$cutoffString = $cutoff.ToUniversalTime().ToString("yyyy-MM-ddTHH:mm:ssZ")

Get-MgServicePrincipal -All | ForEach-Object {
    $sp = $_
    $signIns = Get-MgAuditLogSignIn `
    -Filter "appId eq '$($sp.AppId)' and createdDateTime ge $cutoffString" `
        -Top 1

    if (-not $signIns) {
        [PSCustomObject]@{
            DisplayName = $sp.DisplayName
            AppId       = $sp.AppId
            ObjectId    = $sp.Id
        }
    }
} | Format-Table -AutoSize
📚Prerequisite Note

This query requires both Application.Read.All and AuditLog.Read.All, plus the Microsoft.Graph.Applications and Microsoft.Graph.Reports modules if you are using Graph PowerShell. Treat the result as a review list, not an automatic delete list, because appId-based sign-in activity is a heuristic rather than a complete ownership signal.

Credential expiry policies

Use Entra application authentication method policies to enforce maximum credential lifetimes. Restrict secrets to a maximum of 180 days (or less, depending on your policy). This prevents indefinitely-lived secrets from accumulating.

Orphan detection

An orphaned SP is one with no current owner assigned and no recent sign-in activity. Build a query that surfaces ownerless registrations:

# Find app registrations with no owner
Get-MgApplication -All | Where-Object {
    -not (Get-MgApplicationOwner -ApplicationId $_.Id)
} | Select-Object DisplayName, AppId, CreatedDateTime

Combine with the 90-day activity heuristic for a combined orphan report: no owner and no recent sign-in activity.

Get-MgApplication -All Has assigned owner? Get-MgApplicationOwner YES NO Ownerless Assign owner or flag for review Sign-in within 90 days? servicePrincipal sign-in logs YES ✓ Active - no action Re-check next cycle NO Has active credentials? passwordCredentials / keyCredentials YES HIGH RISK Inactive + creds = attack surface NO Orphan candidate - review & retire Automate monthly Route results to SP governance team

Graph API orphan detection - three decisions, four outcomes

Lab readiness notes

🧪Lab Readiness

Have sample runbook code, analyzer output, KQL examples, and Graph credential reports ready. Log Analytics and alert rules may not show results immediately; use the saved output if the live signal is delayed. Clean up alert rules, test failures, and sample service principals created during the lab.

Guided lab

Lab goal

Configure source control sync for an Automation Account, lint a runbook with PSScriptAnalyzer, inventory expiring Entra application credentials, send Automation diagnostics to Log Analytics, and create a Monitor alert rule from a Log Analytics query.

Prerequisites

Task 1: Source control sync for Automation Account

Azure Portal:

Create the source control connection
  1. Open the Automation Account.
  2. Go to Source control and add a new connection.
  3. Use GitHub, the main branch, and the /runbooks folder.
  4. After the first sync job, open the imported runbook and confirm the portal warns that the file is source-controlled.

Az CLI:

Create the source control connection and trigger a sync job
$resourceGroupName = "<resource-group>"
$automationAccountName = "<automation-account>"
$repoUrl = "https://github.com/<github-owner>/<repo-name>.git"
$githubToken = gh auth token

# Source control sync requires the Automation Account managed identity and a Contributor assignment on the Automation Account itself.
$automationId = az automation account show --automation-account-name $automationAccountName --resource-group $resourceGroupName --query id -o tsv
$automationPrincipalId = az automation account show --automation-account-name $automationAccountName --resource-group $resourceGroupName --query identity.principalId -o tsv

az role assignment create `
  --assignee-object-id $automationPrincipalId `
  --assignee-principal-type ServicePrincipal `
  --role Contributor `
  --scope $automationId | Out-Null

az automation source-control create `
  --resource-group $resourceGroupName `
  --automation-account-name $automationAccountName `
  --name github-runbooks `
  --repo-url $repoUrl `
  --branch main `
  --source-type GitHub `
  --folder-path /runbooks `
  --access-token $githubToken `
  --token-type PersonalAccessToken `
  --auto-sync false `
  --publish-runbook true | Out-Null

# In PowerShell, preserve the required empty string for commit-id with the stop-parsing operator.
az --% automation source-control sync-job create --resource-group <resource-group> --automation-account-name <automation-account> --source-control-name github-runbooks --job-id 11111111-1111-1111-1111-111111111111 --commit-id ""

az automation runbook list `
  --automation-account-name $automationAccountName `
  --resource-group $resourceGroupName `
  --query "[].{name:name,state:state}" `
  -o table

Expected outcomes:
- The sync job reaches Succeeded
- The source-controlled runbook appears in the Automation Account
- The imported runbook shows as Published

Task 2: PSScriptAnalyzer on a runbook

Native PowerShell:

Create a sample runbook and lint it
@'
function Invoke-LegacyLogin {
  param([string]$Password)
  Write-Host "Using password input"
}

Invoke-LegacyLogin -Password "hardcoded-password-value-here"
$password = ConvertTo-SecureString "plaintext" -AsPlainText -Force
'@ | Set-Content -Path .\Sample-Runbook.ps1 -Encoding utf8

Install-Module PSScriptAnalyzer -Scope CurrentUser -Force

Invoke-ScriptAnalyzer -Path .\Sample-Runbook.ps1 -Severity Error, Warning

Invoke-ScriptAnalyzer -Path .\Sample-Runbook.ps1 `
  -IncludeRule PSAvoidUsingPlainTextForPassword,PSAvoidUsingConvertToSecureStringWithPlainText,PSAvoidUsingWriteHost

Expected outcomes:
- PSScriptAnalyzer flags the plain-text password rules
- Write-Host is called out as a lint issue in the sample file
- Re-running after a fix removes the matching rule from the results

Task 3: Graph API credential expiry query

Native PowerShell:

Use a Graph token plus raw REST
$cutoff = (Get-Date).AddDays(30)
$token = az account get-access-token --resource-type ms-graph --query accessToken -o tsv
$headers = @{ Authorization = "Bearer $token" }
$uri = "https://graph.microsoft.com/v1.0/applications?`$select=displayName,appId,passwordCredentials&`$top=999"
$results = @()

do {
  $response = Invoke-RestMethod -Method GET -Uri $uri -Headers $headers
  $results += $response.value
  $uri = $response.'@odata.nextLink'
} while ($uri)

$results | ForEach-Object {
  $app = $_
  $app.passwordCredentials | Where-Object {
    [datetime]$_.endDateTime -lt $cutoff -and [datetime]$_.endDateTime -gt (Get-Date)
  } | ForEach-Object {
    [PSCustomObject]@{
      DisplayName = $app.displayName
      AppId       = $app.appId
      ExpiresOn   = $_.endDateTime
      DaysLeft    = [int]([datetime]$_.endDateTime - (Get-Date)).TotalDays
    }
  }
} | Sort-Object DaysLeft | Format-Table -AutoSize

Graph PowerShell:

Use Invoke-MgGraphRequest for the same inventory
Connect-MgGraph -Scopes "Application.Read.All"

$cutoff = (Get-Date).AddDays(30)
$uri = "https://graph.microsoft.com/v1.0/applications?`$select=displayName,appId,passwordCredentials&`$top=999"
$results = @()

do {
  $response = Invoke-MgGraphRequest -Method GET -Uri $uri
  $results += $response.value
  $uri = $response.'@odata.nextLink'
} while ($uri)

$results | ForEach-Object {
  $app = $_
  $app.passwordCredentials | Where-Object {
    [datetime]$_.endDateTime -lt $cutoff -and [datetime]$_.endDateTime -gt (Get-Date)
  } | ForEach-Object {
    [PSCustomObject]@{
      DisplayName = $app.displayName
      AppId       = $app.appId
      ExpiresOn   = $_.endDateTime
      DaysLeft    = [int]([datetime]$_.endDateTime - (Get-Date)).TotalDays
    }
  }
} | Sort-Object DaysLeft | Format-Table -AutoSize

Az CLI:

Stay in CLI but keep the same pagination logic
$cutoff = (Get-Date).AddDays(30)
$uri = "https://graph.microsoft.com/v1.0/applications?`$select=displayName,appId,passwordCredentials&`$top=999"
$results = @()

do {
  $page = az rest --method get --uri $uri | ConvertFrom-Json
  $results += $page.value
  $uri = $page.'@odata.nextLink'
} while ($uri)

$results | ForEach-Object {
  $app = $_
  $app.passwordCredentials | Where-Object {
    [datetime]$_.endDateTime -lt $cutoff -and [datetime]$_.endDateTime -gt (Get-Date)
  } | ForEach-Object {
    [PSCustomObject]@{
      DisplayName = $app.displayName
      AppId       = $app.appId
      ExpiresOn   = $_.endDateTime
      DaysLeft    = [int]([datetime]$_.endDateTime - (Get-Date)).TotalDays
    }
  }
} | Sort-Object DaysLeft | Format-Table -AutoSize

Expected outcomes:
- The query returns app registrations with credentials expiring inside the chosen window
- The output includes display name, app ID, expiry, and days remaining

Task 4: Log Analytics job diagnostics

Azure Portal:

Enable the Automation diagnostic categories you actually need
  1. Open the Automation Account.
  2. Go to Diagnostic settings and add a setting.
  3. Enable JobLogs, JobStreams, and AuditEvent.
  4. Send them to the Log Analytics workspace.

Az CLI:

Create the diagnostic setting
$automationId = az automation account show `
  --automation-account-name "<automation-account>" `
  --resource-group "<resource-group>" `
  --query id `
  -o tsv

$workspaceId = az monitor log-analytics workspace show `
  --resource-group "<workspace-resource-group>" `
  --workspace-name "<workspace-name>" `
  --query id `
  -o tsv

az monitor diagnostic-settings create `
  --name send-to-law `
  --resource $automationId `
  --workspace $workspaceId `
  --logs '[{"category":"JobLogs","enabled":true},{"category":"JobStreams","enabled":true},{"category":"AuditEvent","enabled":true}]' `
  --metrics '[{"category":"AllMetrics","enabled":true}]'

Az CLI:

Query the workspace
$workspaceCustomerId = az monitor log-analytics workspace show `
  --resource-group "<workspace-resource-group>" `
  --workspace-name "<workspace-name>" `
  --query customerId `
  -o tsv

az monitor log-analytics query `
  --workspace $workspaceCustomerId `
  --analytics-query 'AzureDiagnostics | where ResourceProvider == "MICROSOFT.AUTOMATION" | where Category in ("JobLogs", "JobStreams", "AuditEvent") | where TimeGenerated > ago(1h) | project TimeGenerated, RunbookName_s, Category, ResultType, ResultDescription_s | order by TimeGenerated desc'

Expected outcomes:
- The diagnostic setting is created successfully
- After the first post-enable runbook job, the Automation records land in Log Analytics
- You can pivot on Category, RunbookName_s, and ResultType

Task 5: Monitor alert for failed runbook

Azure Portal:

Create the alert from the Monitor blade
  1. Go to Monitor > Alerts > Create.
  2. Use the Log Analytics workspace or the Automation Account as the scope, depending on whether you want a query-based or metric-based alert.
  3. Keep the alert disabled until the first successful query proves your data shape.

Az CLI:

Create a disabled scheduled-query alert rule first
$workspaceId = az monitor log-analytics workspace show `
  --resource-group "<workspace-resource-group>" `
  --workspace-name "<workspace-name>" `
  --query id `
  -o tsv

az monitor scheduled-query create `
  --resource-group "<resource-group>" `
  --name "Lab5-RunbookAlert" `
  --scopes $workspaceId `
  --condition "count 'AutomationJobs' > 0" `
  --condition-query AutomationJobs="AzureDiagnostics | where ResourceProvider == 'MICROSOFT.AUTOMATION' | where Category == 'JobLogs' | where ResultType == 'Failed' | where TimeGenerated > ago(15m)" `
  --evaluation-frequency 15m `
  --window-size 15m `
  --severity 3 `
  --disabled true

Expected outcomes:
- The rule is created successfully
- You can inspect the condition safely before enabling it
- Once diagnostics are flowing and the query returns the expected schema, you can remove --disabled true

Common pitfalls

Admin takeaways

5 Operational Principles Source control is mandatory Portal edit = ungoverned · use Git → CI/CD → deploy AI code has predictable weaknesses Hardcoded secrets · overprivileged APIs · missing error handling Static analysis in CI is free insurance PSScriptAnalyzer · Bandit · zero meaningful time cost Silent failure is operationally dangerous Alert on threshold > 0 failures · configure before production Every SP needs name, owner, expiry monitoring Orphanless · purposeful · monitored credential lifecycle These practices compound - skip one, and others stop working

Section takeaways - five operational principles that compound together

Quick recap questions

  1. Why is editing a production runbook directly in the Azure portal a governance problem?
  2. What are two common security issues in AI-generated automation code?
  3. Which PSScriptAnalyzer rule catches plain-text passwords?
  4. What monitoring must be enabled on a Function App to query failures in Log Analytics?
  5. What three fields should the app registration Notes field always contain?
  6. What defines an orphaned service principal?

Key reminders

Section 5 to Section 6 Section 5 Source control CI/CD pipelines Static analysis Monitoring SP lifecycle Section 6 Bicep IaC azd deploy CA exclusion solution Maester security UAL monitoring Course bridges Bicep infrastructure as code Managed ID secretless auth WIF CI/CD without static secrets azd up single command for provision, deploy, and configure everything from this course comes together in Section 6

Section 5 to Section 6 - practices carry forward into solution packaging

Current source notes

Volatile platform behavior and dated claims in this section were checked against these current sources on April 26, 2026:

References

References Appendix

All links below were reviewed on 2026-03-10.

Source control and CI/CD

Static analysis and code review

Monitoring and observability

Service principal lifecycle

Sentinel playbook testing

Section 5 - Maintenance and Management - Reference Guide