Skip to content

Commit

Permalink
Adding documentation for temporarily disabling policies
Browse files Browse the repository at this point in the history
  • Loading branch information
Brunoga-MS committed Feb 29, 2024
1 parent d45e198 commit 0a0dadb
Show file tree
Hide file tree
Showing 8 changed files with 83 additions and 6 deletions.
19 changes: 14 additions & 5 deletions docs/content/patterns/alz/Disabling-Policies.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,23 @@ geekdocCollapseSection: true
weight: 60
---

The policies in AMBA provide multiple methods to enable or disable the effects of the policy.
The policies in AMBA provide multiple methods to enable or disable the effects of the policy.

1. **Parameter: AlertState** - Determines the state of the alert rule. This either deploys an alert rule in a disabled state, or disables an already deployed alert rule at scale trough policy.
1. **Parameter: PolicyEffect** - Determines the effect of a Policy Definition, allowing a Policy to be deployed in a disabled state.
1. **Tag: MonitorDisable** - A tag that determines whether the resource should be evaluated. Allows you to exclude selected resources from monitoring.
2. **Parameter: PolicyEffect** - Determines the effect of a Policy Definition, allowing a Policy to be deployed in a disabled state.
3. **Tag: MonitorDisable** - A tag that determines whether the resource should be evaluated. Allows you to exclude selected resources from monitoring.

## AlertState parameter
Recognizing that it is not always possible to test alerts in a dev/test environment, we have introduced the AlertState parameter for all metric alerts (in the initiatives and the example parameter file the parameter is named combining {resourceType}, {metricName} and AlertState, for example VnetGwTunnelIngressAlertState). This is to address a scenario where an alert storm occurs and it is necessary to disable one or more alerts deployed via policies through a controlled process. This could be considered for a roll-back process as part of a change request.

Recognizing that it is not always possible to test alerts in a dev/test environment, we have introduced the AlertState parameter for all metric alerts (in the initiatives and the example parameter file the parameter is named combining {resourceType}, {metricName} and AlertState, for example VnetGwTunnelIngressAlertState). This is to address a scenario where an alert storm occurs and it is necessary to disable one or more alerts deployed via policies through a controlled process. This could be considered for a roll-back process as part of a change request.

### Allowed values

- "true" - Alert rule will be enabled. (Default)
- "false" - Alert rule will be disabled.

### How it works

The AlertState parameter is used for both compliance evaluation and configuration of the state of the alert rule. The value of the **AlertState** parameter is passed on to the **enabled** parameter which is part of the existenceCondition of the Policy.

```json
Expand Down Expand Up @@ -55,14 +59,17 @@ These are the high-level steps that would need to take place:
Note that the above approach will not delete the alerts objects in Azure, merely disable them. To delete the alerts you will have to do so manually. Also note that while you can engage the PolicyEffect to avoid deploying new alerts, you should not do so until you have successfully remediated the above. Otherwise the policy will be disabled, and you will not be able to turn alerts off via policy until that is changed back.

## PolicyEffect parameter

In general, we evaluate the alert rules on best practices, field experience, customer feedback, type of alert and possible impact. There are situations where disabling the policy makes sense to prevent receiving unnecessary and/ or duplicate alerts/ notifications. For example we deploy an alert rule for VPN Gateway Bandwidth Utilization, in turn we have disabled the alert rules for VPN Gateway Egress and Ingress.
The default is intended to provide a well balanced baseline. However you may want to Enable or Disable the creation of certain Alert rules to meet your needs.

### Allowed values

- "deployIfNotExists" - Policy will deploy the alert rule if the conditions are met. (Default for most Policies)
- "disabled" - The policy itself will be created but will not create the corresponding Alert rule.

### How it works

The PolicyEffect parameter is used for the configuration of the effect of the PolicyDefinition (in the initiatives and the example parameter file the parameter is named combining {resourceType}, {metricName} and PolicyEffect, for example ERCIRQoSDropBitsinPerSecPolicyEffect) . The value of the **PolicyEffect** parameter is passed on to the **effect** parameter which configures the effect of the Policy.

```json
Expand All @@ -84,9 +91,11 @@ The PolicyEffect parameter is used for the configuration of the effect of the Po
```

## MonitorDisable parameter

It´s also possible to exclude certain resources from being monitored. You may not want to monitor pre-production or dev environments. The MonitorDisable parameter contains the Tag name to determine whether a resource should be included. By default, creating the tag MonitorDisable with value "true" will prevent deployment of alert rules on those resources. This is easily adjusted to use existing tags, for example you could configure the parameter with the tag name "Environment" and tell it to deploy only if the tag value equals "prod", or when the tag isnt equal to "dev". Currently only the tag name is a parameter, other changes require minor changes in the code.

### How it works

The policyRule only continues if "allOff" is true. Meaning, the deployment will continue as long as the MonitorDisable tag doesn't exist or doesn't hold the value "true". When the tag holds "true", the "allOff" will return "false" as "notEquals": "true" is no longer satisfied, causing the deployment to stop

```json
Expand All @@ -103,4 +112,4 @@ The policyRule only continues if "allOff" is true. Meaning, the deployment will
}
]
}
```
```
11 changes: 10 additions & 1 deletion docs/content/patterns/alz/Policy-Initiatives.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,4 +122,13 @@ This initiative is intended for assignment of policies relevant to service healt
| Deploy_activitylog_ServiceHealth_HealthAdvisory | [deploy-activitylog-ServiceHealth-Health.json](../../../services/Resources/subscriptions/Deploy-ActivityLog-ServiceHealth-Health.json) | deployIfNotExists |
| Deploy_activitylog_ServiceHealth_Incident | [deploy-activitylog-ServiceHealth-Incident.json](../../../services/Resources/subscriptions/Deploy-ActivityLog-ServiceHealth-Incident.json) | deployIfNotExists |
| Deploy_activitylog_ServiceHealth_Maintenance | [deploy-activitylog-ServiceHealth-Maintenance.json](../../../services/Resources/subscriptions/Deploy-ActivityLog-ServiceHealth-Maintenance.json) | deployIfNotExists |
| Deploy_AlertProcessing_Rule | [deploy-alertprocessingrule-deploy.json](../../../services/AlertsManagement/actionRules/Deploy-AlertProcessingRule-Deploy.json) | deployIfNotExists |
| Deploy_ServiceHealth_ActionGroups | [deploy-ServiceHealth-ActionGroups.json](../../../services/Resources/subscriptions/Deploy-ServiceHealth-ActionGroups.json) | deployIfNotExists |

## Notification Assets initiative

This initiative is intended for assignment of policies relevant to notification in ALZ. With the guidance provided in [Introduction to deploying the ALZ Pattern](../deploy/Introduction-to-deploying-the-ALZ-Pattern), this will assign to the alz intermediate root management group structure in the ALZ reference architecture. For details on which policies are included in the initiative as well as what the default enablement state of the policy is, refer to the below table.

| **Policy Display Name** | **Reference ID** | **Path to policy json file** | **Policy default effect** |
|----------|----------|----------|----------|
| Deploy AMBA Notification Assets | ALZ_AlertProcessing_Rule | [deploy-AlertProcessingRule-deploy.json](../../../services/AlertsManagement/actionRules/Deploy-AlertProcessingRule-Deploy.json) | deployIfNotExists |
| Deploy AMBA Notification Suppression Asset | ALZ_Suppression_AlertProcessing_Rule | [deploy-AlertProcessingRule-Suppression.json](../../../services/AlertsManagement/actionRules/Deploy-AlertProcessingRule-Suppression.json) | deployIfNotExists |
52 changes: 52 additions & 0 deletions docs/content/patterns/alz/Temporarily-disabling-notifications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
---
title: Temporarily disabling notifications
geekdocCollapseSection: true
weight: 65
---

Azure Monitor alerts targeted to a large scope allow for at scale coverage, but reduce the flexibility to disable them for specific resources. There might be several reason to stop the notification of alerts. For instance customers could have resources that are stopped or disabled due to maintenance or just want to stop the notification during the night shift. To allow this kind of flexibility, as part of the Notification Assets policy initiative, AMBA provides you with an asset to stop the notification for specific resources.

This asset is made of an alert processing rule (aka APR) with the following characteristics:

- deployed as disabled
- scoped at the subscription level
- suppression rule type
- scheduled to run always

This APR, needs to be configured with the resource id of the resource(s) for which you want to stop notifications and then enabled every time you need it.

Once the resource is out of the maintenance period or when you don't need the suppression rule anymore, ***remember*** to remove the resources and disable the rule.

To know more about how to suppress notifications, see [Suppress notifications during planned maintenance](https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-processing-rules?tabs=portal#suppress-notifications-during-planned-maintenance)

To configure the APR do the following:

1. In **Monitor --> Alerts**, click on **Alert processing rules**

![Monitor/Alerts/Alert processing rule](../media/AlertProcessingRules.png)

2. Click on the ARP named ***apr-AMBA-<mark>subscription display name</mark>-002*** with rule type **Suppression**

![Suppression aler processing rule](../media/SuppressionAlertProcessingRule.png)

3. Click on ***Edit***

![Edit alert processing rule](../media/Edit-AlertProcessingRule.png)

4. In the **Scope** tab, under the filter section, configure the following:

- Filters: ***Resource***
- Operator: ***Equals***
- Value: **Enter the <mark>resource Id</mark> of resources separated by comma <mark>with no spaces before, after or between the strings.</mark>**

![Configure filter](../media/Filter-AlertProcessingRule.png)

{{< hint type=Important >}}
Each filter can include up to 5 values. Should you need more than 5 resources, add a more lines of filter.
{{< /hint >}}

5. Click on ***Review + save*** and then ***Save***

{{< hint type=Note >}}
It is possible to apply other types of filter. For a complete list of allowed scopes and filters, refer to the official [Scope and filters for alert processing rules](https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-processing-rules?tabs=portal#scope-and-filters-for-alert-processing-rules) documentation.
{{< /hint >}}
7 changes: 7 additions & 0 deletions docs/content/patterns/alz/deploy/Remediate-Policies.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,10 @@ $LZManagementGroup="The management group id for Landing Zones"
.\patterns\alz\scripts\Start-AMBARemediation.ps1 -managementGroupName $pseudoRootManagementGroup -policyName Alerting-ServiceHealth
.\patterns\alz\scripts\Start-AMBARemediation.ps1 -managementGroupName $pseudoRootManagementGroup -policyName Notification-Assets
```

Should you need to remediate just one policy definition and not the entire policy initiative, you can run the remediation script targeted at the policy reference id that can be found under [Policy Initiatives](../../Policy-Initiatives). For example, to remediate the ***Deploy AMBA Notification Assets*** policy, run the command below:

```powershell
#Run the following command to initiate remediation of a single policy definition
.\patterns\alz\scripts\Start-AMBARemediation.ps1 -managementGroupName $pseudoRootManagementGroup -policyName ALZ_AlertProcessing_Rule
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 0a0dadb

Please sign in to comment.