Site icon UnixArena

Setup Monitoring for EBS snapshot – Amazon DLM | EC2 AMI

This article will walk you through how to monitor the automated EBS snapshot failures using CloudWatch. Amazon data life cycle manager automates the creation, copy, and deregistration of EBS-backed AMIs using policies. Using DLM, AMI can be copied over to other regions for disaster recovery. Amazon DLM reduces the operational cost by doing the AMI cleanup based on the policies and eliminates the complexity of managing the backup operations. But we need to monitor the DLM operations to ensure that the EBS snapshot is happening without any failure. If there is a failure, a notification should be sent to the service owner.

AMI image monitoring using CloudWatch

Pre-requisite: Active DLM policy. To know how to create the Amazon Data Life Cycle Manager policy, please check it here.

CloudWatch – EBS Metrics:

1. Login to the AWS console and navigate to CloudWatch. Click on “All Metrics”.

CloudWatch Console – AWS

2. Select the EBS metrics.

EBS Metrics – CloudWatch

3. Click on the “Data Life Cycle Manager” metrics.

Data Lifecycle Manager Metrics – CloudWatch

4. Amazon DLM policy will be shown on the left plane and pre-defined metrics will be shown in the right column. I have chosen the “ImagesCopiedRegionFailed” metric to monitor.

ImagesCopiedRegionFailed – Cloudwatch Metric

5. Specify the metric and conditions for the selected alarm.

Specify metrics and conditions – Cloudwatch

6. Specify the threshold type. Threshold limits depend on how many image failures you can afford in the DR region. For example, if you have configured the DLM policy for a critical application, then you can’t afford a single image failure. I have configured the condition to trigger the alarm even if a single image fails.

Conditions – Image Failure threshold – Cloudwatch

7. Create a new SNS topic if you do not have one already. SNS service allows you to send notifications to various endpoints.

Create a new SNS topic – Cloudwatch

Note: You must need to verify the email address to get the notification from the AWS SNS topic.

Subscription confirmation – Cloudwatch

8. If you want to take any action against image copy failure using SSM, here are the options for you.

Monitoring Alert – Action Pipeline – Cloudwatch

9. Enter the alarm name.

Alarm name – Cloudwatch

10. Review the metrics and conditions.

Preview the policy – CloudWatch – EBS snapshot monitoring
Preview and Create Policy – Cloudwatch alarm

11. Create an alarm.

Create Alarm – CloudWatch

Once you have created the alarm, you can watch the details of the alarm in the dashboard. If any failures image copy failures, you will get a notification from SNS to the email endpoints.

Alarms – Overview

Hope this article is informative to you.

Exit mobile version