skip to Main Content

I’m currently trying to monitor the EKS Node group status, sometimes my node groups show degraded and I want a CloudWatch alert whenever the status is in a Degraded state, I checked CloudWatch Metrics there are no standard metrics, and even I’m unable to find the event in Cloud trail,

enter image description here

Is there any possibility’s to creating the alarm using AWS Cloud trail events, Event bridge, or CloudWatch
Kindly help to find the solution for this

2

Answers


  1. I think you can combine Lambda & CloudWatch & EventBridge service here to implement your simple health-check status for a single or multiple node groups.

    For your health check Lambda function:

    1. We create a Lambda with Python3 (3.9 for example)
    2. We describe the node group using Boto3
    3. We put a custom metric to CloudWatch metrics so if the status is Active, we put 1 else 0.

    When we have the function ready, we prepare the every 1 minutes (up to you) setup.

    1. We create an EventBridge (EB) rule with every 1 min triggers
    2. The EB rule destination is the Lambda function

    Once we have enough data points from CloudWatch metrics, we can create a CloudWatch alarm to help us notifying to E-mail or others.

    References:

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search