Aws Ecs Monitor OpenClaw Skill - ClawHub
Do you want your AI agent to automate Aws Ecs Monitor workflows? This free skill from ClawHub helps with devops & cloud tasks without building custom tools from scratch.
What this skill does
AWS ECS production health monitoring with CloudWatch log analysis — monitors ECS service health, ALB targets, SSL certificates, and provides deep CloudWatch log analysis for error categorization, restart detection, and production alerts.
Install
npx clawhub@latest install aws-ecs-monitorFull SKILL.md
Open original| name | version | description |
|---|---|---|
| aws-ecs-monitor | 1.0.1 | AWS ECS production health monitoring with CloudWatch log analysis — monitors ECS service health, ALB targets, SSL certificates, and provides deep CloudWatch log analysis for error categorization, restart detection, and production alerts. |
AWS ECS Monitor
Production health monitoring and log analysis for AWS ECS services.
What It Does
- Health Checks: HTTP probes against your domain, ECS service status (desired vs running), ALB target group health, SSL certificate expiry
- Log Analysis: Pulls CloudWatch logs, categorizes errors (panics, fatals, OOM, timeouts, 5xx), detects container restarts, filters health check noise
- Auto-Diagnosis: Reads health status and automatically investigates failing services via log analysis
Prerequisites
awsCLI configured with appropriate IAM permissions:ecs:ListServices,ecs:DescribeServiceselasticloadbalancing:DescribeTargetGroups,elasticloadbalancing:DescribeTargetHealthlogs:FilterLogEvents,logs:DescribeLogGroups
curlfor HTTP health checkspython3for JSON processing and log analysisopensslfor SSL certificate checks (optional)
Configuration
All configuration is via environment variables:
| Variable | Required | Default | Description |
|---|---|---|---|
ECS_CLUSTER |
Yes | — | ECS cluster name |
ECS_REGION |
No | us-east-1 |
AWS region |
ECS_DOMAIN |
No | — | Domain for HTTP/SSL checks (skip if unset) |
ECS_SERVICES |
No | auto-detect | Comma-separated service names to monitor |
ECS_HEALTH_STATE |
No | ./data/ecs-health.json |
Path to write health state JSON |
ECS_HEALTH_OUTDIR |
No | ./data/ |
Output directory for logs and alerts |
ECS_LOG_PATTERN |
No | /ecs/{service} |
CloudWatch log group pattern ({service} is replaced) |
ECS_HTTP_ENDPOINTS |
No | — | Comma-separated name=url pairs for HTTP probes |
Directories Written
ECS_HEALTH_STATE(default:./data/ecs-health.json) — Health state JSON fileECS_HEALTH_OUTDIR(default:./data/) — Output directory for logs, alerts, and analysis reports
Scripts
scripts/ecs-health.sh — Health Monitor
# Full check
ECS_CLUSTER=my-cluster ECS_DOMAIN=example.com ./scripts/ecs-health.sh
# JSON output only
ECS_CLUSTER=my-cluster ./scripts/ecs-health.sh --json
# Quiet mode (no alerts, just status file)
ECS_CLUSTER=my-cluster ./scripts/ecs-health.sh --quiet
Exit codes: 0 = healthy, 1 = unhealthy/degraded, 2 = script error
scripts/cloudwatch-logs.sh — Log Analyzer
# Pull raw logs from a service
ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh pull my-api --minutes 30
# Show errors across all services
ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh errors all --minutes 120
# Deep analysis with error categorization
ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh diagnose --minutes 60
# Detect container restarts
ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh restarts my-api
# Auto-diagnose from health state file
ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh auto-diagnose
# Summary across all services
ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh summary --minutes 120
Options: --minutes N (default: 60), --json, --limit N (default: 200), --verbose
Auto-Detection
When ECS_SERVICES is not set, both scripts auto-detect services from the cluster:
aws ecs list-services --cluster $ECS_CLUSTER
Log groups are resolved by pattern (default /ecs/{service}). Override with ECS_LOG_PATTERN:
# If your log groups are /ecs/prod/my-api, /ecs/prod/my-frontend, etc.
ECS_LOG_PATTERN="/ecs/prod/{service}" ECS_CLUSTER=my-cluster ./scripts/cloudwatch-logs.sh diagnose
Integration
The health monitor can trigger the log analyzer for auto-diagnosis when issues are detected. Set ECS_HEALTH_OUTDIR to a shared directory and run both scripts together:
export ECS_CLUSTER=my-cluster
export ECS_DOMAIN=example.com
export ECS_HEALTH_OUTDIR=./data
# Run health check (auto-triggers log analysis on failure)
./scripts/ecs-health.sh
# Or run log analysis independently
./scripts/cloudwatch-logs.sh auto-diagnose --minutes 30
Error Categories
The log analyzer classifies errors into:
panic— Go panicsfatal— Fatal errorsoom— Out of memorytimeout— Connection/request timeoutsconnection_error— Connection refused/resethttp_5xx— HTTP 500-level responsespython_traceback— Python tracebacksexception— Generic exceptionsauth_error— Permission/authorization failuresstructured_error— JSON-structured error logserror— Generic ERROR-level messages
Health check noise (GET/HEAD /health from ALB) is automatically filtered from error counts and HTTP status distribution.