Alertika is a tool for alerting based on patterns of interest from data collected by Cloudwatch logs.
Alertika works by executing LogInsights queries against cloudwatch log groups and sending an SNS notification if a match is found. It keeps an in-memory database where it tracks rules that are in Alert mode and sends a recovery notification when the rule no longer matches.
Simply run make in the root folder:
make
The resulting binary will be stored in bin/alertika Current version does not support multiple aws accounts, so you need to run multiple versions of it for every account you want to monitor. Cwlogs-alert will inherit your aws configuration, so before your run it you need to specify which aws profile it will use:
export AWS_PROFILE=my-profile
The following invocation can be used to run alertika after compiling it:
$bin/alertika -config examples/config.toml
Configuration file contains 2 major sections general, and rules. The format is toml.
run_interval = "60s"
# golang text/template
sns_message_template = """
Rule {{.Rule.Name }} changed status to {{.State}}
At least {{ .Rule.NumEvents }} match(es) occured in the last {{ .Rule.Timeframe }}
Query:
{{ .Rule.Query }}
Result:
{{ .Result }}
"""
- run_interval - a string representing how long to wait before the query cycles (defaults to 60s ).
- template - a golang text/template that is used to render the message that will be send.
[rules]
# A timeframe string is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms" or "2h45m".
# Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
[rules.1]
name = "appAlpha exceptions"
log_group = "/aws/containerinsights/my-kubernetes-cluster/application"
num_events = 1
timeframe = "5m"
query = """
fields @message
| filter @message like /Exception/ and `kubernetes.labels.app_kubernetes_io/name` like 'appAlpha'
and `kubernetes.namespace_name` == 'production'
"""
sns_topic = "arn:aws:sns:us-west-2:1234565789:alerts"
[rules.2]
name = "appBravo exceptions"
log_group = "/aws/containerinsights/my-kubernetes-cluster/application"
num_events = 1
timeframe = "5m"
query = """
fields @message
| filter @message like /Exception/ and `kubernetes.labels.app_kubernetes_io/name` like 'appBravo'
and `kubernetes.namespace_name` == 'production'
"""
sns_topic = "arn:aws:sns:us-west-2:1234565789:alerts"
[rules.3]
name = "appCharlie exceptions"
log_group = "/aws/containerinsights/my-kubernetes-cluster/application"
num_events = 1
timeframe = "5m"
query = """
fields @message
| filter @message like /Exception/ and `kubernetes.labels.app_kubernetes_io/name` like 'appCharlie'
and `kubernetes.namespace_name` == 'production'
"""
sns_topic = "arn:aws:sns:us-west-2:1234565789:alerts"
[rule.X]
......
- name - Name of the rule
- log_group - the cloudwatchlogs log group that contains the log streams we would like to search in
- num_events - the minimum number of events our query should produce to trigger the alert
- timeframe - how long back in time we want to search (Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h")
- query - cloudwatchlog insights query to execute (use the CloudWatch Logs Insights console to validate the query)
- sns_topic - an aws sns topic to send the notification