-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathunusual_produce_time-variables.tf
63 lines (49 loc) · 1.73 KB
/
unusual_produce_time-variables.tf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
variable "unusual_produce_time_enabled" {
type = bool
default = true
}
variable "unusual_produce_time_warning" {
type = number
default = 10
}
variable "unusual_produce_time_critical" {
type = number
default = 20
}
variable "unusual_produce_time_evaluation_period" {
type = string
default = "last_30m"
}
variable "unusual_produce_time_note" {
type = string
default = ""
}
variable "unusual_produce_time_docs" {
type = string
default = <<EOFF
The TotalTimeMs metric family measures the total time taken to service a request (be it a produce, fetch-consumer, or fetch-follower request):
produce: requests from producers to send data
fetch-consumer: requests from consumers to get new data
fetch-follower: requests from brokers that are the followers of a partition to get new data
Under normal conditions, this value should be fairly static, with minimal fluctuations. If you are seeing anomalous behavior, you may want to check the individual queue, local, remote and response values to pinpoint the exact request segment that is causing the slowdown.
https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/#metric-to-watch-totaltimems
This monitor checks if there's an unusual high amount of failures. Which might be indicative of the application not being able to perform its task
EOFF
}
variable "unusual_produce_time_filter_override" {
type = string
default = ""
}
variable "unusual_produce_time_alerting_enabled" {
type = bool
default = true
}
variable "unusual_produce_time_require_full_window" {
type = bool
default = true
}
variable "unusual_produce_time_priority" {
description = "Number from 1 (high) to 5 (low)."
type = number
default = 3
}