- When running
pint watch
command/health
HTTP endpoint can now be used for liveness probes.
# pint file/owner
comments were not validated properly for files with no rules. This is now fixed.
-
Added
schema
option to theparser
configuration block for setting rule validation schema. This option is only used when files are parsed in strict mode - which is when rule file path does NOT match any of theparser:relaxed
regex values or when simplyparser:relaxed
is not set at all. Default value isprometheus
and tells pint to expect rule files with the schema expected by Prometheus itself. If you use pint to validate rules loaded into Thanos Rule component then setschema
tothanos
in your pint config file:parser { schema = "thanos" }
File schema when using
schema: prometheus
(default):groups: - name: example rules: - record: ... expr: ... - alert: ... expr: ...
When using
schema: thanos
:groups: - name: example partial_response_strategy: abort rules: - record: ... expr: ... - alert: ... expr: ...
-
Rules configured in
pint
config can now be locked - when a rule is locked it cannot be disabled by users by adding a# pint disable ...
or# pint snooze ...
comments.
- The console reporter won't color the output if
--no-color
flag is set.
- Added rule/report check.
-
pint now uses Prometheus 3.0 libraries for parsing PromQL, which adds support for new query syntax that allows for dots and UTF-8 chars in metric/label names, example:
{"status.üp"} == 0
-
promql/rate will now report a warning if it detects a
rate(sum(...))
but doesn't have metadata to confirm if...
is a counter or not.
- Fixed a crash when parsing
vector()
calls with non-number arguments.
- Improved the accuracy of alerts/template check.
- Improved message formatting in the alerts/template check.
-
Added
--checkstyle
flag topint lint
&pint ci
for writing XML report file incheckstyle
format - #1129. -
Added
--json
flag to bothpint lint
andpint ci
commands, this enables writing a JSON file with the report of all problems #606. -
promql/fragile will now warn when alerting rules are using one of the aggregation operation that can return different series on every evaluation, which can cause alert floppiness - #820.
-
promql/regexp check now supports extra configuration options to disable reports on smelly selector - #1096.
-
Checks can be enabled or disabled specifically for some Prometheus rules via
rule {}
config blocks. Addingenable
ordisable
option with a list of checks names allows to selectively enable or disable checks only for Prometheus rules that match givenrule {}
definition. Enabling checks only for matching rules will only work if these checks are disabled globally viacheck { disabled = [] }
config block. For example to disablepromql/rate
check for all rules except alerting rules in therules/critical
folder:checks { // This will disable promql/rate by default. disabled = [ "promql/rate" ] } rule { match { path = "rules/critical/.*" kind = "alerting" } // This will enable promql/rate only for Prometheus rules matching all our match conditions above. enable = [ "promql/rate" ] }
- alerts/template check was refactored and will now produce more accurate results. Messages produced by this check might include details of the PromQL query fragment causing the problem if the query is complex enough.
- Don't try to create GitLab comments on unmodified lines - #1147.
- Fixed message formatting in the promql/series check.
- Added
fallbackTimeout
option to the promql/series check that controls how much time pint can spend checking other Prometheus servers for missing metrics.
- Reverted
Fixed colored output on some environments - #1106
change as it was breaking GitHub report comments. - Fixed panics in rule/duplicate check.
- Fixed GitHub actions permissions.
- Improved accuracy of the rule/duplicate check.
- Fixed GitHub reporter trying to create pull request comments to unmodified lines - #1120.
- Fixed colored output on some environments - #1106.
- Show correct line number when reporting YAML syntax errors.
- Fixed a bug in
match
blockstate
handling that caused all rules to always match any state.
- promql/regexp check will now look for smelly regexp selectors. See check docs for details.
- promql/range_query now allows to configure a custom maximum duration for range queries - #1064.
- Added
--enabled
flag to the pint command. Passing this flag will only run selected check(s). - Added
state
option to the rulematch
block. See configuration docs for details.
- Don't try to report problem on unmodified files when using GitHub reporter.
- If there is a pint config file present then pint will now always add it to the
parser
blockexclude
list. This is to avoid trying to parse it as a rule file if it's included in the same folder as rules.
- Validate
ignoreLabelsValue
option values in the pint config.
- Added
ignoreLabelsValue
to promql/series check settings. - Added
include
&exclude
options were moved from theci
to theparser
configuration block. They now apply to all pint commands, not justpint ci
- #631.
- Added alerts/absent check.
- Added rule/name check - #1020.
- promql/vector_matching will now report more details, including which Prometheus server reports problems and which part of the query is the issue.
- GitHub report code was refactored, it should behave as before.
- When running
pint ci
pint will now run all checks on unmodified rules when afile/disable
comment was removed.
- Fixed false positive warnings from alerts/comparison when using
absent_over_time()
. - GitHub reporter will now wait before making more requests if it's rate limited - #699.
- When using BitBucket reporter
pint ci
might create a comment longer than the limit allowed by BitBucket. To avoid this pint will now truncate long comments.
- Fixed false positive warnings from rule/dependency check.
- promql/series check will now generate warnings if there are
# pint disable
or# pint rule/set
comments that are not matching any valid query selector or Prometheus server.
- promql/series will now parse
rule/set
comments that target specific time series selectors in PromQL the same way as# pint disable
comments do.
- Fixed false positive reports about removed rules when running
pint ci
on a branch that contains YAML syntax errors.
- Fixed release workflow on GitHub.
- Don't suggest using
humanize
when alert template is already using printf on format the$value
. - Fixed git history parsing when running
pint ci
on a branch that include merge commits.
- Refactored YAML syntax checks to avoid using rulefmt.Parse and effectively parsing rules twice. Some error messages will have different formatting.
- alerts/annotations and rule/label will now report when a label or annotation is required but set to an empty value.
- Fixed handling of
# pint ignore/line
comments on line including multiple#
characters. - When reporting problems all messages will now use
publicURI
from eachprometheus
definition.
- Added GitLab support when running
pint ci
.
- Reduced severity of problems reported by promql/counter due to high number of false positives. It will now report warnings instead of errors.
- Fixed false positive reports from rule/duplicate when a rule file is being deleted and the rules are moved to a different file.
- Correctly handle default Prometheus retention value in promql/range_query check - #958.
- Fixed false positives from rule/duplicate check
when running
pint ci
on files that are both edited and renamed in the same PR. - rule/dependency will no longer report removed symlinks.
- Fixed compatibility with older git releases.
- Fixed
maxComments
defaulting to zero when using GitHub reporter. - Fixed
maxComments
comment creation on GitHub - #935.
- Fixed release workflow on GitHub actions.
- GitHub and BitBucket reporters now supports
maxComments
option to limit the number of comments pint can create on a single pull request. Default value ofmaxComments
is50
.
- query/cost will now only create reports if a query is more expensive than any of the configured limits.
- Fixed duplicated reports when using BitBucket reporter.
- promql/counter check will now consider a metric
to be a counter only if all metadata entries for it use
TYPE counter
. Previously it would check for at least one metadata entry withTYPE counter
.
-
Added promql/counter check - #655.
-
Check using custom rules now accept an optional
comment
option for adding a text comment to all reported problems. Checks supportingcomment
option:- alerts/annotation
- alerts/count
- promql/aggregate
- query/cost
- rule/for
- rule/label
- rule/link
- rule/reject Example:
for { comment = "All alert rules must have `for: 5m` (or more) to avoid flaky alerts." severity = "bug" min = "5m" }
- labels/conflict check will now warn if alerting rule is setting labels that are already configured as external labels.
-
pint watch
command now has two sub-commands:pint watch glob
andpint watch rule_files
. See watch mode docs for details. -
promql/series check will now ignore missing metrics if the query uses a fallback with
or vector(1)
. This will reduce the number of reported missing metrics for queries where the intention is to accept and ignore missing metrics. Example of a rule that will no longer report problems:- alert: Foo expr: sum(my_metric or vector(0)) > 1
- Fixed false positive reports from rule/duplicate when using symlinks.
- Fixed support for multi-document YAML files when using relaxed parsed mode - #746.
-
rule/dependency check will now warn if an alerting rule that's being removed in a pull request is being used inside
ALERTS{alertname="..."}
orALERTS_FOR_STATE{alertname="..."}
queries. -
pint will now perform extra validation of YAML files to ensure that all values are mappings and strings. This will error on rules where, for example, label value is unquoted number:
Bad rule:
- alert: DeadMansSwitch expr: vector(1) labels: priority: 5
Good rule:
- alert: DeadMansSwitch expr: vector(1) labels: priority: "5"
- A large part of rule parsing code was refactored and more problems will now be deduplicated.
- Improved validation of labels and annotations.
- pint will now report any comment that looks like a pint control comment but cannot be parsed correctly.
- Fixed a crash when running
pint ci
and usingignore/file
comments.
- Both alerts/annotation and rule/label
now support more advance validation of label and annotation values with extra
token
option. In addition to thevalue
regexp matching you can also validate values against a static list of allowed values using newvalues
option. See both checks documentation for detail.
- More reports will now be merged into a single comments when using BitBucket.
- Fixed YAML anchor parsing.
- Fixed regexp matching for label names in rule/label.
- Fixed a few bugs in control comment parsing code.
- BitBucket tasks were not marked as resolved correctly.
- Added rule/dependency check.
- When running
pint ci
pint will now first try to parse all files in current working directory, before checking for files modified on current branch. This is to have a full list of all rules, which is needed for checks like newly added rule/dependency. This can slowpint
runs if there's a lot of files in your repository. If there are non-rule files these may fail to parse and result in check errors. To avoid any errors or slowdowns from scanning unrelated files you might need to addci
section to.pint.hcl
withinclude
and/orexclude
options set. See examples/ci.hcl for an example config. Bug
andFatal
severity problems are now reported as tasks when using BitBucket.- promql/regexp will now check for more problems with vector selectors.
pint
will no longer run dynamic Prometheus discovery when--offline
flag is passed.
- Fixed alert preview link on
alerts/count
reports.
pint ci
now uses a new logic for deciding which rules have changed when validating pull requests. Changes that would previously be invisible to pint, like modifying comments or moving the entire rules file, will now trigger checks for affected rules.- pint will now try to create fewer BitBucket comments by merging multiple problem reports into a single comment.
- Control comment handling code was refactored, there are some additional rules
that comment must follow. See
Control comments
section in pint docs.
- Fixed false positive reports from
promql/regexp
- #782. Information
level reports using BitBucket were using wrong comment icon.
alerts/count
check wasn't usinguptime
field fromprometheus
config blocks for metric gap detection.
- Added alerts/external_labels check.
- Added support for reporting problems to TeamCity using Service Messages.
To enable run it run
pint --teamcity lint
orpint --teamcity ci
. - Problems reported to BitBucket and GitHub will now include more details.
- Added
publicURI
field toprometheus
configuration blocks.
- When promql/series finds that a time series used by a rule is missing it will now also check other defined Prometheus servers and add that information to the report. This allows pint to flag rules that are most likely deployed to the wrong servers, using missing scrape jobs.
- Reporting problems to BitBucket will now use comments instead of annotations. This is only if there is an open pull request for tested branch, if there is no open pull request problems will be reported using code insight annotations.
- Fixed
pint watch
to correctly work withdiscovery
.
- Prometheus
template
underdiscovery
block can now templateinclude
andexclude
fields.
- Prometheus servers can now be dynamically configured using one of supported discovery mechanism: file paths or Prometheus metrics query.
-
alerts/template
check can now report problems with alerting rules when trying to use templates on a query that doesn't produce any labels at all. For example when usingvector(...)
function:{% raw %}
- alert: DeadMansSwitch expr: vector(1) annotations: summary: Deadman's switch on {{ $labels.instance }} is firing
{% endraw %}
-
alerts/comparison
check can now warn if alerting rules use a query withfoo > 0 OR vector(1)
, which would always fire. -
alerts/template
check will now look checkon(...)
clause on binary expressions. Whenon(...)
is set only labels listed there will appear on result metrics. For exampleapp_type
here cannot appear on query results, even if it's present onfoo
time series.- alert: ... expr: foo / on (instance, app_name) bar annotations: summary: ... {{ $labels.app_type }} ...
- Added support for
keep_firing_for
in alerting rules - #713. - Added
rule/keep_firing_for
check - #713. - Added
alerts/count
check will now estimate alerts usingkeep_firing_for
field if set - #713. - Configuration rule
match
block supports a new filterkeep_firing_for
.
- The
query/cost
check can now use Prometheus query stats to verify query evaluation time and the number of samples used by a query. See query/cost docs for details.
- Fixed a crash in
promql/series
check when Prometheus instance becomes unavailable - #682. - Fixed false positive reports in
alerts/template
check - #681.
- Rule names were not checked for correctly, allowing for rules with empty names to pass checks.
- Fixed GitHub annotations being added to unmodified lines - #645.
- Added
exclude
option toci
config block - #609. - Added
minCount
&severity
toalerts/count
check - #612. This allows to only show estimated alerts count only if there would be high enough (>= minCount
) number of alerts. Settingseverity
as well allows to block rules that would create too many alerts. - GitHub reporter will now included folded list of all problems in the summary comment - #608.
- Fixed
alerts/annotation
check regexp matching - #613. - When running
pint ci
using GitHub integration annotation comments are now reported only on modified lines - #640.
- Fixed
--base-branch
flag handling when branch name contains/
.
- Added
--fail-on
flag topint lint
command - #570. - Added
tls
section toprometheus
configuration block - #540.
- If a query run by pint fails because it was too expensive too run it will now be reported as a warning instead of an error.
- When validating queries using
{__name__=~"...", foo="bar"}
selectors pint could end up running queries matching a single label, likecount({foo="bar"})
, which could return too many results. This version ensures that queries always include name matcher to avoid that.
alerts/template
check didn't correctly handlelabel_replace()
calls in queries - #568.
- Fixed
--base-branch
flag handling. Value of this flag wasn't being used correctly - #559.
- Fixed incorrect results in
promql/series
check for time series with only a single data point.
- Fixed parsing of alert
for
field values with long durations (for: 1d
).
-
Added
--fail-on
flag topint ci
command - #525. -
promql/rate will now look for queries that call
rate()
on results ofsum(counter)
via recording rules. Example:- record: my:sum expr: sum(http_requests_total) - alert: my alert expr: rate(my:sum[5m])
-
Added rule/for check.
-
Added
owners
configuration block for setting the list of allowed rule owner values. See configuration for details.
pint lint
output will now include severity level as a text label - #524.
- Fixed a bug in
pint ci
that would cause a failure if a directory was renamed. - Fixed false positive reports from
promql/series
check when a time series disappears from Prometheus. - Fixed Prometheus flags parsing in
promql/range_query
check.
- Allow snoozing checks for entire file using
# pint file/snooze ...
comments. - Added
lookbackRange
andlookbackStep
configuration option to the promql/series check - #493.
- Reverted GitHub integration to use Pull Request Review API - #490.
- GitHub integration now uses Check Runs API - #478.
# pint file/disable
comments didn't properly handle Prometheus tags, this is fixed now.
prometheus
configuration blocks now acceptstags
field with a list of tags. Tags can be used to disable or snooze specific checks on all Prometheus instances with that tag. See ignoring for details.
- Added
pint_rule_file_owner
metric.
- Added ability to expand environment variables in pint configuration file. See configuration for details.
-
Use uber-go/automaxprocs to automatically set GOMAXPROCS to match Linux container CPU quota.
-
Added labels/conflict check.
-
If you want to disable individual checks just for some time then you can now snooze them instead of disabling forever.
The difference between
# pint disable ...
and# pint snooze ...
comments is that the snooze comment must include a timestamp. Selected check will be disabled until that timestamp. Timestamp must either use RFC3339 syntax orYYYY-MM-DD
(if you don't care about time and want to snooze until given date). Examples:# pint snooze 2023-01-12T10:00:00Z promql/series # pint snooze 2023-01-12 promql/rate - record: ... expr: ...
-
Removed
cache
option fromprometheus
config blocks. Query cache will now auto-size itself as needed.If you have a config entry with
cache
option, example:prometheus "prod" { uri = "https://prometheus.example.com" cache = 20000 }
Then pint will fail to start. To fix this simply remove the
cache
option:prometheus "prod" { uri = "https://prometheus.example.com" }
- Added rule/duplicate check.
- Fixed a regression causing poor query cache hit rate.
-
Added
uptime
field inprometheus
configuration block. This field can be used to set a custom metric used for Prometheus uptime checks and by default usesup
metric. If you have a Prometheus with a large number of scrape targets there might be a huge number ofup
time series making those uptime checks slow to run. If your Prometheus is configured to scrape itself, then you most likely want to use one of metrics exported by Prometheus, likeprometheus_build_info
:prometheus "prod" { uri = "https://prometheus.example.com" uptime = "prometheus_build_info" }
- Refactored some queries used by promql/series check to avoid sending queries that might be very slow and/or return a huge amount of data.
- Prometheus query cache now takes into account the size of cached response.
This makes memory usage needed for query cache more predictable.
As a result the
cache
option forprometheus
config block now meansthe number of time series cached
instead ofthe number of responses cached
and the default for this option is now50000
.
- promql/vector_matching was sending expensive queries resulting in high memory usage, this is now fixed.
- Added
pint_prometheus_cache_evictions_total
metric tracking the number of times cache results were evicted from query cache. - Allow disabling individual checks for the entire file using
# pint file/disable ...
comments.
- Refactored query cache to only store queries that are requested more than once. This will avoid storing big responses that are never requested from the cache.
- Config validation will now check for duplicated
prometheus
block names.
- Fixed performance regression slowing down
pint watch
over time.
-
prometheus
configuration block now accepts optionalheaders
field, for setting request headers that will be attached to any request made to given Prometheus server. Example:prometheus "protected" { uri = "https://prod.example.com" headers = { "X-Auth": "secret", "X-User": "bob" }
- Prometheus range query handling was rewritten to improve memory usage caused by queries returning huge number of results. As a result pint should use up to 5x less memory.
-
Fixed false positive reports in promql/vector_matching for rules using
on(...)
. Example:sum(foo) without(instance) * on(app_name) group_left() bar
-
Don't log passwords when Prometheus URI is using basic authentication.
-
Fixed false positive reports in alerts/template suggesting to use
humanize
on queries that already useround()
. -
Fixed false positive reports in alerts/comparison when
bool
modifier is used on a condition that is guarded by another condition. Example:alert: Foo expr: (foo > 1) > bool 1
-
Fixed false positive reports in alerts/template warning about labels removed in a query despite being re-added by a join.
- Fixed incorrect line number reporting on BitBucket annotations.
- Fixed handling of symlinks when running
pint lint
andpint watch
commands.
-
BitBucket only allows for annotations on modified lines, so when a high severity problem is reported on unmodified line pint will move that annotation to the first modified line, so it's still visible in BitBucket. Now pint will also add a note to that annotation to make it clear that the problem is really on a different line.
-
alerts/template will now run extra checks to validate syntax of queries executed from within alerting rule templates.
Example template using
sum(xxx
query that's missing closing)
:{% raw %}
- alert: ... expr: ... annotations: summary: | {{ with query "sum(xxx" }} {{ . | first | value | humanize }} {{ end }}
{% endraw %}
-
If a file is ignored pint will now note that using
Information
level annotation. This will make it more obvious that a CI check passed because pint didn't run any checks due to file being excluded.
-
Prometheus rule files can be symlinked between directories. If the symlink source and target files are in a different directory they can end up querying different Prometheus server when running ping checks. This means that when modifying symlink target file checks must be executed against both symlink source and target. Until now pint was ignoring symlinks but starting with this release it will try to follow them. This means that if you modify a file that has symlinks pointing to them pint will try run checks against those symlinks too.
NOTE: pint can only detect and check symlinks if they are located in the current working directory (as seen by running pint process) or its sub-directories.
- Fixed a regression in promql/vector_matching that would cause a panic when parsing function calls with optional arguments.
- promql/vector_matching was incorrectly handling queries containing function calls with multiple arguments.
- Revert 'Use smaller buffers when decoding Prometheus API responses' change.
- Use smaller buffers when decoding Prometheus API responses.
- Fixed wrong request formatting for Prometheus metric metadata queries.
- Switched from using prometheus/client_golang API client to streaming JSON library prymitive/current
- Avoid reporting same issue multiple times in
promql/rate
andpromql/regexp
checks.
- Updated Prometheus modules to v2.38.0.
This adds support for
toTime
template function.
- Fixed symlink handling when running
pint lint
.
- Remove noisy debug logs.
- Added
pint_prometheus_cache_miss_total
metric.
- Reduce log level for
File parsed
messages.
- Purge expired cache entries faster to reduce memory usage.
- Fix
absent()
handling in alerts/comparison #330.
- Added
--min-severity
flag to thepint lint
command. Default value is set towarning
.
- Fix a regression in promql/vector_matching introduced in previous release.
- Fix promql/series disable comments not working when there are multiple comments on a rule.
- promql/series no longer emits an information message
metric is generated by alerts ...
.
- Don't use
topk
in promql/vector_matching check to avoid false positives.
- promql/rate check will now also validate
deriv
function usage. - alerts/annotation check will now recommend using one of
humanize functions if alert query is returning results based on
rate()
and the value is used in annotations.
-
promql/series check now supports more flexible
# pint disable promql/series(...)
comments. Adding a comment# pint disable promql/series({cluster="dev"})
will disable this check for any metric selector withcluster="dev"
matcher. -
query/cost check will now calculate how much Prometheus memory will be needed for storing results of given query.
bytesPerSample
option that was previously used to calculate this was removed. -
prometheus {}
config block now allows to pass a list of paths to explicitly ignore by settingexclude
option. Existingpaths
option was renamed toinclude
for consistency. Example migration:prometheus "foo" { [...] paths = [ "rules/.*" ] }
becomes
prometheus "foo" { [...] include = [ "rules/.*" ] }
pint_last_run_checks
andpint_last_run_checks_done
were not updated properly.
- Deduplicate reports where possible to avoid showing same issue twice.
- rule/link check for validating URIs found in alerting rule annotations.
- Add more details to BitBucket CI reports.
- More compact console output when running
pint lint
.
- promql/range_query check.
- Strict parsing mode shouldn't fail on template errors, those will be later
reported by
alerts/template
check.
- All timeout options are now optional. This includes following config blocks:
prometheus { timeout = ... }
repository { bitbucket { timeout = ... } }
repository { github { timeout = ... } }
pint
will now try to discover all repository settings from environment variables when run as part of GitHub Actions workflow and so it doesn't need anyrepository { github { ... } }
configuration block for that anymore. SettingGITHUB_AUTH_TOKEN
is the only requirement for GitHub Actions now.
- Fixed line reporting on some strict parser errors.
- Added
--base-branch
flag topint ci
command.
- Added rate limit for Prometheus API requests with a default value of 100
requests per second. To customise it set
rateLimit
field inside selectedprometheus
server definition. - Added
pint_last_run_checks
andpint_last_run_checks_done
metrics to track progress when runningpint watch
.
- Improved range query cache efficiency.
- Added extra global configuration for
promql/series
check. See check documentation for details. prometheus
server definition inpint
config file can now accept optionalcache
field (defaults to 10000) to allow fine tuning of built-in Prometheus API query caching.- Added
pint_prometheus_cache_size
metric that exposes the number of entries currently in the query cache.
- Improved error reporting when strict mode is enabled.
- Fixed high memory usage when running range queries against Prometheus servers.
-
The way
pint
sends API requests to Prometheus was changed to improve performance.First change is that each
prometheus
server definition inpint
config file can now accept optionalconcurrency
field (defaults to 16) that sets a limit on how many concurrent requests can that server receive. There is a new metric that tracks how many queries are currently being run for each Prometheus server -pint_prometheus_queries_running
.Second change is that range queries will now be split into smaller queries, so if
pint
needs to run a range query on one week of metrics, then it will break this down into multiple queries each for a two hour slot, and then merge all the results. Previously it would try to run a single query for a whole week and if that failed it would reduce time range until a query would succeed.
- Strict parsing mode didn't fully validate rule group files, this is now fixed and pint runs the same set of checks as Prometheus.
- Fixed
promql/series
handling of rules with{__name__=~"foo|bar"}
queries. - If Prometheus was stopped or restarted
promql/series
would occasionally report metrics as "sometimes present". This check will now try to find time ranges with no metrics in Prometheus and ignore these when checking if metrics are present.
pint_prometheus_queries_total
andpint_prometheus_cache_hits_total
metric wasn't always correctly updated.- Ignore
unknown
metric types inpromql/rate
.
promql/rate
check will now report ifrate()
orirate()
function is being passed a non-counter metric.
- pint will now correctly handle YAML anchors.
-
Parsing files in relaxed mode will now try to find rules inside multi-line strings #252. This allows direct linting of k8s manifests like the one below:
--- kind: ConfigMap apiVersion: v1 metadata: name: example-app-alerts labels: app: example-app data: alerts: | groups: - name: example-app-alerts rules: - alert: Example_Is_Down expr: kube_deployment_status_replicas_available{namespace="example-app"} < 1 for: 5m labels: priority: "2" environment: production annotations: summary: "No replicas for Example have been running for 5 minutes"
- Fixed incorrect line reported when pint fails to unmarshal YAML file.
- Allow fine tuning
promql/series
check with extra control comments# pint rule/set promql/series min-age ...
and# pint rule/set promql/series ignore/label-value ...
See promql/series for details. promql/regexp
will report redundant use of regex anchors.
promql/series
will now report missing metrics only if they were last seen over 2 hours ago by default. This can be customised per rule with comments.
- Fix problem line reporting for
rule/owner
check. - Add missing
rule/owner
documentation page.
- Fixed false positive reports from
promql/series
check when runningpint watch
.
- Added
pint_last_run_duration_seconds
metric. - Added
--require-owner
flag support topint ci
command.
- Better handling of YAML unmarshal errors.
- Fixed false positive reports from
alerts/template
check whenabsent()
is used inside a binary expression.
- File parse errors didn't report correct line numbers when running
pint ci
.
- File parse errors were not reported correctly when running
pint ci
.
- Handle
504 Gateway Timeout
HTTP responses from Prometheus same as query timeouts and retry with a shorter range query.
- When running
pint ci
all checks will be skipped if any commit contains[skip ci]
or[no ci]
string in the commit message.
-
By default pint will now parse all files in strict mode, where all rule files must have the exact syntax Prometheus expects:
groups: - name: example rules: - record: ... expr: ...
Previous releases were only looking for individual rules so
groups
object wasn't required. Now pint will fail to read any file that doesn't follow Prometheus syntax exactly. To enable old behaviour addparser { relaxed = ["(.+)", ...]}
option in the config file. See Configuration for details. To enable old (relaxed) behaviour for all files add:parser { relaxed = ["(.*)"] }
- Improved
promql/vector_matching
checks to detect more issues. - Fixed reporting of problems detected on unmodified lines when running
pint ci
.
- Fixed false positive reports from
alerts/template
check whenabsent()
function is receiving labels from a binary expression.
- When running
pint watch
exported metric can includeowner
label for each rule. This is useful to route alerts based onpint_problem
metrics to the right team. To set a rule owner add a# pint file/owner $owner
comment in a file, to set an owner for all rules in that file. You can also set an owner per rule, by adding# pint rule/owner $owner
comment around given rule. To enforce ownership comments in all files pass--require-owner
flag topint lint
.
promql/series
check no longer runs duplicated checks on source metrics when a query depends on a recording rule added in the same PR.
promql/series
check was reporting that a metric stopped being exported when check queries would require a few retries.
promql/series
check was reporting bothWarning
andBug
problems for the same metric when it was using newly added recording rule.
- Fixed false positive reports from
promql/fragile
whenfoo OR bar
is used inside aggregation.
- Use more efficient queries for
promql/series
check. - Fixed YAML parsing panics detected by Go 1.18 fuzzing.
- Improved query cache hit rate and added
pint_prometheus_cache_hits_total
metric to track the number of cache hits.
- When a range query returns
query processing would load too many samples into memory
error and we retry it with smaller time range cache this information and start with that smaller time range for future calls to speed up runningpint watch
.
- Always print the number of detected problems when running
pint lint
. promql/series
check was refactored and will now detect a range of problems. See promql/series for details.promql/regexp
severity is nowBug
instead of aWarning
.promql/rate
check will no longer produce warnings, it will only report issues that cause queries to never return anything.
-
Allow matching alerting rules by
for
field - #148. Example:rule { match { for = ">= 10m" } }
-
Regexp matchers used in check rules can now reference rule fields. See Configuration for details.
- Added
filename
label topint_problem
metric - #170. - Include Prometheus server URI in reported problems.
- Fixed
pint ci
handling when a file was added to git and then removed in the next commit.
yaml/parse
was using incorrect line numbers for errors caused by duplicated YAML keys.
- Don't use fail-over Prometheus servers in case of errors caused by the query
itself, like
many-to-many matching not allowed
.
-
yaml/parse
error will be raised if a rule file contains duplicated keys, example:- record: foo expr: sum(my_metric) expr: sum(my_metric) without(instance)
prometheus
config block now allows to specify fail-over URIs usingfailover
field. If fail-over URIs are set and main URI fails to respond pint will attempt to use them in the order specified until one of them works.prometheus
config block now allows to define how upstream errors are handled usingrequired
field. Ifrequired
is set totrue
any check that depends on remote Prometheus server will be reported asbug
if it's unable to talk to it. Ifrequired
is set tofalse
pint will only emitwarning
level results. Default value forrequired
isfalse
. Set it totrue
if you want to hard fail in case of remote Prometheus issues. Note that setting it totrue
might block PRs when runningpint ci
until pint is able to talk to Prometheus again.- Renamed
pint/parse
toyaml/parse
and added missing documentation for it.
- Added
pint_last_run_time_seconds
andpint_rules_parsed_total
metrics when runningpint watch
.
promql/comparison
only applies to alerts, so it was renamed toalerts/comparison
.- Online documentation hosted at cloudflare.github.io/pint was reworked.
alerts/count
check will now retry range queries with shorter time window onfound duplicate series for the match group ...
errors from Prometheus.
pint_prometheus_queries_total
andpint_prometheus_query_errors_total
metrics were not incremented correctly.
- Added
promql/regexp
check that will warn about unnecessary regexp matchers. - Added
pint_prometheus_queries_total
andpint_prometheus_query_errors_total
metric when runningpint watch
.
- Fixed a number of bug with
promql/vector_matching
check.
query/series
check was renamed topromql/series
.
- Improved the logic of
promql/vector_matching
check.
- Removed
lines
label frompint_problem
metric exported when runningpint watch
. - Multiple
match
andignore
blocks can now be specified per eachrule
.
- Export
pint_version
metric when runningpint watch
. - Added
--min-severity
flag topint watch
command.
- Added
--max-problems
flag topint watch
command.
- Updated Prometheus modules to v2.33.0.
This adds support for
stripPort
template function.
- Added new
promql/fragile
check. - BitBucket reports will now include a link to documentation.
--workers
flag to control the number of worker threads for running checks.
- More aggressive range reduction for
query processing would load too many samples into memory
errors when sending range queries to Prometheus servers.
- Added
command
filter tomatch
/ignore
blocks. This allows to include skip some checks when (for example) runningpint watch
but include them inpint lint
run.
- Cache each Prometheus server responses to minimise the number of API calls.
pint watch
will start a daemon that will continuously check all matching rules and expose metrics describing all discovered problems.
alerts/annotation
andrule/label
now includerequired
flag value in# pint disable ...
comments. Rename# pint disable alerts/annotation($name)
to# pint disable alerts/annotation($name:$required)
and# pint disable rule/label($name)
to# pint disable rule/label($name:$required)
.--offline
and--disabled
flags are now global, usepint --offline lint
instead ofpint lint --offline
.
promql/rate
,query/series
andpromql/vector_matching
checks were not enabled for all definedprometheus {}
blocks unless there was at least onerule {}
block.annotation
basedmatch
blocks didn't work correctly.
- File renames were not handled correctly when running
git ci
on branches with multiple commits.
- Allow disabling
query/series
check for individual series using# pint disable query/series(my_metric_name)
comments.
- Fixed docker builds.
aggregate
check didn't report stripping required labels on queries using aggregation with no grouping labels (sum(foo)
).aggregate
check didn't test for name and label matches on alert rules.
template
check will now include alert query line numbers when reporting issues.
- Labels returned by
absent()
are only from equal match types (absent(foo="bar")
, notabsent(foo=~"bar.+")
butalerts/template
didn't test for match type when checking for labels sourced fromabsent()
queries.
aggregate
check was refactored and uses to run a single test for bothby
andwithout
conditions. As a result this check might now find issues previously undetected. Check suppression comments will need to be migrated:# pint disable promql/by
becomes# pint disable promql/aggregate
# pint disable promql/without
becomes# pint disable promql/aggregate
# pint ignore promql/by
becomes# pint ignore promql/aggregate
# pint ignore promql/without
becomes# pint ignore promql/aggregate
- Fixed false positive reports in
aggregate
check.
--no-color
flag for disabling output colouring.
- Fixed duplicated warnings when multiple
rule {...}
blocks where configured.
- Specifying multiple
# pint disable ...
comments on a single rule would only apply last comment. This now works correctly and all comments will be applied.
- Added
alerts/for
check that will look for invalidfor
values in alerting rules. This check is enabled by default.
comparison
check is now enabled by default and require no configuration. Removecomparison{ ... }
blocks from pint config file when upgrading.template
check is now enabled by default and require no configuration. Removetemplate{ ... }
blocks from pint config file when upgrading.rate
check is now enabled by default for all configured Prometheus servers. Removerate{ ... }
blocks from pint config file when upgrading.series
check is now enabled by default for all configured Prometheus servers. Removeseries{ ... }
blocks from pint config file when upgrading.vector_matching
check is now enabled by default for all configured Prometheus servers. Removevector_matching{ ... }
blocks from pint config file when upgrading.
- Support
parseDuration
function in alert templates added in Prometheus 2.32.0
- Fixed
series
check handling of queries with{__name__="foo"}
selectors.
- Fixed
template
check handling ofabsent
calls on aggregated metrics, likeabsent(sum(nonexistent{job="myjob"}))
.
-
template
check will now warn if any template is referencing a label that is not passed toabsent()
. Example:{% raw %}
- alert: Foo expr: absent(foo{env="prod"}) annotations: summary: "foo metric is missing for job {{ $labels.job }}"
{% endraw %}
Would generate a warning since
absent()
can only return labels that are explicitly passed to it and the above call only passesenv
label. This can be fixed by updating the query toabsent(foo{env="prod", job="bar"})
.
-
comparison
check will now warn when alert query uses bool modifier after condition, which can cause alert to always fire. Example:- alert: Foo expr: rate(error_count[5m]) > bool 5
Having
bool
as part of> 5
condition means that the query will return value1
when condition is met, and0
when it's not. Rather than returning value ofrate(error_count[5m])
only when that value is> 5
. Since all results of an alerting ruleexpr
are considered alerts such alert rule could always fire, regardless of the value returned byrate(error_count[5m])
.
comparison
check will now ignoreabsent(foo)
alert queries without any condition.
--offline
flag forpint ci
command.
- Fixed
template
check panic when alert query had a syntax error.
rule
block can now specifyignore
conditions that have the same syntax asmatch
but will disablerule
for matching alerting and recording rules #48.match
andignore
blocks can now filter alerting and recording rules by name.record
will be used as name for recording rules andalert
for alerting rules.
-
--offline
flag forpint lint
command. When passed only checks that don't send any live queries to Prometheus server will be run. -
template
check will now warn if template if referencing a label that is being stripped by aggregation. Example:{% raw %}
- alert: Foo expr: count(up) without(instance) == 0 annotations: summary: "foo is down on {{ $labels.instance }}"
{% endraw %}
Would generate a warning since
instance
label is being stripped bywithout(instance)
.
- Fixed file descriptor leak due to missing file
Close()
#69.
- Retry queries that error with
query processing would load too many samples into memory
using a smaller time range.
vector_matching
check for finding queries with incorrecton()
orignoring()
keywords.
comparison
check would trigger false positive for rules usingunless
keyword.
# pint skip/line
place between# pint skip/begin
and# pint skip/end
lines would reset ignore rules causing lines that should be ignored to be parsed.
value
check was replaced bytemplate
, which covers the same functionality and more. See docs for details.