From 8648a29a63b4ef8e9863dec7503ba902db5b431c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C4=93teris=20Caune?= Note: if you use the "start" signal to measure job run times,
then Grace Time also specifies the maximum allowed time gap between "start" and
-"success" signals. Whenever SITE_NAME receives a "start" signal, it expects a subsequent
-"success" signal within Grace Time. If the success signal does not arrive within the
+"success" signals. Whenever SITE_NAME receives a "start" signal, it expects a subsequent
+"success" signal within Grace Time. If the success signal does not arrive within the
configured Grace Time, SITE_NAME will mark the check as failed and send out alerts. Use "Cron" for monitoring cron jobs and other processes with more complex schedules.
@@ -60,8 +60,9 @@ alert for a late check.
Use "OnCalendar" schedules to monitor systemd timers that use Cron Schedules
OnCalendar Schedules
OnCalendar=
schedules.
-Same as with systemd timers, you can specify more than one OnCalendar
expression,
-and SITE_NAME will expect a ping whenever any schedule matches.OnCalendar
expression
+(separated with newlines, one schedule per line), and SITE_NAME will expect a ping
+whenever any schedule matches.
See systemd.time(7) man page for complete OnCalendar syntax reference.
Grace Time is one of the configuration parameters you can set for each check. It is the additional time to wait before sending an alert when a check is late. Use this parameter to account for minor, expected deviations in job -execution times. If you use "start" signals to -measure job execution time, Grace Time also sets the -maximum allowed time gap between "start" and "success" signals. If a job -sends a "start" signal but does not send a "success" signal within grace time, +execution times.
+When a check is considered late depends on whether the check uses a simple +or cron schedule, and whether or not you are +tracking job durations using the "start" events.
+For simple schedules, the check is late when the checks's configured period has passed. +For example, consider a periodic task that should run every hour, and the gaps between +runs should not deviate by more than 5 minutes (Period = 1 hour, +Grace Time = 5 minutes). And let's say the last successful ping arrived at 12:00.
+For cron and OnCalendar schedules, the check enters the late state at the exact
+moment when the current wall clock time matches the schedule. Let's consider a cron
+job with the schedule 10 * * * *
(10 minutes past every hour) and grace time of 5 minutes.
+And let's say the last successful ping arrived at 12:30.
If you use "start" signals to measure job execution time, +Grace Time also sets the maximum allowed time gap between "start" and "success" signals. +If a job sends a "start" signal but does not send a "success" signal within grace time, SITE_NAME will assume failure and send out alerts.
An Integration is a specific method for delivering monitoring alerts when a check's @@ -96,7 +119,7 @@ For each check, you can specify which integrations it should use.
Configuring notifications.Project. To keep things organized, you can group checks and integrations in Projects. -Your account starts with a single default project, but you can create +Your account starts with a single default project, but you can create additional projects as needed. You can transfer existing checks between projects while preserving their configuration and ping URLs.
Each project has a configurable name, a separate set of API keys, and a separate diff --git a/templates/docs/introduction.md b/templates/docs/introduction.md index 1238103981c6..404e57166fbc 100644 --- a/templates/docs/introduction.md +++ b/templates/docs/introduction.md @@ -55,7 +55,7 @@ Each check is always in one of the following states, depicted by a status icon: : **Paused**. You can manually pause the monitoring of specific checks. For example, - if a frequently running cron job has a known problem, and a fix is in the works + if a frequently running cron job has a known problem, and a fix is in the works but not yet ready, you can pause monitoring the corresponding check temporarily to avoid unwanted alerts about a known issue. @@ -96,10 +96,35 @@ Read more about Ping URLs in [Pinging API](http_api/). **Grace Time** is one of the configuration parameters you can set for each check. It is the additional time to wait before sending an alert when a check is late. Use this parameter to account for minor, expected deviations in job -execution times. If you use "start" signals to -[measure job execution time](measuring_script_run_time/), Grace Time also sets the -maximum allowed time gap between "start" and "success" signals. If a job -sends a "start" signal but does not send a "success" signal within grace time, +execution times. + +When a check is considered *late* depends on whether the check uses a simple +or cron schedule, and whether or not you are +[tracking job durations](measuring_script_run_time/) using the "start" events. + +For **simple schedules**, the check is late when the checks's configured period has passed. +For example, consider a periodic task that should run every hour, and the gaps between +runs should not deviate by more than 5 minutes (Period = 1 hour, +Grace Time = 5 minutes). And let's say the last successful ping arrived at 12:00. + +* At 13:00 the check will be declared late (because 1 hour will have passed + since the last ping). +* At 13:05 the check will be declared down and the alerts will go out (because + 1 hour + 5 minutes will have passed since the last ping). + +For **cron and OnCalendar schedules**, the check enters the late state at the exact +moment when the current wall clock time matches the schedule. Let's consider a cron +job with the schedule `10 * * * *` (10 minutes past every hour) and grace time of 5 minutes. +And let's say the last successful ping arrived at 12:30. + +* At 13:10 the check will be declared late (because 13:10 is the next scheduled time + the cron job is expected to send a ping according to the cron schedule). +* At 13:15 the check will be declared down and the alerts will go out (because 5 + minutes will have passed since the time the cron job was expected to check in). + +If you use "start" signals to [measure job execution time](measuring_script_run_time/), +Grace Time also sets the maximum allowed time gap between "start" and "success" signals. +If a job sends a "start" signal but does not send a "success" signal within grace time, SITE_NAME will assume failure and send out alerts. --- @@ -115,7 +140,7 @@ For more information on integrations, see --- **Project**. To keep things organized, you can group checks and integrations in **Projects**. -Your account starts with a single default project, but you can create +Your account starts with a single default project, but you can create additional projects as needed. You can transfer existing checks between projects while preserving their configuration and ping URLs. @@ -124,4 +149,3 @@ project team. The project's team is the set of people you have granted read-only read-write access to the project. For more information on projects, see [Projects and teams](projects_teams/). -