Skip to content

Commit

Permalink
Add examples in docs / Introduction / Grace Time
Browse files Browse the repository at this point in the history
  • Loading branch information
cuu508 committed Jan 30, 2025
1 parent b11f239 commit 8648a29
Show file tree
Hide file tree
Showing 4 changed files with 70 additions and 21 deletions.
9 changes: 5 additions & 4 deletions templates/docs/configuring_checks.html-fragment
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ execution times.</li>
</ul>
<p>Note: if you use the "start" signal to <a href="../measuring_script_run_time/">measure job run times</a>,
then Grace Time also specifies the maximum allowed time gap between "start" and
"success" signals. Whenever SITE_NAME receives a "start" signal, it expects a subsequent
"success" signal within Grace Time. If the success signal does not arrive within the
"success" signals. Whenever SITE_NAME receives a "start" signal, it expects a subsequent
"success" signal within Grace Time. If the success signal does not arrive within the
configured Grace Time, SITE_NAME will mark the check as failed and send out alerts.</p>
<h2>Cron Schedules</h2>
<p>Use "Cron" for monitoring cron jobs and other processes with more complex schedules.
Expand All @@ -60,8 +60,9 @@ alert for a late check.</li>
</ul>
<h2>OnCalendar Schedules</h2>
<p>Use "OnCalendar" schedules to monitor systemd timers that use <code>OnCalendar=</code> schedules.
Same as with systemd timers, you can specify more than one <code>OnCalendar</code> expression,
and SITE_NAME will expect a ping whenever any schedule matches.</p>
Same as with systemd timers, you can specify more than one <code>OnCalendar</code> expression
(separated with newlines, one schedule per line), and SITE_NAME will expect a ping
whenever any schedule matches.</p>
<p>See <a href="https://www.man7.org/linux/man-pages/man7/systemd.time.7.html#CALENDAR_EVENTS">systemd.time(7) man page</a>
for complete OnCalendar syntax reference.</p>
<p><img alt="Editing cron schedule" src="IMG_URL/edit_oncalendar_schedule.png" /></p>
Expand Down
9 changes: 5 additions & 4 deletions templates/docs/configuring_checks.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@ execution times.

Note: if you use the "start" signal to [measure job run times](../measuring_script_run_time/),
then Grace Time also specifies the maximum allowed time gap between "start" and
"success" signals. Whenever SITE_NAME receives a "start" signal, it expects a subsequent
"success" signal within Grace Time. If the success signal does not arrive within the
"success" signals. Whenever SITE_NAME receives a "start" signal, it expects a subsequent
"success" signal within Grace Time. If the success signal does not arrive within the
configured Grace Time, SITE_NAME will mark the check as failed and send out alerts.

## Cron Schedules
Expand All @@ -73,8 +73,9 @@ alert for a late check.
## OnCalendar Schedules

Use "OnCalendar" schedules to monitor systemd timers that use `OnCalendar=` schedules.
Same as with systemd timers, you can specify more than one `OnCalendar` expression,
and SITE_NAME will expect a ping whenever any schedule matches.
Same as with systemd timers, you can specify more than one `OnCalendar` expression
(separated with newlines, one schedule per line), and SITE_NAME will expect a ping
whenever any schedule matches.

See [systemd.time(7) man page](https://www.man7.org/linux/man-pages/man7/systemd.time.7.html#CALENDAR_EVENTS)
for complete OnCalendar syntax reference.
Expand Down
35 changes: 29 additions & 6 deletions templates/docs/introduction.html-fragment
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ When a check transitions into the "Down" state, SITE_NAME sends alert
messages via the configured integrations.</dd>
<dt><span class="status ic-paused"></span></dt>
<dd><strong>Paused</strong>. You can manually pause the monitoring of specific checks. For example,
if a frequently running cron job has a known problem, and a fix is in the works
if a frequently running cron job has a known problem, and a fix is in the works
but not yet ready, you can pause monitoring the corresponding check temporarily to
avoid unwanted alerts about a known issue.</dd>
<dt><span class="status ic-up"></span><div class="spinner started"></div></dt>
Expand Down Expand Up @@ -82,10 +82,33 @@ anybody can send telemetry signals to your checks and mess with your monitoring.
<p><strong>Grace Time</strong> is one of the configuration parameters you can set for each check.
It is the additional time to wait before sending an alert when a check
is late. Use this parameter to account for minor, expected deviations in job
execution times. If you use "start" signals to
<a href="measuring_script_run_time/">measure job execution time</a>, Grace Time also sets the
maximum allowed time gap between "start" and "success" signals. If a job
sends a "start" signal but does not send a "success" signal within grace time,
execution times.</p>
<p>When a check is considered <em>late</em> depends on whether the check uses a simple
or cron schedule, and whether or not you are
<a href="measuring_script_run_time/">tracking job durations</a> using the "start" events.</p>
<p>For <strong>simple schedules</strong>, the check is late when the checks's configured period has passed.
For example, consider a periodic task that should run every hour, and the gaps between
runs should not deviate by more than 5 minutes (Period = 1 hour,
Grace Time = 5 minutes). And let's say the last successful ping arrived at 12:00.</p>
<ul>
<li>At 13:00 the check will be declared late (because 1 hour will have passed
since the last ping).</li>
<li>At 13:05 the check will be declared down and the alerts will go out (because
1 hour + 5 minutes will have passed since the last ping).</li>
</ul>
<p>For <strong>cron and OnCalendar schedules</strong>, the check enters the late state at the exact
moment when the current wall clock time matches the schedule. Let's consider a cron
job with the schedule <code>10 * * * *</code> (10 minutes past every hour) and grace time of 5 minutes.
And let's say the last successful ping arrived at 12:30.</p>
<ul>
<li>At 13:10 the check will be declared late (because 13:10 is the next scheduled time
the cron job is expected to send a ping according to the cron schedule).</li>
<li>At 13:15 the check will be declared down and the alerts will go out (because 5
minutes will have passed since the time the cron job was expected to check in).</li>
</ul>
<p>If you use "start" signals to <a href="measuring_script_run_time/">measure job execution time</a>,
Grace Time also sets the maximum allowed time gap between "start" and "success" signals.
If a job sends a "start" signal but does not send a "success" signal within grace time,
SITE_NAME will assume failure and send out alerts.</p>
<hr />
<p>An <strong>Integration</strong> is a specific method for delivering monitoring alerts when a check's
Expand All @@ -96,7 +119,7 @@ For each check, you can specify which integrations it should use.</p>
<a href="configuring_notifications/">Configuring notifications</a>.</p>
<hr />
<p><strong>Project</strong>. To keep things organized, you can group checks and integrations in <strong>Projects</strong>.
Your account starts with a single default project, but you can create
Your account starts with a single default project, but you can create
additional projects as needed. You can transfer existing checks between projects
while preserving their configuration and ping URLs.</p>
<p>Each project has a configurable name, a separate set of API keys, and a separate
Expand Down
38 changes: 31 additions & 7 deletions templates/docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Each check is always in one of the following states, depicted by a status icon:

<span class="status ic-paused"></span>
: **Paused**. You can manually pause the monitoring of specific checks. For example,
if a frequently running cron job has a known problem, and a fix is in the works
if a frequently running cron job has a known problem, and a fix is in the works
but not yet ready, you can pause monitoring the corresponding check temporarily to
avoid unwanted alerts about a known issue.

Expand Down Expand Up @@ -96,10 +96,35 @@ Read more about Ping URLs in [Pinging API](http_api/).
**Grace Time** is one of the configuration parameters you can set for each check.
It is the additional time to wait before sending an alert when a check
is late. Use this parameter to account for minor, expected deviations in job
execution times. If you use "start" signals to
[measure job execution time](measuring_script_run_time/), Grace Time also sets the
maximum allowed time gap between "start" and "success" signals. If a job
sends a "start" signal but does not send a "success" signal within grace time,
execution times.

When a check is considered *late* depends on whether the check uses a simple
or cron schedule, and whether or not you are
[tracking job durations](measuring_script_run_time/) using the "start" events.

For **simple schedules**, the check is late when the checks's configured period has passed.
For example, consider a periodic task that should run every hour, and the gaps between
runs should not deviate by more than 5 minutes (Period = 1 hour,
Grace Time = 5 minutes). And let's say the last successful ping arrived at 12:00.

* At 13:00 the check will be declared late (because 1 hour will have passed
since the last ping).
* At 13:05 the check will be declared down and the alerts will go out (because
1 hour + 5 minutes will have passed since the last ping).

For **cron and OnCalendar schedules**, the check enters the late state at the exact
moment when the current wall clock time matches the schedule. Let's consider a cron
job with the schedule `10 * * * *` (10 minutes past every hour) and grace time of 5 minutes.
And let's say the last successful ping arrived at 12:30.

* At 13:10 the check will be declared late (because 13:10 is the next scheduled time
the cron job is expected to send a ping according to the cron schedule).
* At 13:15 the check will be declared down and the alerts will go out (because 5
minutes will have passed since the time the cron job was expected to check in).

If you use "start" signals to [measure job execution time](measuring_script_run_time/),
Grace Time also sets the maximum allowed time gap between "start" and "success" signals.
If a job sends a "start" signal but does not send a "success" signal within grace time,
SITE_NAME will assume failure and send out alerts.

---
Expand All @@ -115,7 +140,7 @@ For more information on integrations, see
---

**Project**. To keep things organized, you can group checks and integrations in **Projects**.
Your account starts with a single default project, but you can create
Your account starts with a single default project, but you can create
additional projects as needed. You can transfer existing checks between projects
while preserving their configuration and ping URLs.

Expand All @@ -124,4 +149,3 @@ project team. The project's team is the set of people you have granted read-only
read-write access to the project.

For more information on projects, see [Projects and teams](projects_teams/).

0 comments on commit 8648a29

Please sign in to comment.