-
-
Notifications
You must be signed in to change notification settings - Fork 168
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Rearrange monitor docs and add alerters
- Loading branch information
Showing
21 changed files
with
574 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,186 @@ | ||
Alerter Configuration | ||
===================== | ||
|
||
Alerters send one-off alerts when a monitor fails. They can also send an alert | ||
when it succeeds again. | ||
|
||
An alerter knows if it is urgent or not; if a monitor defined as non-urgent | ||
fails, an urgent alerter will not trigger for it. This means you can avoid | ||
receiving SMS alerts for things which don’t require your immediate attention. | ||
|
||
Alerters can also have a time configuration for hours when they are or are not | ||
allowed to alert. They can also send an alert at the end of the silence period | ||
for any monitors which are currently failed. | ||
|
||
Alerters are defined in the main configuration file, which by default is :file:`monitor.ini`. The section name is the name of your alerter, which you should then add to the ``alerters`` configuration value. | ||
|
||
.. contents:: | ||
|
||
Common options | ||
-------------- | ||
|
||
These options are common to all alerter types. | ||
|
||
.. confval:: type | ||
|
||
:type: string | ||
:required: true | ||
|
||
the type of the alerter; one of those in the list below. | ||
|
||
.. confval:: depend | ||
|
||
:type: comma-separated list of string | ||
:required: false | ||
:default: none | ||
|
||
a list of monitors this alerter depends on. If any of them fail, no attempt will be made to send the alert. | ||
|
||
.. confval:: limit | ||
|
||
:type: integer | ||
:required: false | ||
:default: ``1`` | ||
|
||
the number of times a monitor must have failed before this alerter fires for it. You can use this to escalate an alert to another email address or text messaging, for example. | ||
|
||
.. confval:: dry_run | ||
|
||
:type: boolean | ||
:required: false | ||
:default: ``false`` | ||
|
||
makes an alerter do everything except actually send the message, and instead will print some information about what it would do. | ||
|
||
.. confval:: ooh_success | ||
|
||
:type: boolean | ||
:required: false | ||
:default: ``false`` | ||
|
||
makes an alerter trigger its success action even if out of hours | ||
|
||
.. confval:: groups | ||
|
||
:type: comma-separated list of string | ||
:required: false | ||
:default: ``default`` | ||
|
||
list of monitor groups this alerter should fire for. See the :ref:`group<monitor-group>` setting for monitors. | ||
|
||
.. confval:: only_failures | ||
|
||
:type: boolean | ||
:required: false | ||
:default: ``false`` | ||
|
||
if true, only send alerts for failures (or catchups) | ||
|
||
.. _alerter-tz: | ||
|
||
.. confval:: tz | ||
|
||
:type: string | ||
:required: false | ||
:default: ``UTC`` | ||
|
||
the timezone to use in alert messages. See also :confval:`times_tz`. | ||
|
||
.. confval:: repeat | ||
|
||
:type: boolean | ||
:required: false | ||
:default: ``false`` | ||
|
||
fire this alerter (for a failed monitor) every iteration | ||
|
||
Time restrictions | ||
----------------- | ||
|
||
All alerters accept time period configuration. By default, an alerter is active at all times, so you will always immediately receive an alert at the point where a monitor has failed enough (more times than the limit). To set limits on when an alerter can send, use the configuration values below. | ||
|
||
Note that the :confval:`times_type` option sets the timezone all the values are interpreted as. The default is the local timezone of the host evaluating the logic. | ||
|
||
.. confval:: day | ||
|
||
:type: comma-separated list of integer | ||
:required: false | ||
:default: all days | ||
|
||
which days an alerter can operate on. ``0`` is Monday, ``6`` is Sunday. | ||
|
||
.. confval:: times_type | ||
|
||
:type: string | ||
:required: false | ||
:default: ``always`` | ||
|
||
one of ``always``, ``only``, or ``not``. ``only`` means that the limits specify the period the alerter is allowed to operate in. ``not`` means the specify the period it isn't, and outside of that time it is allowed. | ||
|
||
.. confval:: time_lower | ||
|
||
:type: string | ||
:required: when :confval:`times_type` is not ``always`` | ||
|
||
the lower end of the time range. Must be lower than :confval:`time_upper`. The format is ``HH:mm`` in 24-hour clock. | ||
|
||
.. confval:: time_upper | ||
|
||
:type: string | ||
:required: when :confval:`times_type` is not ``always`` | ||
|
||
the upper end of the time range. Must be lower than :confval:`time_lower`. The format is ``HH:mm`` in 24-hour clock. | ||
|
||
.. confval:: times_tz | ||
|
||
:type: string | ||
:required: false | ||
:default: the host's local time | ||
|
||
the timezone for :confval:`day`, :confval:`time_lower` and :confval:`time_upper` to be interpreted in. | ||
|
||
.. confval:: delay | ||
|
||
:type: boolean | ||
:required: false | ||
:default: ``false`` | ||
|
||
set to true to have the alerter send a "catch-up" alert about a failed monitor if it failed during a time the alerter was not allowed to send, and is still failed as the alerter enters the time it is allowed to send. If the monitor fails and recovers during the not-allowed time, no alert is sent either way. | ||
|
||
|
||
Time examples | ||
^^^^^^^^^^^^^ | ||
|
||
These snippets omit the alerter-specific configuration values. | ||
|
||
Don't trigger during the hours I'm in the office (8:30am to 5:30pm, Monday to Friday): | ||
|
||
.. code-block:: ini | ||
[out_of_hours] | ||
type=some-alerter-type | ||
times_type=not | ||
time_lower=08:30 | ||
time_upper_17:30 | ||
days=0,1,2,3,4 | ||
Don't send at antisocial times, but let me know later if something broke and hasn't recovered yet: | ||
|
||
.. code-block:: ini | ||
[polite_alerter] | ||
type=some-alerter-type | ||
times_type=only | ||
time_lower=07:30 | ||
time_upper=22:00 | ||
delay=1 | ||
Alerters | ||
-------- | ||
|
||
.. note:: The ``type`` of the alerter is the first word in its heading. | ||
|
||
.. toctree:: | ||
:glob: | ||
|
||
alerters/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
46elks - 46elks notifications | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
.. include:: ../creds-warning.rst | ||
|
||
You will need to register for an account at 46elks_. | ||
|
||
.. _46elks: https://46elks.com/ | ||
|
||
.. confval:: username | ||
|
||
:type: string | ||
:required: true | ||
|
||
your 46wlks username | ||
|
||
.. confval:: password | ||
|
||
:type: string | ||
:required: true | ||
|
||
your 46wlks password | ||
|
||
.. confval:: target | ||
|
||
:type: string | ||
:required: true | ||
|
||
46elks target value | ||
|
||
.. confval:: sender | ||
|
||
:type: string | ||
:required: false | ||
:default: ``SmplMntr`` | ||
|
||
your SMS sender field. Start with a ``+`` if using a phone number. | ||
|
||
.. confval:: api_host | ||
|
||
:type: string | ||
:required: false | ||
:default: ``api.46elks.com`` | ||
|
||
API endpoint to use |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
bulksms - SMS via BulkSMS | ||
^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
.. warning:: Do not commit your credentials to a public repo! | ||
|
||
.. confval:: sender | ||
|
||
:type: string | ||
:required: false | ||
:default: ``SmplMntr`` | ||
|
||
who the SMS should appear to be from. Max 11 chars, and best to stick to alphanumerics. | ||
|
||
.. confval:: username | ||
|
||
:type: string | ||
:required: true | ||
|
||
your BulkSMS username | ||
|
||
.. confval:: password | ||
|
||
:type: string | ||
:required: true | ||
|
||
your BulkSMS password | ||
|
||
.. confval:: target | ||
|
||
:type: string | ||
:required: true | ||
|
||
the number to send the SMS to. Specify using country code and number, with no ``+`` or international prefix. For example, ``447777123456`` for a UK mobile. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
email - send via SMTP | ||
^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
.. warning:: Do not commit your credentials to a public repo! | ||
|
||
.. confval:: host | ||
|
||
:type: string | ||
:required: true | ||
|
||
the email server to connect to | ||
|
||
.. confval:: port | ||
|
||
:type: integer | ||
:required: false | ||
:default: ``25`` | ||
|
||
the port to connect on | ||
|
||
.. confval:: from | ||
|
||
:type: string | ||
:required: true | ||
|
||
the email address to give as the sender | ||
|
||
.. confval:: to | ||
|
||
:type: string | ||
:required: true | ||
|
||
the email address to send to. You can specify multiple addresses by separating with ``;``. | ||
|
||
.. confval:: cc | ||
|
||
:type: string | ||
:required: false | ||
|
||
the email address to cc to. You can specify multiple addresses by separating with ``;``. | ||
|
||
.. confval:: username | ||
|
||
:type: string | ||
:required: false | ||
|
||
the username to log in to the SMTP server with | ||
|
||
.. confval:: password | ||
|
||
:type: string | ||
:required: false | ||
|
||
the password to log in to the SMTP server with | ||
|
||
.. confval:: ssl | ||
|
||
:type: string | ||
:required: false | ||
|
||
specify ``starttls` to use StartTLS. Specify ``yes`` to use SMTP SSL. Otherwise, no SSL is used at all. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
execute - run external command | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
.. confval:: fail_command | ||
|
||
:type: string | ||
:required: false | ||
|
||
command to execute when a monitor fails | ||
|
||
.. confval:: success_command | ||
|
||
:type: string | ||
:required: false | ||
|
||
command to execute when a montior recovers | ||
|
||
.. confval:: catchup_command | ||
|
||
:type: string | ||
:required: false | ||
|
||
command to execute when exiting a time period when the alerter couldn't fire, a monitor failed during that time, and hasn't recovered yet. (See the :confval:`delay` configuration option.) If you specify the literal string ``fail_command``, this will share the :confval:`fail_command` configuration value. | ||
|
||
You can specify the following variable inside ``{curly brackets}`` to have them substituted when the command is executed: | ||
|
||
* ``hostname``: the host the monitor is running on | ||
* ``name``: the monitor's name | ||
* ``days``, ``hours``, ``minutes``, and ``seconds``: the monitor's downtime | ||
* ``failed_at``: the date and time the monitor failed | ||
* ``vitual_fail_count``: the monitor's virtual failure count (number of failed checks - :confval:`tolerance`) | ||
* ``info``: the additional information the monitor recorded about its status | ||
* ``description``: description of what the monitor is checking | ||
|
||
You will probably need to quote parameters to the command. For example:: | ||
|
||
fail_command=say "Oh no, monitor {name} has failed at {failed_at}" | ||
|
||
The commands are executed directly by Python. If you require shell features, such as piping and redirection, you should use something like ``bash -c "..."``. For example:: | ||
|
||
fail_command=/bin/bash -c "/usr/bin/printf \"The simplemonitor for {name} has failed on {hostname}.\n\nTime: {failed_at}\nInfo: {info}\n\" | /usr/bin/mailx -A gmail -s \"PROBLEM: simplemonitor {name} has failed on {hostname}.\" email@address" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
nc - macOS notifications | ||
^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Publishes alerts to the macOS Notification Center. Only for macOS. | ||
|
||
No configuration options. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
pushbullet - push notifications | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
.. include:: ../creds-warning.rst | ||
|
||
You will need to be registered at pushbullet_. | ||
|
||
.. _pushbullet: https://www.pushbullet.com/ | ||
|
||
.. confval:: token | ||
|
||
:type: string | ||
:required: true | ||
|
||
your pushbullet token |
Oops, something went wrong.