Skip to content

Commit

Permalink
Rearrange monitor docs and add alerters
Browse files Browse the repository at this point in the history
  • Loading branch information
jamesoff committed Mar 21, 2021
1 parent aa912f0 commit 56c1cf1
Show file tree
Hide file tree
Showing 21 changed files with 574 additions and 3 deletions.
186 changes: 186 additions & 0 deletions docs/alerters.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
Alerter Configuration
=====================

Alerters send one-off alerts when a monitor fails. They can also send an alert
when it succeeds again.

An alerter knows if it is urgent or not; if a monitor defined as non-urgent
fails, an urgent alerter will not trigger for it. This means you can avoid
receiving SMS alerts for things which don’t require your immediate attention.

Alerters can also have a time configuration for hours when they are or are not
allowed to alert. They can also send an alert at the end of the silence period
for any monitors which are currently failed.

Alerters are defined in the main configuration file, which by default is :file:`monitor.ini`. The section name is the name of your alerter, which you should then add to the ``alerters`` configuration value.

.. contents::

Common options
--------------

These options are common to all alerter types.

.. confval:: type

:type: string
:required: true

the type of the alerter; one of those in the list below.

.. confval:: depend

:type: comma-separated list of string
:required: false
:default: none

a list of monitors this alerter depends on. If any of them fail, no attempt will be made to send the alert.

.. confval:: limit

:type: integer
:required: false
:default: ``1``

the number of times a monitor must have failed before this alerter fires for it. You can use this to escalate an alert to another email address or text messaging, for example.

.. confval:: dry_run

:type: boolean
:required: false
:default: ``false``

makes an alerter do everything except actually send the message, and instead will print some information about what it would do.

.. confval:: ooh_success

:type: boolean
:required: false
:default: ``false``

makes an alerter trigger its success action even if out of hours

.. confval:: groups

:type: comma-separated list of string
:required: false
:default: ``default``

list of monitor groups this alerter should fire for. See the :ref:`group<monitor-group>` setting for monitors.

.. confval:: only_failures

:type: boolean
:required: false
:default: ``false``

if true, only send alerts for failures (or catchups)

.. _alerter-tz:

.. confval:: tz

:type: string
:required: false
:default: ``UTC``

the timezone to use in alert messages. See also :confval:`times_tz`.

.. confval:: repeat

:type: boolean
:required: false
:default: ``false``

fire this alerter (for a failed monitor) every iteration

Time restrictions
-----------------

All alerters accept time period configuration. By default, an alerter is active at all times, so you will always immediately receive an alert at the point where a monitor has failed enough (more times than the limit). To set limits on when an alerter can send, use the configuration values below.

Note that the :confval:`times_type` option sets the timezone all the values are interpreted as. The default is the local timezone of the host evaluating the logic.

.. confval:: day

:type: comma-separated list of integer
:required: false
:default: all days

which days an alerter can operate on. ``0`` is Monday, ``6`` is Sunday.

.. confval:: times_type

:type: string
:required: false
:default: ``always``

one of ``always``, ``only``, or ``not``. ``only`` means that the limits specify the period the alerter is allowed to operate in. ``not`` means the specify the period it isn't, and outside of that time it is allowed.

.. confval:: time_lower

:type: string
:required: when :confval:`times_type` is not ``always``

the lower end of the time range. Must be lower than :confval:`time_upper`. The format is ``HH:mm`` in 24-hour clock.

.. confval:: time_upper

:type: string
:required: when :confval:`times_type` is not ``always``

the upper end of the time range. Must be lower than :confval:`time_lower`. The format is ``HH:mm`` in 24-hour clock.

.. confval:: times_tz

:type: string
:required: false
:default: the host's local time

the timezone for :confval:`day`, :confval:`time_lower` and :confval:`time_upper` to be interpreted in.

.. confval:: delay

:type: boolean
:required: false
:default: ``false``

set to true to have the alerter send a "catch-up" alert about a failed monitor if it failed during a time the alerter was not allowed to send, and is still failed as the alerter enters the time it is allowed to send. If the monitor fails and recovers during the not-allowed time, no alert is sent either way.


Time examples
^^^^^^^^^^^^^

These snippets omit the alerter-specific configuration values.

Don't trigger during the hours I'm in the office (8:30am to 5:30pm, Monday to Friday):

.. code-block:: ini
[out_of_hours]
type=some-alerter-type
times_type=not
time_lower=08:30
time_upper_17:30
days=0,1,2,3,4
Don't send at antisocial times, but let me know later if something broke and hasn't recovered yet:

.. code-block:: ini
[polite_alerter]
type=some-alerter-type
times_type=only
time_lower=07:30
time_upper=22:00
delay=1
Alerters
--------

.. note:: The ``type`` of the alerter is the first word in its heading.

.. toctree::
:glob:

alerters/*
45 changes: 45 additions & 0 deletions docs/alerters/46elks.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
46elks - 46elks notifications
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. include:: ../creds-warning.rst

You will need to register for an account at 46elks_.

.. _46elks: https://46elks.com/

.. confval:: username

:type: string
:required: true

your 46wlks username

.. confval:: password

:type: string
:required: true

your 46wlks password

.. confval:: target

:type: string
:required: true

46elks target value

.. confval:: sender

:type: string
:required: false
:default: ``SmplMntr``

your SMS sender field. Start with a ``+`` if using a phone number.

.. confval:: api_host

:type: string
:required: false
:default: ``api.46elks.com``

API endpoint to use
33 changes: 33 additions & 0 deletions docs/alerters/bulksms.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
bulksms - SMS via BulkSMS
^^^^^^^^^^^^^^^^^^^^^^^^^

.. warning:: Do not commit your credentials to a public repo!

.. confval:: sender

:type: string
:required: false
:default: ``SmplMntr``

who the SMS should appear to be from. Max 11 chars, and best to stick to alphanumerics.

.. confval:: username

:type: string
:required: true

your BulkSMS username

.. confval:: password

:type: string
:required: true

your BulkSMS password

.. confval:: target

:type: string
:required: true

the number to send the SMS to. Specify using country code and number, with no ``+`` or international prefix. For example, ``447777123456`` for a UK mobile.
61 changes: 61 additions & 0 deletions docs/alerters/email.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
email - send via SMTP
^^^^^^^^^^^^^^^^^^^^^

.. warning:: Do not commit your credentials to a public repo!

.. confval:: host

:type: string
:required: true

the email server to connect to

.. confval:: port

:type: integer
:required: false
:default: ``25``

the port to connect on

.. confval:: from

:type: string
:required: true

the email address to give as the sender

.. confval:: to

:type: string
:required: true

the email address to send to. You can specify multiple addresses by separating with ``;``.

.. confval:: cc

:type: string
:required: false

the email address to cc to. You can specify multiple addresses by separating with ``;``.

.. confval:: username

:type: string
:required: false

the username to log in to the SMTP server with

.. confval:: password

:type: string
:required: false

the password to log in to the SMTP server with

.. confval:: ssl

:type: string
:required: false

specify ``starttls` to use StartTLS. Specify ``yes`` to use SMTP SSL. Otherwise, no SSL is used at all.
41 changes: 41 additions & 0 deletions docs/alerters/execute.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
execute - run external command
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. confval:: fail_command

:type: string
:required: false

command to execute when a monitor fails

.. confval:: success_command

:type: string
:required: false

command to execute when a montior recovers

.. confval:: catchup_command

:type: string
:required: false

command to execute when exiting a time period when the alerter couldn't fire, a monitor failed during that time, and hasn't recovered yet. (See the :confval:`delay` configuration option.) If you specify the literal string ``fail_command``, this will share the :confval:`fail_command` configuration value.

You can specify the following variable inside ``{curly brackets}`` to have them substituted when the command is executed:

* ``hostname``: the host the monitor is running on
* ``name``: the monitor's name
* ``days``, ``hours``, ``minutes``, and ``seconds``: the monitor's downtime
* ``failed_at``: the date and time the monitor failed
* ``vitual_fail_count``: the monitor's virtual failure count (number of failed checks - :confval:`tolerance`)
* ``info``: the additional information the monitor recorded about its status
* ``description``: description of what the monitor is checking

You will probably need to quote parameters to the command. For example::

fail_command=say "Oh no, monitor {name} has failed at {failed_at}"

The commands are executed directly by Python. If you require shell features, such as piping and redirection, you should use something like ``bash -c "..."``. For example::

fail_command=/bin/bash -c "/usr/bin/printf \"The simplemonitor for {name} has failed on {hostname}.\n\nTime: {failed_at}\nInfo: {info}\n\" | /usr/bin/mailx -A gmail -s \"PROBLEM: simplemonitor {name} has failed on {hostname}.\" email@address"
6 changes: 6 additions & 0 deletions docs/alerters/nc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
nc - macOS notifications
^^^^^^^^^^^^^^^^^^^^^^^^

Publishes alerts to the macOS Notification Center. Only for macOS.

No configuration options.
15 changes: 15 additions & 0 deletions docs/alerters/pushbullet.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
pushbullet - push notifications
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. include:: ../creds-warning.rst

You will need to be registered at pushbullet_.

.. _pushbullet: https://www.pushbullet.com/

.. confval:: token

:type: string
:required: true

your pushbullet token
Loading

0 comments on commit 56c1cf1

Please sign in to comment.