From 56c1cf123f7aff630e5245b04919c68f6fd43579 Mon Sep 17 00:00:00 2001 From: James Seward Date: Sun, 21 Mar 2021 21:07:05 +0000 Subject: [PATCH] Rearrange monitor docs and add alerters --- docs/alerters.rst | 186 ++++++++++++++++++ docs/alerters/46elks.rst | 45 +++++ docs/alerters/bulksms.rst | 33 ++++ docs/alerters/email.rst | 61 ++++++ docs/alerters/execute.rst | 41 ++++ docs/alerters/nc.rst | 6 + docs/alerters/pushbullet.rst | 15 ++ docs/alerters/pushover.rst | 22 +++ docs/alerters/ses.rst | 26 +++ docs/alerters/slack.rst | 32 +++ docs/alerters/sns.rst | 24 +++ docs/alerters/syslog.rst | 4 + docs/alerters/telegram.rst | 18 ++ docs/aws-boilerplate.rst | 3 + docs/aws-confvals.rst | 20 ++ docs/configuration.rst | 3 +- docs/creds-warning.rst | 1 + docs/monitors.rst | 2 + docs/monitors/eximqueue.rst | 20 ++ docs/monitors/{ring.rst => ring_doorbell.rst} | 4 +- docs/monitors/swap.rst | 11 ++ 21 files changed, 574 insertions(+), 3 deletions(-) create mode 100644 docs/alerters.rst create mode 100644 docs/alerters/46elks.rst create mode 100644 docs/alerters/bulksms.rst create mode 100644 docs/alerters/email.rst create mode 100644 docs/alerters/execute.rst create mode 100644 docs/alerters/nc.rst create mode 100644 docs/alerters/pushbullet.rst create mode 100644 docs/alerters/pushover.rst create mode 100644 docs/alerters/ses.rst create mode 100644 docs/alerters/slack.rst create mode 100644 docs/alerters/sns.rst create mode 100644 docs/alerters/syslog.rst create mode 100644 docs/alerters/telegram.rst create mode 100644 docs/aws-boilerplate.rst create mode 100644 docs/aws-confvals.rst create mode 100644 docs/creds-warning.rst create mode 100644 docs/monitors/eximqueue.rst rename docs/monitors/{ring.rst => ring_doorbell.rst} (89%) create mode 100644 docs/monitors/swap.rst diff --git a/docs/alerters.rst b/docs/alerters.rst new file mode 100644 index 00000000..a3395680 --- /dev/null +++ b/docs/alerters.rst @@ -0,0 +1,186 @@ +Alerter Configuration +===================== + +Alerters send one-off alerts when a monitor fails. They can also send an alert +when it succeeds again. + +An alerter knows if it is urgent or not; if a monitor defined as non-urgent +fails, an urgent alerter will not trigger for it. This means you can avoid +receiving SMS alerts for things which don’t require your immediate attention. + +Alerters can also have a time configuration for hours when they are or are not +allowed to alert. They can also send an alert at the end of the silence period +for any monitors which are currently failed. + +Alerters are defined in the main configuration file, which by default is :file:`monitor.ini`. The section name is the name of your alerter, which you should then add to the ``alerters`` configuration value. + +.. contents:: + +Common options +-------------- + +These options are common to all alerter types. + +.. confval:: type + + :type: string + :required: true + + the type of the alerter; one of those in the list below. + +.. confval:: depend + + :type: comma-separated list of string + :required: false + :default: none + + a list of monitors this alerter depends on. If any of them fail, no attempt will be made to send the alert. + +.. confval:: limit + + :type: integer + :required: false + :default: ``1`` + + the number of times a monitor must have failed before this alerter fires for it. You can use this to escalate an alert to another email address or text messaging, for example. + +.. confval:: dry_run + + :type: boolean + :required: false + :default: ``false`` + + makes an alerter do everything except actually send the message, and instead will print some information about what it would do. + +.. confval:: ooh_success + + :type: boolean + :required: false + :default: ``false`` + + makes an alerter trigger its success action even if out of hours + +.. confval:: groups + + :type: comma-separated list of string + :required: false + :default: ``default`` + + list of monitor groups this alerter should fire for. See the :ref:`group` setting for monitors. + +.. confval:: only_failures + + :type: boolean + :required: false + :default: ``false`` + + if true, only send alerts for failures (or catchups) + +.. _alerter-tz: + +.. confval:: tz + + :type: string + :required: false + :default: ``UTC`` + + the timezone to use in alert messages. See also :confval:`times_tz`. + +.. confval:: repeat + + :type: boolean + :required: false + :default: ``false`` + + fire this alerter (for a failed monitor) every iteration + +Time restrictions +----------------- + +All alerters accept time period configuration. By default, an alerter is active at all times, so you will always immediately receive an alert at the point where a monitor has failed enough (more times than the limit). To set limits on when an alerter can send, use the configuration values below. + +Note that the :confval:`times_type` option sets the timezone all the values are interpreted as. The default is the local timezone of the host evaluating the logic. + +.. confval:: day + + :type: comma-separated list of integer + :required: false + :default: all days + + which days an alerter can operate on. ``0`` is Monday, ``6`` is Sunday. + +.. confval:: times_type + + :type: string + :required: false + :default: ``always`` + + one of ``always``, ``only``, or ``not``. ``only`` means that the limits specify the period the alerter is allowed to operate in. ``not`` means the specify the period it isn't, and outside of that time it is allowed. + +.. confval:: time_lower + + :type: string + :required: when :confval:`times_type` is not ``always`` + + the lower end of the time range. Must be lower than :confval:`time_upper`. The format is ``HH:mm`` in 24-hour clock. + +.. confval:: time_upper + + :type: string + :required: when :confval:`times_type` is not ``always`` + + the upper end of the time range. Must be lower than :confval:`time_lower`. The format is ``HH:mm`` in 24-hour clock. + +.. confval:: times_tz + + :type: string + :required: false + :default: the host's local time + + the timezone for :confval:`day`, :confval:`time_lower` and :confval:`time_upper` to be interpreted in. + +.. confval:: delay + + :type: boolean + :required: false + :default: ``false`` + + set to true to have the alerter send a "catch-up" alert about a failed monitor if it failed during a time the alerter was not allowed to send, and is still failed as the alerter enters the time it is allowed to send. If the monitor fails and recovers during the not-allowed time, no alert is sent either way. + + +Time examples +^^^^^^^^^^^^^ + +These snippets omit the alerter-specific configuration values. + +Don't trigger during the hours I'm in the office (8:30am to 5:30pm, Monday to Friday): + +.. code-block:: ini + + [out_of_hours] + type=some-alerter-type + times_type=not + time_lower=08:30 + time_upper_17:30 + days=0,1,2,3,4 + +Don't send at antisocial times, but let me know later if something broke and hasn't recovered yet: + +.. code-block:: ini + + [polite_alerter] + type=some-alerter-type + times_type=only + time_lower=07:30 + time_upper=22:00 + delay=1 + +Alerters +-------- + +.. note:: The ``type`` of the alerter is the first word in its heading. + +.. toctree:: + :glob: + + alerters/* diff --git a/docs/alerters/46elks.rst b/docs/alerters/46elks.rst new file mode 100644 index 00000000..2e55f1c3 --- /dev/null +++ b/docs/alerters/46elks.rst @@ -0,0 +1,45 @@ +46elks - 46elks notifications +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. include:: ../creds-warning.rst + +You will need to register for an account at 46elks_. + +.. _46elks: https://46elks.com/ + +.. confval:: username + + :type: string + :required: true + + your 46wlks username + +.. confval:: password + + :type: string + :required: true + + your 46wlks password + +.. confval:: target + + :type: string + :required: true + + 46elks target value + +.. confval:: sender + + :type: string + :required: false + :default: ``SmplMntr`` + + your SMS sender field. Start with a ``+`` if using a phone number. + +.. confval:: api_host + + :type: string + :required: false + :default: ``api.46elks.com`` + + API endpoint to use diff --git a/docs/alerters/bulksms.rst b/docs/alerters/bulksms.rst new file mode 100644 index 00000000..da8db008 --- /dev/null +++ b/docs/alerters/bulksms.rst @@ -0,0 +1,33 @@ +bulksms - SMS via BulkSMS +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. warning:: Do not commit your credentials to a public repo! + +.. confval:: sender + + :type: string + :required: false + :default: ``SmplMntr`` + + who the SMS should appear to be from. Max 11 chars, and best to stick to alphanumerics. + +.. confval:: username + + :type: string + :required: true + + your BulkSMS username + +.. confval:: password + + :type: string + :required: true + + your BulkSMS password + +.. confval:: target + + :type: string + :required: true + + the number to send the SMS to. Specify using country code and number, with no ``+`` or international prefix. For example, ``447777123456`` for a UK mobile. diff --git a/docs/alerters/email.rst b/docs/alerters/email.rst new file mode 100644 index 00000000..ffd1bef5 --- /dev/null +++ b/docs/alerters/email.rst @@ -0,0 +1,61 @@ +email - send via SMTP +^^^^^^^^^^^^^^^^^^^^^ + +.. warning:: Do not commit your credentials to a public repo! + +.. confval:: host + + :type: string + :required: true + + the email server to connect to + +.. confval:: port + + :type: integer + :required: false + :default: ``25`` + + the port to connect on + +.. confval:: from + + :type: string + :required: true + + the email address to give as the sender + +.. confval:: to + + :type: string + :required: true + + the email address to send to. You can specify multiple addresses by separating with ``;``. + +.. confval:: cc + + :type: string + :required: false + + the email address to cc to. You can specify multiple addresses by separating with ``;``. + +.. confval:: username + + :type: string + :required: false + + the username to log in to the SMTP server with + +.. confval:: password + + :type: string + :required: false + + the password to log in to the SMTP server with + +.. confval:: ssl + + :type: string + :required: false + + specify ``starttls` to use StartTLS. Specify ``yes`` to use SMTP SSL. Otherwise, no SSL is used at all. diff --git a/docs/alerters/execute.rst b/docs/alerters/execute.rst new file mode 100644 index 00000000..1d6050c9 --- /dev/null +++ b/docs/alerters/execute.rst @@ -0,0 +1,41 @@ +execute - run external command +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. confval:: fail_command + + :type: string + :required: false + + command to execute when a monitor fails + +.. confval:: success_command + + :type: string + :required: false + + command to execute when a montior recovers + +.. confval:: catchup_command + + :type: string + :required: false + + command to execute when exiting a time period when the alerter couldn't fire, a monitor failed during that time, and hasn't recovered yet. (See the :confval:`delay` configuration option.) If you specify the literal string ``fail_command``, this will share the :confval:`fail_command` configuration value. + +You can specify the following variable inside ``{curly brackets}`` to have them substituted when the command is executed: + +* ``hostname``: the host the monitor is running on +* ``name``: the monitor's name +* ``days``, ``hours``, ``minutes``, and ``seconds``: the monitor's downtime +* ``failed_at``: the date and time the monitor failed +* ``vitual_fail_count``: the monitor's virtual failure count (number of failed checks - :confval:`tolerance`) +* ``info``: the additional information the monitor recorded about its status +* ``description``: description of what the monitor is checking + +You will probably need to quote parameters to the command. For example:: + + fail_command=say "Oh no, monitor {name} has failed at {failed_at}" + +The commands are executed directly by Python. If you require shell features, such as piping and redirection, you should use something like ``bash -c "..."``. For example:: + + fail_command=/bin/bash -c "/usr/bin/printf \"The simplemonitor for {name} has failed on {hostname}.\n\nTime: {failed_at}\nInfo: {info}\n\" | /usr/bin/mailx -A gmail -s \"PROBLEM: simplemonitor {name} has failed on {hostname}.\" email@address" diff --git a/docs/alerters/nc.rst b/docs/alerters/nc.rst new file mode 100644 index 00000000..a7304324 --- /dev/null +++ b/docs/alerters/nc.rst @@ -0,0 +1,6 @@ +nc - macOS notifications +^^^^^^^^^^^^^^^^^^^^^^^^ + +Publishes alerts to the macOS Notification Center. Only for macOS. + +No configuration options. diff --git a/docs/alerters/pushbullet.rst b/docs/alerters/pushbullet.rst new file mode 100644 index 00000000..1365fcf8 --- /dev/null +++ b/docs/alerters/pushbullet.rst @@ -0,0 +1,15 @@ +pushbullet - push notifications +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. include:: ../creds-warning.rst + +You will need to be registered at pushbullet_. + +.. _pushbullet: https://www.pushbullet.com/ + +.. confval:: token + + :type: string + :required: true + + your pushbullet token diff --git a/docs/alerters/pushover.rst b/docs/alerters/pushover.rst new file mode 100644 index 00000000..e9772152 --- /dev/null +++ b/docs/alerters/pushover.rst @@ -0,0 +1,22 @@ +pushover - notifications +^^^^^^^^^^^^^^^^^^^^^^^^ + +.. include:: ../creds-warning.rst + +You will need to be registered at pushover_. + +.. _pushover: https://pushover.net/ + +.. confval:: user + + :type: string + :required: true + + your pushover username + +.. confval:: token + + :type: string + :required: true + + your pushover token diff --git a/docs/alerters/ses.rst b/docs/alerters/ses.rst new file mode 100644 index 00000000..80f44d0f --- /dev/null +++ b/docs/alerters/ses.rst @@ -0,0 +1,26 @@ +ses - email via Amazon Simple Email Service +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. include:: ../creds-warning.rst + +.. include:: ../aws-boilerplate.rst + +You will need to `verify an address or domain`_. + +.. _verify an address or domain: https://docs.aws.amazon.com/ses/latest/dg/verify-addresses-and-domains.html + +.. confval:: from + + :type: string + :required: true + + the email address to send from + +.. confval:: to + + :type: string + :required: true + + the email address to send to + +.. include:: ../aws-confvals.rst diff --git a/docs/alerters/slack.rst b/docs/alerters/slack.rst new file mode 100644 index 00000000..fe6f8ef1 --- /dev/null +++ b/docs/alerters/slack.rst @@ -0,0 +1,32 @@ +slack - Slack webhook +^^^^^^^^^^^^^^^^^^^^^ + +.. warning:: Do not commit your credentials to a public repo! + +First, set up a webhook for this to use. + +* Go to https://slack.com/apps/manage +* Add a new webhook +* Configure it to taste (channel, name, icon) +* Copy the webhook URL for your configuration below + +.. confval:: url + + :type: string + :required: true + + the Slack webhook URL + +.. confval:: channel + + :type: string + :required: false + :default: the channel configured on the webhook + + the channel to send to + +.. confval:: username + + :type: string + :required: false + :default: a username to send to diff --git a/docs/alerters/sns.rst b/docs/alerters/sns.rst new file mode 100644 index 00000000..d9edf0c5 --- /dev/null +++ b/docs/alerters/sns.rst @@ -0,0 +1,24 @@ +sns - Amazon Simple Notification Service +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. include:: ../creds-warning.rst + +.. include:: ../aws-boilerplate.rst + +Note that not all regions with SNS also support sending SMS. + +.. confval:: topic + + :type: string + :required: yes, if ``number`` is not given + + the ARN of the SNS topic to publish to. Specify this, or ``number``, but not both. + +.. confval:: number + + :type: string + :required: yes, if ``topic`` is not given + + the phone number to SMS. Give the number as country code then number, without a ``+`` or other international access code. For example, ``447777123456`` for a UK mobile. Specify this, or ``topic``, but not both. + +.. include:: ../aws-confvals.rst diff --git a/docs/alerters/syslog.rst b/docs/alerters/syslog.rst new file mode 100644 index 00000000..c493b173 --- /dev/null +++ b/docs/alerters/syslog.rst @@ -0,0 +1,4 @@ +syslog - send to syslog +^^^^^^^^^^^^^^^^^^^^^^^ + +Syslog alerters have no additional configuration. diff --git a/docs/alerters/telegram.rst b/docs/alerters/telegram.rst new file mode 100644 index 00000000..c34c7fa5 --- /dev/null +++ b/docs/alerters/telegram.rst @@ -0,0 +1,18 @@ +telegram - send to a chat +^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. include:: ../creds-warning.rst + +.. confval:: token + + :type: string + :required: true + + the token to access Telegram + +.. confval:: client_id + + :type: string + :required: true + + the chat id to send to diff --git a/docs/aws-boilerplate.rst b/docs/aws-boilerplate.rst new file mode 100644 index 00000000..ca99b52c --- /dev/null +++ b/docs/aws-boilerplate.rst @@ -0,0 +1,3 @@ +If you have AWS credentials configured elsewhere (e.g. in :file:`~/.aws/credentials`), or in the environment, this will use those and you do not need to specifiy credentials in your configuration file. + +As a best practice, use an IAM User/Role which is only allowed to access the resources in use. diff --git a/docs/aws-confvals.rst b/docs/aws-confvals.rst new file mode 100644 index 00000000..071aefaa --- /dev/null +++ b/docs/aws-confvals.rst @@ -0,0 +1,20 @@ +.. confval:: aws_region + + :type: string + :required: false + + the AWS region to use (e.g. ``eu-west-1``) + +.. confval:: aws_access_key + + :type: string + :required: false + + the AWS access key to use + +.. confval:: aws_secret_access_key + + :type: string + :required: false + + the AWS secret access key to use diff --git a/docs/configuration.rst b/docs/configuration.rst index b89be1e9..88ca2291 100644 --- a/docs/configuration.rst +++ b/docs/configuration.rst @@ -105,7 +105,7 @@ This file must contain a ``[monitor]`` section, which must contain at least the :required: false :default: none - a file to watch the modification time on. If the modification time increases, SimpleMonitor reloads its configuration. + a file to watch the modification time on. If the modification time increases, SimpleMonitor :ref:`reloads its configuration`. .. tip:: SimpleMonitor will reload if it receives SIGHUP; this option is useful for platforms which don't have that. @@ -207,6 +207,7 @@ This is an example pair of configuration files to show what goes where. For more partition=/ limit=1G +.. _Reloading: Reloading --------- diff --git a/docs/creds-warning.rst b/docs/creds-warning.rst new file mode 100644 index 00000000..a5dc7a2d --- /dev/null +++ b/docs/creds-warning.rst @@ -0,0 +1 @@ +.. warning:: Do not commit your credentials to a public repo! diff --git a/docs/monitors.rst b/docs/monitors.rst index 94fd587d..f35df6b1 100644 --- a/docs/monitors.rst +++ b/docs/monitors.rst @@ -101,6 +101,8 @@ These options are common to all monitor types. if this monitor should alert at all. +.. _monitor-group: + .. confval:: group :type: string diff --git a/docs/monitors/eximqueue.rst b/docs/monitors/eximqueue.rst new file mode 100644 index 00000000..1d80a11a --- /dev/null +++ b/docs/monitors/eximqueue.rst @@ -0,0 +1,20 @@ +eximqueue - Exim queue size +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Checks the output of ``exigrep`` to make sure the queue isn't too big. + +.. confval:: max_length + + :type: integer + :required: false + :default: ``1`` + + the maximum acceptable queue length + +.. confval:: path + + :type: string + :required: false + :default: ``/usr/local/sbin`` + + the path containing the ``exigrep`` binary diff --git a/docs/monitors/ring.rst b/docs/monitors/ring_doorbell.rst similarity index 89% rename from docs/monitors/ring.rst rename to docs/monitors/ring_doorbell.rst index 09c7b23b..1ca5217f 100644 --- a/docs/monitors/ring.rst +++ b/docs/monitors/ring_doorbell.rst @@ -1,5 +1,5 @@ -ring - Ring doorbell battery -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +ring_doorbell - Ring doorbell battery +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Check the battery level of a Ring doorbell. diff --git a/docs/monitors/swap.rst b/docs/monitors/swap.rst new file mode 100644 index 00000000..733ecfc2 --- /dev/null +++ b/docs/monitors/swap.rst @@ -0,0 +1,11 @@ +swap - available swap space +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Checks for available swap space. + +.. confval:: percent_free + + :type: integer + :required: true + + minimum acceptable free swap percent