Skip to content
This repository has been archived by the owner on Feb 29, 2024. It is now read-only.

We aren't monitoring for ansible-runner failures #224

Open
gandelman-a opened this issue Apr 20, 2017 · 0 comments
Open

We aren't monitoring for ansible-runner failures #224

gandelman-a opened this issue Apr 20, 2017 · 0 comments

Comments

@gandelman-a
Copy link

Our deployment ansibles get run asynchronously via cron. We are currently not monitoring them in any way. If one is failing, it usually doesn't get noticed until one of us bothers to look. We should output something from ansible-runner tasks that indicate failure, and pick up on that via a datadog monitor. This can be something as simple as dropping a file for the failing environment somewhere on the filesystem, and a datadog monitor that triggers when said files exist?

@gandelman-a gandelman-a self-assigned this Apr 25, 2017
gandelman-a added a commit to gandelman-a/hoist that referenced this issue Apr 25, 2017
This updates ansible-runner to drop empty files in a known
directory for environments that are failing to complete their
playbooks. The files are named after the failing environment.
Upon successful ansible run, these files are cleaned up if
they exist.

This'll allow us to set a datadog check that fails when any
files exist here.  Our current datadog check is broken and
relies on log scraping and doesn't really work with multi-env
bastions.

Related-Issue: BonnyCI/projman#224

Signed-off-by: Adam Gandelman <adamg@ubuntu.com>
gandelman-a added a commit to gandelman-a/hoist that referenced this issue Apr 25, 2017
This adds a simple datadog check that fails if it finds any flag files for
failing ansible-runner environments.  This depends on PR BonnyCI#363
but should pass OK if it lands before that merges.

Related-Issue: BonnyCI/projman#224

Signed-off-by: Adam Gandelman <adamg@ubuntu.com>
gandelman-a added a commit to gandelman-a/hoist that referenced this issue Apr 25, 2017
This adds a simple datadog check that fails if it finds any flag files for
failing ansible-runner environments.  This depends on PR BonnyCI#363
but should pass OK if it lands before that merges.

Related-Issue: BonnyCI/projman#224

Signed-off-by: Adam Gandelman <adamg@ubuntu.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant