This project is a comprehensive disk monitoring and system maintenance tool designed to track disk utilization, identify large files, manage log rotation, and detect zombie processes. It integrates seamlessly with Jenkins pipelines for scheduled checks and generates alerts via Microsoft Teams and Slack. The tool is designed to work across various environments, including:
- Oracle Enterprise Linux
- Oracle DB, MongoDB, Cassandra, Redis, Kafka
- OpenShift clusters
- Jenkins, Bitbucket, Jira, Confluence servers
- Application servers running Java and other test environments
/.
│
├── modules/
│ ├── disk_usage.py # Disk monitoring and cleanup
│ ├── process_monitor.py # Zombie process detection and termination
│ ├── alerting.py # Teams and Slack notifications
│ ├── system_detection.py # Filesystem and application detection
│ └── elk_logger.py # ELK Logging integration
│
├── Dockerfile # Docker containerization
├── main.py # Main script integrating all modules
├── Jenkinsfile # Jenkins pipeline for automation
└── requirements.txt # Python dependencies
- Threshold-based Monitoring:
- Monitors disk usage and sends alerts if a specified threshold is exceeded.
- Supports multiple mount points and paths.
- Large File Detection:
- Scans for files exceeding a specified size and archives or deletes them.
- Log Rotation and Archiving:
- Automatically compresses or deletes log files older than a specified number of days.
- Log Files (
.log
): Compress, delete, or rotate based on retention days. - Backup Files (
.bak, .tar, .gz
): Archive or delete after a certain period. - Temporary Files (
.tmp, .swp
): Immediate cleanup of temporary files. - Database Dumps (
.sql, .db
): Archive or move older dumps. - Media Files (
.mp4, .mkv
): Move or archive large media files. - ISO Images (
.iso, .img
): Archive or delete after a period.
- Zombie Process Detection:
- Scans for zombie processes (
ps aux | grep 'Z'
) and identifies their parent processes (PPID).
- Scans for zombie processes (
- Parent Process Cleanup:
- Automatically terminates parent processes (
kill -9
) to eliminate zombie processes.
- Automatically terminates parent processes (
- Threshold-Based Alerts:
- Sends Teams or Slack notifications when the number of zombie processes exceeds a defined threshold.
- Pipeline Integration:
- Designed to run as part of Jenkins pipelines.
- Supports parameterized builds for flexible disk checks and process monitoring.
- Centralized Logging:
- Sends disk monitoring and process cleanup logs to ELK (Elasticsearch, Logstash, Kibana).
- Provides better visualization and monitoring.
-
Oracle DB (on ext4 or XFS):
- Cleans up Oracle database logs and archive files automatically.
- Detects database dump files and moves them to archival storage.
-
MongoDB, Cassandra, Redis:
- Scans
/var/lib
directories for large.wt
or.sst
files. - Performs automatic compaction or backup on large files.
- Scans
-
Local Logging:
- All actions (disk usage alerts, file deletions, zombie process detection) are logged in
/var/log/disk_monitor.log
. - Supports INFO, WARNING, and ERROR log levels.
- All actions (disk usage alerts, file deletions, zombie process detection) are logged in
-
Jenkins Build Artifacts:
- Disk usage reports and logs are attached as artifacts to Jenkins builds.
- Failed builds trigger automatic alerts to Microsoft Teams or Slack.
- Language: Python 3.9+
- Dependencies:
psutil
– Disk and process monitoringrequests
– Microsoft Teams and Slack integrationsubprocess
– Command execution for disk and process managementpython-dotenv
– Environment variable managementelasticsearch
– ELK integration
- Tools:
- Jenkins (for scheduled execution)
- Microsoft Teams / Slack (for alerting)
- Docker (for containerization)
- Python 3.9+
- Jenkins (optional for CI/CD integration)
- Microsoft Teams or Slack Webhook URL for notifications
- ELK stack for centralized logging (optional)
-
Clone the repository
-
Install dependencies:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
- Set up Microsoft Teams Webhook URL in
.env
file:
TEAMS_WEBHOOK_URL=https://outlook.office.com/webhook/YOUR-WEBHOOK-URL
ELK_HOST=your-elk-host
ELK_PORT=9200
python3 main.py --threshold 85 --log_retention_days 30 --scan_path /var/log --check_zombies true
pipeline {
agent any
environment {
PYTHON_VERSION = '3.9'
}
parameters {
string(name: 'THRESHOLD', defaultValue: '85', description: 'Disk usage threshold')
string(name: 'LOG_RETENTION_DAYS', defaultValue: '30', description: 'Log retention period (days)')
string(name: 'SCAN_PATH', defaultValue: '/', description: 'Directory to scan')
booleanParam(name: 'CHECK_ZOMBIES', defaultValue: true, description: 'Enable zombie process detection')
}
stages {
stage('Disk and Process Monitoring') {
steps {
sh '''
source venv/bin/activate
python main.py --threshold ${THRESHOLD} --log_retention_days ${LOG_RETENTION_DAYS} --scan_path ${SCAN_PATH} --check_zombies ${CHECK_ZOMBIES}
'''
}
}
}
post {
always {
cleanWs()
}
}
}