You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, after increasing to 12 species, the job consistently fails due to memory issues.
Not enough memory! User limited to 153931627886 bytes but we only have 135285168640 bytes.
This has been posted before, but I tried everything that was recommended. And all my genomes have been repeat masked.
Other error messages I received:
slurmstepd: error: Detected 1 oom_kill event in StepId=41103254.batch. Some of the step tasks have been OOM Killed. Exit reason: MEM LIMIT
Due to failure we are reducing the remaining try count of job 'CutHeadersJob' kind-CutHeadersJob/instance-bnhuhvlgv1 with ID kind-CutHeadersJob/instance-bnhuhvlg to 1
(I attached my log so you can see the exact messages)
I tried playing around with the flags and adjusting the parameters, but nothing has worked. I tried breaking it into different steps which also didn't work. I also contacted the cluster support staff, and they were unable to come up with a solution.
My scratch spaces where my working directory and coordination directory are located is 2 Terabytes
Thanks for the quick response. Yes that was one of the first things I tried to adjust and always got the same error message despite lowering it. I lowered it to 500G, 400G, 300G, 200G, 180G, 120G, 100G, and as low as 8G.
From: Glenn Hickey ***@***.***>
Date: Wednesday, February 12, 2025 at 1:23 PM
To: ComparativeGenomicsToolkit/cactus ***@***.***>
Cc: cs1890 ***@***.***>, Author ***@***.***>
Subject: Re: [ComparativeGenomicsToolkit/cactus] Cactus Fails with “Not Enough Memory” on 12-species Alignment (SLURM, Toil) (Issue #1611)
Try a lowering --maxMemory 1.4Ti
—
Reply to this email directly, view it on GitHub<#1611 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BO5WTRFVRCLRMS7J3Q5CQ6T2POGQZAVCNFSM6AAAAABW77QAV6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNJUGUYTQNZTHE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
I successfully ran Progressive Cactus v.2.9.3 on a SLURM-based HPC cluster for an alignment of 5 species using the following command:
-created a tmux session
module load python/3.8.2
source scratch/cs1890/cactus-bin-v2.9.3/venv-cactus-v2.9.3/bin/activate
cd /scratch/cs1890/cactus_data/output
TOIL_SLURM_ARGS="partition=mem --time=3-00:00:00"
cactus jobstore /scratch/cs1890/cactus_data/input/sequenceFile.txt
/scratch/cs1890/cactus_data/output/output.hal
--batchSystem slurm
--batchLogsDir /scratch/cs1890/cactus_logs
--coordinationDir /scratch/cs1890/tmp
--workDir /scratch/cs1890/tmp
--consCores 64
--maxMemory 1.4Ti
--doubleMem true
--maxJobs 100
However, after increasing to 12 species, the job consistently fails due to memory issues.
Not enough memory! User limited to 153931627886 bytes but we only have 135285168640 bytes.
This has been posted before, but I tried everything that was recommended. And all my genomes have been repeat masked.
Other error messages I received:
slurmstepd: error: Detected 1 oom_kill event in StepId=41103254.batch. Some of the step tasks have been OOM Killed. Exit reason: MEM LIMIT
Due to failure we are reducing the remaining try count of job 'CutHeadersJob' kind-CutHeadersJob/instance-bnhuhvlgv1 with ID kind-CutHeadersJob/instance-bnhuhvlg to 1
(I attached my log so you can see the exact messages)
I tried playing around with the flags and adjusting the parameters, but nothing has worked. I tried breaking it into different steps which also didn't work. I also contacted the cluster support staff, and they were unable to come up with a solution.
My scratch spaces where my working directory and coordination directory are located is 2 Terabytes
cactus_run.log
The text was updated successfully, but these errors were encountered: