Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal error in ComputeBranchCache #1822

Open
legason opened this issue Mar 18, 2025 · 6 comments
Open

Internal error in ComputeBranchCache #1822

legason opened this issue Mar 18, 2025 · 6 comments

Comments

@legason
Copy link

legason commented Mar 18, 2025

Dear Hyphy developers and users! ,

I tried to analyse my DNA sequences ~ 135kb long and 54 in total for recombination using the command below ENV=TOLERATE_NUMERICAL_ERRORS=1 hyphy gard --alignment EBV_trimmed.fasta --tree EBV_trimmed.fasta.treefile --max-breakpoints 500 --model GTR+G --cpu 8

Of course, this is after trying to construct the tree in guard, and it failed several times. So I constructed the tree in raxml and hoped running guard with a precomputed tree would make the analysis more efficient because HyPhy wouldn’t have to infer the tree dynamically. Is my alignment too large? I even tried to reduce the breakpoints to 100 but still got the same error. I appreciate your help.

Thanks, Ismail.

@spond
Copy link
Member

spond commented Mar 18, 2025

Dear @legason,

A couple of things.

  1. The ENV statement is a command argument to hyphy (not a shell var); see below
  2. I would suggest using HYPHYMPI. Also the cpu= flag doesn't do anything (it's CPU=X, case sensitive)
  3. The --tree option is ignored. It makes no sense for gard and --model is similarly not an option HyPhy understands

So something like

mpirun -np N HYPHYMPI gard --alignment EBV_trimmed.fasta  ENV="TOLERATE_NUMERICAL_ERRORS=1;" 

For long alignments like yours, 135kbp, I would expect GARD to run a loing time; it's not really designed for very long genomes (more like RNA viruses, or fragments), something on the order of seversl kbp.

Best,
Sergei

@legason
Copy link
Author

legason commented Mar 18, 2025

Thanks, Sergei!

However, does supplying precomputed tree work or not?

@stevenweaver
Copy link
Member

Dear @legason,

No, supplying a precomputed tree does not work for GARD. As mentioned in the response by @spond, the --tree option is ignored in GARD because the method itself is designed to infer trees dynamically as part of its recombination detection process. Using a precomputed tree will not make the analysis more efficient, as GARD still needs to evaluate different topologies at potential breakpoints to detect recombination events.

Best,
Steven

@legason
Copy link
Author

legason commented Mar 19, 2025

Thanks, @spond @stevenweaver, for your responses. I have run it as per @spond guidance, but it takes hours.

Thanks again,
Ismail

@spond
Copy link
Member

spond commented Mar 19, 2025

Dear @legason,

If you have highly similar sequences, doing a data compression like desciribed in #1448

Best,
Sergei

@legason
Copy link
Author

legason commented Mar 19, 2025

Dear @spond
I will try it if the current session takes another 12 hours.

Thanks,
Ismail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants