Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cactus-pangenome inconsistent alignments in genic region between different runs #1613

Open
avril-m-harder opened this issue Feb 12, 2025 · 0 comments

Comments

@avril-m-harder
Copy link

avril-m-harder commented Feb 12, 2025

We have a ~30-kb region of interest that spans 5 genes in our primary reference genome (black path in tube map). Based on gene (blat) and whole region (minimap2) sequences, we know that the haplotype represented by the purple path at the bottom of this tube map has a deletion that removes the first 2 genes in the region but the last 3 genes are present. I extracted the sequences for the nodes traversed by only the purple path (these sequences include the 3 retained genes) and they align nearly perfectly to the primary reference (and ≥20 other haplotypes in the graph). Are there any reasons why these nodes might be maintained separately and only traversed by the purple path rather than the purple haplotype sequences properly aligning to the sequences on the other side of purple's deletion?

Using the GFA, I confirmed that these purple-specific nodes are not traversed by any other path in the graph. When I build a graph using just the primary reference and the purple haplotype, the region aligns as it should. The purple haplotype and primary references aren't super divergent, mash distance was 0.008, middle of the pack for the 33-haplotype set.

(The two graphs mentioned above were built with v2.9.3. I previously built a graph with v2.7.1 using a different set of assemblies--not including the exact one represented in purple here but with 3 assemblies that have the same haplotype in this region--and the sequences aligned as expected in the graph.)

Image

Thanks!

*edited to add there are no path breaks for the purple haplotype here. there are alignments of this subpath on either side of these purple-specific nodes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant