You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have a ~30-kb region of interest that spans 5 genes in our primary reference genome (black path in tube map). Based on gene (blat) and whole region (minimap2) sequences, we know that the haplotype represented by the purple path at the bottom of this tube map has a deletion that removes the first 2 genes in the region but the last 3 genes are present. I extracted the sequences for the nodes traversed by only the purple path (these sequences include the 3 retained genes) and they align nearly perfectly to the primary reference (and ≥20 other haplotypes in the graph). Are there any reasons why these nodes might be maintained separately and only traversed by the purple path rather than the purple haplotype sequences properly aligning to the sequences on the other side of purple's deletion?
Using the GFA, I confirmed that these purple-specific nodes are not traversed by any other path in the graph. When I build a graph using just the primary reference and the purple haplotype, the region aligns as it should. The purple haplotype and primary references aren't super divergent, mash distance was 0.008, middle of the pack for the 33-haplotype set.
(The two graphs mentioned above were built with v2.9.3. I previously built a graph with v2.7.1 using a different set of assemblies--not including the exact one represented in purple here but with 3 assemblies that have the same haplotype in this region--and the sequences aligned as expected in the graph.)
Thanks!
*edited to add there are no path breaks for the purple haplotype here. there are alignments of this subpath on either side of these purple-specific nodes
The text was updated successfully, but these errors were encountered:
We have a ~30-kb region of interest that spans 5 genes in our primary reference genome (black path in tube map). Based on gene (blat) and whole region (minimap2) sequences, we know that the haplotype represented by the purple path at the bottom of this tube map has a deletion that removes the first 2 genes in the region but the last 3 genes are present. I extracted the sequences for the nodes traversed by only the purple path (these sequences include the 3 retained genes) and they align nearly perfectly to the primary reference (and ≥20 other haplotypes in the graph). Are there any reasons why these nodes might be maintained separately and only traversed by the purple path rather than the purple haplotype sequences properly aligning to the sequences on the other side of purple's deletion?
Using the GFA, I confirmed that these purple-specific nodes are not traversed by any other path in the graph. When I build a graph using just the primary reference and the purple haplotype, the region aligns as it should. The purple haplotype and primary references aren't super divergent, mash distance was 0.008, middle of the pack for the 33-haplotype set.
(The two graphs mentioned above were built with v2.9.3. I previously built a graph with v2.7.1 using a different set of assemblies--not including the exact one represented in purple here but with 3 assemblies that have the same haplotype in this region--and the sequences aligned as expected in the graph.)
Thanks!
*edited to add there are no path breaks for the purple haplotype here. there are alignments of this subpath on either side of these purple-specific nodes
The text was updated successfully, but these errors were encountered: