Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag "transcript_id" not found in attributes when in index with vg #4515

Open
ld9866 opened this issue Jan 31, 2025 · 1 comment
Open

Tag "transcript_id" not found in attributes when in index with vg #4515

ld9866 opened this issue Jan 31, 2025 · 1 comment

Comments

@ld9866
Copy link

ld9866 commented Jan 31, 2025

Dear developer:

Since it takes a very long time for vg to compare transcriptome data to pangenome data, about 5 days for each sample of 32 cores, which may be caused by too many variations in it, I want to extract a gene region for analysis, but we found that GTF could not be input after extracting it separately.

I read all the previous posts and I still don't think I know what to do. Can you help me.

Best regards,
Dong

The code:
vg autoindex --threads 32 --workflow mpmap --workflow rpvg --prefix vg_rpvg --ref-fasta test.fa --vcf test.vcf.gz --tx-gff test.gtf

The error:
[IndexRegistry]: Checking for phasing in VCF(s).
[IndexRegistry]: Chunking inputs for parallelism.
[IndexRegistry]: Constructing spliced VG graph from FASTA and VCF input.
ERROR: Tag "transcript_id" not found in attributes (line 3).

The data:

test.tar.gz

@jeizenga
Copy link
Contributor

The 9th column in a GTF file is a semi-structured list of named fields and values, and typically, one of these fields indicates a unique identifier for the transcript, such as an accession number/ID. In GENCODE, that field is called "transcript_id", so that is the default in vg autoindex and vg rna. However, if you are using annotations from a different source, they often have a different name for this field. You'll probably be able to figure out which field is the identifier if you look at line 3 of the GTF (as the error indicates) and then you can provide the name of that field to vg autoindex with the --gff-tx-tag option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants