Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16S species assignment #6

Open
jsan4christ opened this issue Aug 20, 2019 · 3 comments
Open

16S species assignment #6

jsan4christ opened this issue Aug 20, 2019 · 3 comments

Comments

@jsan4christ
Copy link

Hi @mreppell,

I trust that you are well,

I'm looking into the possibility of using karp to get better species assignments for 16S. Am not sure that am using it for the right purpose, but after looking at the examples its not clear to me how to proceed. Assuming I have an OTU table and a species assignment reference database like silva or rdp. What would be the way to go about this to a valid taxa file.

Please advise,

@mreppell
Copy link
Owner

Thank you for reaching out. Karp uses the base quality scores in sequencing reads to help resolve multiply mapping reads. As such, it requires the raw sequencing data, in fastq format, as input. An OTU table does not have the information that Karp needs to assign taxonomy.

I hope this helps, and I'm happy to answer any additional questions you have,

Mark Reppell

@jsan4christ
Copy link
Author

jsan4christ commented Aug 20, 2019 via email

@mreppell
Copy link
Owner

If have installed Karp and have:

Raw fastq file with reads: "yourdata.fastq.gz"
Reference database fasta file: "reference.fasta"
Reference database taxonomy file: "reference.tax" - Description of format of this file is on Karp main page

Then first you make an index of your reference database:

./karp -c index -r reference.fasta -i reference.index

This will create "reference .index". Then, you classify your fastq file using:

./karp -c quantify -r reference.fasta -i reference.index -f yourdata.fastq.gz -o yourdata.results -t reference.tax

This will produce a file "yourdata.results" containing Karp's estimates of taxa abundance in your sample. If you are using paired-end reads, then the code becomes:

./karp -c quantify -r reference.fasta -i reference.index -f yourdata.R1.fastq.gz -q yourdata.R2.fastq.gz --paired -o yourdata.results -t reference.tax

I hope this is helpful,

Mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants