Skip to content

Latest commit

 

History

History
51 lines (36 loc) · 2.25 KB

README.md

File metadata and controls

51 lines (36 loc) · 2.25 KB

Cleaning of the Melastomataceae probe set for target enrichment

This repository details the cleaning process carried out for the Melastomataceae probe set.

See An updated and extended version of the Melastomataceae probe set for target capture (Dagallier & Michelangeli 2024).

Cleaning process

The cleaning process is detailed here.

Clean probe set

The new and clean probe set is available here: PROBE_SET_CLEAN.FNA (nucleotides version) and PROBE_SET_CLEAN_prot.FAA (amino acid version).

N.B. The purpose of this clean probe set is to be used bioinformatically to recover targeted sequences from sequencing reads, but not to physically target the DNA in vitro.

Additional note. It might be interesting to remove short sequences from the probe set, e.g. with hybpiper fix_targetfile (https://github.com/mossmatters/HybPiper/wiki/Troubleshooting,-common-issues,-and-recommendations#14-fixing-and-filtering-your-target-file)

Comparison between the old and new probe set

See details here.

statplot Figure 1. Summary of recovery statistics computed with HybPiper for the assemblies with the old probe set (blue) and the new probe set in nucleotide format (yellow), and with the new probe set in amino-acids format (orange). A: number of loci with mapped reads, B: number of loci with assembled sequences, and C: number of loci with assembled sequences equal or longer to 75% of the length of their locus reference in the probe set. Burrow-Wheeler aligner (bwa) was used to map the reads with nucleotide probe sets, and Diamond was used for the amino-acids probe set. Numbers right to the boxplots are the median value.

How to cite

Please cite as: Dagallier L-PMJ, Michelangeli FA. 2024. An updated and extended version of the Melastomataceae probe set for target capture. Applications in Plant Sciences 12: e11564. https://doi.org/10.1002/aps3.11564