layout | title | permalink |
---|---|---|
page |
Tutorial |
/tutorial/ |
# download the pipeline
wget https://github.com/terrimporter/MetaWorks/releases/download/v1.13.0/MetaWorks1.13.0.zip
# unzip the pipeline
unzip MetaWorks1.13.0.zip
# Download miniconda3 if you don't already have conda on your system wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh
mkdir ~/bin
cd ~/bin
ln -s ~/miniconda3/bin/conda conda
echo $SHELL conda init bash
# Move into the MetaWorks folder cd MetaWorks1.13.0
conda env create -f environment.yml
conda activate MetaWorks_v1.13.0
# download the COIv4 classifier
wget https://github.com/terrimporter/CO1Classifier/releases/download/RDP-COI-v5.0.0/RDP_COIv5.0.0.zip
# decompress the file
unzip RDP_COIv5.0.0.zip
# Note the full path to the rRNAClassifier.properties file, ex. mydata_trained/rRNAClassifier.properties
You can find the full list of custom-trained classifiers that work with MetaWorks here.
# download ORFfinder if you wish to filter out putative pseudogenes
wget ftp://ftp.ncbi.nlm.nih.gov/genomes/TOOLS/ORFfinder/linux-i64/ORFfinder.gz
# decompress
gunzip ORFfinder.gz
# make executable
chmod 755 ORFfinder
# If you do not have a bin folder in your home directory, then create one first.
mkdir ~/bin
# put in your PATH (ex. ~/bin).
mv ORFfinder ~/bin/.
We have provided a small set of COI paired-end Illumina MiSeq files for this tutorial. These sequence files contain reads for several pooled COI amplicons, but here we will focus on the COI-BR5 amplicon (Hajibabaei et al., 2012, Gibson et al., 2014).
The config_testing_COI_data.yaml file has been 'preset' to work with the COI_data files in the testing folder. You will, however, still need to add the path to the trained COI classifier and save your changes.
RDP:
# If you are using a custom-trained reference set
# enter the path to the trained RDP classifier rRNAClassifier.properties file here:
t: "/path/to/CO1Classifier/v4/mydata_trained/rRNAClassifier.properties"
Then you should be ready to run the MetaWorks pipeline on the testing data.
# You may need to edit the number of jobs you would like to run, ex --jobs 1 or --jobs 4, according to how many cores you have available
snakemake --jobs 2 --snakefile snakefile_ESV --configfile config_testing_COI_data.yaml
The final output file is called results.csv . The results are for the COI-BR5 amplicon. This can be imported into R for bootstrap support filtering, pivot table creation, normalization, vegan analysis, etc. There are also a number of other output files in the stats directory showing the total number of reads processed at each step as well as the sequence lengths. Log files are also available for the dereplication, denoising, and chimera removal steps.
If you are done with MetaWorks, deactivate the conda environment:
conda deactivate