The following directories will be created in the output directory after the pipeline has finished.
Directory | Description |
---|---|
qc/fastqc/raw/ |
FastQC results for read 1 and read2 before adapter trimming. |
qc/fastqc/trim/ |
FastQC results for read 1 and read2 after adapter trimming. |
qc/fastq_screen/ |
FastQ Screen results for read 1 and read2 before adapter trimming. |
qc/cutadapt/ |
Log files generated by cutadapt containing adapter trimming information. |
qc/multiqc/ |
HTML file generated by MultiQC to collate pipeline QC from FastQC, FastQ Screen, cutadapt, samtools flagstat, samtools idxstats, picard CollectMultipleMetrics and picard MarkDuplicates. |
Directory | Description |
---|---|
align/ |
Filtered, coordinate sorted alignment files in BAM format at the run-level for each sample. |
align/flagstat/ |
Multiple BAM files will be generated before the final filtered BAM file is created. The SAMtools flagstat files for a selection of these will be placed in this directory. |
align/idxstats/ |
SAMtools idxstats files to determine the percentage of reads mapping to each contig in reference assembly. |
align/picard_metrics/ |
Alignment QC files from picard CollectMultipleMetrics and the metrics file from MarkDuplicates. |
align/sysout/ |
Sysout files for various programs to aid in troubleshooting errors. |
Directory | Description |
---|---|
align/replicateLevel/ |
Replicate-level, merged, coordinate sorted BAM files after the re-marking and removal of duplicates. |
align/replicateLevel/flagstat/ |
Flagstat files associated with the final filtered merged BAM file. |
align/replicateLevel/picard_metrics/ |
Metrics file from MarkDuplicates. |
align/replicateLevel/sysout/ |
Sysout files for various programs to aid in troubleshooting errors. |
align/replicateLevel/bigwig/ |
Normalised bigWig files scaled to 1 million mapped read pairs. |
align/replicateLevel/danpos/ |
Normalised wig files scaled to 1 million mapped reads, and spreadsheet containing genome-wide nucleosome positions generated by DANPOS2 . |
Genome-specific files required by selected processes in the pipeline.
An IGV session file called igv_session.xml
will be created at the end of the pipeline. This avoids having to load all the data individually into IGV for visualisation. Once installed, open IGV, go to File > Open Session
and select the igv_session.xml
file for loading.
File paths in the IGV session file will be set as absolute paths to the directory containing the results. If you prefer to load the data over the web you can just replace the relevant portion of the file path with a link in the session file.
The reference genome fasta file will be soft-linked to the genome/
directory, and by default this will be set as the genome for the IGV session. If you prefer to use an in-built genome provided by IGV just change the file path to the name of the IGV genome e.g. mm10 or hg19.
Nextflow provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to trouble-shoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. Default reports generated by the pipeline are BABS-MNASeqPE_report.html
, BABS-MNASeqPE_timeline.html
, BABS-MNASeqPE_trace.txt
and BABS-MNASeqPE_dag.dot
.