Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Biscuit #295

Open
wants to merge 31 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
3f1fde8
add biscuit modules
Dec 8, 2022
694f7bb
install new modules
Dec 12, 2022
ccea079
Merge branch 'master' into biscuit
njspix Dec 12, 2022
7fc6063
git fixes
Dec 12, 2022
0f1e260
just starting to prototype biscuit
njspix Dec 12, 2022
a44715c
local change biscuit modules; hopefully upstream soon
njspix Dec 14, 2022
d6bcf6a
adding pileup
njspix Dec 14, 2022
ab1a7d1
fix a bit of epic munging
njspix Dec 14, 2022
84748ae
filling out biscuit, changing some options
njspix Dec 15, 2022
41e873d
params update
njspix Dec 15, 2022
7691ea3
more params tweaks
njspix Dec 15, 2022
2470c54
more tweaks
njspix Dec 15, 2022
96039c5
install njspix biscuit modules (temp)
njspix Dec 19, 2022
c42d4b4
module updates
njspix Dec 19, 2022
ad32b36
update
njspix Jan 2, 2023
0e9edb2
Merge branch 'master' of https://github.com/njspix/methylseq into bis…
Jan 2, 2023
e14555c
tweaks
njspix Jan 2, 2023
451d880
more config tweaks, refactoring
njspix Jan 2, 2023
d393907
update schema, modules, etc
njspix Jan 18, 2023
fe38f2a
update modules
njspix Mar 7, 2023
4d758a6
getting rid of enable_conda
njspix Mar 7, 2023
9bcaa02
linting passes now
njspix Mar 7, 2023
068a590
more conda leftovers
njspix Mar 7, 2023
5c1108e
working on unifying options
njspix Mar 7, 2023
88e7cf0
schema updates
njspix Mar 7, 2023
f52a294
fancy config for methyldackel actually works now
njspix Mar 8, 2023
b5e919a
more schema tweaks
njspix Mar 8, 2023
7c98649
changelog updates
njspix Mar 8, 2023
8380e76
Merge branch 'dev' into biscuit
Jan 8, 2024
dd85987
module updates
Jan 8, 2024
6c92ffa
schema updates
Jan 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,24 @@
- 🐛 fix `ignore_3prime_r2` param #299
- 🐛 removed unused directory #297

## Working list...

### Pipeline Updates

- Add `Biscuit` aligner as a separate, 3rd workflow
- Add `biscuit_index` parameter
- Add a `--save_merged` option to save merged (`cat`'d) fastq files for the same sample

### Refactoring

- 🧹 Refactoring of parameters, especially those that generalize across aligners:
- `--non_directional` is now under 'Alignment options' rather than 'Bismark options', as both the Bismark and Biscuit aligners support directional and non-directional alignment.
- New 'Methylation calling options' section
- `--merge_cg` replaces the former `--cytosine_report` option. Biscuit and bwa-meth produce stranded methylation calls by default, while bismark does not. This flag abstracts away all the necessary flags for each tool, and defaults to producing merged (unstranded) methylationc calls.
- `--meth_cutoff`, `--comprehensive`, and the `--ignore` parameters apply to all 3 aligners
- `--no_overlap` applies to biscuit and bismark but NOT bwa-meth
- `--nomeseq` has been moved to 'Special Library Types' as both `bismark` and `biscuit` support NOMe-seq.

## [v2.3.0](https://github.com/nf-core/methylseq/releases/tag/2.3.0) - 2022-12-16

### Pipeline Updates
Expand Down
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,23 +22,23 @@ On release, automated continuous integration tests run the pipeline on a full-si

## Pipeline Summary

The pipeline allows you to choose between running either [Bismark](https://github.com/FelixKrueger/Bismark) or [bwa-meth](https://github.com/brentp/bwa-meth) / [MethylDackel](https://github.com/dpryan79/methyldackel).
Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for alignment), `--aligner bismark_hisat` or `--aligner bwameth`.

| Step | Bismark workflow | bwa-meth workflow |
| -------------------------------------------- | ---------------- | --------------------- |
| Generate Reference Genome Index _(optional)_ | Bismark | bwa-meth |
| Merge re-sequenced FastQ files | cat | cat |
| Raw data QC | FastQC | FastQC |
| Adapter sequence trimming | Trim Galore! | Trim Galore! |
| Align Reads | Bismark | bwa-meth |
| Deduplicate Alignments | Bismark | Picard MarkDuplicates |
| Extract methylation calls | Bismark | MethylDackel |
| Sample report | Bismark | - |
| Summary Report | Bismark | - |
| Alignment QC | Qualimap | Qualimap |
| Sample complexity | Preseq | Preseq |
| Project Report | MultiQC | MultiQC |
The pipeline allows you to choose between running either [Bismark](https://github.com/FelixKrueger/Bismark), [bwa-meth](https://github.com/brentp/bwa-meth) / [MethylDackel](https://github.com/dpryan79/methyldackel), or [biscuit](https://huishenlab.github.io/biscuit/).
Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for alignment), `--aligner bismark_hisat`, `--aligner bwameth`, or `--aligner biscuit`.

| Step | Bismark workflow | bwa-meth workflow | Biscuit workflow |
| -------------------------------------------- | ---------------- | --------------------- | ---------------- |
| Generate Reference Genome Index _(optional)_ | Bismark | bwa-meth | Biscuit `index` |
| Merge re-sequenced FastQ files | cat | cat | cat |
| Raw data QC | FastQC | FastQC | FastQC |
| Adapter sequence trimming | Trim Galore! | Trim Galore! | Cutadapt |
| Align Reads | Bismark | bwa-meth | biscuit `align` |
| Deduplicate Alignments | Bismark | Picard MarkDuplicates | samblaster |
| Extract methylation calls | Bismark | MethylDackel | Biscuit `pileup` |
| Sample report | Bismark | - | Biscuit `qc` |
| Summary Report | Bismark | - | Biscuit `qc` |
| Alignment QC | Qualimap | Qualimap | Qualimap |
| Sample complexity | Preseq | Preseq | Preseq |
| Project Report | MultiQC | MultiQC | MultiQC |

## Usage

Expand Down
215 changes: 204 additions & 11 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,15 @@ process {
]
}

withName: CAT_FASTQ {
publishDir = [
path: { "${params.outdir}/merged_fastq" },
mode: params.publish_dir_mode,
pattern: '*.gz',
enabled: params.save_merged
]
}

withName: FASTQC {
ext.args = '--quiet'
publishDir = [
Expand Down Expand Up @@ -209,11 +218,11 @@ process {
params.comprehensive ? ' --comprehensive --merge_non_CpG' : '',
params.meth_cutoff ? " --cutoff ${params.meth_cutoff}" : '',
params.nomeseq ? '--CX' : '',
params.ignore_r1 > 0 ? "--ignore ${params.ignore_r1}" : '',
params.ignore_3prime_r1 > 0 ? "--ignore_3prime ${params.ignore_3prime_r1}" : '',
meta.single_end ? '' : (params.no_overlap ? ' --no_overlap' : '--include_overlap'),
meta.single_end ? '' : (params.ignore_r2 > 0 ? "--ignore_r2 ${params.ignore_r2}" : ""),
meta.single_end ? '' : (params.ignore_3prime_r2 > 0 ? "--ignore_3prime_r2 ${params.ignore_3prime_r2}": "")
meta.single_end ? '' : (params.no_overlap ? ' --no_overlap' : '--include_overlap'),
meta.single_end ? '' : (params.ignore > 0 ? "--ignore ${params.ignore}" : ""),
meta.single_end ? '' : (params.ignore_3prime > 0 ? "--ignore_3prime ${params.ignore_3prime}" : ""),
meta.single_end ? '' : (params.ignore_r2 > 0 ? "--ignore_r2 ${params.ignore_r2}" : ""),
meta.single_end ? '' : (params.ignore_3prime_r2 > 0 ? "--ignore_r2 ${params.ignore_3prime_r2}" : "")
].join(' ').trim() }
publishDir = [
[
Expand Down Expand Up @@ -304,6 +313,173 @@ process {
]
}

withName: BISCUIT_INDEX {
ext.args = ''
publishDir = [
path: { "${params.outdir}/${params.aligner}/reference_genome" },
saveAs: { it =~ /.*\.yml/ ? null : it },
mode: params.publish_dir_mode,
enabled: params.save_reference
]
}

withName: BISCUIT_ALIGN {
ext.args = { [
// Use directional alignment (align read 1 to parent, read 2 to daughter).
// Note that `pbat` libaries are directional, but 'backwards' (read 1 aligns to daughter);
// this 'inversion' is handled in the `biscuit` subworkflow by reversing read 1 and read 2 if `pbat` is set.
( params.pbat || params.single_cell || params.zymo || (!params.non_directional) ) ? '-b 1' : '',
].join(' ').trim() }
ext.args2 = ''
publishDir = [
[
path: { "${params.outdir}/${params.aligner}/alignments/raw" },
mode: params.publish_dir_mode,
pattern: "*.bam.*",
enabled: (params.save_align_intermeds || params.skip_deduplication || params.rrbs)
]
]
}

withName: BISCUIT_BLASTER {
ext.args = { [
// Use directional alignment (align read 1 to parent, read 2 to daughter).
// Note that `pbat` libaries are directional, but 'backwards' (read 1 aligns to daughter);
// this 'inversion' is handled in the `biscuit` subworkflow by reversing read 1 and read 2 if `pbat` is set.
( params.pbat || params.single_cell || params.zymo || (!params.non_directional) ) ? '-b 1' : '',
].join(' ').trim() }
ext.args2 = { [
meta.single_end ? "--ignoreUnmated" : ""
].join(' ').trim() }
ext.args3 = ''
publishDir = [
[
path: { "${params.outdir}/${params.aligner}/alignments/dup_marked" },
mode: params.publish_dir_mode,
pattern: "*.bam*",
enabled: (params.save_align_intermeds || params.skip_deduplication || params.rrbs)
]
]
}

withName: BISCUIT_BSCONV {
ext.prefix = { "${meta.id}_bsconv" }
ext.args = { "-f ${params.bs_conv_filter}" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/alignments/bisulfite_conversion_filtered" },
mode: params.publish_dir_mode,
pattern: "*.bam",
enabled: params.save_align_intermeds
]
}

withName: BISCUIT_PILEUP {
errorStrategy = 'retry'
maxRetries = 3
ext.args = { [
params.no_overlap ? '' : '-d',
params.nomeseq ? '-N' : '',
params.ignore > 0 ? "-5 ${params.ignore}" : '',
params.ignore_3prime > 0 ? "-3 ${params.ignore_3prime}" : '',
].join(' ').trim() }
publishDir = [
path: { "${params.outdir}/${params.aligner}/snp_data/" },
mode: params.publish_dir_mode,
enabled: params.save_align_intermeds,
pattern: "*.vcf.gz"
]
}

withName: BISCUIT_VCF2BED {
ext.args = { [
(params.meth_cutoff) ? "-k ${params.meth_cutoff}" : '-k 1',
(params.comprehensive) ? '-t c' : '',
(params.nomeseq) ? '-t hcg' : ''
].join(' ').trim() }
publishDir = [
path: { "${params.outdir}/${params.aligner}/methylation_calls/" },
mode: params.publish_dir_mode,
enabled: !params.merge_cg,
pattern: "*.bed.gz"
]
}

withName: BISCUIT_VCF2BED_NOME {
ext.args = { [
(params.meth_cutoff) ? "-k ${params.meth_cutoff}" : '-k 1',
(params.nomeseq) ? '-t gch' : ''
].join(' ').trim() }
publishDir = [
path: { "${params.outdir}/${params.aligner}/accessibility_data/" },
mode: params.publish_dir_mode,
enabled: !params.merge_cg,
pattern: "*.bed.gz"
]
}

withName: BISCUIT_MERGECG {
ext.prefix = { "${meta.id}_mergecg" }
ext.args = ''
publishDir = [
path: { "${params.outdir}/${params.aligner}/methylation_calls/" },
mode: params.publish_dir_mode,
enabled: params.merge_cg,
pattern: "*.bed.gz"
]
}

withName: BISCUIT_MERGECG_NOME {
ext.prefix = { "${meta.id}_mergecg" }
ext.args = '-N'
publishDir = [
path: { "${params.outdir}/${params.aligner}/accessibility_data/" },
mode: params.publish_dir_mode,
enabled: params.merge_cg,
pattern: "*.bed.gz"
]
}

withName: BISCUIT_QC {
ext.args = ''
publishDir = [
[
path: { "${params.outdir}/${params.aligner}/qc/mbias/cpg" },
mode: params.publish_dir_mode,
pattern: "*_CpGRetentionByReadPos.txt"
],
[
path: { "${params.outdir}/${params.aligner}/qc/mbias/cph" },
mode: params.publish_dir_mode,
pattern: "*_CpHRetentionByReadPos.txt"
],
[
path: { "${params.outdir}/${params.aligner}/qc/dup_report/" },
mode: params.publish_dir_mode,
pattern: "*_dup_report.txt"
],
[
path: { "${params.outdir}/${params.aligner}/qc/insert_size/" },
mode: params.publish_dir_mode,
pattern: "*_isize_table.txt"
],
[
path: { "${params.outdir}/${params.aligner}/qc/mapping_qual/" },
mode: params.publish_dir_mode,
pattern: "*_mapq_table.txt"
],
[
path: { "${params.outdir}/${params.aligner}/qc/mapping_strand/" },
mode: params.publish_dir_mode,
pattern: "*_strand_table.txt"
],
[
path: { "${params.outdir}/${params.aligner}/qc/conversion/" },
mode: params.publish_dir_mode,
pattern: "*_totalReadConversionRate.txt"
],
]
}

withName: PICARD_MARKDUPLICATES {
ext.args = "--ASSUME_SORTED true --REMOVE_DUPLICATES false --VALIDATION_STRINGENCY LENIENT --PROGRAM_RECORD_ID 'null' --TMP_DIR tmp"
ext.prefix = { "${meta.id}.markdup.sorted" }
Expand Down Expand Up @@ -381,6 +557,18 @@ process {
]
}

withName: SAMTOOLS_INDEX_BSCONV {
ext.args = ""
publishDir = [
[
path: { "${params.outdir}/${params.aligner}/alignments/bisulfite_conversion_filtered" },
mode: params.publish_dir_mode,
pattern: "*.bai",
enabled: params.save_align_intermeds
]
]
}

withName: SAMTOOLS_INDEX_DEDUPLICATED {
ext.args = ""
publishDir = [
Expand All @@ -407,12 +595,17 @@ process {
}

withName: METHYLDACKEL_EXTRACT {
ext.args = [
params.comprehensive ? ' --CHG --CHH' : '',
params.ignore_flags ? " --ignoreFlags" : '',
params.methyl_kit ? " --methylKit" : '',
params.min_depth > 0 ? " --minDepth ${params.min_depth}" : ''
].join(" ").trim()
ext.args = {
def simple_args = [
params.comprehensive ? ' --CHG --CHH' : '',
params.ignore_flags ? " --ignoreFlags" : '',
params.methyl_kit ? " --methylKit" : '',
params.meth_cutoff ? " --minDepth ${params.meth_cutoff}" : '',
].join(" ")
def ignore_bases = [params.ignore, params.ignore_3prime, params.ignore_r2, params.ignore_3prime_r2].join(",")
def ignore_args = ["--nOT", "--nOB", "--nCTOT", "--nCTOB"].collect { "$it $ignore_bases" }.join(" ")
return [simple_args, ignore_args].join(" ").trim()
}
publishDir = [
[
path: { "${params.outdir}/methyldackel" },
Expand Down
1 change: 1 addition & 0 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ nextflow.enable.dsl = 2
params.fasta = WorkflowMain.getGenomeAttribute(params, 'fasta')
params.fasta_index = WorkflowMain.getGenomeAttribute(params, 'fasta_index')
params.bismark_index = WorkflowMain.getGenomeAttribute(params, 'bismark')
params.biscuit_index = WorkflowMain.getGenomeAttribute(params, 'biscuit')
params.bwa_meth_index = WorkflowMain.getGenomeAttribute(params, 'bwa_meth')

/*
Expand Down
Loading
Loading