Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BISCUIT #186

Closed
wants to merge 64 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
c003c6e
Added aligner-biscuit, with all relevant steps and parameters and upd…
ekushele Mar 25, 2020
4323546
workflow files
ekushele Mar 25, 2020
6717545
change docs: output.md and usage.md. Update CHANGELOG.md accordingly.…
ekushele Mar 25, 2020
add54f3
change environment.yml and Dockerfile to include biscuit as bioconda …
ekushele Apr 21, 2020
a5d4f4a
Resolved merge conflict by incorporating both suggestions-mine and me…
ekushele Apr 21, 2020
42c3f95
passed lint test
ekushele Apr 21, 2020
ff55790
Update docs/usage.md
ekushele Apr 22, 2020
ed1203f
Update docs/usage.md
ekushele Apr 22, 2020
81946ea
Update CHANGELOG.md
ekushele Apr 22, 2020
2b5d492
Update docs/usage.md
ekushele Apr 22, 2020
0c0d93d
change trim_galore process to newer version
ekushele Apr 22, 2020
10c0f9b
Merge https://github.com/ekushele/methylseq into dev
ekushele Apr 22, 2020
7b71df0
changed bismark_methXtract to newer version
ekushele Apr 22, 2020
8aee069
remove ch_try
ekushele Apr 22, 2020
e0e3a09
Update CHANGELOG.md
ekushele Apr 22, 2020
c8775cf
Update docs/usage.md
ekushele Apr 22, 2020
2d691dc
Update docs/output.md
ekushele Apr 22, 2020
46b8b7c
Update docs/output.md
ekushele Apr 22, 2020
f9f799e
set parameters to false
ekushele Apr 22, 2020
2c6b12a
Merge https://github.com/ekushele/methylseq into dev
ekushele Apr 22, 2020
b7d3e8f
remove SNP step, until biscuit finish the progress with it
ekushele Apr 22, 2020
cb95f33
changed web address for soloWCGW file
ekushele Apr 22, 2020
8472d46
change soloWCGW to solo WCGW-in common PMDs
ekushele Apr 22, 2020
532e997
remove soloWCGW step
ekushele Apr 23, 2020
1f4dfd5
remove swift
ekushele Apr 28, 2020
ac62a93
added epiread convertion with paired-end merging and get SNP files an…
ekushele Feb 8, 2021
a76e7d5
fix some problems with main.nf, get cpg from assets dir, add common_d…
ekushele Feb 8, 2021
f1e5597
change environment file, fix problems in scrape_software_versions.py …
ekushele Feb 10, 2021
f972b35
make main to be like in 1f4dfd5 commit (with sorted bam for preseq an…
ekushele Feb 10, 2021
e6c9083
makes ch_splicesites_for_bismark_hisat_align to be like https://githu…
ekushele Feb 11, 2021
842ae99
improve README.md for lint test
ekushele Feb 11, 2021
9aae954
improve output.md for lint test
ekushele Feb 11, 2021
50ceb39
improve output.md for lint test
ekushele Feb 11, 2021
db60919
pushing last commit before PR
ekushele Feb 11, 2021
41d48d5
change knwwn splice main
ekushele Feb 11, 2021
48d2716
added option blacklist
ekushele Feb 14, 2021
962995e
update software and CHANGELOG.md for current updates, few changes on …
ekushele Feb 15, 2021
6efc3d6
update CHANGELOG.md
ekushele Feb 15, 2021
37f01fc
update CHANGELOG.md
ekushele Feb 15, 2021
2f9aa2b
clear markdown lint problems
ekushele Feb 15, 2021
022eaff
change environment.yml and CHANELOG.md to pass nf-core lint
ekushele Feb 15, 2021
68bbb5b
fix nf-core CI: '\--epiread to \'--epiread
ekushele Feb 15, 2021
3a4fc81
add biscuit check to epiread if
ekushele Feb 15, 2021
bd2615a
add biscuit before check to epiread if
ekushele Feb 15, 2021
f2f0cd1
add try and catch in epiread_pairedEnd_convertion
ekushele Feb 19, 2021
9031da4
remove tailing spaces
ekushele Feb 20, 2021
020b0e1
remove software updates from CHANGELOG.md
ekushele Feb 20, 2021
b0735e4
Merge branch 'dev' into PR_branch
ewels Feb 22, 2021
63f9202
Update CHANGELOG.md
ewels Feb 22, 2021
310e45a
Update CHANGELOG.md
ewels Feb 22, 2021
b9cd416
change bin/epiread_pairedEnd_convertion to bin/epiread_pairedEnd_conv…
ekushele Feb 22, 2021
b3a9729
Merge branch 'PR_branch' of https://github.com/ekushele/methylseq int…
ekushele Feb 22, 2021
4a7415f
remove cleanup=true from base.config
ekushele Feb 22, 2021
72a0dae
processes names as snake_case
ekushele Feb 22, 2021
9329e42
change processes name to snake_case
ekushele Feb 22, 2021
16b78d9
change epiread_convertion to epiread_conversion in process name
ekushele Feb 22, 2021
e980520
add linebreaks and indent in markDuplicates_samblaster
ekushele Feb 22, 2021
439ccd8
change debug_epiread to be in saveAs option
ekushele Feb 22, 2021
7ae0ffa
remove save_pileup_file, replaced with save_align_intermidiates
ekushele Feb 22, 2021
724d36c
Fix merge conflicts
ewels Mar 30, 2021
6c37778
Minor whitespace formatting & cleanup
ewels Mar 30, 2021
4ab3f8a
make 'common' group in nextflow_schema.json, remove nondirectional_li…
ekushele Apr 4, 2021
26de2f1
use getSimpleName instead of assembly_name
ekushele Apr 4, 2021
d3bff3a
add optional to output section for epiread original file
ekushele Jun 13, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,16 @@ jobs:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ['20.07.1', '21.03.0-edge']
aligner: ['bismark', 'bismark_hisat', 'bwameth']
aligner: ['bismark', 'bismark_hisat', 'bwameth', 'biscuit']
include:
- aligner: 'bismark'
ref_index: --bismark_index results/reference_genome/BismarkIndex/
- aligner: 'bismark_hisat'
ref_index: --bismark_index results/reference_genome/BismarkIndex/
- aligner: 'bwameth'
ref_index: --bwa_meth_index results/reference_genome/genome.fa
- aligner: 'biscuit'
ref_index: --bwa_biscuit_index results/reference_genome/genome.fa
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand Down
15 changes: 14 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,20 @@

## v1.7dev

_..nothing yet.._
### Pipeline Updates

* Added Picard CollectInsertSizeMetrics and Picard CollectGcBiasMetrics
* Improved qulimap and preseq by adding `samtools sort` and `samtools index` step in the Bismark aligner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a vague memory of this being a bad idea for some reason, that Bismark alignments shouldn't be sorted. But it's from years ago and I can't remember the reasoning now..

@FelixKrueger do you know what I'm talking about?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, bismark_deduplication needs an unsorted bam, so it gets the unsorted bam

* Added BISCUIT aligner as an optional aligner, with all relative steps (alignment, mark duplicates with [samblaster](https://github.com/GregoryFaust/samblaster), methylation extraction, QC for biscuit, and optional [Epi-read](https://huishenlab.github.io/biscuit/epiread_format/) file creation with SNP information).

### New software

* samblaster `0.1.26`
* bedtools `2.30.0`
* biscuit `0.3.16.20200420`
* bcftools`1.10`
* parallel `20201122`
* gawk `5.1.0`

## [v1.6](https://github.com/nf-core/methylseq/releases/tag/1.6) - 2021-03-26

Expand Down
31 changes: 16 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,22 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
## Pipeline Summary

The pipeline allows you to choose between running either [Bismark](https://github.com/FelixKrueger/Bismark) or [bwa-meth](https://github.com/brentp/bwa-meth) / [MethylDackel](https://github.com/dpryan79/methyldackel).
Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for alignment), `--aligner bismark_hisat` or `--aligner bwameth`.

| Step | Bismark workflow | bwa-meth workflow |
|----------------------------------------------|------------------|-----------------------|
| Generate Reference Genome Index _(optional)_ | Bismark | bwa-meth |
| Raw data QC | FastQC | FastQC |
| Adapter sequence trimming | Trim Galore! | Trim Galore! |
| Align Reads | Bismark | bwa-meth |
| Deduplicate Alignments | Bismark | Picard MarkDuplicates |
| Extract methylation calls | Bismark | MethylDackel |
| Sample report | Bismark | - |
| Summary Report | Bismark | - |
| Alignment QC | Qualimap | Qualimap |
| Sample complexity | Preseq | Preseq |
| Project Report | MultiQC | MultiQC |
Choose between workflows by using `--aligner bismark` (default, uses bowtie2 for alignment), `--aligner bismark_hisat` or `--aligner bwameth` or `--aligner biscuit`.

| Step | Bismark workflow | bwa-meth workflow | biscuit |
|----------------------------------------------|------------------|-----------------------|-------------------|
| Generate Reference Genome Index _(optional)_ | Bismark | bwa-meth | biscuit |
| Raw data QC | FastQC | FastQC | FastQC |
| Adapter sequence trimming | Trim Galore! | Trim Galore! | Trim Galore! |
| Align Reads | Bismark | bwa-meth | biscuit |
| Deduplicate Alignments | Bismark | Picard MarkDuplicates | samblaster |
| Extract methylation calls | Bismark | MethylDackel | biscuit |
| Sample Report | Bismark | - | biscuit QC |
| Summary Report | Bismark | - | - |
| Picard Metrics | Picard | Picard | Picard |
| Alignment QC | Qualimap | Qualimap | Qualimap |
| Sample complexity | Preseq | Preseq | Preseq |
| Project Report | MultiQC | MultiQC | MultiQC |

## Quick Start

Expand Down
11 changes: 11 additions & 0 deletions assets/common_dbsnp.hdr
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
##INFO=<ID=TYPE,Number=1,Type=String,Description="Variation class/type snv">
##INFO=<ID=COMMON_SOME,Number=1,Type=Integer,Description="1 percent common in some of the population sources in UCSC">
##INFO=<ID=COMMON_ALL,Number=1,Type=Integer,Description="1 percent common in all of the available population sources in UCSC">
##INFO=<ID=REF_MIN,Number=.,Type=Character,Description="ref genotype, less ambiguous from dbsnp or other sources">
##INFO=<ID=ALT_MIN,Number=.,Type=Character,Description="alt genotype, less ambiguous from dbsnp or other sources">
##INFO=<ID=REF_DBSNP,Number=.,Type=Character,Description="ref genotype, dbSNP">
##INFO=<ID=ALT_DBSNP,Number=.,Type=Character,Description="alt genotype, dbSNP">
##INFO=<ID=REF_ALL,Number=.,Type=Character,Description="ref genotype, other sources">
##INFO=<ID=ALT_ALL,Number=.,Type=Character,Description="alt genotype, other sources">
##INFO=<ID=MAX_MAF,Number=.,Type=Float,Description="maximum minor allele frequency among all sources">
##INFO=<ID=RSID,Number=1,Type=String,Description="dbsnp ID">
Binary file added bin/epiread_pairedEnd_conversion
Binary file not shown.
Loading