Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework main workflow to tolerate larger datasets #70

Open
wants to merge 36 commits into
base: dev
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
e0cdf4d
drop in replace cat_cat with find_concatenate
BioWilko Jan 27, 2025
ed826a0
Fix modules conf
BioWilko Jan 27, 2025
148fbfd
[automated] Fix code linting
nf-core-bot Jan 27, 2025
2da5527
replace gunzip with find/unpigz
BioWilko Jan 29, 2025
b072ad6
No taxid duplication check
BioWilko Jan 29, 2025
7096fdb
Do not require unique tax ids
BioWilko Jan 30, 2025
acb8910
Remove taxid uniquness again, for the final time
BioWilko Jan 30, 2025
5bd7ae6
Fix it
BioWilko Jan 30, 2025
d65039e
collect batches
BioWilko Jan 30, 2025
5de809b
grouptuple before find_cat
BioWilko Jan 30, 2025
b4780be
Remap flattened fastas to metadata with an inner join
BioWilko Jan 30, 2025
0b45f7a
remove taxid uniqueness param entirely
BioWilko Jan 30, 2025
21fbf81
Merge branch 'dev' into find_concatenate_patch
jfy133 Jan 30, 2025
9279716
Merge branch 'find_concatenate_patch' of github.com:nf-core/createtax…
jfy133 Jan 30, 2025
2b53f99
Fixlinting
jfy133 Jan 30, 2025
b660f27
Update find/unpigz remove hack
BioWilko Jan 31, 2025
c0168f0
enforce id uniqueness
BioWilko Jan 31, 2025
02d747f
remove local find unzip module
BioWilko Jan 31, 2025
c033e6b
unhide unzip batch size
BioWilko Jan 31, 2025
2542775
Merge branch 'dev' into find_concatenate_patch
jfy133 Jan 31, 2025
0208693
[automated] Fix code linting
nf-core-bot Jan 31, 2025
4675773
update find/unpigz
BioWilko Feb 3, 2025
daa0820
Patch workflow join logic
BioWilko Feb 3, 2025
24ba8c6
Filter out rows without dna fastas for matching
BioWilko Feb 6, 2025
4a604df
Fix AA batching logic
BioWilko Feb 6, 2025
e3a55b2
Join unbatched aa fastas with their metadata for kaiju
BioWilko Feb 6, 2025
c3a8626
First subworkflow test
BioWilko Feb 6, 2025
ce87fdb
Fix preprocessing subworkflow
BioWilko Feb 10, 2025
aa0a10f
input ungrouped fasta refs to ganon
BioWilko Feb 10, 2025
34c619b
Initialise outputs as empty channels
BioWilko Feb 10, 2025
6fcd794
ganon wants grouped DNA fastas
BioWilko Feb 10, 2025
79cabe6
Fix kaiju outs
BioWilko Feb 10, 2025
718d178
Reduce unzip batch size for test profile
BioWilko Feb 28, 2025
faf9b9a
test unzip batch size to 1
BioWilko Mar 1, 2025
ed8fce9
Merge branch 'dev' into find_concatenate_patch
BioWilko Mar 7, 2025
96ce159
remove commented includes
BioWilko Mar 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fixlinting
jfy133 committed Jan 30, 2025
commit 2b53f99ec438ba563e898471e580e0f9d3a5b7f2
31 changes: 8 additions & 23 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
@@ -12,16 +12,12 @@
"pattern": "^\\S+$",
"uniqueItems": true,
"errorMessage": "Sequence reference name must be provided and cannot contain spaces",
"meta": [
"id"
]
"meta": ["id"]
},
"taxid": {
"type": "integer",
"errorMessage": "Please provide a valid taxonomic ID in integer format",
"meta": [
"taxid"
]
"meta": ["taxid"]
},
"fasta_dna": {
"type": "string",
@@ -38,33 +34,22 @@
"errorMessage": "FASTA file for amino acid reference sequence cannot contain spaces and must have a valid FASTA extension (fasta, faa, fa, fas), optionally gzipped"
}
},
"required": [
"id",
"taxid"
],
"required": ["id", "taxid"],
"anyOf": [
{
"required": [
"fasta_dna"
]
"required": ["fasta_dna"]
},
{
"required": [
"fasta_aa"
]
"required": ["fasta_aa"]
}
]
},
"allOf": [
{
"uniqueEntries": [
"fasta_dna"
]
"uniqueEntries": ["fasta_dna"]
},
{
"uniqueEntries": [
"fasta_aa"
]
"uniqueEntries": ["fasta_aa"]
}
]
}
}