Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: string index out of range #21

Open
robinycfang opened this issue Dec 12, 2021 · 0 comments
Open

IndexError: string index out of range #21

robinycfang opened this issue Dec 12, 2021 · 0 comments

Comments

@robinycfang
Copy link

Hi

I was running ConsensusCeuncher to collapse UMIs. It seems to output some results:

├── sscs
│   ├── sscs.sorted.bam (.bai)                     
│   ├── singleton.sorted.bam (.bai)                
├── sscs_SC
|   ├── sscs.sc.sorted.bam (.bai)              
├── dcs
│   ├── dcs.sorted.bam (.bai)
├── dcs_SC
│   ├── dcs.sc.sorted.bam(.bai)
│   ├── all.unique.dcs.sorted.bam(.bai)
├── read_families.txt                       Family size and frequency
├── stats.txt                               Consensus sequence formation metrics
├── tag_fam_size.png                        Distribution of reads across family size

However, when I checked on the log, I found an error:

# === DCS ===
SSCS - Total reads: 26020276
SSCS - Unmapped reads: 0
SSCS - Secondary/Supplementary reads: 0
DCS reads: 89306
SSCS singletons: 25841664 

[bam_sort_core] merging from 6 files and 1 in-memory blocks...
Traceback (most recent call last):
  File "/ConsensusCruncher/singleton_correction.py", line 320, in <module>
    main()
  File "/ConsensusCruncher/singleton_correction.py", line 268, in main
    corrected_read = strand_correction(tag, duplex, query_name, singleton_dict)
  File "/ConsensusCruncher/singleton_correction.py", line 101, in strand_correction
    dcs = duplex_consensus(read, complement_read)
  File "/ConsensusCruncher/singleton_correction.py", line 71, in duplex_consensus
    if read1.query_sequence[i] == read2.query_sequence[i] and \
IndexError: string index out of range
[bam_sort_core] merging from 6 files and 1 in-memory blocks...
# === DCS - Singleton Correction ===
SSCS SC - Total reads: 26023245
SSCS SC - Unmapped reads: 0

It looks like DCS was not properly performed? For my experiments, I might just need to use sscs.sc.sorted.bam. Are these final bam files still safe to use? The following is my commands. Thanks!
python3 ConsensusCruncher.py fastq2bam --fastq1 sample_R1.fastq --fastq2 sample_R2.fastq -o out_dir -b bwa -g /picard/2.10.9/picard.jar -r hg38.fasta -s samtools -l umilist.txt

python3 ConsensusCruncher.py consensus -i sample.sorted.bam -o out_dir -s samtools -b cytoBand.txt -g hg38 --cleanup True --scorrect True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant