Skip to content

Commit

Permalink
Overhaul readme
Browse files Browse the repository at this point in the history
  • Loading branch information
snystrom committed Oct 18, 2021
1 parent 4c6017c commit 44f1f6f
Showing 1 changed file with 123 additions and 18 deletions.
141 changes: 123 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,33 +46,138 @@ vplot --version

## Usage

Note that in the returned matrix, the columns are genomic coordinate (leftmost
column is 5' end), and rows are fragment size (top row is *largest* fragment
size, this can be swapped using `-i`)
`vplot` takes aligned reads as an indexed bam file and a bed file of genomic
coordinates as input to return a CSV-format matrix of fragment size plotted
against genomic coordinates. `vplot` can create aggregate plots of multiple
regions by providing multiline entries in the regions bed file, or can create
matrices for each region separately by setting the `--multi` flag, in which case
a separate matrix per region will be written.

In the returned matrix, the columns are genomic coordinate (leftmost
column is 5' end), and rows are fragment size (top row is *largest* fragment size).
In keeping with the original spirit of the original Henikoff V-plot, larger fragments are
returned at the top of the matrix, while small fragments are plotted along the
bottom. This can be a little annoying for indexing later, though, because row 1
corresponds to the max fragment size. To make rowwise positional indexing
easier, set `--invert` to return smaller fragments at the top of the matrix
(i.e. row 1 = 1bp fragment size).

Finally, by setting `--fragment-type` (`-f`) you can control how reads are
summarized in the matrix. Setting `-f midpoint` (default) only adds signal at
the midpoint of a fragment. Setting `-f ends` adds signal at the start and end
positions of each read. Setting `-f fragment` adds signal along the full length
of the read fragment. Typically, `midpoint` is what you want, but plotting whole
fragments or fragment ends could be useful for different assays & data visualization.

Like examples? Here you go:

``` sh
vplot 0.3.3

# regions must be equal width
$ cat regions.bed
chr2L 100 200 .
chr3R 5000 5100 .

# remember to index your bam file
$ samtools index reads.bam

# default behavior aggregates reads & prints to stdout
$ vplot reads.bam regions.bed > vplot_matrix.csv

# make vplots for each region separately instead with --multi
# this writes a file per region
$ vplot --multi reads.bam regions.bed
# returns:
chr2L-100-200.csv
chr3R-5000-5100.csv

# set a custom file prefix for multi-output files:
$ vplot --multi -o myPrefix_ reads.bam regions.bed
# returns:
myPrefix_chr2L-100-200.csv
myPrefix_chr3R-5000-5100.csv
```

## Full Help Text

``` sh

vplot 0.3.4

USAGE:
vplot [FLAGS] [OPTIONS] <bam> <regions>

FLAGS:
-h, --help Prints help information
-i, --invert Invert the matrix so that the smallest fragments appear at the top
-m, --multi Instead of aggregating reads into 1 matrix, write 1 matrix for each region. Matrices are written as
1 csv per region named: `chr-start-end.csv`
-V, --version Prints version information
-h, --help
Prints help information

-i, --invert
Invert the matrix so that the smallest fragments appear at the top

-m, --multi
Instead of aggregating reads into 1 matrix, write 1 matrix for each region. Matrices are written as 1 csv
per region named: `chr-start-end.csv`
-V, --version
Prints version information


OPTIONS:
-f, --fragment-type <fragment-type> How reads are counted in the matrix. Using either the midpoint of the
fragment, fragment ends, or the whole fragment [default: midpoint] [possible
values: midpoint, ends, fragment]
-x, --max-size <max-fragment-size> Maximum fragment size to include in the V-plot matrix [default: 700]
-o, --output <output> Set output file name or output directory. This option behaves differently
depending on which input flags are set. See --help for details [default: -]
-f, --fragment-type <fragment-type>
How reads are counted in the matrix. Using either the midpoint of the fragment, fragment ends, or the whole
fragment [default: midpoint] [possible values: midpoint, ends, fragment]
-x, --max-size <max-fragment-size>
Maximum fragment size to include in the V-plot matrix [default: 700]

-o, --output <output>
Set output file name or output directory. This option behaves differently depending on which input flags are
set. See --help for details.

If --multi is unset and -o is set to a directory, the output file will be written to:
outdir/<bamfile>.vmatrix.csv. if --multi is unset and -o is a file path, output file will be written to this
file name. if --multi is set and -o is a directory, files will be written to outdir as: outdir/chr-start-
end.csv. if --multi is set and -o is a string, the string will be used as a prefix, and
files will be written as: <prefix>chr-start-end.csv.

Examples:

vplot reads.bam regions.bed > output.csv

vplot -o outdir/ reads.bam regions.bed

returns: outdir/reads.bam.vmatrix.csv

vplot -o matrix.csv reads.bam regions.bed

returns: matrix.csv

vplot -m -o outdir/ reads.bam regions.bed

returns: - outdir/chr1-1000-2000.csv

- outdir/chr2-1000-2000.csv

vplot -m -o myPrefix_ reads.bam regions.bed

returns:

- myPrefix_chr1-1000-2000.csv

- myPrefix_chr2-1000-2000.csv

vplot -m -o outdir/myPrefix_ reads.bam regions.bed

returns:

- outdir/myPrefix_chr1-1000-2000.csv

- outdir/myPrefix_chr2-1000-2000.csv [default: -]

ARGS:
<bam> Path to an indexed bam file
<regions> Path to a bed file (must be in bed4 format: chr, start, end, strand) Of a region (or regions) in
which to generate the vplot. If using multiple regions, all entries must be the same width.
<bam>
Path to an indexed bam file

<regions>
Path to a bed file (must be in bed4 format: chr, start, end, strand) Of a region (or regions) in which to
generate the vplot. If using multiple regions, all entries must be the same width. If setting multiple
regions, reads will be aggregated into a single matrix unless `--multi` is set
```

0 comments on commit 44f1f6f

Please sign in to comment.