rustybam [OPTIONS] <SUBCOMMAND>
or
rb [OPTIONS] <SUBCOMMAND>
``` rustybam 0.1.23 Mitchell R. Vollger mrvollger@gmail.com bioinformatics toolkit in rust
USAGE:
rb [OPTIONS]
OPTIONS:
-t, --threads
SUBCOMMANDS: stats Get percent identity stats from a sam/bam/cram or PAF bed-length Count the number of bases in a bed file [aliases: bedlen, bl, bedlength] filter Filter PAF records in various ways invert Invert the target and query sequences in a PAF along with the CIGAR string liftover Liftover target sequence coordinates onto query sequence using a PAF trim-paf Trim paf records that overlap in query sequence [aliases: trim, tp] orient Orient paf records so that most of the bases are in the forward direction break-paf Break PAF records with large indels into multiple records (useful for SafFire) [aliases: breakpaf, bp] paf-to-sam Convert a PAF file into a SAM file. Warning, all alignments will be marked as primary! [aliases: paftosam, p2s, paf2sam] fasta-split Reads in a fasta from stdin and divides into files (can compress by adding .gz) [aliases: fastasplit, fasplit] fastq-split Reads in a fastq from stdin and divides into files (can compress by adding .gz) [aliases: fastqsplit, fqsplit] get-fasta Mimic bedtools getfasta but allow for bgzip in both bed and fasta inputs [aliases: getfasta, gf] nucfreq Get the frequencies of each bp at each position repeat Report the longest exact repeat length at every position in a fasta suns Extract the intervals in a genome (fasta) that are made up of SUNs help Print this message or the help of the given subcommand(s) ```
shell
mamba install -c bioconda rustybam
shell
cargo install rustybam
Download from releases (may be slower than locally complied versions).
shell
git clone https://github.com/mrvollger/rustybam.git
cd rustybam
cargo build --release
and the executables will be built here:
shell
target/release/{rustybam,rb}
For BAM files with extended cigar operations we can calculate statistics about the aliment and report them in BED format.
shell
rustybam stats {input.bam} > {stats.bed}
The same can be done with PAF files as long as they are generated with -c --eqx
.
shell
rustybam stats --paf {input.paf} > {stats.bed}
I have a
PAF
and I want to subset it for just a particular region in the reference.
With rustybam
its easy:
shell
rustybam liftover \
--bed <(printf "chr1\t0\t250000000\n") \
input.paf > trimmed.paf
But I also want the alignment statistics for the region.
No problem, rustybam liftover
does not just trim the coordinates but also the CIGAR
so it is ready for rustybam stats
:
```shell rustybam liftover \ --bed <(printf "chr1\t0\t250000000\n") \ input.paf \ | rustybam stats --paf \
trimmed.stats.bed ```
Okay, but Evan asked for an "align slider" so I need to realign in chunks.
No need, just make your bed
query to rustybam liftoff
a set of sliding windows
and it will do the rest.
```shell rustybam liftover \ --bed <(bedtools makewindows -w 100000 \ <(printf "chr1\t0\t250000000\n") \ ) \ input.paf \ | rustybam stats --paf \
trimmed.stats.bed ```
You can also use rustybam breakpaf
to break up the paf records of indels above a certain size to
get more "miropeats" like intervals.
```shell rustybam breakpaf --max-size 1000 input.paf \ | rustybam liftover \ --bed <(printf "chr1\t0\t250000000\n") \ | ./rustybam stats --paf \
trimmed.stats.bed ```
Yeah but how do I visualize the data?
Try out SafFire!
Split a fasta file between stdout
and two other files both compressed and uncompressed.
shell
cat {input.fasta} | rustybam fasta-split two.fa.gz three.fa
Split a fastq file between stdout
and two other files both compressed and uncompressed.
shell
cat {input.fastq} | rustybam fastq-split two.fq.gz three.fq
This tools is designed to mimic bedtools getfasta
but this tools allows the fasta to be bgzipped
.
shell
samtools faidx {seq.fa(.gz)}
rb get-fasta --name --strand --bed {regions.of.interest.bed} --fasta {seq.fa(.gz)}
trim-paf
.bedtools getfasta
like operation that actually works with bgzipped input.
D4
for Nucfreq.suns
.