Current release: 0.3.0
cargo install intspan
spanr
``
$ spanr help
intspan 0.2.1-alpha.0
wang-q <wang-q@outlook.com>
intspan` operates chromosome IntSpan files
USAGE: spanr [SUBCOMMAND]
FLAGS: -h, --help Prints help information -V, --version Prints version information
SUBCOMMANDS: combine Combine multiple sets of runlists in a yaml file compare Compare 2 YAML files convert Convert runlist file to ranges file cover Output covers on chromosomes genome Convert chr.size to runlists gff Convert gff3 to covers on chromosomes help Prints this message or the help of the given subcommand(s) merge Merge runlist yaml files range Convert runlist file to ranges file some Extract some records from a runlist yaml file span Operate spans in a YAML file split Split a runlist yaml file stat Coverage on chromosomes for runlists statop Coverage on chromosomes for one YAML crossed another
```
linkr
``
$ linkr help
linkr 0.2.1-alpha.0
wang-q <wang-q@outlook.com>
linkr` operates ranges on chromosomes and links of ranges
USAGE: linkr [SUBCOMMAND]
FLAGS: -h, --help Prints help information -V, --version Prints version information
SUBCOMMANDS: circos Convert links to circos links or highlights clean Replace ranges within links, incorporate hit strands and remove nested links connect Connect bilateral links into multilateral ones filter Filter links by numbers of ranges or length differences help Prints this message or the help of the given subcommand(s) merge Merge overlapped ranges via overlapping graph sort Sort links and ranges within links
```
An IntSpan represents sets of integers as a number of inclusive ranges, for example 1-10,19,45-48
or -99--10,1-10,19,45-48
.
The following picture is the schema of an IntSpan object. Jump lines are above the baseline; loop lines are below it.
Also, AlignDB::IntSpan and jintspan are implements of IntSpan objects in Perl and Java, respectively.
chr.sizes
Single
Multi
Examples in S288c.ranges
text
I:1-100
I(+):90-150
S288c.I(-):190-200
II:21294-22075
II:23537-24097
Simple rules:
chromosome
and start
are requiredspecies
, strand
and end
are optional.
to separate species
and chromosome
strand
is one of +
and -
and surround by round brackets:
to separate names and digits-
to separate start
and end
species
:
species
should be alphanumeric and without spaces, one exception char is /
.species
is an identity, you can also treat is as a strain name, a lineage or something else. text
species.chromosome(strand):start-end
--------^^^^^^^^^^--------^^^^^^----
Types of links:
Bilateral links
I(+):13063-17220 I(-):215091-219225
I(+):139501-141431 XII(+):95564-97485
Bilateral links with hit strand
I(+):13327-17227 I(+):215084-218967 -
I(+):139501-141431 XII(+):95564-97485 +
Multilateral links
II(+):186984-190356 IX(+):12652-16010 X(+):12635-15993
spanr
```bash
spanr genome tests/spanr/S288c.chr.sizes
spanr genome tests/spanr/S288c.chr.sizes | spanr stat tests/spanr/S288c.chr.sizes stdin --all
spanr some tests/spanr/Atha.yml tests/spanr/Atha.list
spanr merge tests/spanr/I.yml tests/spanr/II.yml
spanr cover tests/spanr/S288c.ranges spanr cover tests/spanr/S288c.ranges -c 2 spanr cover tests/spanr/dazzname.ranges
spanr gff tests/spanr/NC_007942.gff --tag tRNA
spanr range --op overlap tests/spanr/intergenic.yml tests/spanr/S288c.ranges
spanr span --op cover tests/spanr/brca2.yml
spanr combine tests/spanr/Atha.yml jrunlist combine -o stdout tests/spanr/Atha.yml
spanr compare \ --op intersect \ tests/spanr/intergenic.yml \ tests/spanr/repeat.yml
spanr split tests/spanr/I.II.yml
spanr stat tests/spanr/S288c.chr.sizes tests/spanr/intergenic.yml
spanr stat tests/spanr/S288c.chr.sizes tests/spanr/I.II.yml
diff <(spanr stat tests/spanr/Atha.chr.sizes tests/spanr/Atha.yml) \ <(jrunlist stat -o stdout tests/spanr/Atha.chr.sizes tests/spanr/Atha.yml)
spanr statop \ --op intersect \ tests/spanr/S288c.chr.sizes \ tests/spanr/intergenic.yml \ tests/spanr/repeat.yml
diff <(spanr statop \ --op intersect --all\ tests/spanr/Atha.chr.sizes \ tests/spanr/Atha.yml \ tests/spanr/paralog.yml ) \ <(jrunlist statop \ -o stdout \ --op intersect --all \ tests/spanr/Atha.chr.sizes \ tests/spanr/Atha.yml \ tests/spanr/paralog.yml )
spanr convert tests/spanr/repeat.yml tests/spanr/intergenic.yml | spanr cover stdin | spanr stat tests/spanr/S288c.chr.sizes stdin --all spanr merge tests/spanr/repeat.yml tests/spanr/intergenic.yml | spanr combine stdin | spanr stat tests/spanr/S288c.chr.sizes stdin --all
```
linkr
```bash
linkr sort tests/linkr/II.links.tsv -o tests/linkr/II.sort.tsv
linkr merge tests/linkr/II.links.tsv -v
linkr clean tests/linkr/II.sort.tsv linkr clean tests/linkr/II.sort.tsv --bundle 500 linkr clean tests/linkr/II.sort.tsv -r tests/linkr/II.merge.tsv
linkr connect tests/linkr/II.clean.tsv -v
linkr filter tests/linkr/II.connect.tsv -n 2 linkr filter tests/linkr/II.connect.tsv -n 3 -r 0.99
linkr circos tests/linkr/II.connect.tsv linkr circos --highlight tests/linkr/II.connect.tsv
```
Steps:
sort
|
v
clean -> merge
| /
| /
v
clean
|
V
connect
|
v
filter
```bash linkr sort tests/S288c/links.lastz.tsv tests/S288c/links.blast.tsv \ -o tests/S288c/sort.tsv
linkr clean tests/S288c/sort.tsv \ -o tests/S288c/sort.clean.tsv
linkr merge tests/S288c/sort.clean.tsv -c 0.95 \ -o tests/S288c/merge.tsv
linkr clean tests/S288c/sort.clean.tsv -r tests/S288c/merge.tsv --bundle 500 \ -o tests/S288c/clean.tsv
linkr connect tests/S288c/clean.tsv -r 0.8 \ -o tests/S288c/connect.tsv
linkr filter tests/S288c/connect.tsv -r 0.8 \ -o tests/S288c/filter.tsv
wc -l tests/S288c/*.tsv
cat tests/S288c/filter.tsv | perl -nla -F"\t" -e 'print for @F' | spanr cover stdin -o tests/S288c/cover.yml
spanr stat tests/S288c/chr.sizes tests/S288c/cover.yml -o stdout
```
```bash gzip -dcf tests/Atha/links.lastz.tsv.gz tests/Atha/links.blast.tsv.gz | linkr sort stdin -o tests/Atha/sort.tsv
linkr clean tests/Atha/sort.tsv \ -o tests/Atha/sort.clean.tsv
linkr merge tests/Atha/sort.clean.tsv -c 0.95 \ -o tests/Atha/merge.tsv
linkr clean tests/Atha/sort.clean.tsv -r tests/Atha/merge.tsv --bundle 500 \ -o tests/Atha/clean.tsv
linkr connect tests/Atha/clean.tsv -o tests/Atha/connect.tsv
linkr filter tests/Atha/connect.tsv -r 0.8 \ -o tests/Atha/filter.tsv
wc -l tests/Atha/*.tsv
cat tests/Atha/filter.tsv | perl -nla -F"\t" -e 'print for @F' | spanr cover stdin -o tests/Atha/cover.yml
spanr stat tests/Atha/chr.sizes tests/Atha/cover.yml -o stdout
```
```text $ cd ~/Scripts/rust/intspan $ cargo build --release --examples $ command time -l target/release/examples/benchmark ["target/release/examples/benchmark"] step 2 duration: 0.04019228699999999 s step 3 duration: 0.048044463 s step 4 duration: 0.101135046 s step 5 duration: 0.378076464 s step 6 duration: 0.6468764170000001 s 1.21 real 1.21 user 0.00 sys 1024000 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 259 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 304 involuntary context switches
```
```text $ cd ~/Scripts/java/jintspan $ mvn clean verify $ command time -l java -jar target/jintspan-*-jar-with-dependencies.jar benchmark step 2 duration 0.023358 step 3 duration 0.035295 step 4 duration 0.053588 step 5 duration 0.316216 step 6 duration 0.561500 1.12 real 1.38 user 0.07 sys 110718976 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 30334 page reclaims 4 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 1 signals received 4 voluntary context switches 926 involuntary context switches
```
```text $ cd ~/Scripts/cpan/AlignDB-IntSpanXS/benchmark $ make $ command time -l ./test_c benchmark step 2 duration 0.022875 step 3 duration 0.032172 step 4 duration 0.057164 step 5 duration 0.294729 step 6 duration 0.525069 0.93 real 0.93 user 0.00 sys 1085440 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 274 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 176 involuntary context switches
```
```text $ cd ~/Scripts/cpan/AlignDB-IntSpanXS/benchmark $ command time -l perl test_ai.pl benchmark step 2 duration 2.506869 step 3 duration 2.831008 step 4 duration 2.969270 step 5 duration 46.395918 step 6 duration 96.724945 151.45 real 151.25 user 0.10 sys 6377472 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 1566 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 14911 involuntary context switches
```
```text $ cd ~/Scripts/cpan/AlignDB-IntSpanXS/benchmark $ command time -l perl test_ai.pl benchmark xs step 2 duration 0.273726 step 3 duration 0.296036 step 4 duration 0.344481 step 5 duration 2.072225 step 6 duration 9.789098 12.80 real 12.76 user 0.02 sys 6475776 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 1590 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 3016 involuntary context switches
```
```text $ cd ~/Scripts/rust/intspan $ cargo build --release --examples $ command time -l target/release/examples/file ["target/release/examples/file"] step 1 create duration: 0.022158192 s step 2 intersect duration: 0.846951539 s step 3 intersect runlist duration: 0.9379064089999999 s 1.81 real 1.80 user 0.00 sys 2555904 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 633 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 383 involuntary context switches
```
```text $ cd ~/Scripts/java/jintspan $ mvn clean verify $ command time -l java -jar target/jintspan-*-jar-with-dependencies.jar file step 1 create duration 0.071450 step 2 intersect duration 0.499175 step 3 intersect runlist duration 0.789997 1.52 real 1.69 user 0.14 sys 308686848 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 78554 page reclaims 2 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 1 signals received 3 voluntary context switches 2034 involuntary context switches
```
```text $ cd ~/Scripts/cpan/AlignDB-IntSpanXS/benchmark $ make $ command time -l ./test_c file step 1 create duration 0.118375 step 2 intersect duration 2.174462 step 3 intersect runlist duration 18.218233 20.51 real 20.42 user 0.05 sys 2121728 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 521 page reclaims 6 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 1 voluntary context switches 6360 involuntary context switches
```
```text $ cd ~/Scripts/cpan/AlignDB-IntSpanXS/benchmark $ command time -l perl test_ai.pl file ==> test against large sets step 1 create duration 4.548069 step 2 intersect duration 61.313397 step 3 intersect runlist duration 61.335031 127.25 real 126.56 user 0.38 sys 11943936 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 2924 page reclaims 1 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 6 voluntary context switches 45869 involuntary context switches
```
```text $ cd ~/Scripts/cpan/AlignDB-IntSpanXS/benchmark $ command time -l perl test_ai.pl file xs ==> test against large sets step 1 create duration 0.116019 step 2 intersect duration 8.530752 step 3 intersect runlist duration 8.677303 17.37 real 17.26 user 0.05 sys 9822208 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 2407 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 0 voluntary context switches 7011 involuntary context switches
```