A collection of command-line tools for working with STAM.
Various tools are grouped under the stam
tool, and invoked with a subcommand:
stam annotate
- Add an annotation from a JSON filestam info
- Return information regarding a STAM model. stam init
- Initialize a new STAM annotationstorestam to-text
- Print the text of any resources in the model.stam to-tsv
- Convert STAM to a simple TSV (Tab Separated Values) format. This is not lossless but provides a decent view on the data.stam validate
- Validate a STAM model. stam save
- Write a STAM model to file(s). This can be used to switch between STAM JSON and STAM CSV output, based on the extension.stam tag
- Regular-expression based tagger on plain text. For many of these, you can set --verbose
for extra details in the output.
$ cargo install stam-tools
Add the --help
flag after the subcommand for extensive usage instructions.
Most tools take as input a STAM JSON file containing an annotation store. Any
files mentioned via the @include
mechanism are loaded automatically.
Instead of passing STAM JSON files, you can read from stdin and/or output to
stdout by setting the filename to -
, this works in many places.
These tools also support reading and writing STAM CSV.
The stam tag
tool can be used for matching regular expressions in text and subsequently associated annotations with the found results. It is a tool to do for example tokenization or other tagging tasks.
The stam tag
command takes a TSV file (example) containing regular expression rules for the tagger.
The file contains the following columns:
Example:
```tsv
\w+(?:[-_]\w+)* simpletokens type word [.\?,/]+ simpletokens type punctuation [0-9]+(?:[,.][0-9]+) simpletokens type number ```