x12pp
is a CLI pretty-printer for X12 EDI files.
X12 is an arcane format consisting of a fixed-length header followed by a series of segments, each separated by a segment terminator character.
These segments are generally not separated by newlines, so extracting a range of lines from a file or taking a peek at the start using the usual Unix toolbox becomes unnecessarily painful.
Of course, you could split the lines using sed -e 's/~/~\n/g'
and get on with
your day, but:
~
is the traditional and most widely-used segment terminator
it's not required -- each X12 file specifies its own terminators as part of
the header.sed
or perl
would mean I wouldn't have a chance to explore fast
stream processing in Rust.So here we are.
Assuming you have Rust and Cargo installed on your machine, clone this
repository and then from the root run cargo build --release
. This will
result in a statically-compiled binary at target/release/x12pp
, which you can
then copy wherever you need.
``` $ x12pp < FILE > NEWFILE $ x12pp FILE -o NEWFILE
$ x12pp --uglify FILE ```
See manpage or --help
for more.
All tests were performed on an Intel Core i9-7940X, using a 1.3G X12 test file
located on a RAM disk. In each case, shell redirection was used to
pipe the file through the test command and into /dev/null
in order to get
as close as possible to measuring pure processing time. For example:
$ time sed -e 's/~/~\n/g' < test-file > /dev/null
| Tool | Command | Terminator detection | Pre-wrapped? | SIGPIPE? | Time |
|-------------|-------------------------------|----------------------|--------------|----------|-------|
| x12pp | x12pp
| ✓ | ✓ | ✓ | 1.05s |
| GNU sed 4.7 | sed -e s/~/~\n/g
| ✗ | ✗ | ✗ | 7.6s |
| perl 5.28.2 | perl -pe 's/~[\r\n]*/~\n/g'
| ✗ | ✓ but slower | ✗ | 8.5s |
| edicat | edicat
| ✓ | ✓ | ✓ | 7m41s |
x12pp
was
to be able to run x12pp < FILE | head -n 100
without having to plough
through a multi-gigabyte file.