djot.rs

The fastest Djot parser ever?

Djot is a light markup syntax. It derives most of its features from commonmark, but it fixes a few things that make commonmark's syntax complex and difficult to parse efficiently. It is also much fuller-featured than commonmark, with support for definition lists, footnotes, tables, several new kinds of inline formatting (insert, delete, highlight, superscript, subscript), math, smart punctuation, attributes that can be applied to any element, and generic containers for block-level, inline-level, and raw content.

djot.rs is a pull parser for Djot written in pure Rust. It is built on three principles:

These are not necessarily compatible with each other. As an example, the speed requirement made me make the choice of only targeting UTF-8 and handling all the text as pure bytes, inhibiting the legibility. However, I try to find a balance where possible.

NOTE: djot.rs is not finished and thus unusable at the moment. If you want to help out, send me a mail through the djot discussion mailing list djot-discuss.

Usage

djot.rs is written as a library you can use for parsing Djot into an iterator of markup events. You can use this in combination with a writer to produce the desired output. djot.rs will ship with builtin writers for e.g. HTML, but you can build your own as well.

There exists a CLI for djot.rs, which is hosted on GitHub. With it, you can convert Djot into the desired output from the command line.

Parsing logic

The parsing logic is organized into these two modules:

Both the lexer and the parser work as iterators, to avoid as much allocation as possible. Two passes are necessary, however, since we need to define a map of references and footnotes before starting to parse.

Contributing

All patches should be sent to the djot.rs development mailing list djot-dev.