Chumsky

crates.io crates.io License actions-badge

A friendly parser combinator crate that makes writing LL(k) parsers with error recovery and partial parsing easy.

Example usage with my own language, Tao

Note: Error diagnostic rendering is performed by Ariadne

Features

What is a parser combinator?

Parser combinators are a technique for implementing parsers by defining them in terms of other parsers. The resulting parsers use a recursive descent strategy for transforming an input into an output. Using parser combinators to define parsers is roughly analagous to using Rust's Iterator trait to define iterative algorithms: the type-driven API of Iterator makes it more difficult to make mistakes and easier to encode complicated iteration logic than if one were to write the same code by hand. The same is true of parsers and parser combinators.

Example Brainfuck Parser

See examples/brainfuck.rs for the full interpreter (cargo run --example brainfuck -- examples/sample.bf).

```rust use chumsky::prelude::*;

[derive(Clone)]

enum Instr { Left, Right, Incr, Decr, Read, Write, Loop(Vec), }

fn parser() -> impl Parser, Error = Simple> { recursive(|bf| bf.delimited_by('[', ']').map(Instr::Loop) .or(just('<').to(Instr::Left)) .or(just('>').to(Instr::Right)) .or(just('+').to(Instr::Incr)) .or(just('-').to(Instr::Decr)) .or(just(',').to(Instr::Read)) .or(just('.').to(Instr::Write)) .repeated()) } ```

Other examples include:

Error Recovery

Chumsky has support for error recovery, meaning that it can encounter a syntax error, report the error, and then attempt to recover itself into a state in which it can continue parsing so that multiple errors can be produced at once and a partial AST can still be generated from the input for future compilation stages to consume.

However, there is no silver bullet strategy for error recovery. By definition, if the input to a parser is invalid then the parser can only make educated guesses as to the meaning of the input. Different recovery strategies will work better for different languages, and for different patterns within those languages.

Chumsky provides a variety of recovery strategies (each implementing the Strategy trait), but it's important to understand that which you apply, where you apply them, and in what order will greatly affect the quality of the errors that Chumsky is able to produce, along with the extent to which it is able to recover a useful AST. Where possible, you should attempt more 'specific' recovery strategies first rather than those that mindlessly skip large swathes of the input.

It is recommended that you experiment with applying different strategies in different situations and at different levels of the parser to find a configuration that you are happy with. If none of the provided error recovery strategies cover the specific pattern you wish to catch, you can even create your own by digging into Chumsky's internals and implementing your own strategies! If you come up with a useful strategy, feel free to open a PR against the main repo!

Planned Features

Philosophy

Chumsky should:

Other Information

My apologies to Noam for choosing such an absurd name.

License

Chumsky is licensed under the MIT license (see LICENSE) in the main repository.