Minimal Parsing Language (MPL)

This is minimal parser combinator of Minimal Parsing Language (MPL) like Top-Down Parsing Language (TDPL). It creates a abstract syntax tree (AST) for each input.

Getting Started

  1. implement Variable
  2. insert each rule into HashMap
  3. minimal_parse()

Example

```rust use crate::ParenthesesVariable::; use mpl::parse::Parse; use mpl::rules::{RightRule, RightRuleKind::}; use mpl::span::StartAndLenSpan; use mpl::symbols::{StrTerminal::*, Variable}; use mpl::trees::AST; use std::collections::HashMap;

[derive(Clone, Debug, Hash, Eq, PartialEq)]

enum ParenthesesVariable { Open, Parentheses, Close, }

impl Variable for ParenthesesVariable {}

/// /// Open = '(' Parentheses / () /// Parentheses = Open Close / f /// Close = ")" Open / f /// fn main() { let mut rules = HashMap::new();

rules.insert(
    Open,
    RightRule::from_right_rule_kind((T(Char('(')), V(Parentheses)), Empty),
);
rules.insert(
    Parentheses,
    RightRule::from_right_rule_kind((V(Open), V(Close)), Failure),
);
rules.insert(
    Close,
    RightRule::from_right_rule_kind((T(Str(")")), V(Open)), Failure),
);

let input = "(()(()))";

// all of the span
let all_of_the_span = StartAndLenSpan::<u32, u16>::from_start_len(0, input.len() as u16);

let result: Result<
    AST<ParenthesesVariable, StartAndLenSpan<u32, u16>, ()>,
    AST<ParenthesesVariable, StartAndLenSpan<u32, u16>, ()>,
> = input.minimal_parse(&rules, &Open, &all_of_the_span);

if let Ok(ast) = result {
    println!("{}", ast);
}

} ```

Test Examples

Parsers written with MPL

MPL

Definition of MPL grammar

A MPL grammar G is a tuple G = (V, Σ, R, S) in which: - V is a finite set of variables. - Σ is a finite set of original terminal symbols. - T is an union of Σ or M (Σ ∪ M) (M (= {(), f}) is a finite set of metasymbols). - R is a finite set of rules of the form - A = B C / D
A in V (A ∈ V),
B, C, D in E (E = T ∪ V) (T ∩ V = ∅) (B, C, D ∈ E).
For any variable A there is exactly one rule with A to the left of =. - S in V (S ∈ V) is the start variable.

Empty

() is a metasymbol that always succeeds without consuming input.

Empty = () () / ()

Failure

f is a metasymbol that always fails without consuming input.

Failure = f f / f

Extended MPL

Since one of the goals of MPL is to create an AST, it also supports two features in terms of ease of use and speed.

Any

? is a metasymbol representing any single input like wildcard character. This succeeds if there is any input left, and fails if there is no input left.

Any = ? () / f

To extend the difinition of MPL grammar, let ? ∈ M.

All

* is a metasymbol representing All remaining input like wildcard character. This will succeed even if the remaining inputs are zero.

All = * () / f

Same as All = ? All / ().

To extend the difinition of MPL grammar, let * ∈ M.

Difference between TDPL and MPL

The biggest difference between the two grammars is the rule form. There are two rule forms in TDPL.

A..BC/D, A,B,C,D in V.
A..a, a in ∑ ∪ {ε, f}, f is a metasymbol not in ∑ and ε is the null string.

MPL, on the other hand, has one rule form.

TODO

Practice

Sequence

A <- e1 e2 rust A = e1 e2 / f

Choice

A <- e1 / e2 rust A = e1 () / e2

Zero or more

A <- e* rust A = e A / ()

Not predicate

A <- !e . rust A = B ? / f B = e * / ()

References