This is minimal parser combinator of Minimal Parsing Language (MPL) like Top-Down Parsing Language (TDPL). It creates a abstract syntax tree (AST) for each input.
Variable
HashMap
minimal_parse()
Input
[T]
and str
by defaultPosition
u*
, i*
, and f*
by defaultSpan
StartAndLenSpan
by defaultTerminal
SliceTerminal
, StrTerminal
, and U8SliceTerminal
by defaultOutput
()
by defaultRules
HashMap
by defaultParse
[T]
, str
, and [u8]
by default```rust use crate::ParenthesesVariable::; use mpl::parse::Parse; use mpl::rules::{RightRule, RightRuleKind::}; use mpl::span::StartAndLenSpan; use mpl::symbols::{StrTerminal::*, Variable}; use mpl::trees::AST; use std::collections::HashMap;
enum ParenthesesVariable { Open, Parentheses, Close, }
impl Variable for ParenthesesVariable {}
///
/// Open = '(' Parentheses / ()
/// Parentheses = Open Close / f
/// Close = ")" Open / f
///
fn main() {
let mut rules = HashMap::new();
rules.insert(
Open,
RightRule::from_right_rule_kind((T(Char('(')), V(Parentheses)), Empty),
);
rules.insert(
Parentheses,
RightRule::from_right_rule_kind((V(Open), V(Close)), Failure),
);
rules.insert(
Close,
RightRule::from_right_rule_kind((T(Str(")")), V(Open)), Failure),
);
let input = "(()(()))";
// all of the span
let all_of_the_span = StartAndLenSpan::<u32, u16>::from_start_len(0, input.len() as u16);
let result: Result<
AST<ParenthesesVariable, StartAndLenSpan<u32, u16>, ()>,
AST<ParenthesesVariable, StartAndLenSpan<u32, u16>, ()>,
> = input.minimal_parse(&rules, &Open, &all_of_the_span);
if let Ok(ast) = result {
println!("{}", ast);
}
} ```
A MPL grammar G
is a tuple G = (V, Σ, R, S)
in which:
- V
is a finite set of variables.
- Σ
is a finite set of original terminal symbols.
- T
is an union of Σ
or M
(Σ ∪ M) (M
(= {(), f}) is a finite set of metasymbols).
- R
is a finite set of rules of the form
- A = B C / D
A in V (A ∈ V),
B, C, D in E (E = T ∪ V) (T ∩ V = ∅) (B, C, D ∈ E).
For any variable A there is exactly one rule with A to the left of =
.
- S in V (S ∈ V) is the start variable.
()
is a metasymbol that always succeeds without consuming input.
Empty = () () / ()
f
is a metasymbol that always fails without consuming input.
Failure = f f / f
Since one of the goals of MPL is to create an AST, it also supports two features in terms of ease of use and speed.
?
is a metasymbol representing any single input like wildcard character. This succeeds if there is any input left, and fails if there is no input left.
Any = ? () / f
To extend the difinition of MPL grammar, let ? ∈ M.
*
is a metasymbol representing All remaining input like wildcard character. This will succeed even if the remaining inputs are zero.
All = * () / f
Same as All = ? All / ()
.
To extend the difinition of MPL grammar, let * ∈ M.
The biggest difference between the two grammars is the rule form. There are two rule forms in TDPL.
A..BC/D
, A,B,C,D in V.
A..a
, a in ∑ ∪ {ε, f}, f is a metasymbol not in ∑ and ε is the null string.
MPL, on the other hand, has one rule form.
A <- e1 e2
rust
A = e1 e2 / f
A <- e1 / e2
rust
A = e1 () / e2
A <- e*
rust
A = e A / ()
A <- !e .
rust
A = B ? / f
B = e * / ()