Gobble is a simple parser combinator system for parsing strings.
*Note:It works well but it is currently still under heavy development, so the API may change significantly between versions. if the 'b' changes in "0.b.c" there will be breaking changes. Though I do believe right now I'm close to setting on the API
I'm very open to recieving feedback on github*
Creating Parsers in rust should be quite straight forward. For example parsing a function call
```rust use gobble::*; parser!{ (Ident->String) string((Alpha.one(),(Alpha,NumDigit,'_').istar())) }
parser!{
(FSig->(String,Vec
If you'd prefer not to use macros, you don't have to:
```rust use gobble::*; let ident = || string((Alpha.one(),(Alpha,NumDigit,'_').istar()));
let fsig = (first(ident(),"("),sepuntilig(ident(),",",")"));
let (nm, args) = fsig.parses("loadFile1(fname,ref)").unwrap(); asserteq!(nm, "loadFile1"); asserteq!(args, vec!["fname", "ref"]); //identifiers cant start with numbers, assert!(fsig.parses("23file(fname,ref)").is_err());
```
But the macros guarantee of Zero-Sized types which is nice when combining them
To work this library depends the following:
```rust pub enum ParseError { //... }
// In the OK Case the value mean
// LCChars = copy of original, but moved forward,
// V = The resulting type
// Option
//implements Iterator and can be cloned relatively cheaply pub struct LCChars<'a>{ it:std::str::Chars<'a>, line:usize, col:usize, }
pub trait Parser
//...helper methods
} pub trait CharBool { fn char_bool(&self,c:char)->bool; //....helper methods // } ```
Parser is automatically implemented for:
* Fn<'a>(&LCChars<'a>)->ParseRes<'a,String>
* &'static str
which will return itself if it matches
* char
which will return itself if it matched the next char
* Tuples of up to 6 parsers. Returning a tuple of all the
parsers matched one after the
other.
Most of the time a parser can be built simply by combining other parsers ```rust use gobble::*;
// map can be used to convert one result to another // keyval is now a function that returns a parser let keyval = || (common::Ident,":",common::Quoted).map(|(a,_,c)|(a,c));
//this can also be written as below for better type safety
fn keyval2()->impl Parser // or as a macro KeyVal is now a struct like:
// pub struct KeyVal;
parser!{
(KeyVal->(String,String))
(common::Ident,":",common::Quoted).map(|(a,_,c)|(a,c))
} //parses is a helper on Parsers
let (k,v) = keyval().parses(r#"car:"mini""#).unwrap();
asserteq!(k,"car");
asserteq!(v,"mini"); //this can now be combined with other parsers.
// 'igthen' combines 2 parsers and drops the result of the first
// 'thenig' drops the result of the second
// 'sepuntil will repeat the first term into a Vec, separated by the second
// until the final term.
let obj = || "{".igthen(sepuntilig(keyval(),",","}")); let obs = obj().parses(r#"{cat:"Tiddles",dog:"Spot"}"#).unwrap();
asserteq!(obs[0],("cat".tostring(),"Tiddles".tostring())); ```
## CharBool CharBool is the trait for boolean char checks. It is auto implemented for:
* Fn(char)->bool
* char -- Returns true if the input matches the char
* &'static str -- returns true if the str contains the input
* several zero size types - Alpha,NumDigit,HexDigit,WS,WSL,Any
* Tuples of up to 6 CharBools -- returning true if any of the members succeed This means you can combine them in tuples CharBool also provides several helper methods which each return a parser
* And a helper that returns a CharBool
* let id = (Alpha,"*").minn(4).parses("sm*shinggame+you").unwrap();
asserteq!(id,"sm*shinggame"); // not enough matches
assert!((NumDigit,"abc").minn(4).parses("23fflr").is_err()); // any succeeds even with no matches equivilent to minn(0) but "Zero Size"
asserteq!((NumDigit,"abc").star().parses("23fflr"),Ok("23".tostring()));
asserteq!((NumDigit,"abc").star().parses("fflr"),Ok("".to_string())); ``` White space is pretty straight forward to handle ```rust
use gobble::*;
let myws = || " \t".star();
// middle takes three parsers and returns the result of the middle
// this could also be done easily with 'map' or 'thenig'
let mys = |p| middle(myws(),p,my_ws()); let spid = mys(common::Ident);
let v = spid.parses(" \t doggo ").unwrap();
asserteq!(v,"doggo");
Some structures like Json, or programming languages need to be able to
handle recursion. However with the techniques we have used so far
this would lead to infinitely sized structures. The way to handle this is to make sure one member of the loop is not ```rust
use gobble::*; enum Expr {
Val(isize),
Add(Box fn exprl()->impl Parser // using the full fn def we avoid the recursive structure
fn expr<'a>(it:&LCChars<'a>)->ParseRes<'a,Expr> {
//note that exprl has brackets but expr doesnt.
//expr is a reference to a static function
let p = (exprl(),maybe(s("+").igthen(expr)))
.map(|(l,opr)|match opr{
Some(r)=>Expr::Add(Box::new(l),Box::new(r)),
None=>l,
}); } let r = expr.parse_s("45 + (34+3 )").unwrap(); //recursive structures are never fun to write manually
assert_eq!(r,Expr::Add(
Box::new(Expr::Val(45)),
Box::new(Expr::Paren(Box::new(Expr::Add(
Box::new(Expr::Val(34)),
Box::new(Expr::Val(3))
))))
)); ``` *Now Parser output is a trait associated type (Out)
use (Alpha,NumDigit,"_").char_bool(c)
will be true if any of them matchone(self)
matches and returns exactly 1 character
* plus(self)
'+' requires at least 1 matches and ruturns a string
* min_n(self,n:usize)
requires at least n matches and ruturns a string
* star(self)
'' matches any number of chars returning a string
* exact(self,n:usize)
'' matches exactly n chars returning a string
* iplus(self)
'+' requires at least 1 matches and ruturns a ()
* istar(self)
'*' matches any number of chars returning a ()
* iexact(self,n:usize)
matches exactly n chars returning a ()except(self,cb:CharBool)
Passes if self does, and cb doesnt
```rust
use gobble::*;
let s = |c| c > 'w' || c == 'z';
let xv = s.one().parses("xhello").unwrap();
asserteq!(xv,'x');White Space
That said gobble already provides
WSand
s(p)```rust
use gobble::*;
//eoi = end of input
let p = repeat_until_ig(s_("abc".plus()),eoi);
let r = p.parse_s("aaa \tbbb bab").unwrap();
assert_eq!(r,vec!["aaa","bbb","bab"]);
Recursive Structures
build into the structure. Instead to create it using the 'Fn' or with a macro which will return a zero sized struct for certain[derive(Debug,PartialEq)]
p.parse(it)
Changelog:
v 0.5.3
v 0.5.2
v 0.5.1
v 0.5.0
v 0.4.4:
skip_star(p)
skip_plus(p)
skip_exact(p,n)
v 0.4.3:
string<A:Parser>(a:A)->impl Parser<String>
to create a parser that reads the internal parser but returns the whole string it matched onv 0.4.2:
v 0.4.1:
v 0.4.0:
v 0.3.0: Breaking Changes
impl Parser<Out=V>
instead of impl Parser<V>
and most things should work
* readfs removed - use CharBool.minn(usize) instead
* Esc removed - see common::common_str for how to handle escapesv 0.2.1 :
v 0.2.0 -- Major update:
v 0.1.6:
one_char(&str)
Parser to check the next char is a member of that.v 0.1.5 :
v 0.1.4:
common_int
and common_bool
parsersv 0.1.3:
v 0.1.2 :
sep_until(main,sep,close)
repeat_until(main,close)
v 0.1.1 :
eoi
and to_end()
functions for making sure you have the end of the input;common_str()
for getting the most common form of string