A parser library designed for Advent of Code.
This library mainly provides a macro, parser!
, that lets you write
a custom parser for your [AoC] puzzle input in seconds.
For example, my puzzle input for December 2, 2015 looked like this:
4x23x21
22x29x19
11x4x11
8x10x5
24x18x16
...
The parser for this format is a one-liner:
parser!(lines(u64 "x" u64 "x" u64))
.
If you are NOT using [aoc-runner], you can use aoc-parse like this:
```rust use aoc_parse::{parser, prelude::*};
let p = parser!(lines(u64 "x" u64 "x" u64)); assert_eq!( p.parse("4x23x21\n22x29x19\n").unwrap(), vec![(4, 23, 21), (22, 29, 19)] ); ```
If you ARE using aoc-runner, do this instead:
```rust use aocrunnerderive::*;
fn parseinput(text: &str) -> anyhow::Result
asserteq!( parseinput("4x23x21\n22x29x19").unwrap(), vec![(4, 23, 21), (22, 29, 19)] ); ```
The argument you need to pass to the parser!
macro is a pattern;
all aoc-parse does is match strings against your chosen pattern
and convert them into Rust values.
Here are the pieces that you can use in a pattern:
i8
, i16
, i32
, i64
, i128
, isize
- These match an integer,
written out using decimal digits, with an optional +
or -
sign
at the start, like 0
or -11474
.
It's an error if the string contains a number too big to fit in the
type you chose. For example, parser!(i8).parse("1000")
is an error.
(It matches the string, but fails during the "convert" phase.)
u8
, u16
, u32
, u64
, u128
, usize
- The same, but without
the sign.
i8_bin
, i16_bin
, i32_bin
, i64_bin
, i128_bin
, isize_bin
,
u8_bin
, u16_bin
, u32_bin
, u64_bin
, u128_bin
, usize_bin
,
i8_hex
, i16_hex
, i32_hex
, i64_hex
, i128_hex
, isize_hex
,
u8_hex
, u16_hex
, u32_hex
, u64_hex
, u128_hex
, usize_hex
-
Match an integer in base 2 or base 16. The _hex
parsers allow both
uppercase and lowercase digits A
-F
.
bool
- Matches either true
or false
and converts it to the
corresponding bool
value.
alpha
, alnum
, upper
, lower
- Match single characters of
various categories. (These use the Unicode categories, even though
Advent of Code historically sticks to ASCII.)
digit
, digit_bin
, digit_hex
- Match a single ASCII character
that's a digit in base 10, base 2, or base 16, respectively.
The digit is converted to its numeric value, as a usize
.
any_char
: Match the next character, no matter what it is (like .
in a regular expression, except that any_char
matches newline
characters).
"x"
- A Rust string, in quotes, is a pattern that matches that exact
string only.
Exact patterns don't produce a value.
pattern1 pattern2 pattern3...
- Patterns can be
concatenated to form larger patterns. This is how
parser!(u64 "x" u64 "x" u64)
matches the string 4x23x21
. It simply
matches each subpattern in order. It converts the match to a tuple if
there are two or more subpatterns that produce values.
parser_var
- You can use previously defined
parsers that you've stored in local variables.
For example, the amount
parser below makes use of the fraction
parser
defined on the previous line.
``` let fraction = parser!(i64 "/" u64); let amount = parser!(fraction " tsp");
assert_eq!(amount.parse("1/4 tsp").unwrap(), (1, 4)); ```
string(pattern)
- Matches the given pattern,
but instead of converting it to some value, simply return the matched
characters as a String
.
By default, alpha+
returns a Vec<char>
, and sometimes that is handy
in AoC, but often it's better to have it return a String
.
Repeating patterns:
pattern*
- Any pattern followed by an asterisk
matches that pattern zero or more times. It converts the results to a
Vec
. For example, parser!("A"*)
matches the strings A
, AA
,
AAAAAAAAAAAAAA
, and so on, as well as the empty string.
pattern+
- Matches the pattern one or more times, producing a Vec
.
parser!("A"+)
matches A
, AA
, etc., but not the empty string.
pattern?
- Optional pattern, producing a Rust Option
. For
example, parser!("x=" i32?)
matches x=123
, producing Some(123)
;
it also matches x=
, producing the value None
.
These behave just like the *
, +
, and ?
special characters in
regular expressions.
repeat_sep(pattern, separator)
-
Match the given pattern any number of times, separated by the separator.
This converts only the bits that match pattern to Rust values, producing
a Vec
. Any parts of the string matched by separator are not converted.
Custom conversion:
... (name1: pattern1) ... => expr
-
On successfully matching the patterns to the left of =>
, evaluate the Rust
expression expr to convert the results to a single Rust value.
Use this to convert input to structs or enums. For example, suppose we have
input that looks like (3,66)-(27,8)
and we want to produce these structs:
```
struct Point(i64, i64);
struct Line { p1: Point, p2: Point, } ```
The patterns we need are:
``` let point = parser!("(" (x: i64) "," (y: i64) ")" => Point(x, y)); let line_parser = parser!((p1: point) "-" (p2: point) => Line { p1, p2 });
asserteq!( lineparser.parse("(3,66)-(27,8)").unwrap(), Line { p1: Point(3, 66), p2: Point(27, 8) }, ); ```
Patterns with two or more alternatives:
{pattern1, pattern2, ...}
-
First try matching pattern1; if it matches, stop. If not, try
pattern2, and so on. All the patterns must produce the same type of
Rust value.
For example, parser!({"<" => -1, ">" => 1})
either matches <
,
returning the value -1
, or matches >
, returing 1
.
Lines and sections:
line(pattern)
- Matches a single line of text that
matches pattern, and the newline at the end of the line.
This is like ^pattern\n
in regular expressions,
except line(pattern)
will only ever match exactly
one line of text, even if pattern could match more newlines.
line(string(any_char+))
matches a line of text, strips off the newline
character, and returns the rest as a String
.
line("")
matches a blank line.
lines(pattern)
- Matches any number of lines of
text matching pattern. Each line must be terminated by a newline, '\n'
.
Equivalent to line(pattern)*
.
License: MIT