This lightweight crate helps you write long regular expressions as readable as possible by making important parts verbose and gives you options to facilitate writing/reading your expressions. It works in combination with the regex crate.
It consists of three main methods and plenty of handy functions. Along with already-adjusted combinations of flags, special characters and prepared patterns such as Website URL, Date/Time Formats, French, Persian, Chinese Alphabets etc. you could write long regular expressions that are easy to follow and read.
The main functions are literal, list and group. They work together by chaining them, take two arguments, one for an expression, the other for special characters, flags etc.
To create a regex like
rust
r"(?i)(?-i:Don't capture)\s(me)";
one would use the crate as follows:
``` rust
use easyregex::{EasyRegex, settings::{base:: DEFAULT, group::{DEFAULTGROUP, SENSITIVENONCAPTURE}}};
let text = "Don't capture ME"; // a text to be matched by our regex.
let result = EasyRegex::insensitive()
.group("Don't capture", &SENSITIVENONCAPTURE) // SENSITIVENONCAPTURE refers to
// (?-i) and (?: ...) options which
// together makes the (?-i: ...) pattern.
.whitespace(&DEFAULT)
.group(r"me", &DEFAULT_GROUP);
let capturedtext = result.clone().getregex().unwrap() .captures(text).unwrap().get(1).unwrap().as_str();
assert_eq!(r"(?i)(?-i:Don't capture)\s(me)", result.get_regex().unwrap().as_str());
assert_eq!("ME", captured_text); // insensitive ME
```
For
rust
r"^(http|https|ftp):/{2}([a-zA-Z0-9-.]+\.[a-zA-Z]{2,4})(:[0-9]+)?/?([a-zA-Z0-9-._?,'/\\+&%$#=~]*)";
it would be:
```rust
use easyregex::{
EasyRegex,
settings::{Settings, base::*, group::*},
collection::{ALPHANUMERIC, UPPERLOWERCASE}
};
let sectionone = EasyRegex::startofline() .group(r"http|https|ftp", &DEFAULTGROUP) .literal(":", &DEFAULT) .literal(r"/", &Settings::exactly(2));
let sectiontwo = EasyRegex::newsection() .list(r"a-zA-Z0-9-.", &ONEORMORE) .literal(r".", &DEFAULT) .list( &UPPERLOWERCASE, &Settings::range(Some(2), Some(4)) ) .intogroup(&DEFAULT) // put all previous patterns of "sectiontwo" into a group with default options // i.e. a capturing group like (previous patterns) .group(":[0-9]+", &OPTIONAL_GROUP) .literal(r"/", &OPTIONAL);
let sectionthree = EasyRegex::newsection() .literal(&ALPHANUMERIC, &DEFAULT) .literal(r"-.?,'/\+&%$#=~", &DEFAULT) // special characters need not be scaped // due to the next method, intolist. .intolist(&NILORMORE) .into_group(&DEFAULT);
let collectedsections = format!( "{}{}{}", sectionone.getregex().unwrap(), sectiontwo.getregex().unwrap(), sectionthree.get_regex().unwrap() );
let isresultok = regex::RegexBuilder::new(&collectedsections).build().isok(); asserteq!(true, isresult_ok); ```
There are some regular expressions for complicated patterns as Website URL, Date/Time formats, Non-English Alphabets and so on. Here are some examples.
``` rust use easyregex::{ EasyRegex, settings::{Settings, group::DEFAULTGROUP}, collection::{DATE, TIMEHHMM_24} };
let text = r#" Feb 17 2009 5:3am 03/26/1994 8:41 23/7/2030 9:20Pm 12 Sept 2015 6:14 03-26-1994 02:18 2030/4/27 3:50 "#; let result = EasyRegex::newsection() .group(DATE, &DEFAULTGROUP) // will capture any valid format of a date. .literalspace() .group(TIMEHHMM24, &DEFAULTGROUP); // will capture hours and minutes in 24-hour clock. result .clone() .getregex() .unwrap() .capturesiter(text) .foreach(|captures| println!("{}", captures.get(0).unwrap().as_str())); // The captures will be: // 03/26/1994 8:41 // 12 Sept 2015 6:14 // 03-26-1994 02:18 // 2030/4/27 3:50
let matchedpatternscount = result.getregex().unwrap().capturesiter(text).count(); asserteq!(4, matchedpatterns_count); ```
There are a collection of useful regular expressions for other languages including French. ```rust use easyregex::{EasyRegex, collection::FRENCHALPHABET, settings::base::ONEORMORE};
let text = "Adélaïde Aurélie"; let result = EasyRegex::newsection().list(&FRENCHALPHABET, &ONEORMORE);
let count = result.getregex().unwrap().capturesiter(text).count(); assert_eq!(2, count); ```
To make life easier, there are methods for creating certain expressions such as HTML Elements that can have child elements as well. See Helpers.