To force writing long regular expressions as readable as possible by making it verbose and clear to write/read, this lightweight crate has been created.
It consists of few main methods and plenty of handy functions. You can write a long regex using them along with already-adjusted settings and prepared patterns for English, French, German, Persian, Chinese and Arabic languages.
To create a regex like
rust
r"(?i)(?-i:Don't capture)\s(me)";
one would use this crate as follows:
``` rust
use easyregex::{EasyRegex, settings::{base:: DEFAULT, group::{DEFAULTGROUP, SENSITIVENONCAPTURE}}};
let text = "Don't capture ME"; // a text to be matched by our regex.
let result = EasyRegex::insensitive()
.group("Don't capture", &SENSITIVENONCAPTURE) // SENSITIVENONCAPTURE refers to
// (?i) and (?: ...) options which
// together makes the (?-i: ...) pattern.
.literal(r"\s", &DEFAULT)
.group(r"me", &DEFAULT_GROUP);
let mut capturedtext = result.clone().getregex().unwrap() .captures(text).unwrap().get(1).unwrap().as_str();
assert_eq!(r"(?i)(?-i:Don't capture)\s(me)", result.get_regex().unwrap().as_str());
assert_eq!("ME", captured_text); // insensitive ME
```
There are a collection of useful regular expressions for other languages including French. ```rust use easyregex::{EasyRegex, collection::FRENCHALPHABET, settings::base::ONEORMORE};
let text = "Adélaïde Aurélie"; let result = EasyRegex::newsection().list(&FRENCHALPHABET, &ONEORMORE);
let count = result.getregex().unwrap().capturesiter(text).count(); assert_eq!(2, count); ```
And for a long one:
rust
r"^(http|https|ftp):[/]{2}([a-zA-Z0-9-.]+\.[a-zA-Z]{2,4})(:[0-9]+)?/?([a-zA-Z0-9-._?,'/\\+&%$#=~]*)";
It would be:
```rust
use easyregex::{
EasyRegex,
settings::{
Settings,
base:: {DEFAULT, OPTIONAL, NILORMORE, ONEORMORE},
group::{DEFAULTGROUP, OPTIONALGROUP, SENSITIVENONCAPTURE}},
collection::{ALPHANUMERIC, UPPERLOWERCASE}
};
let sectionone = EasyRegex::startofline() .group(r"http|https|ftp", &DEFAULTGROUP) .literal(":", &DEFAULT) .list( r"/", &Settings::exactly(2) );
let sectiontwo = EasyRegex::newsection() .list(r"a-zA-Z0-9-.", &ONEORMORE) .literal(r".", &DEFAULT) .list( &UPPERLOWERCASE, &Settings::range(Some(2), Some(4)) ) .intogroup(&DEFAULT) // put all previous patterns of "sectiontwo" into a group with default options // i.e. a capturing group like (previous patterns) .group(":[0-9]+", &OPTIONAL_GROUP) .literal(r"/", &OPTIONAL);
let sectionthree = EasyRegex::newsection() .literal(&ALPHANUMERIC, &DEFAULT) .literal(r"-.?,'/\+&%$#=~", &DEFAULT) // special characters need not be scaped // due to the next method, intolist. .intolist(&NILORMORE) .into_group(&DEFAULT);
let collectedsections = format!( "{}{}{}", sectionone.getregex().unwrap(), sectiontwo.getregex().unwrap(), sectionthree.get_regex().unwrap() );
let isresultok = regex::RegexBuilder::new(&collectedsections).build().isok(); asserteq!(true, isresult_ok); ```