CommonRegex port for Rust
Find a lot of kinds of common information in a string.
Pull requests welcome!
Please note that this is currently English/US specific.
install via Cargo with
sh
cargo install commonregex
Or use Crates.toml
```toml [package] ... ...
[dependencies] commonregex = "0.1.0" ```
You can instantiate a CommonRegex object passing a string in the constructor and use the fields of the object to acess the matches and the methods for the matches of other strings (passing the string as parameter), or not pass a string in the constructor and just use the methods.
Possible properties and its equivalent methods:
dates(text)
times(text: &str)
phones(text: &str)
phones_with_exts(text: &str)
links(text: &str)
emails(text: &str)
ips(text: &str)
ipv6s(text: &str)
prices(text: &str)
hex_colors(text: &str)
credit_cards(text: &str)
visas(text: &str)
mastercards(text: &str)
btc_addresses(text: &str)
street_addresses(text: &str)
zip_codes(text: &str)
po_boxs(text: &str)
ssns(text: &str)
md5s(text: &str)
sha1s(text: &str)
sha2s(text: &str)
guids(text: &str)
isbn13s(text: &str)
isbn10s(text: &str)
mac_addresses(text: &str)
ibans(text: &str)
gitrepos(text: &str)
CommonRegex(text: &str)
parse(regex: &str, text: &str)
let text = 'John, please get that article on www.linkedin.com to me by 5:00PM\n' + 'on Jan 9th 2012. 4:00 would be ideal, actually. If you have any questions,\n' + 'you can reach my associate at (012)-345-6789 or associative@mail.com.\n' + 'I\'ll be on UK during the whole week on a J.R.R. Tolkien convention.';
let parsed = commonregex::CommonRegex(text); println!("{:?}", parsed); /* prints CommonRegex { dates: ["Jan 9th 2012"], times: ["5:00PM", "4:00 "], phones: ["(519)-236-2723"], phoneswithexts: ["(519)-236-2723x341"], links: ["www.linkedin.com", "harold.smith@gmail.com"], emails: ["harold.smith@gmail.com"], ipv4s: [], ipv6s: [], prices: [], hexcolors: ["201", "dea", "eac", "519", "236", "272", "341"], creditcards: [], visas: [], mastercards: [], btcaddresses: [], streetaddresses: [], zipcodes: [], poboxs: [], ssns: [], md5s: [], sha1s: [], sha2s: [], guids: [], isbn13s: [], isbn10s: [], mac_addresses: [], ibans: [], gitrepos: [] } */ println!("{:?}", parsed.dates); //prints ["Jan 9th 2012"] println!("{:?}", parsed.times); //prints ["5:00PM", "4:00"] println!("{:?}",parsed.phones); //prints ["(012)-345-6789"] println!("{:?}",parsed.links); //prints ["www.linkedin.com"] println!("{:?}",parsed.emails); //prints ["associative@mail.com"]
Alternatively, you can generate a single CommonRegex instance and use it to parse multiple segments of text.
println!("{:?}",commonregex::times("When are you free? Do you want to meet up for coffee at 4:00?")); //prints ["4:00"] println!("{:?}",commonregex::prices("They said the price was $5,000.90, actually it is $3,900.5. It\'s $1100.4 less, can you imagine this?")); //prints ["$5,000.90", "$3,900.5", "$110"] println!("{:?}",commonregex::ipv6s("The IPv6 address for localhost is 0:0:0:0:0:0:0:1, or alternatively, ::1.")); //prints ["0:0:0:0:0:0:0:1", "::1"]
There are CommonRegex ports for other languages, see here