A tiny library to efficiently search strings for substrings or sets of ASCII characters.
rust
use jetscii::AsciiChars;
let mut search = AsciiChars::new();
search.push(b'-');
search.push(b':');
let part_number = "86-J52:rev1";
let parts: Vec<_> = part_number.split(search.with_fallback(|c| {
c == b'-' || c == b':'
})).collect();
assert_eq!(&parts, &["86", "J52", "rev1"]);
rust
use jetscii::Substring;
let colors: Vec<_> = "red, blue, green".split(Substring::new(", ")).collect();
assert_eq!(&colors, &["red", "blue", "green"]);
We use a particular set of x86-64 SSE 4.2 instructions (PCMPESTRI
and PCMPESTRM
) to gain great speedups. This method stays fast even
when searching for a character in a set of up to 16 choices.
When the PCMPxSTRx
instructions are not available, we fall back to
reasonably fast but universally-supported methods.
Searching a 5MiB string of a
s with a single space at the end:
| Method | Speed |
|--------------------------------------------------|-----------|
| str.find(AsciiChars)
| 5719 MB/s |
| str.as_bytes().iter().position(|&v| v == b' ')
| 1620 MB/s |
| str.find(|c| c == ' ')
| 1090 MB/s |
| str.find(' ')
| 1085 MB/s |
| str.find(&[' '][..])
| 602 MB/s |
| str.find(" ")
| 293 MB/s |
Searching a 5MiB string of a
s with a single ampersand at the end:
| Method | Speed |
|--------------------------------------------------|-----------|
| str.find(AsciiChars)
| 5688 MB/s |
| str.as_bytes().iter().position(|&v| ...)
| 1620 MB/s |
| str.find(|c| ...)
| 1022 MB/s |
| str.find(&['<', '>', '&'][..])
| 361 MB/s |
| Method | Speed |
|--------------------------------------------------|-----------|
| str.find(Substring::new("xyzzy"))
| 5017 MB/s |
| str.find("xyzzy" | 3837 MB/s |
git checkout -b my-new-feature
)git commit -am 'Add some feature'
)git push origin my-new-feature
)