base256u

Just a simple Rust crate to map between bytes and unicode glyphs. Includes reference printable-ascii-preserved Unicode (papu) encoder and decoder functions. The papu encoding will preserve all text that is already only printable ascii characters and all the other bytes map to single-codepoint non-combining printable glyphs, skipping odd things like NBSP and SHY.

You can find the documentation in the usual place.

Using this crate is as simple as use base256u::{Decode, Encode}; and then calling the base256u() method or base256u_papu() to get the default papu encoding.

```rust use crate::{Decode, Encode};

[test]

fn encoding() { let encoded: String = (u8::MIN..=u8::MAX).base256upapu().collect(); asserteq!(encoded, "°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^`abcdefghijklmnopqrstuvwxyz{|}~§ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇň¤ŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſ"); let encoded: String = b"Pack my box with five dozen liquor jugs." .intoiter() .copied() .base256upapu() .collect(); asserteq!(encoded, "Pack my box with five dozen liquor jugs."); }

[test]

fn decoding() { let decoded: Vec> = "°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^`abcdefghijklmnopqrstuvwxyz{|}~§ĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪīĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇň¤ŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſƝʼn".chars().base256upapu().collect(); let mut matcher: Vec> = (u8::MIN..=u8::MAX).map(|b| Some(b)).collect(); matcher.push(None); matcher.push(None); asserteq!(decoded, matcher); let decoded: Vec = "Pack my box with five dozen liquor jugs." .chars() .base256upapu() .map(|c| c.unwrap()) .collect(); asserteq!( String::fromutf8(decoded).unwrap(), "Pack my box with five dozen liquor jugs." ); } ```