Enhanced C#
Language of your choice: library documentation
|
Encodes and decodes BAIS (Byte Array In String) encoding, which preserves runs of ASCII characters unchanged. This encoding is useful for debugging (since ASCII runs are visible) and for conversion of bytes to JSON. More...
Encodes and decodes BAIS (Byte Array In String) encoding, which preserves runs of ASCII characters unchanged. This encoding is useful for debugging (since ASCII runs are visible) and for conversion of bytes to JSON.
Arrays encoded with ByteArrayInString.Convert(ArraySlice<byte>, bool) tend to be slightly more compact than standard Uuencoding or Base64, and when you use this encoding in JSON with UTF-8, the output is typically also more compact than yEnc since double-byte characters above 127 are avoided.
A BAIS string alternates between runs of "direct" bytes (usually bytes in the ASCII range that are represented as themselves) and runs of a special base-64 encoding. The base-64 encoding is a sequence of 6-bit digits with 64 added to them, except for 63 which is mapped to itself. This is easier and faster to encode and decode than standard Base64 and has an interesting property described below.
A BAIS string begins in ASCII mode and switches to base 64 when the '' character is encountered. Base-64 mode ends, returning to ASCII, when a '!' character is encountered.
For example:
// C a t
E A B C D var b = new byte[] { 67, 97, 116, 128, 10, 69, 255, 65, 66, 67, 68 }; Assert.AreEqual(ByteArrayInString.Convert(b), "Cat\b`@iE?tEB!CD");
A byte sequence such as 128, 10, 69, 255 can be encoded in base 64 as illustrated:
---128--- ---10---- ---69---- ---255--- Bytes: 1000 0000 0000 1010 0100 0101 1111 1111 Base 64: 100000 000000 101001 000101 111111 110000 Encoded: 01100000 01000000 01101001 01000101 01111111 01110000 ---96--- ---64--- --105--- ---69--- --127--- --112--- ` @ i E ~ p
An interesting property of this base-64 encoding is that when it encodes bytes between 63 and 126, those bytes appear unchanged at certain offsets (specifically the third, sixth, ninth, etc.) In this example, since the third byte is 'E' (69), it also appears as 'E' in the output.
When viewing BAIS strings, another thing to keep in mind is that runs of zeroes ('\0') will tend to appear as runs of @
characters in the base 64 encoding, although a single zero is not always enough to make a @
appear. Runs of 255 will tend to appear as runs of ?
.
There are many ways to encode a given byte array as BAIS.
Static Public Member Functions | |
static string | Convert (ArraySlice< byte > bytes, bool allowControlChars=true) |
Encodes a byte array to a string with BAIS encoding, which preserves runs of ASCII characters unchanged. More... | |
static ArraySlice< byte > | Convert (string s) |
Decodes a BAIS string back to a byte array. More... | |
static char | EncodeBase64Digit (int digit) |
static int | DecodeBase64Digit (char digit) |
|
inlinestatic |
Encodes a byte array to a string with BAIS encoding, which preserves runs of ASCII characters unchanged.
allowControlChars | If true, control characters under 32 are treated as ASCII (except character 8 ''). |
If the byte array can be interpreted as ASCII, it is returned as characters, e.g. Convert(new byte[] { 65,66,67,33 }) == "ABC!"
. When non-ASCII bytes are encountered, they are encoded as described in the description of this class.
For simplicity, this method's base-64 encoding always encodes groups of three bytes if possible (as four characters). This decision may, unfortunately, cut off the beginning of some ASCII runs.
|
inlinestatic |
Decodes a BAIS string back to a byte array.
s | String to decode. |
Convert(s).ToArray()
if you need a true array).