Enhanced C#
Language of your choice: library documentation
Static Public Member Functions | List of all members
Loyc.ByteArrayInString Class Reference

Encodes and decodes BAIS (Byte Array In String) encoding, which preserves runs of ASCII characters unchanged. This encoding is useful for debugging (since ASCII runs are visible) and for conversion of bytes to JSON. More...


Source file:

Remarks

Encodes and decodes BAIS (Byte Array In String) encoding, which preserves runs of ASCII characters unchanged. This encoding is useful for debugging (since ASCII runs are visible) and for conversion of bytes to JSON.

Arrays encoded with ByteArrayInString.Convert(ArraySlice<byte>, bool) tend to be slightly more compact than standard Uuencoding or Base64, and when you use this encoding in JSON with UTF-8, the output is typically also more compact than yEnc since double-byte characters above 127 are avoided.

A BAIS string alternates between runs of "direct" bytes (usually bytes in the ASCII range that are represented as themselves) and runs of a special base-64 encoding. The base-64 encoding is a sequence of 6-bit digits with 64 added to them, except for 63 which is mapped to itself. This is easier and faster to encode and decode than standard Base64 and has an interesting property described below.

A BAIS string begins in ASCII mode and switches to base 64 when the '' character is encountered. Base-64 mode ends, returning to ASCII, when a '!' character is encountered.

For example:

  //                    C   a    t       
E A B C D var b = new byte[] { 67, 97, 116, 128, 10, 69, 255, 65, 66, 67, 68 }; Assert.AreEqual(ByteArrayInString.Convert(b), "Cat\b`@iE?tEB!CD");

A byte sequence such as 128, 10, 69, 255 can be encoded in base 64 as illustrated:

             ---128---    ---10----    ---69----  ---255---  
  Bytes:     1000 0000    0000 1010    0100 0101  1111 1111  
  Base 64:   100000   000000   101001    000101   111111   110000
  Encoded: 01100000 01000000 01101001  01000101 01111111 01110000
           ---96--- ---64--- --105---  ---69--- --127--- --112---
              `        @        i         E        ~        p

An interesting property of this base-64 encoding is that when it encodes bytes between 63 and 126, those bytes appear unchanged at certain offsets (specifically the third, sixth, ninth, etc.) In this example, since the third byte is 'E' (69), it also appears as 'E' in the output.

When viewing BAIS strings, another thing to keep in mind is that runs of zeroes ('\0') will tend to appear as runs of @ characters in the base 64 encoding, although a single zero is not always enough to make a @ appear. Runs of 255 will tend to appear as runs of ?.

There are many ways to encode a given byte array as BAIS.

Static Public Member Functions

static string Convert (ArraySlice< byte > bytes, bool allowControlChars=true)
 Encodes a byte array to a string with BAIS encoding, which preserves runs of ASCII characters unchanged. More...
 
static ArraySlice< byte > Convert (string s)
 Decodes a BAIS string back to a byte array. More...
 
static char EncodeBase64Digit (int digit)
 
static int DecodeBase64Digit (char digit)
 

Member Function Documentation

◆ Convert() [1/2]

static string Loyc.ByteArrayInString.Convert ( ArraySlice< byte >  bytes,
bool  allowControlChars = true 
)
inlinestatic

Encodes a byte array to a string with BAIS encoding, which preserves runs of ASCII characters unchanged.

Parameters
allowControlCharsIf true, control characters under 32 are treated as ASCII (except character 8 '').
Returns
The encoded string.

If the byte array can be interpreted as ASCII, it is returned as characters, e.g. Convert(new byte[] { 65,66,67,33 }) == "ABC!". When non-ASCII bytes are encountered, they are encoded as described in the description of this class.

For simplicity, this method's base-64 encoding always encodes groups of three bytes if possible (as four characters). This decision may unfortunately cut off the beginning of some ASCII runs.

◆ Convert() [2/2]

static ArraySlice<byte> Loyc.ByteArrayInString.Convert ( string  s)
inlinestatic

Decodes a BAIS string back to a byte array.

Parameters
sString to decode.
Returns
Decoded byte array (use Convert(s).ToArray() if you need a true array).