Enhanced C#
Language of your choice: library documentation
Properties | Public Member Functions | Protected Member Functions | Protected fields | List of all members
Loyc.Syntax.StreamCharSource Class Reference

Exposes a stream as an ICharSource, as though it were an array of characters. The stream must support seeking, and if a text decoder is specified, it must meet certain constraints. More...


Source file:
Inheritance diagram for Loyc.Syntax.StreamCharSource:
Loyc.Collections.Impl.ListSourceBase< char > Loyc.Collections.ICharSource Loyc.Collections.IListSource< char >

Remarks

Exposes a stream as an ICharSource, as though it were an array of characters. The stream must support seeking, and if a text decoder is specified, it must meet certain constraints.

This class reads small blocks of bytes from a stream, reloading blocks from the stream when necessary. Data is cached with a pair of character buffers, and a third buffer is used to read from the stream. A Stream is required rather than a TextReader because TextReader doesn't support seeking.

This class assumes the underlying stream never changes.

The stream does not (and probably cannot, if I understand the System.Text.Decoder API correctly) save the decoder state at each block boundary. Consequently, only encodings that meet special constraints will work with StreamCharSource. These include Encoding.Unicode, Encoding.UTF8, and Encoding.UTF32, but not Encoding.UTF7. Using unsupported encodings will cause exceptions and/or or corrupted data output while reading from the StreamCharSource.

The decoder must meet the following constraints:

  1. Characters must be divided on a byte boundary. UTF-7 doesn't work because some characters are encoded using Base64.
  2. Between characters output by the decoder, the decoder must be stateless. Therefore, encodings that support compression generally won't work.
  3. The decoder must produce at least one character from a group of 8 bytes (StreamCharSource.MaxSeqSize).

Properties

override int Count [get]
 
- Properties inherited from Loyc.Collections.Impl.ListSourceBase< char >
abstract override int Count [get]
 
bool IsEmpty [get]
 
this[int index] [get]
 

Public Member Functions

 StreamCharSource (Stream stream)
 
 StreamCharSource (Stream stream, Decoder decoder)
 
 StreamCharSource (Stream stream, Encoding encoding)
 
 StreamCharSource (Stream stream, Decoder decoder, int bufSize)
 
new StringSlice Slice (int startIndex, int length)
 Returns a substring from the character source. If some of the requested characters are past the end of the stream, the string is truncated to the available number of characters. More...
 
sealed override char TryGet (int index, out bool fail)
 
- Public Member Functions inherited from Loyc.Collections.Impl.ListSourceBase< char >
int IndexOf (T item)
 
Slice_< T > Slice (int start, int count)
 
override IEnumerator< T > GetEnumerator ()
 
- Public Member Functions inherited from Loyc.Collections.IListSource< char >
TryGet (int index, out bool fail)
 Gets the item at the specified index, and does not throw an exception on failure. More...
 
IRange< T > Slice (int start, int count=int.MaxValue)
 Returns a sub-range of this list. More...
 

Protected Member Functions

void SwapBlks ()
 
bool Access (int charIndex)
 
void ReloadBlockOf (int charIndex)
 
void ScanPast (int index)
 
void ReadNextBlock ()
 
- Protected Member Functions inherited from Loyc.Collections.Impl.ListSourceBase< char >
int ThrowIndexOutOfRange (int index)
 

Protected fields

Stream _stream
 
byte[] _buf
 
char[] _blk
 
int _blkStart
 
int _blkLen
 
List< Pair< int, uint > > _blkOffsets = new List<Pair<int,uint>>()
 A sorted list of mappings between byte positions and character indexes. In each Pair(of A,B), A is the character index and B is the byte index. This list is built on-demand. More...
 
bool _reachedEnd = false
 Set true when the last block has been scanned. If true, then _eofIndex and _eofPosition indicate the Count and the size of the stream, respectively. More...
 
int _eofIndex = 0
 _eofIndex is the character index of EOF if it has been reached or, if not, the index of the first unscanned character. _eofIndex equals _blkOffsets[_blkOffsets.Count-1].A. More...
 
uint _eofPosition = 0
 _eofPosition is the byte position of EOF if it has been reached or, if not, the position of the first unscanned character. _eofPosition equals _blkOffsets[_blkOffsets.Count-1].B. More...
 
Decoder _decoder
 
const int DefaultBufSize = 2048 + MaxSeqSize - 1
 
const int MaxSeqSize = 8
 

Member Function Documentation

new StringSlice Loyc.Syntax.StreamCharSource.Slice ( int  startIndex,
int  length 
)
inline

Returns a substring from the character source. If some of the requested characters are past the end of the stream, the string is truncated to the available number of characters.

Parameters
startIndexIndex of first character to return. If startIndex >= Count, an empty string is returned.
lengthNumber of characters desired.
Exceptions
ArgumentExceptionThrown if startIndex or length are negative.

Implements Loyc.Collections.ICharSource.

References Loyc.Collections.ArraySlice< T >.InternalList, and Loyc.Localize.Localized().

Member Data Documentation

List<Pair<int,uint> > Loyc.Syntax.StreamCharSource._blkOffsets = new List<Pair<int,uint>>()
protected

A sorted list of mappings between byte positions and character indexes. In each Pair(of A,B), A is the character index and B is the byte index. This list is built on-demand.

int Loyc.Syntax.StreamCharSource._eofIndex = 0
protected

_eofIndex is the character index of EOF if it has been reached or, if not, the index of the first unscanned character. _eofIndex equals _blkOffsets[_blkOffsets.Count-1].A.

uint Loyc.Syntax.StreamCharSource._eofPosition = 0
protected

_eofPosition is the byte position of EOF if it has been reached or, if not, the position of the first unscanned character. _eofPosition equals _blkOffsets[_blkOffsets.Count-1].B.

bool Loyc.Syntax.StreamCharSource._reachedEnd = false
protected

Set true when the last block has been scanned. If true, then _eofIndex and _eofPosition indicate the Count and the size of the stream, respectively.