Enhanced C#
Language of your choice: library documentation
|
Contains classes related to lexical analysis, such as the universal token type (Loyc.Syntax.Lexing.Token) and Loyc.Syntax.Lexing.TokensToTree. More...
Classes | |
class | BaseILexer |
A version of BaseLexer<CharSrc> that implements ILexer<Token>. You should use this base class if you want to wrap your lexer in a postprocessor such as IndentTokenGenerator or TokensToTree. It can also be used with the EnumerableExt.Buffered extension method to help feed data to your parser. More... | |
class | BaseLexer |
The recommended base class for lexers generated by LLLPG, when not using the inputSource option. More... | |
class | CharCategory |
interface | ILexer |
A standard interface for lexers. More... | |
interface | ILllpgApi |
For reference purposes, this interface is a list of the non-static methods that LLLPG expects to be able to call when it is generating code. LLLPG does not actually need lexers and parsers to implement this interface; they simply need to implement the same set of methods as this interface contains. More... | |
interface | ILllpgLexerApi |
For reference purposes, this interface contains the non-static methods that LLLPG expects lexers to implement. LLLPG does not actually expect lexers to implement this interface; they simply need to implement the same set of methods as this interface contains. More... | |
class | IndentTokenGenerator |
A preprocessor usually inserted between the lexer and parser that inserts "indent", "dedent", and "end-of-line" tokens at appropriate places in a token stream. More... | |
interface | ISimpleToken |
Basic information about a token as expected by BaseParser<Token>: a token Type, which is the type of a "word" in the program (string, identifier, plus sign, etc.), a value (e.g. the name of an identifier), and an index where the token starts in the source file. More... | |
interface | IToken |
The methods of Token in the form of an interface. More... | |
class | LexerSource |
An implementation of the LLLPG Lexer API, used with the LLLPG options inputSource and inputClass . More... | |
class | LexerSourceFile |
Adds the AfterNewline method to SourceFile. More... | |
class | LexerSourceWorkaround |
This class only exists to work around a limitation of the C# language: "cannot change access modifiers when overriding 'protected' inherited member Error(...)". More... | |
class | LexerWrapper |
A base class for wrappers that modify lexer behavior. Implements the ILexer interface, except for the NextToken() method. More... | |
struct | Token |
A common token type recommended for Loyc languages that want to use features such as token literals or the TokensToTree class. More... | |
class | TokenListAsLexer |
Adapter: converts IEnumerable(Token) to the ILexer<Token> interface. More... | |
class | TokensToTree |
A preprocessor usually inserted between the lexer and parser that converts a token list into a token tree. Everything inside brackets, parens or braces is made a child of the open bracket. More... | |
class | TokenTree |
A list of Token structures along with the ISourceFile object that represents the source file that the tokens came from. More... | |
class | TriviaSaver |
A lexer wrapper that saves whitespace tokens into a list (TriviaList). More... | |
class | WhitespaceFilter |
Filters out tokens whose Value is WhitespaceTag.Value. More... | |
class | WhitespaceTag |
WhitespaceTag.Value can be used as the Token.Value of whitespace tokens, to make whitespace easy to filter out. More... | |
Enumerations | |
enum | TokenKind : ushort { TokenKind.Other = 0x0000, TokenKind.Comment = 0x0100, TokenKind.Id = 0x0200, TokenKind.Literal = 0x0300, TokenKind.Dot = 0x0600, TokenKind.Assignment = 0x0700, TokenKind.Operator = 0x0800, TokenKind.Separator = 0x0900, TokenKind.AttrKeyword = 0x0A00, TokenKind.TypeKeyword = 0x0B00, TokenKind.OtherKeyword = 0x0C00, TokenKind.Spaces = 0x0F00, LParen = 0x1000, RParen = 0x1100, LBrack = 0x1200, RBrack = 0x1300, LBrace = 0x1400, RBrace = 0x1500, Indent = 0x1600, Dedent = 0x1700, LOther = 0x1800, ROther = 0x1900, KindMask = 0x1F00, TokenKind.BracketFlag = 0x1000, TokenKind.CloserFlag = 0x0100 } |
A list of token categories that most programming languages have. More... | |
Contains classes related to lexical analysis, such as the universal token type (Loyc.Syntax.Lexing.Token) and Loyc.Syntax.Lexing.TokensToTree.
|
strong |
A list of token categories that most programming languages have.
Some Loyc languages will support the concept of a "token literal" which is a TokenTree, and some DSLs will rely on these token literals for input. However, tokens differ between different languages; for instance the set of operators varies between languages. On the other hand, most languages do have some concept of "an operator" and "an identifier", and the TokenKind reflects this fact.
When you are using Token to represent tokens in your language, it is recommended to define every value of your "TokenType" enumeration in terms of TokenKind using integer offsets, like this:
enum MyTokenType { EOF = TokenKind.Spaces, Id = TokenKind.Id, IfKeyword = TokenKind.OtherKeyword, ForKeyword = TokenKind.OtherKeyword + 1, LoopKeyword = TokenKind.OtherKeyword + 2, ... MulOp = TokenKind.Operator, AddOp = TokenKind.Operator + 1, DivOp = TokenKind.Operator + 2, DotOp = TokenKind.Dot, ... }
Using TokenKind is only important if you intend to support DSLs via token literals (e.g. LLLPG) in your language.
A DSL that just needs simple tokens like "strings", "identifiers" and "dots" can write a parser based on values of Token.Kind alone; if it needs certain specific operators or "keywords" that do not have a dedicated TokenKind, such as + and %, it can further check the Value of the token; meanwhile, the host language put a global Symbol in the Token.Value to represent operators, keywords and identifiers.