1. Introduction to LLLPG

30 May 2016 (although LLLPG 0.9 was first published 7 Oct 2013)

LLLPG (Loyc LL(k) Parser Generator) is a recursive-decent parser generator for C#, with a feature set slightly better than ANTLR version 2. It’s a system that I decided to create after trying to use ANTLR3’s C# module - about seven years ago now - and ran into C#-specific bugs that I couldn’t overcome. The author of ANTLR, a Java guy, wasn’t about to fix them, and the author of the C# target had seemingly vanished.

Besides, I wasn’t happy with the ANTLR-generated code; I thought I could generate simpler and more efficient code. “How hard could it be to make an LL(k) parser generator?” I wondered. The answer: really hard, actually. Even today I’ve never seen a paper about how it should be done. Since I didn’t really know what I was doing, it ended up taking several years to write LLLPG, and while its performance could use improvement, I’m happy with the result.

While ANTLR has advanced some ways in that time period, it is still Java-centric, and I think the advantages of LLLPG still make it worth considering even if all the major C#-specific bugs in ANTLR have been fixed (I don’t know if they have or not, but the C# version still lags behind the Java version).

Typically, you will use the LLLPG Visual Studio Custom Tool (a.k.a. Single-File Generator):

LLLPG in Visual Studio

What kind of parser generator is LLLPG?

There are several types of parser generators, e.g. LALR(1), PEG, LL(1), LL(k), and LL(*). Of these, I think PEG (Parsing Expression Grammars, usually implemented with packrat parsers) and LL(k)/LL(*) (hand-written parsers and ANTLR 3/4) are the most popular for writing new grammars today (some people also use regular expressions, but regexes are much less powerful than “proper” parser generators because they do not support full recursion).

Of course, LLLPG is an LL(k) parser generator. In addition to plain LL(k), LLLPG has a few extra, advanced features because some programming languages are difficult to express with an LL(k) grammar alone. LL(k) has two main advantages: potentially high performance (especially if k is low), and output that is relatively easy to understand. To be honest, ANTLR 3/4 is more powerful than LLLPG because the lookahead value k is unlimited, but unlimited lookahead is not free; if your goal is to write a fast parser, limiting yourself to LL(k) is something you might do anyway. In LLLPG, you can still do unlimited lookahead with a zero-width assertion, it’s just not automatic; you have to ask for it.

LLLPG is not a dedicated tool the way ANTLR is. Instead, LLLPG is designed to be embedded inside another programming language. While you may use LLLPG similarly to other parser generators, it’s really just a “macro” inside a programming language I’m making called Enhanced C# — one of a hundred macros that you might be using, and perhaps in the future you’ll write a macro or two yourself.

As of early 2016, Enhanced C# is incomplete; only two components of it are ready (the parser, and the macro runner which is called LeMP). Hopefully though, you’ll find it fairly user-friendly and fun.

A focus on prediction

LLLPG is designed to focus on one job and do it as well as possible: LL(k) prediction analysis. LLLPG doesn’t try to do everything for you: it doesn’t construct tokens, it doesn’t create syntax trees. You’re a programmer, and you already have a programming language; so I assume you know enough to design your own Token class and syntax tree classes. If I designed and built your syntax trees for you, I figure I’d just be increasing the learning curve: not only would you have to learn how to use LLLPG, you’d have to learn my class library too! No, LLLPG’s main goal is to eliminate the most difficult and error-prone part of writing LL(k) parsers by hand: figuring out which branch to take, or which method to call. LLLPG still leaves you in charge of the rest.

That said, I have designed a universal syntax tree as part of the Loyc project, called the Loyc tree, but LLLPG is not oriented toward helping you use them. Even so, I hope you’ll consider using Loyc trees, and this manual will show you how later. Internally, LLLPG uses them heavily.

Advantages of LLLPG over other tools

“Blah, blah, blah! Show me this thing already!”

Sure! Let’s look at some simple examples.