Using LeMP for C# code generation and analysis
2 Mar 2016 (edited March 5)Introduction
Today I planned to write an article about the new pattern-matching and “algebraic data type” features I added to C# via LeMP, but then I saw the new WuffProjects.CodeGeneration library and thought “wait a minute, LeMP has made that easy for a year now!” In fact, LeMP can do some pretty neat stuff, as you’ll see!
LeMP is a macro processor for a superset of C# called “Enhanced C#”. If you’ve ever used sweet.js, LeMP is basically the same thing for C#, just not as polished. Also, whereas sweet.js seems focused on letting you create your own macros, LeMP comes with many useful macros right-out-of-the-box, but creating new ones isn’t as easy (yet).
So here’s the scenario: you want to write a program that generates C# source code, and either runs it or analyzes it somehow. How should you do it?
In fact, this article also shows how to parse and analyze C# source code, but I’ll focus first on code generation. This article also contains links to some fascinating stuff, so try to read to the end before you click off to somewhere else… this article gets more interesting (IMO) as you geet farther into it!
Background: The Old Ways
First, let’s touch on a couple of alternatives:
- Obviously, you can generate C# code as a string and simply write it to a file. That’s okay for simple tasks, but it can be challenging to ensure that the syntax and spacing is correct. Therefore, many people use CodeDom, but that’s a “lowest common denominator” technology; the constructs supported by CodeDom are limited. Finally there’s Roslyn, which includes the recently-open-sourced official C# compiler engine. Compared to LeMP it’s downright clunky to use, but it does offer semantic analysis and 100% perfect reliability.
- If you’d like to generate code inside Visual Studio, the most well-known approach is to use T4 templates. However, it may be better to use LeMP instead, because (1) the output looks better (no ugly indentation problems), (2) your input file is much more elegant, (3) LeMP has lots of built-in features to enhance C#, and (4) your input file gets syntax highlighting thanks to a free Visual Studio extension. Let me know in the comments what you’d like to accomplish with LeMP; I might be able to help!
Now, a couple of years ago I wrote the LL(k) parser generator, which needs a robust way to generate C# code. For this it uses Loyc trees printed by the Enhanced C# printing engine in Loyc.Ecs.dll, all of which is part of the Loyc repo on GitHub.
In layman’s terms, you can use somewhat scary code like this to generate C# (without any LeMP goodness):
public static void Main(string[] args)
{
File.WriteAllText("helloWorld.cs", HelloWorldProgram("Hello, World!"));
}
static string HelloWorldProgram(string whatToPrint)
{
var F = new LNodeFactory(EmptySourceFile.Unknown);
var code = LNode.List(
F.Call(CodeSymbols.Import, F.Id("System")),
F.Call(CodeSymbols.Import, F.Dot(F.Id("System"), F.Id("Collections"), F.Id("Generic"))),
F.Call(CodeSymbols.Namespace, F.Id("Namespaze"), F.Missing, F.Braces(
F.Call(CodeSymbols.Class, F.Id("Klass"), F.List(), F.Braces(
F.Fn(F.Void, F.Id("Main"), F.List(), F.Braces(
F.Call(F.Dot(F.Id("Console"), F.Id("WriteLine")), F.Literal(whatToPrint))
))
))
)));
return EcsLanguageService.WithPlainCSharpPrinter.Print(code);
}
Then HelloWorldProgram("Hello, World!")
returns
using System;
using System.Collections.Generic;
namespace Namespaze
{
class Klass
{
void Main()
{
Console.WriteLine("Hello, world!");
}
}
}
It’s a little easier if you ask the parser to do some of the work, and then do a find-and-replace…
static string HelloWorldProgram(string whatToPrint)
{
var F = new LNodeFactory(EmptySourceFile.Unknown);
IEnumerable<LNode> code = EcsLanguageService.Value.Parse(@"
using System;
using System.Collections.Generic;
namespace Namespaze {
class Klass {
void Main() {
Console.WriteLine(PLACEHOLDER);
}
}
}",
MessageSink.Console, ParsingService.Stmts);
// Now substitute code requested by caller
code = code.Select((LNode stmt) =>
stmt.ReplaceRecursive(expr => {
if (expr.IsIdNamed("PLACEHOLDER"))
return F.Literal(whatToPrint);
return null;
}));
return EcsLanguageService.WithPlainCSharpPrinter.Print(code);
}
But this isn’t really what you want, since syntax errors are not detected at compile-time, and you’re wasting runtime CPU cycles on parsing code instead of generating it.
Introducing LeMP
LeMP lets you do code generation with a “literal” representation of the code (in comp-sci jargon, it makes C# pretend to be homoiconic). For example, suppose you want to generate a method called Square()
that takes a parameter of a user-defined type T
and squares it. You’ll be able to write that as
static LNode GetSquareFunction(LNode T) {
return quote {
public static $T Square($T x) => x*x;
};
}
First, you’ll need to install LeMP.
LeMP itself is a code generator, so what I’m doing now is showing you how to use a code generator in Visual Studio (LeMP) to generate another code generator that runs outside Visual Studio. (You could then, if you wanted, reprogram your code generator to run inside Visual Studio by reading my article about Custom Tools, or better yet, by writing a macro to be called by LeMP itself.)
This may sound complicated, but it’s easy to do, at least after you’ve installed LeMP, made an example.ecs
file in your project and assigned LeMP
as the Custom Tool.
You’ll need to add references to the following assemblies from your copy of LeMP:
Loyc.Essentials.dll
Loyc.Collections.dll
Loyc.Syntax.dll
Loyc.Ecs.dll
Put the following code in your example.ecs
file:
using System(.Collections.Generic, .Linq, .Text, );
using Loyc(.Collections, .Syntax, .Ecs, );
namespace Loyc.Ecs {
class Example {
public static string HelloWorldProgram(string whatToPrint) {
LNode code = quote {
using System;
using System.Collections.Generic;
namespace Namespaze {
class Klass {
void Main() {
Console.WriteLine($(LNode.Literal(whatToPrint)));
}
}
}
};
return EcsLanguageService.WithPlainCSharpPrinter.Print(code.Args);
}
}
}
Then locate your Main
method and add a call to Console.WriteLine(Example.HelloWorldProgram("Howdy folks!"))
. Run and make sure it works.
The trick here, and the reason we’re using LeMP instead of plain C#, is that LeMP includes a neat trick called “quote”, which allows us to generate syntax trees inside our C# code. In this case we’ve quoted an entire C# source file:
LNode code = quote {
using System;
using System.Collections.Generic;
namespace Namespaze {
class Klass {
void Main() {
Console.WriteLine($(LNode.Literal(whatToPrint)));
}
}
}
};
LNode
(short for Loyc tree node) is a flexible “generic” syntax tree which could, theoretically, represent code in any programming language, but happens (at the moment) to represent C# code. An LNode
is immutable (read-only).
quote
is a macro — a function that transforms one syntax tree into another. It generates code to construct the syntax tree you asked for; for example, if you write
LNode call = quote(func(12345));
You’ll see code in your output file (example.out.cs) to create a syntax tree representing a call to func
with 12345
as its argument list:
LNode call = LNode.Call((Symbol) "func", LNode.List(LNode.Literal(12345)));
quote
allows you to insert subtrees into your tree. For example, if you write
LNode assignment = quote(x = $call);
quote
assumes that $call
refers to a variable of type LNode
called call
, so it inserts call
into the output, like this:
LNode assignment = LNode.Call(CodeSymbols.Assign,
LNode.List(LNode.Id((Symbol) "x"), call)).SetStyle(NodeStyle.Operator);
quote
accepts either an (expression in parentheses)
or a { statement in braces; }
. When using braces, make sure to add a semicolon at the end of the statement! If you’d like to create a syntax tree that itself represents a braced block, you’ll need to use double braces as in quote { { stuff; } }
.
Because the output from quote
refers to data types such as Symbol
, CodeSymbols
, and LNode
, you may need to add references to the following namespaces when using this macro:
using Loyc; // For Symbol
using Loyc.Syntax; // For LNode, CodeSymbols
Macros themselves cannot make non-local changes, so quote
itself cannot add these using
directives on your behalf.
Other useful namespaces include
using Loyc.Collections; // For VList<LNode>, a list of LNodes (value type)
using Loyc.Ecs; // For EcsLanguageService (Enhanced C# parser/printer)
Generating code in a loop
If you’re generating code, a common task is generating a sequence of similar statements, methods, or data types.
The normal data type for lists of LNode
is called VList<LNode>
. Note that VLists are value types, so they can only be empty, never null.
As an example, here’s one way to generate a sequence of using
statements from a sequence of namespaces:
VList<LNode> namespaces = quote(System, System.Text, System.Linq).Args;
VList<LNode> usings = LNode.List(namespaces.Select(ns => quote { using $ns; }));
Note: quote
always produces a single LNode
. If you quote multiple things, the outer node will be a call to the special identifier #splice
— in this case #splice(System, System.Text, System.Linq)
. By writing quote(...).Args
, we are extracting the three arguments to the #splice
pseudo-function.
Alternately, you could use a loop:
VList<LNode> namespaces = quote(System, System.Text, System.Linq).Args;
VList<LNode> usings = LNode.List();
foreach (var ns in namespaces)
usings.Add(quote { using $ns; });
If you have a VList<LNode>
or IEnumerable<LNode>
, you can “splice” it into an argument list by using $(..list)
inside of a quote
. For example, given a list of method arguments, we might like to splice them into a method declaration:
VList<LNode> args = quote{ string second; object third; }.Args;
LNode function = quote {
void function(int first, $(..args), long fourth) {}
};
Console.WriteLine(EcsLanguageService.Value.Print(function));
The output is
void function(int first, string second, object third, long fourth)
{
}
As you can see, the $(..args)
expression causes the nodes in args
to be expanded and treated as part of the function’s argument list.
A peculiar thing here is that string second; object third;
are separated by semicolons, even though arguments in a method’s argument list are separated by commas. In fact, if you try to separate these variables by commas, you’ll get a syntax error. So what’s really going on here?
There are two parts to the answer.
- First, the syntax: why is are there semicolons? The Enhanced C# parser used by LeMP has no idea that the variables
second
andthird
will be used later in a function argument list. All it sees are two ordinary variable declarations inside braces. Because of the braces, a semicolon is required at the end of each statement. Note: If you change the code toquote(string second, object third)
, you’ll actually get a syntax error because in that case, the parser treatsquote()
as an ordinary function call, so its arguments are not allowed to be (unassigned) variable declarations. - Second: why does this work? The fact that you can insert what appear to be statements into the middle of an argument list works because a syntax tree that represents a variable declaration like “
int x;
” is identical to the syntax tree that represents a method argument like “int x
”. This is a property of the mapping from “Enhanced C#” (the syntax accepted by LeMP) to Loyc trees, and this mapping tends to be designed in such a way that you can transplant syntax trees from one place to another and it “just works” the way you want it to.
I won’t distract you with too much depth on this topic; if you want to know more, please ask.
Converting code to text and compiling it
To convert an LNode
to text, you can use EcsLanguageService.WithPlainCSharpPrinter.Print(lnode)
as shown earlier. Using EcsLanguageService.WithPlainCSharpPrinter
instead of EcsLanguageService.Value
tells the printer to avoid using syntax that is not part of “plain-old” C#, if possible.
You can also simply call LNode.ToString()
as in
Console.WriteLine("{0}", quote { class Foo {} });
/* Output:
#class(Foo, @``, {});
*/
But this doesn’t work the way you want, because the default output language is LES, not C#. You can, however, change the current output language to C# by using (LNode.PushPrinter(...))
:
using (LNode.PushPrinter(EcsLanguageService.WithPlainCSharpPrinter.Printer))
Console.WriteLine("{0}", quote { class Foo {} });
/* Output:
class Foo
{
}
*/
That’s better! Having converted your code to a string, you can run it using the CSharpCodeProvider
in System.dll. Here’s a demonstration:
static void CompileAndRun()
{
VList<LNode> code = quote {
using System;
namespace Example {
public class Code {
public static double Square(double x) { return x*x; }
}
}
}.Args;
// Compile the code to an assembly in your "Temp" directory
string[] codeStrings = {EcsLanguageService.Value.Print(code)};
Assembly asm = CompileToAssembly(codeStrings,
new[] { "System.dll" }, MessageSink.Console);
// Use reflection to find our compiled method
var module = asm.GetModules()[0];
do {
if (module != null) {
Type mt = module.GetType("Example.Code");
if (mt != null) {
MethodInfo methInfo = mt.GetMethod("Square");
if (methInfo != null) {
double n = 9.0;
Console.WriteLine("The Square of {0} is {1}", n,
methInfo.Invoke(null, new object[] { n }));
break;
}
}
}
Console.WriteLine("Failed to locate method");
} while (false);
}
static Assembly CompileToAssembly(string[] sourceFiles, string[] references = null, IMessageSink sink = null)
{
references = references ?? new[] { "System.dll" };
sink = sink ?? MessageSink.Current;
CompilerParameters CompilerParams = new CompilerParameters();
CompilerParams.GenerateInMemory = true;
CompilerParams.TreatWarningsAsErrors = false;
CompilerParams.GenerateExecutable = false;
CompilerParams.CompilerOptions = "/optimize";
CompilerParams.ReferencedAssemblies.AddRange(references);
CSharpCodeProvider provider = new CSharpCodeProvider();
CompilerResults compile = provider.CompileAssemblyFromSource(CompilerParams, sourceFiles);
StringBuilder msgs = new StringBuilder("Compiler errors:\n");
foreach (CompilerError msg in compile.Errors) {
LogMessage lmsg = new LogMessage(msg.IsWarning ? Severity.Warning : Severity.Error,
new LineAndPos(msg.Line, msg.Column), "{0}: {1}", msg.ErrorNumber, msg.ErrorText);
lmsg.WriteTo(MessageSink.Current);
msgs.Append(lmsg.ToString() + "\n");
}
if (compile.Errors.HasErrors)
throw new FormatException(msgs.ToString());
return compile.CompiledAssembly;
}
It’s that easy, although error handling can be quite a pain if the source code doesn’t exist in a file anywhere, since the locations mentioned in the error messages don’t tell you much.
Analyzing and manipulating syntax trees
What else can you do with a syntax tree?
Writing it to a file
That’s too easy:
VList<LNode> code = quote {
using System;
class HelloWorld {
public static void Main(string[] args) {
Console.WriteLine("I'm not talking to you.");
}
}
}.Args;
string text = EcsLanguageService.WithPlainCSharpPrinter.Print(code,
MessageSink.Console, ParsingService.File);
File.WriteAllText("HelloWorld.cs", text);
Finding and replacing
Given an LNode
you can use ReplaceRecursive
to find something, and optionally change it. For example, the following code finds every literal true
in a code block and changes it to !false
node = node.ReplaceRecursive(expr => {
// you could call expr.IsLiteral to find out if something is a literal,
// but there's no need to do that if you're just looking for a Value.
if (true.Equals(expr.Value))
return quote(!false);
return null; // make no change here
}));
Remember that LNode
is immutable, so this doesn’t change the existing syntax tree, it creates a new one with some part(s) changed. That’s why we write node = node.ReplaceRecursive(...)
.
If you simply want to search for “everything that matches a certain pattern” and not change the syntax tree, you should still use the same ReplaceRecursive
function, but always return null
from it. If you want to avoid examining children of a particular node, simply return the same node you were given, which prevents children from being scanned without creating a new syntax tree:
node = node.ReplaceRecursive(expr => {
matchCode (expr) {
case { $type $method($(.._)) { $(.._); } }: // any normal method
Console.WriteLine("Method '{0}' returns '{1}'", method, type);
return expr; // prevent children from being scanned
}
return null;
});
Pattern matching using matchCode
The matchCode
macro provides pattern matching for a single LNode
. You can use $
to create variables that “capture” part of the syntax tree, or use $_
for parts you don’t care about. For example,
static Symbol GetLoopType(LNode code)
{
matchCode(code) {
case { while($_) $_; }: return CodeSymbols.While;
case { for($_; $_; $_) $_; }: return CodeSymbols.For;
case { do $_; while($_); }: return CodeSymbols.Do;
default: return null;
}
}
What’s a Symbol?
A Symbol
is a kind of singleton string. Many programming languages, including Ruby and Ecmascript 6, have a “symbol” concept, but since .NET doesn’t have a standard Symbol type, the Loyc libraries have their own. A WeakValueDicionary
is used to store “global” Symbols; when you do (Symbol) "mySymbol"
you are looking up an existing global symbol or creating a new one if it doesn’t exist yet.
The main advantage of Symbol
over string
is that, since Symbol
s are singletons, two Symbol
s never need to be compared for equality like strings do, which improves their performance. If two Symbol
references are equal then the symbols are equal, otherwise they are not equal; there is no need to compare the Symbol.Name
inside the two symbols, and in fact no operator==
is defined for Symbol
. I introduced Symbol
in an old CodeProject article.
Pattern-matching example #2: extract class name
Here’s a method that finds the name of a class, struct, or enum type:
static LNode GetName(LNode type)
{
matchCode(type) {
case { class $name : $(.._) { $(.._); } },
{ struct $name : $(.._) { $(.._); } },
{ enum $name : $(.._) { $(.._) } }:
return name;
default:
return null;
}
}
The capture $(.._)
contains the expression .._
, which consists of two parts:
..
which means “match any number of nodes (arguments or statements)”_
which means “discard the matching code”. If you don’t discard the result, it has typeVList<LNode>
.
So this thing:
class $name : $(.._) { $(.._); }
means “match a class definition with any number of base types, and any number of statements inside the braces”.
This demo shows the GetName
function in action:
public static void Demo()
{
using (LNode.PushPrinter(EcsLanguageService.WithPlainCSharpPrinter.Printer)) {
Console.WriteLine(GetName(quote {
public sealed class String : System.Object, IEnumerable<char> {}
}));
Console.WriteLine(GetName(quote {
public enum BinaryDigits { Zero, One }
}));
Console.WriteLine(GetName(quote {
public struct Point<T> {
public T X { get; set; }
public T Y { get; set; }
}
}));
}
}
The output should be
String;
BinaryDigits;
Point<T>;
By the way, if you’re reading this and thinking “this LeMP thing has some pretty impressive capabilities… why haven’t I heard of it before?” the answer is twofold:
- this is the first article I’ve written about LeMP’s new
matchCode
construct, and - I stopped working on LeMP for a couple of months because people showed very little interest in it. Currently if you Google “LeMP”, my original article about LeMP doesn’t show up in the top 10 search results, because nobody blogged about it so there ain’t no links anywhere about it. If you like LeMP, please say so, share it with your friends, blog about it, make a YouTube video… something!
Pattern-matching example 3: [notify]
You might want to look for a particular attribute on a particular construct. For example, let’s say you want to find a ‘notify’ attribute on a property, which will help implement the standard INotifyPropertyChanged
interface by calling a user-defined NotifyPropertyChanged
method. In other words, suppose we want to transform
public string CompanyName { get { return _companyName; } [notify] set; }
into this:
public string CompanyName {
get { return _companyName; }
set {
if (_companyName != null ? !_companyName.Equals(value) : value != null) {
_companyName = value;
NotifyPropertyChanged("CompanyName");
}
}
}
Here’s a method that detects a property of the expected form and returns a new one, or the same property if unchanged:
public static LNode MaybeTransformNotifyProperty(LNode input) {
matchCode (input) {
// Detect if this is a property with an empty setter, and grab its parts
case {
[$(..attrs)] $Type $Name {
[$(..getAttrs)] $getter;
[$(..setAttrs)] set;
}
}:
// Look for `[notify]` in `setAttrs`
LNode notify;
setAttrs = setAttrs.WithoutNodeNamed((Symbol) "notify", out notify);
if (notify == null)
return input;
// Support custom NotifyPropertyChanged method
LNode notifyMethod = quote(NotifyPropertyChanged);
matchCode(notify) {
case $_($(ref notifyMethod)): // do nothing
}
// Discover the field name
LNode fieldName;
matchCode (getter) {
case { get => $(ref fieldName); }, // C# 6 syntax?
{ get { $(.._); return $(ref fieldName); } }:
// do nothing
default:
return input; // fail
}
// Choose difference check
LNode changed;
matchCode (Type) {
case int, uint, byte, sbyte, short, ushort,
float, double, decimal, string:
changed = quote(value != $fieldName);
default:
changed = quote($fieldName != null ?
!$fieldName.Equals(value) : value != null);
}
// Extract property name and return output
string propNameString = Name.Name.Name;
return quote {
[$(..attrs)] public $Type $Name {
[$(..getAttrs)] $getter;
[$(..setAttrs)] set {
if ($changed) {
$fieldName = value;
$notifyMethod($(LNode.Literal(propNameString)));
}
};
};
};
default:
return null;
}
}
First, notice that the initial matchCode (input)
looks for a property but doesn’t directly match the [notify]
attribute we’re looking for. That’s because of a limitation of matchCode
in the current version: it is unable to do pattern matching on attributes, it can only grab the entire attribute list. Instead I call setAttrs.WithoutAttrNamed
to search for an attribute with a particular name and remove it from the attribute list, which we need to do anyway.
Second, what does this do?
LNode notifyMethod = quote(NotifyPropertyChanged);
matchCode(notify) {
case $_($(ref notifyMethod)): // do nothing
}
You can control the name of the function that should be called if the property changed by writing [notify(MethodName)]
. The default method is NotifyPropertyChanged
. In matchCode
, $(ref X)
tells matchCode
to assign the matching syntax tree to X
rather than to create a new variable (which would be scoped to the inside of the case
handler).
You do not need a break
statement at the end of each case
. In fact, as shown here, you don’t need any code inside a case at all!
Next, look at the matchCode (getter)
construct, which figures out the name of the backing field. Notice that case
can accepts multiple patterns, each one optionally enclosed in braces; the braces indicate that you want to match a statement rather than an expression. In truth, Loyc trees do not distinguish between the concepts of “statement” and “expression”; what the braces really do is tell the parser to expect “statement syntax” rather than “expression syntax”.
Choosing a difference-check is straightforward, but doesn’t actually work right:
// Choose difference check
LNode changed;
matchCode (Type) {
case int, uint, byte, sbyte, short, ushort,
float, double, decimal, string:
changed = quote(value != $fieldName);
default:
changed = quote($fieldName != null ?
!$fieldName.Equals(value) : value != null);
}
What we really want is to use the first check if the type has a meaningful !=
operator, and use the second check if it’s a reference type. However, LeMP doesn’t have a semantic analysis engine, so it has no idea whether Type
is a reference type or not. One way to solve this problem would be to somehow allow the programmer to signal what kind of equality check is desired, but I’ll leave that as an exercise.
This line is a little funny:
string propNameString = Name.Name.Name;
The property name is an LNode
called Name
, and it has a property called Name
which gets the identifier name if it is a simple identifier. The name is a Symbol
, not a string, but Symbol
has a Name
property that gets the string stored inside it.
This line is actually wrong, because it won’t work properly for explicit interface implementations like this:
T IFunky.FunkyProp<T> { get { return _funkyProp; } [notify] set; }
In this case Name
will hold the syntax tree for IFunky.FunkyProp<T>
. If we want to extract the name FunkyProp
from this, there is a method EcsValidators.KeyNameComponentOf(Name)
for doing this. You can use this method as follows:
string propNameString = EcsValidators.KeyNameComponentOf(Name).Name.Name;
Finally, notice how we call the notifyMethod
:
$notifyMethod($(LNode.Literal(propNameString)));
Remember that quote
assumes its inputs have type LNode
(or VList<LNode>
if you use $(..list)
), so I’ve used $(LNode.Literal(propNameString))
to convert the string into an LNode
by calling LNode.Literal()
.
Calling your method
Now that we have a method to transform a property, the final task is to find the properties. Again, this can be done with ReplaceRecursive
:
node = node.ReplaceRecursive(MaybeTransformNotifyProperty);
Exercises for you
First exercise: modify the code so that
notify public string CompanyName => _companyName;
produces the same output as the original statement,
public string CompanyName { get { return _companyName; } [notify] set; }
This time I’ve used notify
instead of [notify]
; this makes it a “custom word attribute”, similar to partial class
or yield return
: partial
and yield
are not keywords, but the C# compiler treats them as if they were. Similarly, Enhanced C# treats notify
as if it were a keyword attribute like public
or virtual
. The following two statements are equivalent:
notify public string CompanyName => _companyName;
[#notify, #public] string CompanyName => _companyName;
That is:
- Keyword attributes like
public
andsealed
are treated like normal attributes, just with#
prepended on the front. Note:#
is not considered to be an operator and it does not necessarily mark a preprocessor directive; it’s simply part of the identifier. Unless it’s a preprocessor directive like#region
, Enhanced C# treats#
like a normal identifier character, exactly like_
or a letter of the alphabet. - The word attribute
notify
is treated as a regular attribute called#notify
.
Knowing this, you should be able to complete the exercise.
Second exercise: in order to speed up the search function, modify it so that it detects methods and ignores their contents, by returning n
:
node = node.ReplaceRecursive(n => {
matchCode(n) {
// TODO: Detect method and return n
}
return MaybeTransformNotifyProperty(n);
});
The pattern to match an arbitrary method was shown earlier in this article. Test your new code on this syntax tree:
LNode node = quote {
void Nonsensical(int _y) {
public string Y { get { return _y; } [notify] set; }
}
public string Z { get { return _z; } [notify] set; }
};
When you print the output, you should see that Z
was modified but not Y
.
Matching patterns specified at run-time
matchCode
performs pattern-matching at compile-time — whenever you save your *.ecs file. You can also do pattern matching at run-time using the MatchesPattern
method. For instance, if you run this code:
LNode code = quote(this.foo(Math.PI * 2, bar + 1));
LNode pattern = rawQuote(this.$_($A, $B));
MMap<Symbol, LNode> captures;
if (code.MatchesPattern(pattern, out captures)) {
foreach (KeyValuePair<Symbol, LNode> p in captures)
Console.WriteLine("['{0}'] = {1}", p.Key,
EcsLanguageService.Value.Print(p.Value, null, ParsingService.Exprs));
} else
Console.WriteLine("DID NOT MATCH PATTERN");
Here I’ve used rawQuote
rather than a normal quote
, which causes the $
operator to be treated literally: $A
and $B
will become literal parts of the syntax tree (rawQuote
, unlike quote
, does not treat A
and B
as existing variables to insert into the tree.)
When you run this, the output is
['_'] = foo
['A'] = Math.PI * 2
['B'] = bar + 1
I’ve used
EcsLanguageService.Value.Print(pair.Value, null, ParsingService.Exprs)
to ensure that the node is printed as an expression, instead of the default printing mode which treats every node as a statement. As I mentioned earlier, the nodes themselves do not distinguish between statements and expressions; a node cannot tell you if it is a statement or an expression, so you have to tell the printing engine explicitly.
Note: MatchesPattern
uses a completely separate pattern-matching engine than matchCode
(akin to the difference between an interpreter and a compiler). One difference I noticed while writing this: MatchesPattern
actually captures _
rather than ignoring it; it probably shouldn’t do that. Please write a comment if you notice any other differences. By the way, MatchesPattern
is used by the replace
macro built into LeMP, which was described in the first article about LeMP.
Writing macros
Finally, you can use the skills you’ve learned here to write your own macro DLL that LeMP will load and use; this is more convenient than having to write and maintain your own Visual Studio extension. I’ll write an article about writing and using macros just as soon as someone asks for one.
Step 1 is as follows. Remember the MaybeTransformNotifyProperty
method from above? You can easily change this into a macro by adding a LexicalMacro
attribute and an IMacroContext
parameter, like this:
[LexicalMacro("public string Name { get { return _name; } [notify(NPC)] set; }",
"Generates code for INotifyPropertyChanged. This example will call "+
"NPC(\"CompanyName\") if the new value is different from the old value. "+
"The argument on the [notify] attribute is optional; if absent, the "+
"default method, `NotifyPropertyChanged`, is called.",
"#property", "notify", Mode = MacroMode.Passive | MacroMode.Normal)]
public static LNode MaybeTransformNotifyProperty(LNode input, IMacroContext context) {
// same as before
}
The first two strings in the attribute are documentation, and the third string "#property"
(which is actually a params string[]
) is the name of the node for which the macro function should be invoked. This function modifies properties, and it just so happens that properties are represented in a Loyc tree by the #property()
pseudo-function; therefore "#property"
causes this method to be called whenever LeMP encounters a property.
We’re not quite done yet, but whoops, would you look at the time! This article is getting pretty long, so I will end it now.
Learn more about LeMP!
You can learn about some other LeMP capabilities in the previous article, and I plan to write another article soon, this time about pattern matching with match
(as opposed to matchCode
which you saw in this article).
In addition to LeMP macros, you’ll find that the LNode
class has numerous useful methods for querying and modifying Loyc trees. See the LNode
class reference. You can search for any class or method you’ve seen in the search box above the Code Documentation, or browse through the Loyc namespace to find out what else is available.
Conclusion
LeMP is a useful code generation and code analysis tool. QED. Let me know if you have any questions, and what you’re doing with it!
Tip: when using LeMP, keep your Error List open. An error in your example.ecs
file will often lead to more errors in your example.out.cs
file, but unfortunately Visual Studio often puts errors from the *.out.cs
file first, so when diagnosing errors, you’ll have to look near the end of the error list first for any errors in your *.ecs
file.
Tip: Because there is no IntelliSense in ecs files, learn to love partial class
. You may find it useful to break up some of your classes into two parts, one in an .ecs file (so you can use LeMP macros), and another in a .cs file (so you can use IntelliSense).
Help wanted
I would like someone to help make a Roslyn back-end for LeMP.
P.S. a shout out to the srclib project. I wish I had time to implement the Visual Studio version!