Sunday, October 11, 2009

Type Systems for Xtext DSLs

I recently implemented a type system for an Xtext DSL. I wanted to give you a couple of pointers of how to do it, if you have to.

What is a type system? A type system is essentially a set of sophisticated constraints. I.e. it is a way of determining whether a model is correct beyond its structure. We all know type systems from programming languages. If you try to add an int and a String, you'll get a type error.

So, in principle, you could simply implement a type system as a set of Check constraints. This is true, but for non trivial type systems and languages, this can grow complex. A more principled approach is recommended, and it consists of three ingredients.

The type meta model: the structure of types and their relationship often warrants its own meta model. For example, an array type isn't trivial: It has to express that it is an Array, it has to contain the size, and it has to point to the type of the elements in the array. The array type itself is structured. Hence, it is a good idea to create a meta model, or language, to capture the types. In the context of Xtext, you can either do this as a separate EMF meta model that you reference from the Xtext DSL, or you simply add additional rules to the language that represent the types once they are transformed to Ecore. Make sure that the grammar structure does not allow you to actually "write down" the types in models - you want to work with them behind the scenes.

So how do you associate the type of an element with the element itself? A good way is simply to create a bunch of Xtend functions. I usually call it typeof(...) and define it polymorphically for all the language elements for which I want to have a type. These functions typically do one of two things: for atomic elements (e.g. an integer literal or an array declaration) the typeof(...) extension simply returns the type object, i.e. instances of the type meta model (or language) defined above. For non-atomic elements (let's say, a comparison operation that compares the result of a constant and a plus expression) the typeof function contains a type derivation rule. For example, for a plus expression, it returns the type of one of the operands (the types of the two operands have to be the same, see below). For a comparison
operation, it simply returns boolean.

Finally, once you have these two ingredients in place, you can now implement the actual type constraints. For example, it a type constraint might say that for a plus expression, the typeof(...) the left operand and the typeof(...) the right operand need to be the same. Note that by calling the typeof(...) function for both operands, the resulting type is calculated correctly even if the two operands are, for example, multiplication operations themselves.

One last comment: usually, types don't have to be the same for a constraint to be correct, but they have to be "compatible", whereas compatible is defined specficially for each type. To incorporate this problem, simply create an isCompatible(t1, t2) Xtend function, which you polymorphically override for all relevant type combinations.

That's it! Using this approach gives you a scalable, and maintainable type system implementation.

PS: MPS uses the same approach. It even comes with a separate DSL for defining type systems, including a set of operators for type inference and type compatibility. Very nice!
Helpful post! What about a full comparison between Eclipse Modeling (Xtext) and Jetbrains MPS?
I am thinking about it, maybe I'll find the time...

Cool, we did the same when adapting existing types from a SAP system to build arithmetic expressios. Did you reduce types to a limited set of kinds (e.g. number, string, boolean) when declaring types in your DSL, too?
I am not sure I understand what you mean. Yes, I have number, string, boolean as types, but the DSL currently does not have "different kinds of numbers" such as rational, int or real.
I see, so you don't allow new subranges of types in your DSL such as "type SmallInteger like Integer bounded by -128 and 127"? Instead you rely on static types that can be constructed by literals, right?

Would be helpful to see a screenshot of your editor including error markers. Any chance to update your post?
At this time, I cannot publish a screenshot since it's in a (currently still) proprietary DSL.

Post a Comment

<< Home

back to

This is Markus Voelter's Blog. It is not intended as a replacement for my regular web site, but rather as a companion that contains ideas, thoughts and loose ends.

December 2005 / January 2006 / February 2006 / March 2006 / April 2006 / May 2006 / June 2006 / July 2006 / August 2006 / September 2006 / October 2006 / November 2006 / December 2006 / February 2007 / March 2007 / April 2007 / May 2007 / June 2007 / July 2007 / September 2007 / October 2007 / November 2007 / December 2007 / January 2008 / February 2008 / March 2008 / April 2008 / May 2008 / June 2008 / July 2008 / August 2008 / September 2008 / October 2008 / November 2008 / December 2008 / January 2009 / February 2009 / March 2009 / April 2009 / May 2009 / June 2009 / July 2009 / August 2009 / September 2009 / October 2009 / November 2009 / December 2009 / January 2010 / February 2010 / April 2010 / May 2010 / June 2010 / July 2010 / August 2010 / September 2010 / October 2010 / November 2010 / December 2010 / January 2011 / March 2011 / April 2011 / May 2011 / June 2011 / July 2011 / October 2011 / November 2011 / December 2011 / January 2012 / February 2012 / October 2012 / January 2013 /

You can get an atom feed for this blog.