Type Systems for Xtext DSLs
I recently implemented a type system for an
Xtext DSL. I wanted to give you a couple of pointers of how to do it, if you have to.
What is a type system? A type system is essentially a set of sophisticated constraints. I.e. it is a way of determining whether a model is correct beyond its structure. We all know type systems from programming languages. If you try to add an int and a String, you'll get a type error.
So, in principle, you could simply implement a type system as a set of Check constraints. This is true, but for non trivial type systems and languages, this can grow complex. A more principled approach is recommended, and it consists of three ingredients.
The type meta model: the structure of types and their relationship often warrants its own meta model. For example, an array type isn't trivial: It has to express that it is an Array, it has to contain the size, and it has to point to the type of the elements in the array. The array type itself is structured. Hence, it is a good idea to create a meta model, or language, to capture the types. In the context of Xtext, you can either do this as a separate EMF meta model that you reference from the Xtext DSL, or you simply add additional rules to the language that represent the types once they are transformed to Ecore. Make sure that the grammar structure does not allow you to actually "write down" the types in models - you want to work with them behind the scenes.
So how do you associate the type of an element with the element itself? A good way is simply to create a bunch of Xtend functions. I usually call it typeof(...) and define it polymorphically for all the language elements for which I want to have a type. These functions typically do one of two things: for atomic elements (e.g. an integer literal or an array declaration) the typeof(...) extension simply returns the type object, i.e. instances of the type meta model (or language) defined above. For non-atomic elements (let's say, a comparison operation that compares the result of a constant and a plus expression) the typeof function contains a type derivation rule. For example, for a plus expression, it returns the type of one of the operands (the types of the two operands have to be the same, see below). For a comparison
operation, it simply returns boolean.
Finally, once you have these two ingredients in place, you can now implement the actual type constraints. For example, it a type constraint might say that for a plus expression, the typeof(...) the left operand and the typeof(...) the right operand need to be the same. Note that by calling the typeof(...) function for both operands, the resulting type is calculated correctly even if the two operands are, for example, multiplication operations themselves.
One last comment: usually, types don't have to be the same for a constraint to be correct, but they have to be "compatible", whereas compatible is defined specficially for each type. To incorporate this problem, simply create an isCompatible(t1, t2) Xtend function, which you polymorphically override for all relevant type combinations.
That's it! Using this approach gives you a scalable, and maintainable type system implementation.
PS:
MPS uses the same approach. It even comes with a separate DSL for defining type systems, including a set of operators for type inference and type compatibility. Very nice!