When using Xtext for creating my grammar, we sometimes stumble upon the issue of ID vs keyword:

Given the Xtext Grammar

Entity: "entity" name=ID "{"
"}";

it will work perfectly fine to have a model like

entity person {
}

but once you try a model like

entity entity {
}

it will result in the following parse error

mismatched input ‘entity’ expecting RULE_ID

at the place of the entity’s name.

What goes wrong here

Xtext uses ANTRLRv3 under the hood and the ANTLR lexer will - context free - split the whole document into a set of tokens before the parser even runs. In our usecase the list could look something like

Keyword “entity”, WS, Keyword “entity”, WS, Keyword “{“, WS, Keyword “{“

so that the parser will see the keyword “entity” instead of the ID terminal that is expected according to the grammar.

Datatype rules to the rescue

Xtext not only supports Parser rules and Lexer/Terminal Rules, but has also support for a special kind of parser rules, datatype rules. Datatype Rules are processed by the parser, but like terminal rules they produce data types like INT or STRING and not EObjects like normal parser rules do.

Here is how our grammar could look like using dataytpe rules.

Entity: "entity" name=MyID "{"
"}";

MyID: ID | "entity";

Now the parser also can handle the “entity” keyword at the place of the name attribute an thus will accept our model.