Strange error messages

This forum is to report technical problems with Rel.
Post Reply
HughDarwen
Posts: 124
Joined: Sat May 24, 2008 4:49 pm

Strange error messages

Post by HughDarwen »

I've been getting some strange-looking error messages from V3.12. E.g. (in DBrowser):

(2+1))
ERROR: Encountered " ")" ") "" at line 1, column 6.

In general, the word "Encountered" seems to followed by a repetition of the rogue token. Here's another example:

tuple tuple { }
ERROR: Encountered " "TUPLE" "tuple "" at line 1, column 7.

Notice upper-case version being apparently given as the encountered token, even though I had typed it in lower case.

Hugh
HughDarwen
Posts: 124
Joined: Sat May 24, 2008 4:49 pm

Re: Strange error messages

Post by HughDarwen »

Here's another:

tupple { c 'abc' }
ERROR: Encountered " <STRING_LITERAL> "\'abc\' "" at line 1, column 12.
Was expecting one of:
"}" ...
"," ...

This kind of thing seems to happen whenever a key word is misspelled. The message treats some subsequent token as the rogue one. Another example:

ootput 1+2 ;
ERROR: Encountered " <INTEGER_LITERAL> "1 "" at line 1, column 8.
Was expecting:
":" ...

Here it seems that "ootput" is assumed to be intended as a statement label!

Hugh
Dave
Site Admin
Posts: 372
Joined: Sun Nov 27, 2005 7:19 pm

Re: Strange error messages

Post by Dave »

The Rel parser is generated by the JavaCC parser generator (https://javacc.dev.java.net/). I'm using the beta 4.1 version, because it fixes some bugs in the previous versions. Unfortunately, it also changes the parse error messages. Previously, upon encountering a parse error, it would just print the associated token value. Now, it prints the internal token name (used in the grammar definition) followed by the token value, and surrounds the token value in double-quotes and further surrounds both the token name and value in another set of double-quotes. I presume this is intended to make parse errors easier to parse by external tools.

That's why you get rather unreadable output like " <INTEGER_LITERAL> "1 "". It's also why you get apparent duplication like " ")" ") "" or ""TUPLE" "tuple"". In the former case , the internal token name is ")" and the token value is ") ". In the latter case, the token name is TUPLE and the token value is "tuple".

Facilities built into JavaCC can be used to extract the individual components out of parse errors, which means I should be able to construct more readable error messages.

As for errors where a token is misspelled at the start of a statement, the results are understandably odd but make sense in the context of the grammar:

If a token is not recognised, the parser assumes the token must be an identifier. Hence, the parser thinks "tupple { c 'abc' }" is an attempt to perform a projection on a relvar named 'tupple', using an attribute list that starts with an attribute named 'c' followed by the string literal 'abc'. Of course, a string literal doesn't make grammatical sense in an attribute list. So, the parser complains about an unexpected string literal, because everything up to that point is legitimate Tutorial D! In this case, I might be able to produce a more meaningful error message by checking to see if the identifier exists (e.g., is a relvar) during the parsing phase.

Similarly, when the parser encounters "ootput 1+2 ;", the only place where an identifier can appear in that context is as a statement label (variable assignment is already excluded via a lookahead), so the parser complains when it doesn't find the required colon after the statement label "ootput".
Post Reply