-
Kim Nguyễn authored
to use two lexers (depending on whether we are between square brackets or not) is too brittle (it crudely tries to parse ``( [whitespace] 'a [whitespace] )'' as a variable, to force the user to write the variable beetween parenthesis. However this does not scale to types with two arguments (says [ t ('a, 'b) ]). We use a simpler heuristic (with look ahead) (1) try to see if the regular expression ' (anything but ', \n)* '(anything but the first letter of an identifier) can be found. If so, we put back the lexeme in the buffer and parse it as as a string. (2) if (1) failed, try to parse it as a variable (3) if (3) failed, try to parse it again as a string. We are guaranteed to fail here but it means we have a malformed string, so we parse as a string to get a proper error message. The only thing this does not cover are cases like type t = [ 'abcd'Int ] which was tokenized before as [, 'abcd', Int, ] and is now tokenized as [, 'abcd, 'Int, ] It does not seem to be a problem in practice though (since in the code I have seen thus far, people were at least putting a space). it is easy to emmit a warning in this case, suggesting the user to add a whitespace to get the old behaviour back.
36b83c45