• Kim Nguyễn's avatar
    Fix the handling of polymorphic variables in the lexer. The solution · 36b83c45
    Kim Nguyễn authored
    to use two lexers (depending on whether we are between square brackets
    or not) is too brittle (it crudely tries to parse
     ``( [whitespace] 'a  [whitespace] )'' as a variable, to force the user
    to write the variable beetween parenthesis. However this does not scale
    to types with two arguments (says [ t ('a, 'b) ]).
    
    We use a simpler heuristic (with look ahead)
    
    (1) try to see if the regular expression
    
    ' (anything but ', \n)* '(anything but the first letter of an identifier)
    
    can be found. If so, we put back the lexeme in the buffer and parse it as as
    a string.
    
    (2) if (1) failed, try to parse it as a variable
    
    (3) if (3) failed, try to parse it again as a string. We are
    guaranteed to fail here but it means we have a malformed string, so we
    parse as a string to get a proper error message.
    
    The only thing this does not cover are cases like
    type t = [ 'abcd'Int ]
    which was tokenized before as [, 'abcd', Int, ]
    and is now tokenized as [, 'abcd, 'Int, ]
    It does not seem to be a problem in practice though (since in the code
    I have seen thus far, people were at least putting a space).
    it is easy to emmit a warning in this case, suggesting the user to add
    a whitespace to get the old behaviour back.
    36b83c45
Name
Last commit
Last update
..
ast.ml Loading commit data...
cduce_curl.ml Loading commit data...
cduce_loc.ml Loading commit data...
cduce_loc.mli Loading commit data...
cduce_netclient.ml Loading commit data...
parser.ml Loading commit data...
parser.mli Loading commit data...
ulexer.ml Loading commit data...
ulexer.mli Loading commit data...
url.ml Loading commit data...
url.mli Loading commit data...
wlexer.mll Loading commit data...