parser/ulexer.ml · 07bc7bc14d48c9dbe8482c589fff1b0e0570f303 · cduce / cduce

Implement the syntax ``t (t1, …, tn)'' for type instantiation, without · 07bc7bc1

Kim Nguyễn authored Feb 15, 2015

requiring an abscence of whitespace between ``t'' and ``(''.  Outside
of regular expression contexts, ``t (t1, …, tn)'' is parsed with a
higher precedence than & and \, to allow one to write
``t (Int) & Bool'' without extra parentheses (i.e.
``(t (Int)) & Bool'').  Inside a regular expression, type
instantiation and sequencing become ambiguous, and there is no way to
distinguish syntactically: ``[ Int (Int, Int) ]'' from
``[ t (Int, Int) ]''. The former should resolve to a sequence while
the latter only makes sense as an instantiation (if ``t'' is a
parametric type). Both are treated as element sequencing and
disambiguated during identifier resolution (more precisely during the
"derecurse" phase, before typechecking).

Note that due to the lower precedence of sequencing w.r.t to other
regular expression constructs, a type ``[ t (Int)* ]'' will be parsed
correctly, but yield an error message saying that t is not fully
intantiated. One has to write ``[ (t (Int))* ]'' which is similar to
function applications for expressions.

Finally, we also re-order sequencing after typing to always group a
potential type instantiation with its argument, i.e. we transform
sequences such as
``[ t1 t2 t3 ... tn ]'' (which are parsed as
``[ (((t1 t2) t3) ... tn) ]'' because sequence concatenation is
left-associative) into ``[ ... (ti tj) ... ]'' if ``ti'' is an
identifier and ``tj'' is of the form ``(s1,...,sk)''. This is sound
because concatenation of regular expression is associative (and the
original sequence would fail, anyway).

07bc7bc1