\neg(t_1\!\circ t_2^+)))$ in the ``else'' branch. The reader may
wonder why by removing the functions in $(t_2^+\to\neg(t_1\!\circ
t_2^+))$ we removed those diverging on $t_2^+$. This deserves some
explanation. First, the interpretation of a type $t\to s$ is the set
t_2$). Therefore, we can exclude from $t_1$ all the functions that when applied to an argument
in $t_2^+$ they either diverge or yield a result not in $t_1\!\circ t_2^+$.
Both of these things can be obtained simply by removing from $t_1$ the functions in
$(t_2^+\to\neg(t_1\!\circ t_2^+))$. Therefore for what concerns $e_1$ we can refine its type in the ``then'' branch as
$t_1^+=t_1\setminus(t_2^+\to
\neg(t_1\!\circ t_2^+)))$. That this removes the functions that applied to $t_2^+$ arguments yield results not in $t_1\!\circ t_2^+$ should be pretty obvious. That this also removes functions diverging on $t_2^+$ arguments is subtler and deserves some
explanation. In particular, the interpretation of a type $t\to s$ is the set
of all functions that when applied to an argument of type $t$ they
either diverge or return a value in $s$. If we use \Empty{} to denote
the empty type, that is the set that contains no value, then it is
clear that $t\to\Empty$ is the set of all functions that when applied
to an argument of type $t$ they do not return any value and, thus,
diverge. Second, the type $(t\to s_1)\wedge(t\to s_2)$ and the type
$t\to s_1\wedge s_2$ are equivalent (in the sense that they denote the
same set of values) insofar as a they both denote the set of functions
that when applied to an argument in $t$ either diverge or return a
value that is both in $s_1$ and in $s_2$. From this we can deduce that
for all types $s$ and $t$ the types $(t\to s)\wedge(t\to\neg s)$ and
$t\to\Empty$ are equivalent. So if now we take all the
functions of type $(t_2^+\to(t_1\!\circ t_2^+))$ and we remove from
there those of type $(t_2^+\to\neg(t_1\!\circ t_2^+))$, then we are
removing the functions that have type $(t_2^+\to(t_1\!\circ
t_2^+))\wedge(t_2^+\to\neg(t_1\!\circ t_2^+))$, that is $(t_2^+\to
\Empty)$, that is, those that diverge on $t_2^+$.
Removing this set is important, since without doing it we would not be able to type the expression in~\eqref{exptre}: it
is clear that in~\eqref{exptre} the type of $x_1$ in the branch ``then'' is
$\Int{\vee}\String\to\Int$; but if we intersect the static type of $x_1$, that is $(\Int{\vee}\String\to\Int)\vee(\Bool{\vee}\String\to\Bool)$, with
$(t_2^+\to(t_1\!\circ t_2^+))$, that is $\String\to\Int$, then since intersections distribute over unions and $(\String\to\Int)\wedge(\Int{\vee}\String\to\Int)=(\Int{\vee}\String\to\Int)$ (the latter being contained in the former) we obtain:
$(\String\to\Int)\wedge(\Bool{\vee}\String\to\Bool)$ is non empty
(in particular, it contains all functions that diverge on \String{} arguments),
then the type we obtain is not just $(\Int{\vee}\String\to\Int)$ but a
strict supertype of it. By using $((t_2^+\to(t_1\!\circ
t_2^+))\setminus(t_2^+\to\neg(t_1\!\circ t_2^+)))$ instead, we
either diverge or return a value in $s$. As such the interpretation of $t\to s$ contains
all the functions that diverge (at least) on $t$. Therefore removing $t\to s$ from a type $u$ removes from $u$ not only all the functions that when applied to a $t$ argument return a result in $s$, but also all the functions that diverge on $t$.
Ergo $t_1\setminus(t_2^+\to
\neg(t_1\!\circ t_2^+)))$ removes, among others, all functions in $t_1$ that diverge on $t_2^+$.
Let us show all of this on our example \eqref{exptre}, in particular, show how by this technique we deduce the type of $x_1$ in the branch ``then'' is (a subtype of) $\Int{\vee}\String\to\Int$. Take the static type of $x_1$, that is $(\Int{\vee}\String\to\Int)\vee(\Bool{\vee}\String\to\Bool)$ and intersect it with
$(t_2^+\to\neg(t_1\!\circ t_2^+))$, that is, $\String\to\neg\Int$. Since intersection distributes over unions we
then what we obtain is a subtype of $(\Int{\vee}\String\to\Int)$
(namely the functions of type $\Int{\vee}\String\to\Int$ minus those
that always diverge on \String{} arguments).
then what we obtain is a strict subtype of $\Int{\vee}\String\to\Int$, namely the functions of type $\Int{\vee}\String\to\Int$ minus those
that diverge on all \String{} arguments.
This is essentially what we formalize in Section~\ref{sec:language}.
This is essentially what we formalize in Section~\ref{sec:language}, in the type system by the rule \Rule{PAppL} and in the typing algorithm with the case \eqref{uno} of the definition of the function \constrf.
In the previous section we outlined the main ideas of our approach to occurrence typing. However, devil is in the details. So the formalization we give in Section~\ref{sec:language} is not so smooth as we just outlined: we must introduce several auxiliary definitions to handle some corner cases. This section presents by tiny examples the main technical difficulties we had to overcome and the definitions we introduced to handle them. As such it provides a kind of road-map to the technicalities of Section~\ref{sec:language}.
In the previous section we outlined the main ideas of our approach to occurrence typing. However, devil is in the details. So the formalization we give in Section~\ref{sec:language} is not so smooth as we just outlined: we must introduce several auxiliary definitions to handle some corner cases. This section presents by tiny examples the main technical difficulties we had to overcome and the definitions we introduced to handle them. As such it provides a kind of road-map for the technicalities of Section~\ref{sec:language}.
\paragraph{Typing occurrences} As it should be clear by now, not only variables but also generic expression are given different types in the ``then'' and ``else'' branches of type tests. For instance, in \eqref{two} the expression $x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative one. In this specific case it is possible to deduce these typings from the specialized types of the single variables (in particular, thanks to the fact that $x_2$ has type \Int{} the positive branch and \Bool{} in the negative one), but this is not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$, $x_2:\Int$, and the expression
\begin{equation}\label{twobis}
\ifty{x_1x_2}{\Int}{...x_1x_2...}{...x_1x_2...}
\end{equation}
It is not possible to specialize the type of the variables in the
branches. Nevertheless we want to be able to deduce that $x_1x_2$ has
branches. Nevertheless, we want to be able to deduce that $x_1x_2$ has
type \Int{} in the positive branch and type \Bool{} in the negative
one. In order to do so in Section~\ref{sec:language} we will use
special type environments that map not only variables but also generic
...
...
@@ -374,55 +341,55 @@ assumptions are not only about variables but also about
expressions. Two corner cases are particularly difficult. The first is
shown by the following example
\begin{equation}\label{bistwo}
\ifty{e(42)}{\Int}{e}{...}
\ifty{e(42)}{\Bool}{e}{...}
\end{equation}
If $e$ is an expression of type $\Int\to t$, then, as discussed before,
the positive branch will have type $(\Int\to t)\setminus(\Int\to\neg
t)$. If furthermore the negative branch is of the same type (or of a subtype), then this
\Bool)$. If furthermore the negative branch is of the same type (or of a subtype), then this
will also be the type of the whole expression in \eqref{bistwo}. Now imagine
that the application $e(42)$ reduces to an integer, then the whole
that the application $e(42)$ reduces to a Boolean value, then the whole
expression in \eqref{bistwo} reduces to $e$; but this has type
$\Int\to t$ which, in general, is \emph{not} a subtype of $(\Int\to
t)\setminus(\Int\to\neg t)$, and therefore type is not preserved by the reduction. To cope with this problem, in
t)\setminus(\Int\to\neg\Bool)$, and therefore type is not preserved by the reduction. To cope with this problem, in
Section~\ref{sec:language} we resort to \emph{type schemes} a
technique introduced by~\citet{Frisch2008} to type expressions by
sets of types, so that the expression in \eqref{bistwo} will have both the types at issue.
The second corner case is a slight modification of the example above
The second corner case is a modification of the example above
where the positive branch is $e(42)$, e.g.,
$\ifty{e(42)}{\Int}{e(42)}{42}$. In this case the type deduced for the
whole expression would be \Int, while after reduction we would obtain
the expression $e(42)$ which is not of type \Int{} (even though will
eventually reduce to an\Int). This problem will be handled in the
$\ifty{e(42)}{\Bool}{e(42)}{\textsf{true}}$. In this case the type deduced for the
whole expression is \Bool, while after reduction we would obtain
the expression $e(42)$ which is not of type \Bool{} but of type $t$ (even though it will
eventually reduce to a \Bool). This problem will be handled in the
proof of type preservation by considering parallel reductions (e.g, if
$e(42)$ reduces in a step to, say, $3$ then
$\ifty{e(42)}{\Int}{e(42)}{42}$ reduces in one step to
$\ifty{3}{\Int}{3}{42}$): see Appendix~\ref{app:parallel}.
$e(42)$ reduces in a step to, say, $\textsf{false}$, then
$\ifty{e(42)}{\Bool}{e(42)}{\textsf{true}}$ reduces in one step to
$\ifty{\textsf{false}}{\Bool}{\textsf{false}}{\textsf{true}}$): see Appendix~\ref{app:parallel}.
\paragraph{Nested Checks} The last class of technical problems arise
from the nesting of type checks. In particular, there are two cases
that pose problem. The first can be shown by this example
that pose a problem. The first can be shown by this example
our technique deduces for the positive branch $e$ that both $x$ and
$y$ have type $\Int\vee\Bool$. But if in this branch $e$we have the
following test $\ifty{x}{\Int}{(y+x)}{(\neg y)}$, then we want to
consider it well-typed, that is, we want to deduce that $y$ is of type
$y$ have type $\Int\vee\Bool$. But if inside this branch $e$there is a test such as
$\ifty{x}{\Int}{(y+x)}{(\neg y)}$, then we want to
consider this test well-typed, that is, we want to deduce that $y$ is of type
$\Int$ in the positive branch of this test and of type \Bool{} in the negative
one. To be able to deduce that we need to produce type environments by
using all hypotheses available even when these hypotheses are formulated later in the flow of control. In the example, when we try to type $y+x$ we want to use the
hypothesis that $x\in\Int$ succeeded not only locally, but also to
refine the information we can get by supposing that the test in
\eqref{nest1} succeeded. This will be done in the type systems of
\eqref{nest1} succeeded, thus deducing that $(x,y)$ will be of type $\Int\times\Int$. This will be done in the type systems of
Section~\ref{sec:language} by the rule \Rule{Path} and will require at
algorithmic level to look for a fix-point solution of a function, or
an approximation thereof.
\mick{The paragraph above is not up to date.}
\beppe{Is it ok now?}
Finally, a nested check may help refining not only the types but also the type assumptions about some expression. For instance, when typing the positive branch $e$ of
Finally, a nested check may help refining not only the types but also the type assumptions on some expressions. For instance, when typing the positive branch $e$ of
@@ -438,5 +405,7 @@ retyping the expression without that assumption (see rule
\Rule{Env\Aa} in Section~\ref{sec:algorules}).
\subsection{Outline}
\subsubsection*{Outline} Our presentation proceeds as follows
For space reasons several technical definitions and all the proofs are omitted from this presentation and can be found in the Appendix available online.
\beppe{I have doubt: is this safe or should we play it safer and deduce $t\wedge\neg(t_1\to t_2)$? In other terms is is possible to deduce two separate negation of arrow types that when intersected with the interface are non empty, but by intersecting everything makes the type empty? It should be safe since otherwise intersection would not be admissible in semantic subtyping (see Theorem 6.15 in JACM), but I think we should doube ckeck it.}
%\beppe{I have doubt: is this safe or should we play it safer and
% deduce $t\wedge\neg(t_1\to t_2)$? In other terms is is possible to
% deduce two separate negation of arrow types that when intersected
% with the interface are non empty, but by intersecting everything
% makes the type empty? It should be safe since otherwise intersection
% would not be admissible in semantic subtyping (see Theorem 6.15 in
% JACM), but I think we should doube ckeck it.}
As explained in Section~\ref{sec:challenges}, we need to be able to
deduce for, say, the function $\lambda^{\Int\to\Int} x.x$ a type such
as $(\Int\to\Int)\wedge\neg(\Bool\to\Bool)$ (in particular, if this is
...
...
@@ -786,7 +791,7 @@ should be enough to capture all realistic cases \beppe{can we give an estimate