\code{number|string|void}, since both languages track the possibility

that some operations may yield \code{void} results. Considering this

would not pose any problem but clutter our presentation, which is why we omit

\code{number|string|void}, since they track when operations may yield \code{void} results. Considering this would be easy but also clutter our presentation, which is why we omit

such details.} But it is clear that a more precise type would be

one that states that \code{foo} returns a number when it is applied to

a number and it returns a string when it is applied to a string, so that the type deduced for, say \code{foo(42)} would be \code{number} rather than \code{number|string}. This is

a number and it returns a string when it is applied to a string, so that the type deduced for, say,\code{foo(42)} would be \code{number} rather than \code{number|string}. This is

exactly what the intersection type \code{(number$\to$number) \&

(string$\to$string)} states (intuitively, an expression has an intersection of type, noted \code{\&}, if and only if it has all the types of the intersection) and corresponds in Flow to declaring \code{foo} as follows:

\begin{alltt}\color{darkblue}

...

...

@@ -66,20 +64,20 @@ typing approach, even though they are used only at meta-theoretic

level,\footnote{At the moment of writing there is a pending pull

request to add negation types to the syntax of TypeScript, but that is all.} in

particular to determine the type environment when the type case

fails. We already saw them at work when we informally typed the ``else''

fails. We already saw negation types at work when we informally typed the ``else''

branch in \code{foo}, for which we assumed that $x$ did \emph{not} have

type \code{number}---i.e., it had the (negation) type \code{$\neg$number}---and

deduced from it that $x$ then had type \code{string}---i.e.,

\code{(number|string)\&$\neg$number} which is equivalent to the set-theoretic difference

\code{(number|string)\textbackslash\,number} and, thus, to \code{string}.

The approaches described above essentially focus on refining the type

The approaches cited above essentially focus on refining the type

of variables that occur in an expression whose type is being tested. They

do it when the variable occurs at top-level in the test (i.e., the variable is the

expression being tested) or under some specific positions such as in

nested pairs or at the end of a path of selectors. \beppe{Not precise,

please check which are the cases that are handled in the cited

papers}. In this work we aim at removing this restriction on the

papers}. In this work we aim at removing this limitation on the

contexts and develop a general theory to refine the type of variables

that occur in tested expressions under generic contexts, such as variables occurring on the

left or the right expressions of an application. In other words, we aim at establishing a formal framework to

...

...

@@ -105,15 +103,15 @@ We focus our study on conditionals that test types and consider the following s

\(

\ifty{x}{\Int}{x+1}{(\textsf{trim } x)}

\)).

In particular, in this introduction section we concentrate on applications, since they constitute the most difficult case and many other cases can be reduced to them. A typical example is the expression

In particular, in this introduction we concentrate on applications, since they constitute the most difficult case and many other cases can be reduced to them. A typical example is the expression

\begin{equation}\label{typical}

\ifty{x_1x_2}{t}{e_1}{e_2}

\end{equation}

where $x_i$'s denote variables, $t$ is some type, and $e_i$'s are generic expressions.

Depending the actual $t$ and the static types of $x_1$ and $x_2$ we

can make type assumptions for $x_1$, for $x_2$\emph{and} for the application $x_1x_2$

Depending the actual $t$ and the static types of $x_1$ and $x_2$, we

can make type assumptions for $x_1$, for $x_2$,\emph{and} for the application $x_1x_2$

when typing $e_1$ that are different from those we can make when typing

$e_2$. For instance, suppose $x_1$ is bound to the function \code{foo} defined in \eqref{foo2}. Thus $x_1$ has type,$(\Int\to\Int)\wedge(\String\to\String)$ (we used the syntax of the types of Section~\ref{sec:language} where unions and intersections are denoted by $\vee$ and $\wedge$).

$e_2$. For instance, suppose $x_1$ is bound to the function \code{foo} defined in \eqref{foo2}. Thus $x_1$ has type $(\Int\to\Int)\wedge(\String\to\String)$ (we used the syntax of the types of Section~\ref{sec:language} where unions and intersections are denoted by $\vee$ and $\wedge$).

Then it is not hard to see that the expression\footnote{Most of the following expressions are given just for the sake of example. Determining the type of expressions other than variables is interesting for constructor expressions but less so for destructors such as applications, projections, and selections: any reasonable programmer would not repeat the same application twice, it would store its result in a variable. This becomes meaningful when we introduce constructor such as pairs, as we do for instance in the expression in~\eqref{pair}.}

%

\begin{equation}\label{mezzo}

...

...

@@ -185,7 +183,7 @@ $x_1x_2$ is of type $\Bool$.

Let us recap. If $e$ is an expression of type $t_0$ and we are trying to type

%

\[\ifty{e}{t}{e_1}{e_2}\]

\(\ifty{e}{t}{e_1}{e_2}\),

%

then we can assume that $e$ has type $t_0\wedge t$ when typing $e_1$

and type $t_0\setminus t$ when typing $e_2$. If furthermore $e$ is of

...

...

@@ -211,8 +209,9 @@ third, we also know the type of the application, that is the type

that denotes all the values that may result from the application of a

function in $t_1$ to an argument in $t_2$, type that we denote by

$t_1\circ t_2$. For instance, if $t_1=\Int\to\Bool$ and $t_2=\Int$,

then $\dom{t_1}=\Int$ and $t_1\circ t_2=\Bool$. Notice that, introducing operations such as $\dom{}$ and $\circ$ is redundant when working with simple types, but becomes necessary in the presence of set-theoretic types. If for instance $t_1$ is the type of \eqref{foo2}, that is, \(t_1=(\Int{\to}\Int)\)\(\wedge\)\((\String{\to}\String)\)---which

denotes functions that return an integer if applied to an integer and a string if applied to a string---, then

then $\dom{t_1}=\Int$ and $t_1\circ t_2=\Bool$. Notice that, introducing operations such as $\dom{}$ and $\circ$ is redundant when working with simple types, but becomes necessary in the presence of set-theoretic types. If for instance $t_1$ is the type of \eqref{foo2}, that is, \(t_1=(\Int{\to}\Int)\)\(\wedge\)\((\String{\to}\String)\),

%---which denotes functions that return an integer if applied to an integer and a string if applied to a string---

then

\(\dom{t}=\Int\vee\String\),

that is the union of all the possible

input types, while the precise return type of such a

...

...

@@ -228,7 +227,7 @@ that $e_1e_2$ has type $t$ succeeds or fails. Let us start with refining the typ

which the test succeeds. Intuitively, we want to remove from $t_2$ all

the values for which the application will surely return a result not

in $t$, thus making the test fail. Consider $t_1$ and let $s$ be the

largest subtype of $\dom{t_1}$ such that

largest subtype of $\dom{t_1}$ such that\vspace{-1.29mm}

\begin{equation}\label{eq1}

t_1\circ s\leq\neg t

\end{equation}

...

...

@@ -238,7 +237,7 @@ $t_2$ all the values in $s$ or, equivalently, keep in $t_2$ all the

values of $\dom{t_1}$ that are not in $s$. Let us implement the second

viewpoint: the set of all elements of $\dom{t_1}$ for which an

application \emph{does not} surely give a result in $\neg t$ is

denoted $\worra{t_1}t$ (read, ``worra'') and defined as $\min\{u\leq\dom{t_1}\alt

denoted $\worra{t_1}t$ (read, ``$t_1$worra$t$'') and defined as $\min\{u\leq\dom{t_1}\alt

t_1\circ(\dom{t_1}\setminus u)\leq\neg t\}$: it is easy to see that

according to this definition $\dom{t_1}\setminus(\worra{t_1} t)$ is

the largest subset of $\dom{t_1}$ satisfying \eqref{eq1}. Then we can

...

...

@@ -254,7 +253,7 @@ $\worra{t_1} t$, the set of arguments that when applied to a function

of type $t_1$\emph{may} return a result in $t$; then we can refine the

type of $e_2$ as $t_2^+\eqdeftiny t_2\wedge(\worra{t_1} t)$ in the ``then'' branch (we call it the \emph{positive} branch)

and as $t_2^-\eqdeftiny t_2\setminus(\worra{t_1} t)$ in the ``else'' branch (we call it the \emph{negative} branch).

As a side remark note

As a side remark note\marginpar{\tiny\beppe{Remove if space is needed}}

that the set $\worra{t_1} t$ is different from the set of elements that return a

result in $t$ (though it is a supertype of it). To see that, consider

for $t$ the type \String{} and for $t_1$ the type $(\Bool\to\Bool)\wedge(\Int\to(\String\vee\Int))$,

...

...

@@ -273,7 +272,7 @@ $t_1\!\circ t_2^+$ (which is a subtype of $t_1\!\circ

t_2$). Therefore, we can exclude from $t_1$ all the functions that when applied to an argument

in $t_2^+$ they either diverge or yield a result not in $t_1\!\circ t_2^+$.

Both of these things can be obtained simply by removing from $t_1$ the functions in

$(t_2^+\to\neg(t_1\!\circ t_2^+))$. Therefore for what concerns $e_1$ we can refine its type in the ``then'' branch as

$(t_2^+\to\neg(t_1\!\circ t_2^+))$, that is, we refine the type of $e_1$in the ``then'' branch as

$t_1^+=t_1\setminus(t_2^+\to

\neg(t_1\!\circ t_2^+)))$. That this removes the functions that applied to $t_2^+$ arguments yield results not in $t_1\!\circ t_2^+$ should be pretty obvious. That this also removes functions diverging on $t_2^+$ arguments is subtler and deserves some

explanation. In particular, the interpretation of a type $t\to s$ is the set

...

...

@@ -298,14 +297,14 @@ that diverge on all \String{} arguments.

This is essentially what we formalize in Section~\ref{sec:language}, in the type system by the rule \Rule{PAppL} and in the typing algorithm with the case \eqref{uno} of the definition of the function \constrf.

This is essentially what we formalize in Section~\ref{sec:language}, in the type system by the rule \Rule{PAppL} and in the typing algorithm with the case \eqref{due} of the definition of the function \constrf.

In the previous section we outlined the main ideas of our approach to occurrence typing. However, devil is in the details. So the formalization we give in Section~\ref{sec:language} is not so smooth as we just outlined: we must introduce several auxiliary definitions to handle some corner cases. This section presents by tiny examples the main technical difficulties we had to overcome and the definitions we introduced to handle them. As such it provides a kind of road-map for the technicalities of Section~\ref{sec:language}.

\paragraph{Typing occurrences} As it should be clear by now, not only variables but also generic expression are given different types in the ``then'' and ``else'' branches of type tests. For instance, in \eqref{two} the expression $x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative one. In this specific case it is possible to deduce these typings from the specialized types of the single variables (in particular, thanks to the fact that $x_2$ has type \Int{} the positive branch and \Bool{} in the negative one), but this is not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$, $x_2:\Int$, and the expression

\paragraph{Typing occurrences} As it should be clear by now, not only variables but also generic expression are given different types in the ``then'' and ``else'' branches of type tests. For instance, in \eqref{two} the expression $x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative one. In this specific case it is possible to deduce these typings from the refined types of the variables (in particular, thanks to the fact that $x_2$ has type \Int{} the positive branch and \Bool{} in the negative one), but this is not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$, $x_2:\Int$, and the expression

\begin{equation}\label{twobis}

\ifty{x_1x_2}{\Int}{...x_1x_2...}{...x_1x_2...}

\end{equation}

...

...

@@ -366,30 +365,19 @@ $\ifty{e(42)}{\Bool}{e(42)}{\textsf{true}}$ reduces in one step to

$\ifty{\textsf{false}}{\Bool}{\textsf{false}}{\textsf{true}}$): see Appendix~\ref{app:parallel}.

\paragraph{Nested Checks} The last class of technical problems arise

from the nesting of type checks. In particular, there are two cases

that pose a problem. The first can be shown by this example

\paragraph{Checks interdependence} The last class of technical problems arise

from the multual dependence of different checks type checks. In particular, there are two cases

that pose a problem. The first can be shown by considering two functions $f$ and $g$ both of type $(\Int\to\Int)\wedge(\Any\to\Bool)$ and the test

our technique deduces for the positive branch $e$ that both $x$ and

$y$ have type $\Int\vee\Bool$. But if inside this branch $e$ there is a test such as

$\ifty{x}{\Int}{(y+x)}{(\neg y)}$, then we want to

consider this test well-typed, that is, we want to deduce that $y$ is of type

$\Int$ in the positive branch of this test and of type \Bool{} in the negative

one. To be able to deduce that we need to produce type environments by

using all hypotheses available even when these hypotheses are formulated later in the flow of control. In the example, when we try to type $y+x$ we want to use the

hypothesis that $x\in\Int$ succeeded not only locally, but also to

refine the information we can get by supposing that the test in

\eqref{nest1} succeeded, thus deducing that $(x,y)$ will be of type $\Int\times\Int$. This will be done in the type systems of

if we independently check $fx$ against $\Int$ and $gx$ against $\Bool$ we deduce $\Int$ for the first occurrence of $x$ and $\Any$ for the second. Thus we would type the positive branch under the hypothesis that $x$ is of type $\Int$. But if we use the hypothesis generated by the test of $fx$, that is that $x$ is of type \Int, to check $gx$, then it is easy to see that the positive branch of \eqref{nest1} can never be selected. In other words, we want to produce type environmments for occurrence typing by taking into account all all hypotheses available, even when these hypotheses are formulated later in the flow of control. This will be done in the type systems of

Section~\ref{sec:language} by the rule \Rule{Path} and will require at

algorithmic level to look for a fix-point solution of a function, or

an approximation thereof.

\mick{The paragraph above is not up to date.}

\beppe{Is it ok now?}

Finally, a nested check may help refining not only the types but also

the type assumptions on some expressions. For instance, when typing

Finally, a nested check may help refining

the type assumptions on some outer expressions. For instance, when typing