type a let binding such as \code{

%\begin{alltt}\color{darkblue}

let x = (y\(\in\)Int)?`yes:`no in (x\(\in\)`yes)?y+1:not(y)%

%\end{alltt}

}

which is clearly safe when $y:\Int\vee\Bool$. Nor can this example be solved by partial evaluation since we do not handle nesting of tests in the condition\code{( ((y\(\in\)Int)?`yes:`no)\(\in\)`yes )? y+1 : not(y)},

and both are issues that system by~\citet{THF10} can handle. We think that it is possible

to reuse some of their ideas to perform a information flow analysis on top of

warrant a formal treatment. In particular, the rule [{\sc OverApp}]

only detects the application of an overloaded function once, when

type-checking the body of the function against the coarse input type

(i.e., $\psi$ is computed only once). But we could repeat this

process whilst type-checking the inferred arrows (i.e., we would

enrich $\psi$ while using it to find the various arrow types of the

lambda abstraction). Clearly, if untamed, such a process may never

made to converge and, foremost, whether it is of use in practice is among our objectives.

But the real challenges that lie ahead are the handling of side

effects and the addition of polymorphic types. Our analysis works in a

functions typed by intersection types and/or when integrating gradual

typing. This deserves a whole pan of non trivial research that we plan to

The previous analysis already covers a large pan of realistic cases. For instance, the analysis already handles list data structures, since products and recursive types can encode them as right associative nested pairs, as it is done in the language CDuce~\cite{BCF03} (e.g., $X =\textsf{Nil}\vee(\Int\times X)$ is the type of the lists of integers). And even more since the presence of union types makes it possible to type heterogeneous lists whose content is described by regular expressions on types as proposed by~\citet{hosoya00regular}. Since the main application of occurrence typing is to type dynamic languages, then it is worth showing how to extend our work to records. We use the record types as they are defined in CDuce and which are obtained by extending types with the following two type constructors:\centerline{\(\textbf{Types} ~~ t ~ ::= ~ \record{\ell_1=t \ldots\ell_n=t}{t}\alt\Undef\)}

@@ -112,14 +112,15 @@ Depending the actual $t$ and the static types of $x_1$ and $x_2$, we

can make type assumptions for $x_1$, for $x_2$, \emph{and} for the application $x_1x_2$

when typing $e_1$ that are different from those we can make when typing

$e_2$. For instance, suppose $x_1$ is bound to the function \code{foo} defined in \eqref{foo2}. Thus $x_1$ has type $(\Int\to\Int)\wedge(\String\to\String)$ (we used the syntax of the types of Section~\ref{sec:language} where unions and intersections are denoted by $\vee$ and $\wedge$).

Then it is not hard to see that the expression\footnote{This and most of the following expressions are just given for the sake of example. Determining the type of expressions other than variables is interesting for constructors but less so for destructors such as applications, projections, and selections: any reasonable programmer would not repeat the same application twice, (s)he would store its result in a variable. This becomes meaningful when we introduce constructor such as pairs, as we do for instance in the expression in~\eqref{pair}.}

%

\begin{equation}\label{mezzo}

\texttt{let }x_1 \texttt{\,=\,}\code{foo}\texttt{ in }\ifty{x_1x_2}{\Int}{((x_1x_2)+x_2)}{\texttt{42}}

\end{equation}

%

is well typed with type $\Int$: when typing the branch ``then'' we

suceeded and that, therefore, not

only $x_1x_2$ is of type \Int, but also that $x_2$ is of type $\Int$: the other possibility,

$x_2:\String$, would have made the test fail.

For~\eqref{mezzo} we reasoned only on the type of the variables in the ``then'' branch but we can do the same

of type $t_1$\emph{may} return a result in $t$; then we can refine the

type of $e_2$ as $t_2^+\eqdeftiny t_2\wedge(\worra{t_1} t)$ in the ``then'' branch (we call it the \emph{positive} branch)

and as $t_2^-\eqdeftiny t_2\setminus(\worra{t_1} t)$ in the ``else'' branch (we call it the \emph{negative} branch).

that the set $\worra{t_1} t$ is different from the set of elements that return a

result in $t$ (though it is a supertype of it). To see that, consider

string; then we have that $\dom{t_1}=\Int\vee\Bool$ and $\worra{t_1}\String=\Int$,

but there is no (non-empty) type that ensures that an application of a

function in $t_1$ will surely yield a $\String$ result.

\fi

Once we have determined $t_2^+$, it is then not very difficult to refine the

type $t_1$ for the positive branch, too. If the test succeeded, then we know two facts: first,

In this section we formalize the ideas we outlined in the introduction. We start by the definition of types followed by the language and its reduction semantics. The static semantics is the core of our work: we first present a declarative type system that deduces (possibly many) types for well-typed expressions and then the algorithms to decide whether an expression is well typed or not.

\subsection{Types}

\qquad

\end{mathpar}

\begin{mathpar}

\Infer[PAppR]

{\pvdash\Gamma e t \varpi.0:\arrow{t_1}{t_2}\\\pvdash\Gamma e t \varpi:t_2'}

{\pvdash\Gamma e t \varpi.1:\neg t_1 }

{ t_2\land t_2' \simeq\Empty}

\Infer[PAppL]

{\pvdash\Gamma e t \varpi.1:t_1 \\\pvdash\Gamma e t \varpi:t_2 }

{\pvdash\Gamma e t \varpi.0:\neg (\arrow{t_1}{\neg t_2}) }

That is, $\tyof{e}{\Gamma}=\ts$ if and only if $\Gamma\vdashA e:\ts$ is provable.

We start by defining the algorithm for each single occurrence, that is for the deduction of $\pvdash\Gamma e t \varpi:t'$. This is obtained by defining two mutually recursive functions $\constrf$ and $\env{}{}$:

(\lnot\True\to\lnot\True\to\False)$}

We consider \True, \Any and $\lnot\True$ as candidate types for

\texttt{x} which, in turn allows us to deduce a precise type given in the table. Finally, thanks to this rule it is no longer necessary to force refinement by using a type case. As a consequence we can define the functions \texttt{and\_} and \texttt{or\_} more naturally as:

\begin{alltt}\color{darkblue}\morecompact

let and_ = fun (x : Any) -> fun (y : Any) -> not_ (or_ (not_ x) (not_ y))

let xor_ = fun (x : Any) -> fun (y : Any) -> and_ (or_ x y) (not_ (and_ x y))

would return, in many cases, not a much better information, since most

partitions would collapse on the same return type: type-cases on the

parameter are the tipping points that are likely make a difference, by returning different

types for different partitions thus yielding more precise typing (but see also our discussion on future work in Section~\ref{sec:conclusion}).

types for different partitions thus yielding more precise typing. But they are not the only such tipping points: see rule \Rule{OverApp} in Section~\ref{sec:practical}.