The system we defined in the previous section implements the ideas we

The type system we defined in the previous section implements the ideas we

illustrated in the introduction and it is safe. Now the problem is to

decide whether an expression is well typed or not, that is, to find an

algorithm that given a type environment $\Gamma$ and an expression $e$

...

...

@@ -35,7 +35,7 @@ of its soundness by using and adapting the technique of \emph{type

representations of the infinite sets of types of

$\lambda$-abstractions which can be used to define an algorithmic

system that can be easily proved to be sound. The simpler algorithm

that we propose in this section implies (i.e., it is less precise than) the one with type schemes (cf.\

that we propose in this section implies (i.e., it is less precise than) the one with type schemes (\emph{cf}.\

Lemma~\ref{soundness_simple_ts}) and it is thus sound, too. The algorithm of this

section is not only simpler but, as we discuss in Section~\ref{sec:algoprop},

is also the one that should be used in practice. This is why we preferred to

...

...

@@ -103,7 +103,7 @@ then we define:\vspace{-1.2mm}

All the operators above but $\worra{}{}$ are already present in the

theory of semantic subtyping: the reader can find how to compute them

in~\cite[Section

6.11]{Frisch2008} (see also~\citep[\S4.4]{Cas15} for a detailed description). Below we just show a new formula that computes

6.11]{Frisch2008} (see also~\citep[\S4.4]{Cas15} for a detailed description). Below we just show our new formula that computes

$\worra t s$ for a $t$ subtype of $\Empty\to\Any$. For that, we use a

result of semantic subtyping that states that every type $t$ is

equivalent to a type in disjunctive normal form and that if

...

...

@@ -150,9 +150,9 @@ We start by defining the algorithm for each single occurrence, that is for the d

All the functions above are defined if and only if the initial path

$\varpi$ is valid for $e$ (i.e., $\occ e{\varpi}$ is defined) and $e$

is well-typed (which implies that all $\tyof{\occ e{\varpi}}\Gamma$

in the definition are defined).%

in the definition are defined)%

\iflongversion%

\footnote{Note that the definition is

.\footnote{Note that the definition is

well-founded. This can be seen by analyzing the rule

\Rule{Case\Aa} of Section~\ref{sec:algorules}: the definition of $\Refine{e,t}\Gamma$ and

$\Refine{e,\neg t}\Gamma$ use $\tyof{\occ e{\varpi}}\Gamma$, and

...

...

@@ -160,19 +160,19 @@ in the definition are defined).%

\Rule{Case\Aa} states that $\Gamma\vdash e:t_0$ (and this is

possible only if we were able to deduce under the hypothesis

$\Gamma$ the type of every occurrence of $e$.)\vspace{-3mm}}

\else

; the well foundness of the definition can be deduced by analysing the rule~\Rule{Case\Aa} of Section~\ref{sec:algorules}.

\fi

Each case of the definition of the $\constrf$ function corresponds to the

application of a logical rule

\iflongversion

(\emph{cf.} Footnote~\ref{fo:rules})

\fi

(\emph{cf.} definition in Footnote~\ref{fo:rules})

in

the deduction system for $\vdashp$: case \eqref{uno} corresponds

to the application of \Rule{PEps}; case \eqref{due} implements \Rule{Pappl}

straightforwardly; the implementation of rule \Rule{PAppR} is subtler:

instead of finding the best $t_1$ to subtract (by intersection) from the

static type of the argument, \eqref{tre} finds directly the best type for the argument by

applying the $\worra{}{}$ operator to the static types of the function

applying the $\worra{}{}$ operator to the static type of the function

and the refined type of the application. The remaining (\ref{quattro}--\ref{sette})

cases are the straightforward implementations of the rules

\Rule{PPairL}, \Rule{PPairR}, \Rule{PFst}, and \Rule{PSnd},

...

...

@@ -188,7 +188,7 @@ the definition of $\constrf$.

It remains to explain how to compute the environment $\Gamma'$ produced from $\Gamma$ by the deduction system for $\Gamma\evdash e t \Gamma'$. Alas, this is the most delicate part of our algorithm.

%

In a nutshell what we want to do is to define a function

In a nutshell, what we want to do is to define a function

$\Refine{\_,\_}{\_}$ that takes a type environment $\Gamma$, an

expression $e$ and a type $t$ and returns the best type environment

$\Gamma'$ such that $\Gamma\evdash e t \Gamma'$ holds. By the best

The previous analysis already covers a large gamut of realistic

cases. For instance, the analysis already handles list data

structures, since products and recursive types can encode them as

rightassociative nested pairs, as it is done in the language

right-associative nested pairs, as it is done in the language

CDuce (e.g., $X =\textsf{Nil}\vee(\Int\times X)$ is the

type of the lists of integers): see Code 8 in Table~\ref{tab:implem} of Section~\ref{sec:practical} for a concrete example. And even more since the presence of

union types makes it possible to type heterogeneous lists whose

type of the lists of integers): see Code 8 in Table~\ref{tab:implem} of Section~\ref{sec:practical} for a concrete example. Even more, thanks to the presence of

union types it is possible to type heterogeneous lists whose

content is described by regular expressions on types as proposed

by~\citet{hosoya00regular}. Since the main application of occurrence

typing is to type dynamic languages, then it is worth showing how to

extend our work to records. Although we use the record types as they are

defined in CDuce we cannot do the same for CDuce record expressions

extend our work to records. Although we use the record \emph{types} as they are

defined in CDuce we cannot do the same for CDuce record \emph{expressions}

which require non-trivial modifications for occurrence typing,

especially because we want to capture record field extension and field

deletion which current gradual typing systems fail to capture. CDuce record types are obtained by extending types with the

especially because we want to capture the typing of record field extension and field

deletion, two widely used record operations that current gradual typing systems fail to capture. CDuce record types are obtained by extending types with the

following two type constructors:\\[1.4mm]

%

\centerline{\(\textbf{Types}~~ t ~ ::= ~ \record{\ell_1=t \ldots\ell_n=t}{t}\alt\Undef\)}\\[1.4mm]

\centerline{\(\textbf{Types}\quad t ~ ::= ~ \record{\ell_1=t \ldots\ell_n=t}{t}\alt\Undef\)}\\[1.4mm]

@@ -25,27 +25,31 @@ following two type constructors:\\[1.4mm]

%% \]

where $\ell$ ranges over an infinite set of labels $\Labels$ and $\Undef$

is a special singleton type whose only value is the constant

$\undefcst$ which is a constant not in $\Any$. The type

$\undefcst$, a constant not in $\Any$. The type

$\record{\ell_1=t_1\ldots\ell_n=t_n}{t}$ is a \emph{quasi-constant

function} that maps every $\ell_i$ to the type $t_i$ and every other

$\ell\in\Labels$ to the type $t$ (all the $\ell_i$'s must be

distinct). Quasi constant functions are the internal representation of

record types in CDuce. These are not visible to the programmer who can use

only two specific forms of quasi constant functions, open record types and closed record types, provided by the

only two specific forms of quasi constant functions, open record types and closed record types (as for OCaml object types), provided by the

following syntactic sugar and that form the \emph{record types} of our language%

\iflongversion%

\footnote{Note that in the definitions ``$\ldots{}$'' is meta-syntax to denote the presence of other fields while in the open records ``{\large\textbf{..}}'' is the syntax that distinguishes them from closed ones.}

\fi

\begin{itemize}[nosep]

\item$\crecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1\ldots\ell_n=t_n}{\Undef}$ (closed records).

\item$\orecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1\ldots\ell_n=t_n}{\Any\vee\Undef}$ (open records).

\item$\crecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1\ldots\ell_n=t_n}{\Undef}$\hfill(closed records).

\item$\orecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1\ldots\ell_n=t_n}{\Any\vee\Undef}$\hfill (open records).

\end{itemize}

plus the notation $\mathtt{\ell\eqq} t$ to denote optional fields,

which corresponds to using in the quasi-constant function notation the

field $\ell= t \vee\Undef$.

field $\ell= t \vee\Undef$%

\iflongversion

.

\else

\ (note that ``$\ldots{}$'' is meta-syntax while ``{\large\textbf{..}}'' is syntax).

\fi

For what concerns expressions, we adapt CDuce records to our analysis. In particular records are built starting from the empty record expression \erecord{}and by adding, updating, or removing fields:\vspace{-1mm}

For what concerns expressions, we adapt CDuce records to our analysis. In particular records are built starting from the empty record expression \erecord{} by adding, updating, or removing fields:\vspace{-1mm}

\[

\begin{array}{lrcl}

\textbf{Expr}& e & ::=&\erecord{}\alt\recupd e \ell e \alt\recdel e \ell\alt e.\ell

We have implemented the algorithmic system $\vdashA$. Our

implementation is in OCaml and uses CDuce as a library to

implementation is written in OCaml and uses CDuce as a library to

provide the semantic subtyping machinery. Besides the type-checking

algorithm defined on the base language, our implementation supports

record types (Section \ref{ssec:struct}) and the refinement of

...

...

@@ -32,12 +32,12 @@ $\texttt{x}$ is specialized to \Int{} in the ``then'' case and to \Bool{}

in the ``else'' case. The function is thus type-checked twice more

under each hypothesis for \texttt{x}, yielding the precise type

$(\Int\to\Int)\land(\Bool\to\Bool)$. Note that w.r.t.\ rule \Rule{AbsInf+} of Section~\ref{sec:refining}, our implementation improved the output of the computed

type. Indeed using rule~[{\sc AbsInf}+] we would obtain the

type. Indeed, using rule~[{\sc AbsInf}+] we would obtain the

then the rule \Rule{OverApp} applies and \True, \Any, and $\lnot\True$ become candidate types for

\texttt{x}, which allows us to deduce the precise type given in the table. Finally, thanks to rule \Rule{OverApp} it is not necessary to use a type case to force refinement. As a consequence we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as:

\texttt{x}, which allows us to deduce the precise type given in the table. Finally, thanks to rule \Rule{OverApp} it is not necessary to use a type case to force refinement. As a consequence, we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as:

\begin{alltt}\color{darkblue}\morecompact

let and_ = fun (x : Any) -> fun (y : Any) -> not_ (or_ (not_ x) (not_ y))

let xor_ = fun (x : Any) -> fun (y : Any) -> and_ (or_ x y) (not_ (and_ x y))

Here, for each arrow declared in the interface of the function, we

For each arrow declared in the interface of the function, we

first typecheck the body of the function as usual (to check that the

arrow is valid) and collect the refined types for the parameter $x$.

Then we deduce all possible output types for this refined input

types and add the resulting arrows to the type we deduce for the whole

Then we deduce all possible output types for this refined set of input

types and add the resulting arrows to the type deduced for the whole

function (see Appendix~\ref{app:optimize} for an even more precise rule).

...

...

@@ -178,8 +178,8 @@ function (see Appendix~\ref{app:optimize} for an even more precise rule).

In summary, in order to type a

function we use the type-cases on its parameter to partition the

domain of the function and we type-check the function on each single partitions rather

than on the union thereof. Of course, we could use a much finer

domain of the function and we type-check the function on each single partition rather

than on the union thereof. Of course, we could use much a finer

partition: the finest (but impossible) one is to check the function

against the singleton types of all its inputs. But any finer partition

would return, in many cases, not a much better information, since most

...

...

@@ -207,7 +207,7 @@ therefore add to our deduction system a last further rule:\\[1mm]

Whenever a function parameter is the argument of an

overloaded function, we record as possible types for this parameter

all the domains $t_i$ of the arrows that type the overloaded

function, restricted (via intersection) by the static type $t$ of the parameter and provided that the type is not empty ($t\wedge t_i\not\simeq\Empty$). We show the full power of this rule on some practical examples in Section~\ref{sec:practical}.

function, restricted (via intersection) by the static type $t$ of the parameter and provided that the type is not empty ($t\wedge t_i\not\simeq\Empty$). We show the remarkable power of this rule on some practical examples in Section~\ref{sec:practical}.