particular connective (here, a type constructor, that is, either

$\to$, or $\times$, or $b$), while identity rules (e.g., axioms and

cuts) and structural rules (e.g., weakening and contraction) do

not.\vspace{-3.3mm}}

not.\svvspace{-3.3mm}}

such as \Rule{Subs} and \Rule{Inter}. We handle this presence

in the classic way: we define an algorithmic system that tracks the

miminum type of an expression; this system is obtained from the

...

...

@@ -76,7 +76,7 @@ things get more difficult, since a function can be typed by, say, a

union of intersection of arrows and negations of types. Checking that

the function has a functional type is easy since it corresponds to

checking that it has a type subtype of $\Empty{\to}\Any$. Determining

its domain and the type of the application is more complicated and needs the operators $\dom{}$ and $\circ$ we informally described in Section~\ref{sec:ideas} where we also introduced the operator $\worra{}{}$. These three operators are used by our algorithm and formally defined as:\vspace{-0.5mm}

its domain and the type of the application is more complicated and needs the operators $\dom{}$ and $\circ$ we informally described in Section~\ref{sec:ideas} where we also introduced the operator $\worra{}{}$. These three operators are used by our algorithm and formally defined as:\svvspace{-0.5mm}

\begin{eqnarray}

\dom t & = &\max\{ u \alt t\leq u\to\Any\}

\\[-1mm]

...

...

@@ -93,12 +93,12 @@ We need similar operators for projections since the type $t$

of $e$ in $\pi_i e$ may not be a single product type but, say, a union

of products: all we know is that $t$ must be a subtype of

$\pair\Any\Any$. So let $t$ be a type such that $t\leq\pair\Any\Any$,

then we define:\vspace{-0.7mm}

then we define:\svvspace{-0.7mm}

\begin{equation}

\begin{array}{lcrlcr}

\bpl t & = &\min\{ u \alt t\leq\pair u\Any\}\qquad&\qquad

\bpr t & = &\min\{ u \alt t\leq\pair\Any u\}

\end{array}\vspace{-0.7mm}

\end{array}\svvspace{-0.7mm}

\end{equation}

All the operators above but $\worra{}{}$ are already present in the

theory of semantic subtyping: the reader can find how to compute them

...

...

@@ -111,10 +111,10 @@ furthermore $t\leq\Empty\to\Any$, then $t \simeq \bigvee_{i\in

N_i}\neg(s_n'\to t_n')\right)$ with $\bigwedge_{p\in P_i}(s_p\to

t_p)\bigwedge_{n\in N_i}\neg(s_n'\to t_n')\not\simeq\Empty$ for all

$i$ in $I$. For such a $t$ and any type $s$ then we have:\vspace{-1.0mm}

$i$ in $I$. For such a $t$ and any type $s$ then we have:\svvspace{-1.0mm}

%

\begin{equation}

\worra t s = \dom t \wedge\bigvee_{i\in I}\left(\bigwedge_{\{P\subseteq P_i\alt s\leq\bigvee_{p \in P}\neg t_p\}}\left(\bigvee_{p \in P}\neg s_p\right) \right)\vspace{-1.0mm}

\worra t s = \dom t \wedge\bigvee_{i\in I}\left(\bigwedge_{\{P\subseteq P_i\alt s\leq\bigvee_{p \in P}\neg t_p\}}\left(\bigvee_{p \in P}\neg s_p\right) \right)\svvspace{-1.0mm}

\end{equation}

The formula considers only the positive arrows of each summand that

forms $t$ and states that, for each summand, whenever you take a subset

...

...

@@ -136,7 +136,7 @@ extends $\Gamma$ with hypotheses on the occurrences of $e$ that are

the most general that can be deduced by assuming that $e\,{\in}\,t$ succeeds. For that we need the notation $\tyof{e}{\Gamma}$ which denotes the type deduced for $e$ under the type environment $\Gamma$ in the algorithmic type system of Section~\ref{sec:algorules}.

That is, $\tyof{e}{\Gamma}=t$ if and only if $\Gamma\vdashA e:t$ is provable.

We start by defining the algorithm for each single occurrence, that is for the deduction of $\pvdash\Gamma e t \varpi:t'$. This is obtained by defining two mutually recursive functions $\constrf$ and $\env{}{}$:\vspace{-1.3mm}

We start by defining the algorithm for each single occurrence, that is for the deduction of $\pvdash\Gamma e t \varpi:t'$. This is obtained by defining two mutually recursive functions $\constrf$ and $\env{}{}$:\svvspace{-1.3mm}

\env{\Gamma,e,t} (\varpi) & = &{\constr\varpi{\Gamma,e,t}\wedge\tyof{\occ e \varpi}\Gamma}\label{otto}

\end{eqnarray}\vspace{-5mm}\\

\end{eqnarray}\svvspace{-5mm}\\

All the functions above are defined if and only if the initial path

$\varpi$ is valid for $e$ (i.e., $\occ e{\varpi}$ is defined) and $e$

is well-typed (which implies that all $\tyof{\occ e{\varpi}}\Gamma$

...

...

@@ -159,7 +159,7 @@ in the definition are defined)%

this is defined for all $\varpi$ since the first premisses of

\Rule{Case\Aa} states that $\Gamma\vdash e:t_0$ (and this is

possible only if we were able to deduce under the hypothesis

$\Gamma$ the type of every occurrence of $e$.)\vspace{-3mm}}

$\Gamma$ the type of every occurrence of $e$.)\svvspace{-3mm}}

\else

; the well foundness of the definition can be deduced by analysing the rule~\Rule{Case\Aa} of Section~\ref{sec:algorules}.

\fi

...

...

@@ -223,7 +223,7 @@ $\Refinef$ yields for $x$ a type strictly more precise than the type deduced in

previous iteration.

The solution we adopt in practice is to bound the number of iterations to some number $n_o$. This is obtained by the following definition of $\Refinef$\vspace{-1mm}

The solution we adopt in practice is to bound the number of iterations to some number $n_o$. This is obtained by the following definition of $\Refinef$\svvspace{-1mm}

For what concerns \emph{expressions}, we cannot use CDuce record expressions

as they are, but we must adapt them to our analysis. In particular, we

consider records that are built starting from the empty record expression \erecord{} by adding, updating, or removing fields:\vspace{-0.75mm}

consider records that are built starting from the empty record expression \erecord{} by adding, updating, or removing fields:\svvspace{-0.75mm}

\[

\begin{array}{lrcl}

\textbf{Expr}& e & ::=&\erecord{} ~\alt~ \recupd e \ell e ~\alt~ \recdel e \ell ~\alt~ e.\ell

\end{array}\vspace{-.75mm}

\end{array}\svvspace{-.75mm}

\]

in particular $\recdel e \ell$ deletes the field $\ell$ from $e$, $\recupd e \ell e'$ adds the field $\ell=e'$ to the record $e$ (deleting any existing $\ell$ field), while $e.\ell$ is field selection with the reduction:

\(\erecord{...,\ell=e,...}.\ell\ \reduces\ e\).

...

...

@@ -77,7 +77,7 @@ To define record type subtyping and record expression type inference we need thr

%

Then two record types $t_1$ and $t_2$ are in subtyping relation, $t_1\leq t_2$, if and only if for all $\ell\in\Labels$ we have $\proj\ell{t_1}\leq\proj\ell{t_2}$. In particular $\orecord{\!\!}$ is the largest record type.

Expressions are then typed by the following rules (already in algorithmic form).\vspace{-.1mm}

Expressions are then typed by the following rules (already in algorithmic form).\svvspace{-.1mm}

\begin{mathpar}

\Infer[Record]

{~}

...

...

@@ -90,7 +90,7 @@ Expressions are then typed by the following rules (already in algorithmic form).

@@ -238,7 +238,8 @@ that $e_1e_2$ has type $t$ succeeds or fails. Let us start with refining the typ

which the test succeeds. Intuitively, we want to remove from $t_2$ all

the values for which the application will surely return a result not

in $t$, thus making the test fail. Consider $t_1$ and let $s$ be the

largest subtype of $\dom{t_1}$ such that\vspace{-1.29mm}

largest subtype of $\dom{t_1}$ such that%

\svvspace{-1.29mm}

\begin{equation}\label{eq1}

t_1\circ s\leq\neg t

\end{equation}

...

...

@@ -293,9 +294,10 @@ Ergo $t_1\setminus (t_2^+\to

Let us see all this on our example \eqref{exptre}, in particular, by showing how this technique deduces that the type of $x_1$ in the positive branch is (a subtype of) $\Int{\vee}\String\to\Int$.

Take the static type of $x_1$, that is $(\Int{\vee}\String\to\Int)\vee(\Bool{\vee}\String\to\Bool)$ and intersect it with

$(t_2^+\to\neg t)$, that is, $\String\to\neg\Int$. Since intersection distributes over unions we

$(\Bool{\vee}\String{\to}\Bool)\wedge\neg(\String{\to}\neg\Int)$ is empty

...

...

@@ -314,9 +316,18 @@ This is essentially what we formalize in Section~\ref{sec:language}, in the type

In the previous section we outlined the main ideas of our approach to occurrence typing. However, devil is in the details. So the formalization we give in Section~\ref{sec:language} is not so smooth as we just outlined: we must introduce several auxiliary definitions to handle some corner cases. This section presents by tiny examples the main technical difficulties we had to overcome and the definitions we introduced to handle them. As such it provides a kind of road-map for the technicalities of Section~\ref{sec:language}.

\paragraph{Typing occurrences} As it should be clear by now, not only variables but also generic expressions are given different types in the ``then'' and ``else'' branches of type tests. For instance, in \eqref{two} the expression $x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative one. In this specific case it is possible to deduce these typings from the refined types of the variables (in particular, thanks to the fact that $x_2$ has type \Int{} the positive branch and \Bool{} in the negative one), but this is not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$, $x_2:\Int$, and the expression\vspace{-1mm}

\paragraph{Typing occurrences} As it should be clear by now, not only variables

but also generic expressions are given different types in the ``then'' and

``else'' branches of type tests. For instance, in \eqref{two} the expression

$x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative

one. In this specific case it is possible to deduce these typings from the

refined types of the variables (in particular, thanks to the fact that $x_2$ has

type \Int{} the positive branch and \Bool{} in the negative one), but this is

not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$,

@@ -7,7 +7,7 @@ In this section we formalize the ideas we outlined in the introduction. We start

\subsection{Types}

\begin{definition}[Types]\label{def:types}

%\iflongversion%%%%%%%

The set of types \types{} is formed by the terms $t$ coinductively produced by the grammar:\vspace{-1.45mm}

The set of types \types{} is formed by the terms $t$ coinductively produced by the grammar:\svvspace{-1.45mm}

\[

\begin{array}{lrcl}

\textbf{Types}& t & ::=& b\alt t\to t\alt t\times t\alt t\vee t \alt\neg t \alt\Empty

...

...

@@ -17,10 +17,10 @@ and that satisfy the following conditions

\begin{itemize}[nosep]

\item (regularity) every term has a finite number of different sub-terms;

\item (contractivity) every infinite branch of a term contains an infinite number of occurrences of the

arrow or product type constructors.\vspace{-1mm}

arrow or product type constructors.\svvspace{-1mm}

\end{itemize}

\iffalse%%%%%%%%%%%%%%%%%%%%%%%%%%

A type $t\in\types{}$ is a term coinductively produced by the grammar:\vspace{-1.45mm}

A type $t\in\types{}$ is a term coinductively produced by the grammar:\svvspace{-1.45mm}

\[

\begin{array}{lrcl}

\textbf{Types}& t & ::=& b\alt t\to t\alt t\times t\alt t\vee t \alt\neg t \alt\Empty

...

...

@@ -28,7 +28,7 @@ A type $t\in\types{}$ is a term coinductively produced by the grammar:\vspace{-1

\]

that satisfies the following conditions: $(1)$\emph{Regularity}: the

term has a finite number of different sub-terms; $(2)$\emph{Contractivity}: every infinite branch of the term contains an infinite number of occurrences of the

arrow or product type constructors.\vspace{-1mm}

arrow or product type constructors.\svvspace{-1mm}

\fi%%%%%%%%%%%%%%%%%%%

\end{definition}

We use the following abbreviations: $

...

...

@@ -79,7 +79,7 @@ union of the values of the two types). We use $\simeq$ to denote the

symmetric closure of $\leq$: thus $s\simeq t$ (read, $s$ is equivalent to $t$) means that $s$ and $t$ denote the same set of values and, as such, they are semantically the same type.

\subsection{Syntax}\label{sec:syntax}

The expressions $e$ and values $v$ of our language are inductively generated by the following grammars:\vspace{-1mm}

The expressions $e$ and values $v$ of our language are inductively generated by the following grammars:\svvspace{-1mm}

@@ -110,7 +110,7 @@ values. We write $v\in t$ if the most specific type of $v$ is a subtype of $t$ (

\subsection{Dynamic semantics}\label{sec:opsem}

The dynamic semantics is defined as a classic left-to-right call-by-value reduction for a $\lambda$-calculus with pairs, enriched with specific rules for type-cases. We have the following notions of reduction:\vspace{-1.2mm}

The dynamic semantics is defined as a classic left-to-right call-by-value reduction for a $\lambda$-calculus with pairs, enriched with specific rules for type-cases. We have the following notions of reduction:\svvspace{-1.2mm}

\[

\begin{array}{rcll}

(\lambda^{\wedge_{i\in I}s_i\to t_i} x.e)\,v &\reduces& e\subst x v\\[-.4mm]

...

...

@@ -137,7 +137,7 @@ standard, for what concerns the type system we will have to introduce several

unconventional features that we anticipated in

Section~\ref{sec:challenges} and are at the core of our work. Let

us start with the standard part, that is the typing of the functional

core and the use of subtyping, given by the following typing rules:\vspace{-1mm}

core and the use of subtyping, given by the following typing rules:\svvspace{-1mm}

\begin{mathpar}

\Infer[Const]

{}

...

...

@@ -182,7 +182,7 @@ core and the use of subtyping, given by the following typing rules:\vspace{-1mm}

{\Gamma\vdash e:t\\t\leq t' }

{\Gamma\vdash e: t' }

{}

\qquad\vspace{-3mm}

\qquad\svvspace{-3mm}

\end{mathpar}

These rules are quite standard and do not need any particular explanation besides those already given in Section~\ref{sec:syntax}. Just notice subtyping is embedded in the system by the classic \Rule{Subs} subsumption rule. Next we focus on the unconventional aspects of our system, from the simplest to the hardest.

...

...

@@ -190,7 +190,7 @@ The first unconventional aspect is that, as explained in

Section~\ref{sec:challenges}, our type assumptions are about

expressions. Therefore, in our rules the type environments, ranged over

by $\Gamma$, map \emph{expressions}---rather than just variables---into

types. This explains why the classic typing rule for variables is replaced by a more general \Rule{Env} rule defined below:\vspace{-1mm}

types. This explains why the classic typing rule for variables is replaced by a more general \Rule{Env} rule defined below:\svvspace{-1mm}

\begin{mathpar}

\Infer[Env]

{}

...

...

@@ -200,7 +200,7 @@ types. This explains why the classic typing rule for variables is replaced by a

\Infer[Inter]

{\Gamma\vdash e:t_1\\\Gamma\vdash e:t_2 }

{\Gamma\vdash e: t_1 \wedge t_2 }

{}\vspace{-3mm}

{}\svvspace{-3mm}

\end{mathpar}

The \Rule{Env} rule is coupled with the standard intersection introduction rule \Rule{Inter}

which allows us to deduce for a complex expression the intersection of

...

...

@@ -209,12 +209,12 @@ environment $\Gamma$ with the static type deduced for the same

expression by using the other typing rules. This same intersection

rule is also used to infer the second unconventional aspect of our

system, that is, the fact that $\lambda$-abstractions can have negated

arrow types, as long as these negated types do not make the type deduced for the function empty:\vspace{-.5mm}

arrow types, as long as these negated types do not make the type deduced for the function empty:\svvspace{-.5mm}

@@ -161,7 +161,7 @@ We introduce the new syntactic category of \emph{type schemes} which are the ter

\]

Type schemes denote sets of types, as formally stated by the following definition:

\begin{definition}[Interpretation of type schemes]

We define the function $\tsint{\_}$ that maps type schemes into sets of types.\vspace{-2.5mm}

We define the function $\tsint{\_}$ that maps type schemes into sets of types.\svvspace{-2.5mm}

\begin{align*}

\begin{array}{lcl}

\tsint t &=&\{s\alt t \leq s\}\\

...

...

@@ -204,13 +204,13 @@ We also need to perform intersections of type schemes so as to intersect the sta

\end{lemma}

Finally, given a type scheme $\ts$ it is straightforward to choose in its interpretation a type $\tsrep\ts$ which serves as the canonical representative of the set (i.e., $\tsrep\ts\in\tsint\ts$):

\begin{definition}[Representative]

We define a function $\tsrep{\_}$ that maps every non-empty type scheme into a type, \textit{representative} of the set of types denoted by the scheme.\vspace{-2mm}

We define a function $\tsrep{\_}$ that maps every non-empty type scheme into a type, \textit{representative} of the set of types denoted by the scheme.\svvspace{-2mm}

\begin{align*}

\begin{array}{lcllcl}

\tsrep t &=& t &\tsrep{\ts_1 \tstimes\ts_2}&=&\pair{\tsrep{\ts_1}}{\tsrep{\ts_2}}\\

Expression substitutions, ranged over by $\rho$, map an expression into another expression. The application of an expressions substitution $\rho$ to an expression $e$, noted $e\rho$ is the capture avoiding replacement defined as follows:

\begin{itemize}

\item If $e'\equiv_\alpha e''$, then $e''\subst{e'}e = e$.\vspace{1mm}

\item If $e'\not\equiv_\alpha e''$, then $e''\subst{e'}e$ is inductively defined as \vspace{-1.5mm}

\item If $e'\not\equiv_\alpha e''$, then $e''\subst{e'}e$ is inductively defined as \svvspace{-1.5mm}

As we explained in the introduction, both TypeScript and Flow deduce the type

\code{(number$\vee$string) $\to$ (number$\vee$string)} for the first definition of the function \code{foo} in~\eqref{foo}, and the more precise type\vspace{-3pt}

\code{(number$\vee$string) $\to$ (number$\vee$string)} for the first definition of the function \code{foo} in~\eqref{foo}, and the more precise type\svvspace{-3pt}