The system we defined in the previous section implements the ideas we The type system we defined in the previous section implements the ideas we illustrated in the introduction and it is safe. Now the problem is to decide whether an expression is well typed or not, that is, to find an algorithm that given a type environment $\Gamma$ and an expression $e$ ... ... @@ -35,7 +35,7 @@ of its soundness by using and adapting the technique of \emph{type representations of the infinite sets of types of $\lambda$-abstractions which can be used to define an algorithmic system that can be easily proved to be sound. The simpler algorithm that we propose in this section implies (i.e., it is less precise than) the one with type schemes (cf.\ that we propose in this section implies (i.e., it is less precise than) the one with type schemes (\emph{cf}.\ Lemma~\ref{soundness_simple_ts}) and it is thus sound, too. The algorithm of this section is not only simpler but, as we discuss in Section~\ref{sec:algoprop}, is also the one that should be used in practice. This is why we preferred to ... ... @@ -103,7 +103,7 @@ then we define:\vspace{-1.2mm} All the operators above but $\worra{}{}$ are already present in the theory of semantic subtyping: the reader can find how to compute them in~\cite[Section 6.11]{Frisch2008} (see also~\citep[\S4.4]{Cas15} for a detailed description). Below we just show a new formula that computes 6.11]{Frisch2008} (see also~\citep[\S4.4]{Cas15} for a detailed description). Below we just show our new formula that computes $\worra t s$ for a $t$ subtype of $\Empty\to\Any$. For that, we use a result of semantic subtyping that states that every type $t$ is equivalent to a type in disjunctive normal form and that if ... ... @@ -150,9 +150,9 @@ We start by defining the algorithm for each single occurrence, that is for the d All the functions above are defined if and only if the initial path $\varpi$ is valid for $e$ (i.e., $\occ e{\varpi}$ is defined) and $e$ is well-typed (which implies that all $\tyof {\occ e{\varpi}} \Gamma$ in the definition are defined).% in the definition are defined)% \iflongversion% \footnote{Note that the definition is .\footnote{Note that the definition is well-founded. This can be seen by analyzing the rule \Rule{Case\Aa} of Section~\ref{sec:algorules}: the definition of $\Refine {e,t} \Gamma$ and $\Refine {e,\neg t} \Gamma$ use $\tyof{\occ e{\varpi}}\Gamma$, and ... ... @@ -160,19 +160,19 @@ in the definition are defined).% \Rule{Case\Aa} states that $\Gamma\vdash e:t_0$ (and this is possible only if we were able to deduce under the hypothesis $\Gamma$ the type of every occurrence of $e$.)\vspace{-3mm}} \else ; the well foundness of the definition can be deduced by analysing the rule~\Rule{Case\Aa} of Section~\ref{sec:algorules}. \fi Each case of the definition of the $\constrf$ function corresponds to the application of a logical rule \iflongversion (\emph{cf.} Footnote~\ref{fo:rules}) \fi (\emph{cf.} definition in Footnote~\ref{fo:rules}) in the deduction system for $\vdashp$: case \eqref{uno} corresponds to the application of \Rule{PEps}; case \eqref{due} implements \Rule{Pappl} straightforwardly; the implementation of rule \Rule{PAppR} is subtler: instead of finding the best $t_1$ to subtract (by intersection) from the static type of the argument, \eqref{tre} finds directly the best type for the argument by applying the $\worra{}{}$ operator to the static types of the function applying the $\worra{}{}$ operator to the static type of the function and the refined type of the application. The remaining (\ref{quattro}--\ref{sette}) cases are the straightforward implementations of the rules \Rule{PPairL}, \Rule{PPairR}, \Rule{PFst}, and \Rule{PSnd}, ... ... @@ -188,7 +188,7 @@ the definition of $\constrf$. It remains to explain how to compute the environment $\Gamma'$ produced from $\Gamma$ by the deduction system for $\Gamma \evdash e t \Gamma'$. Alas, this is the most delicate part of our algorithm. % In a nutshell what we want to do is to define a function In a nutshell, what we want to do is to define a function $\Refine{\_,\_}{\_}$ that takes a type environment $\Gamma$, an expression $e$ and a type $t$ and returns the best type environment $\Gamma'$ such that $\Gamma \evdash e t \Gamma'$ holds. By the best ... ... @@ -259,7 +259,7 @@ cases. \subsubsection{Algorithmic typing rules}\label{sec:algorules} We now have all the notions we need for our typing algorithm% We now have all the definitions we need for our typing algorithm% \iflongversion% , which is defined by the following rules. \else% ... ... @@ -350,10 +350,10 @@ for an expression $e$ by occurrence typing and stored in $\Gamma$ with the type deduced for $e$ by the logical rules: this is simply obtained by removing any hypothesis about $e$ from $\Gamma$, so that the deduction of the type $t$ for $e$ cannot but end by a logical rule. Of course this does not apply when the expression $e$ is a variable, course, this does not apply when the expression $e$ is a variable, since an hypothesis in $\Gamma$ is the only way to deduce the type of a variable, which is why the algorithm reintroduces the classic rule for variables. Finally notice that there is no counterpart for the for variables. Finally, notice that there is no counterpart for the rule \Rule{Abs-} and that therefore it is not possible to deduce negated arrow types for functions. This means that the algorithmic system is not complete as we discuss in details in the next section. ... ... @@ -418,10 +418,10 @@ definition). Then we have: \end{theorem} We can use the algorithmic system $\vdashAts$ defined for the proof of Theorem~\ref{th:algosound} to give a far more precise characterization of the terms for which our algorithm is complete. The system $\vdashAts$ of the terms for which our algorithm is complete: positivity is a very useful but rough approximation. The system $\vdashAts$ copes with negated arrow types but it still is not complete essentially for two reasons: $(i)$ the recursive nature of rule \Rule{Path} and $(ii)$ the use of nested \Rule{PAppL} that yield rule \Rule{Path} and $(ii)$ the use of nested \Rule{PAppL} that yields a precision that the algorithm loses by using type schemes in the definition of \constrf{} (case~\eqref{due} is the critical one). Completeness is recovered by $(i)$ limiting the depth of the ... ...
 ... ... @@ -3,21 +3,21 @@ The previous analysis already covers a large gamut of realistic cases. For instance, the analysis already handles list data structures, since products and recursive types can encode them as right associative nested pairs, as it is done in the language right-associative nested pairs, as it is done in the language CDuce (e.g., $X = \textsf{Nil}\vee(\Int\times X)$ is the type of the lists of integers): see Code 8 in Table~\ref{tab:implem} of Section~\ref{sec:practical} for a concrete example. And even more since the presence of union types makes it possible to type heterogeneous lists whose type of the lists of integers): see Code 8 in Table~\ref{tab:implem} of Section~\ref{sec:practical} for a concrete example. Even more, thanks to the presence of union types it is possible to type heterogeneous lists whose content is described by regular expressions on types as proposed by~\citet{hosoya00regular}. Since the main application of occurrence typing is to type dynamic languages, then it is worth showing how to extend our work to records. Although we use the record types as they are defined in CDuce we cannot do the same for CDuce record expressions extend our work to records. Although we use the record \emph{types} as they are defined in CDuce we cannot do the same for CDuce record \emph{expressions} which require non-trivial modifications for occurrence typing, especially because we want to capture record field extension and field deletion which current gradual typing systems fail to capture. CDuce record types are obtained by extending types with the especially because we want to capture the typing of record field extension and field deletion, two widely used record operations that current gradual typing systems fail to capture. CDuce record types are obtained by extending types with the following two type constructors:\$1.4mm] % \centerline{$$\textbf{Types} ~~ t ~ ::= ~ \record{\ell_1=t \ldots \ell_n=t}{t}\alt \Undef$$}\\[1.4mm] \centerline{$$\textbf{Types} \quad t ~ ::= ~ \record{\ell_1=t \ldots \ell_n=t}{t}\alt \Undef$$}\\[1.4mm] %% \[ %% \begin{array}{lrcl} %% \textbf{Types} & t & ::= & \record{\ell_1=t \ldots \ell_n=t}{t}\alt \Undef ... ... @@ -25,27 +25,31 @@ following two type constructors:\\[1.4mm] %%$ where $\ell$ ranges over an infinite set of labels $\Labels$ and $\Undef$ is a special singleton type whose only value is the constant $\undefcst$ which is a constant not in $\Any$. The type $\undefcst$, a constant not in $\Any$. The type $\record{\ell_1=t_1 \ldots \ell_n=t_n}{t}$ is a \emph{quasi-constant function} that maps every $\ell_i$ to the type $t_i$ and every other $\ell \in \Labels$ to the type $t$ (all the $\ell_i$'s must be distinct). Quasi constant functions are the internal representation of record types in CDuce. These are not visible to the programmer who can use only two specific forms of quasi constant functions, open record types and closed record types, provided by the only two specific forms of quasi constant functions, open record types and closed record types (as for OCaml object types), provided by the following syntactic sugar and that form the \emph{record types} of our language% \iflongversion% \footnote{Note that in the definitions $\ldots{}$'' is meta-syntax to denote the presence of other fields while in the open records {\large\textbf{..}}'' is the syntax that distinguishes them from closed ones.} \fi \begin{itemize}[nosep] \item $\crecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1 \ldots \ell_n=t_n}{\Undef}$ (closed records). \item $\orecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1 \ldots \ell_n=t_n}{\Any \vee \Undef}$ (open records). \item $\crecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1 \ldots \ell_n=t_n}{\Undef}$ \hfill(closed records). \item $\orecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1 \ldots \ell_n=t_n}{\Any \vee \Undef}$\hfill (open records). \end{itemize} plus the notation $\mathtt{\ell \eqq} t$ to denote optional fields, which corresponds to using in the quasi-constant function notation the field $\ell = t \vee \Undef$. field $\ell = t \vee \Undef$% \iflongversion . \else \ (note that $\ldots{}$'' is meta-syntax while {\large\textbf{..}}'' is syntax). \fi For what concerns expressions, we adapt CDuce records to our analysis. In particular records are built starting from the empty record expression \erecord{} and by adding, updating, or removing fields:\vspace{-1mm} For what concerns expressions, we adapt CDuce records to our analysis. In particular records are built starting from the empty record expression \erecord{} by adding, updating, or removing fields:\vspace{-1mm} \[ \begin{array}{lrcl} \textbf{Expr} & e & ::= & \erecord {} \alt \recupd e \ell e \alt \recdel e \ell \alt e.\ell ... ...
 ... ... @@ -15,7 +15,7 @@ In a sense, occurrence typing is a discipline designed to push forward the frontiers beyond which gradual typing is needed, thus reducing the amount of runtime checks needed. For instance, the JavaScript code of~\eqref{foo} and~\eqref{foo2} in the introduction can also be typed by using gradual typing:\vspace{-.4mm} typed by using gradual typing:%\vspace{-.2mm} \begin{alltt}\color{darkblue}\morecompact function foo(x\textcolor{darkred}{ : \pmb{\dyn}}) \{ return (typeof(x) === "number")? x++ : x.trim(); \refstepcounter{equation} \mbox{\color{black}\rm(\theequation)}\label{foo3} ... ... @@ -149,7 +149,7 @@ $\code{number} \land \dyn\simeq \code{number}$ is not an option, since it would force us to choose between having the gradual guarantee or having, say, $\code{number} \land \code{string}$ be more precise than $\code{number} \land \dyn$.\vspace{-2mm}}% $\code{number} \land \dyn$.\vspace{-2mm}} Hence the need for the second part of Rule~\textsc{[AbsInf+]}: the maximal interpretation of $\code{number} \land \dyn$ is $\code{number}$, and it is clear that, if $\code x$ is given type \code{number}, ... ...
 We have implemented the algorithmic system $\vdashA$. Our implementation is in OCaml and uses CDuce as a library to implementation is written in OCaml and uses CDuce as a library to provide the semantic subtyping machinery. Besides the type-checking algorithm defined on the base language, our implementation supports record types (Section \ref{ssec:struct}) and the refinement of ... ... @@ -32,12 +32,12 @@ $\texttt{x}$ is specialized to \Int{} in the then'' case and to \Bool{} in the else'' case. The function is thus type-checked twice more under each hypothesis for \texttt{x}, yielding the precise type $(\Int\to\Int)\land(\Bool\to\Bool)$. Note that w.r.t.\ rule \Rule{AbsInf+} of Section~\ref{sec:refining}, our implementation improved the output of the computed type. Indeed using rule~[{\sc AbsInf}+] we would obtain the type. Indeed, using rule~[{\sc AbsInf}+] we would obtain the type $(\Int\to\Int)\land(\Bool\to\Bool)\land(\Bool\vee\Int\to\Bool\vee\Int)$ with a redundant arrow. Here we can see that since we deduced with a redundant arrow. Here we can see that, since we deduced the first two arrows $(\Int\to\Int)\land(\Bool\to\Bool)$, and since the union of their domain exactly covers the domain of the third arrow, the union of their domain exactly covers the domain of the third arrow, then the latter is not needed. Code~2 shows what happens when the argument of the function is left unannotated (i.e., it is annotated by the top type \Any, written \texttt{Any}'' in our implementation). Here ... ... @@ -138,7 +138,7 @@ $(\True\to\Any\to\True)\land(\Any\to\True\to\True)\land (\lnot\True\to\lnot\True\to\False)$ %}\\[.7mm] then the rule \Rule{OverApp} applies and \True, \Any, and $\lnot\True$ become candidate types for \texttt{x}, which allows us to deduce the precise type given in the table. Finally, thanks to rule \Rule{OverApp} it is not necessary to use a type case to force refinement. As a consequence we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as: \texttt{x}, which allows us to deduce the precise type given in the table. Finally, thanks to rule \Rule{OverApp} it is not necessary to use a type case to force refinement. As a consequence, we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as: \begin{alltt}\color{darkblue}\morecompact let and_ = fun (x : Any) -> fun (y : Any) -> not_ (or_ (not_ x) (not_ y)) let xor_ = fun (x : Any) -> fun (y : Any) -> and_ (or_ x y) (not_ (and_ x y)) ... ...
 ... ... @@ -17,14 +17,14 @@ In this section we show how to do it by using the theory of occurrence typing we developed in the first part of the paper. In particular, we collect the different types that are assigned to the parameter of a function in its body, and use this information to partition the domain of the function and to re-type its body. Consider a more involved example and to re-type its body. Consider a more involved example in a pseudo-code that uses our notation \begin{alltt}\color{darkblue}\morecompact function (x \textcolor{darkred}{: $$\tau$$}) \{ (x $$\in$$ Real) ? \{ (x $$\in$$ Int) ? succ x : sqrt x \} : \{ $$\neg$$x \} \refstepcounter{equation} \mbox{\color{black}\rm(\theequation)}\label{foorefine} \} \end{alltt} When $\tau$ is \code{Real|Bool} (we assume that \code{Int} is a subtype of \code{Real}) we want to deduce for this function the where we assume that \code{Int} is a subtype of \code{Real}. When $\tau$ is \code{Real$\vee$Bool} we want to deduce for this function the type \code{$(\Int\to\Int)\wedge(\Real\backslash\Int\to\Real)\wedge(\Bool\to\Bool)$}. When $\tau$ is \Any, ... ... @@ -39,9 +39,9 @@ by~\citet{THF10} since, for instance in if is_int z then z+1 else 42 \end{alltt} is well typed since the function \code{is\_int} is given type $(\Int\to\True)\wedge(\neg\Int\to\False)$. We choose a more general approach allowing the programmer to hint a particular type for the argument and deducing, if possible, an intersection type for the $(\Int\to\True)\wedge(\neg\Int\to\False)$. We propose a more general approach than the one by~\citet{THF10} since we allow the programmer to hint a particular type for the argument and let the system deduce, if possible, an intersection type for the function. We start by considering the system where $\lambda$-abstractions are ... ... @@ -118,7 +118,7 @@ Where $\psi\setminus\{x\}$ is the function defined as $\psi$ but undefined on $x \end{displaymath} \noindent \else.~\fi All that remains to do is replace the rule [{\sc Abs}+] with the All that remains to do is to replace the rule [{\sc Abs}+] with the following rule\vspace{-.8mm} \begin{mathpar} \Infer[AbsInf+] ... ... @@ -137,13 +137,13 @@ following rule\vspace{-.8mm}$\psi$is always equal to the domain of$\Gamma$restricted to variables. Simply put, this rule first collects all possible types that are deduced for a variable$x$during typing and then uses them to re-type the body of the lambda under this new refined hypothesis for the type of for a variable$x$during the typing of the body of the$\lambda$and then uses them to re-type the body under this new refined hypothesis for the type of$x$. The re-typing ensures that the type safety property carries over to this new rule. This system is enough to type our case study \eqref{foorefine} for the case$\tau$defined as \code{Real|Bool}. Indeed the analysis of the body yields defined as \code{Real$\vee$Bool}. Indeed, the analysis of the body yields$\psi(x)=\{\Int,\Real\setminus\Int\}$for the branch \code{(x\,$\in$\,Int) ? succ\,x : sqrt\,x} and, since \code{$(\Bool\vee\Real)\setminus\Real = \Bool$}, yields$\psi(x)=\{\Bool\}$for the branch \code{$\neg$x}. So the function ... ... @@ -153,7 +153,7 @@ will be checked for the input types$\Int$,$\Real\setminus\Int$, and It is not too difficult to generalize this rule when the lambda is typed by an intersection type:\vspace{-.8mm} \begin{mathpar} \Infer[AbsInf+] {\forall i\in I~\Gamma,x:s_i\vdash \Infer[AbsInf+] {\forall i\in I\hspace*{2mm}\Gamma,x:s_i\vdash e\triangleright\psi_i \and \Gamma,x:s_i\vdash ... ... @@ -164,11 +164,11 @@ typed by an intersection type:\vspace{-.8mm} I}s_i\to t_i} x.e:\bigwedge_{i\in I}(s_i\to t_i)\land\bigwedge_{(u, w)\in T_i}(u\to w) } {}\vspace{-3mm} \end{mathpar} Here, for each arrow declared in the interface of the function, we For each arrow declared in the interface of the function, we first typecheck the body of the function as usual (to check that the arrow is valid) and collect the refined types for the parameter$x$. Then we deduce all possible output types for this refined input types and add the resulting arrows to the type we deduce for the whole Then we deduce all possible output types for this refined set of input types and add the resulting arrows to the type deduced for the whole function (see Appendix~\ref{app:optimize} for an even more precise rule). ... ... @@ -178,8 +178,8 @@ function (see Appendix~\ref{app:optimize} for an even more precise rule). In summary, in order to type a function we use the type-cases on its parameter to partition the domain of the function and we type-check the function on each single partitions rather than on the union thereof. Of course, we could use a much finer domain of the function and we type-check the function on each single partition rather than on the union thereof. Of course, we could use much a finer partition: the finest (but impossible) one is to check the function against the singleton types of all its inputs. But any finer partition would return, in many cases, not a much better information, since most ... ... @@ -207,7 +207,7 @@ therefore add to our deduction system a last further rule:\\[1mm] Whenever a function parameter is the argument of an overloaded function, we record as possible types for this parameter all the domains$t_i$of the arrows that type the overloaded function, restricted (via intersection) by the static type$t$of the parameter and provided that the type is not empty ($t\wedge t_i\not\simeq\Empty$). We show the full power of this rule on some practical examples in Section~\ref{sec:practical}. function, restricted (via intersection) by the static type$t$of the parameter and provided that the type is not empty ($t\wedge t_i\not\simeq\Empty\$). We show the remarkable power of this rule on some practical examples in Section~\ref{sec:practical}. \ignore{\color{darkred} RANDOM THOUGHTS: A delicate point is the definition of the union ... ...