Victor Lanvin committed Jul 04, 2019 1 2 3 4 5 6 7 8 9 10 Gradual typing is an approach proposed by~\citet{siek2006gradual} to combine the safety guarantees of static typing with the programming flexibility of dynamic typing. The idea is to introduce an \emph{unknown} (or \emph{dynamic}) type, denoted $\dyn$, used to inform the compiler that some static type-checking can be omitted, at the cost of some additional runtime checks. The use of both static typing and dynamic typing in a same program creates a boundary between the two, where the compiler automatically adds ---often costly~\cite{takikawa2016sound}--- dynamic type-checks to ensure that a value crossing the barrier is correctly typed.  Victor Lanvin committed Jun 27, 2019 11 12 Occurrence typing and gradual typing are two complementary disciplines which have a lot to gain to be integrated, although we are not aware  Victor Lanvin committed Jul 04, 2019 13 14 15 of any work in this sense. Moreover, the integration of gradual typing with set-theoretic types has already been studied by~\citet{castagna2019gradual}, which allows us to keep the same formalism. In a sense, occurrence typing is a  Victor Lanvin committed Jun 27, 2019 16 discipline designed to push forward the frontiers beyond which gradual  Victor Lanvin committed Jul 04, 2019 17 18 typing is needed, thus reducing the amount of runtime checks needed. For instance, the example at the beginning can be  Victor Lanvin committed Jun 27, 2019 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 typed by using gradual typing: \begin{alltt}\color{darkblue} function foo(x\textcolor{darkred}{ : \dyn}) \{ (typeof(x) === "number")? x++ : x.length \} \end{alltt} Using standard'' gradual typing this is compiled into: \begin{alltt}\color{darkblue} function foo(x) \{ (typeof(x) === "number")? (\Cast{number}{x})++ : (\Cast{string}{x}).length \} \end{alltt} where {\Cast{$t$}{$e$}} is a type-cast.\footnote{Intuitively, \code{\Cast{$t$}{$e$}} is syntactic sugar for \code{(typeof($e$)==="$t$")? $e$ : (throw "Type error")}. Not exactly though, since to implement compilation \emph{à la} sound gradual typing we need cast on function types.} % We have already seen in the introduction that by using occurrence typing combined with a union type instead of the gradual type \dyn for parameter annotation, we can avoid the insertion of any cast, at the cost of some additional type annotations. But occurrence typing can be used also on the gradually typed code. If we use occurrence typing to type the gradually-typed version of \code{foo}, this allows the system to avoid inserting the first cast \code{\Cast{number}{x}} since, thanks to occurrence typing, the occurrence of \code{x} at issue is given type \code{number} (the second cast is still necessary however). But removing this cast is far from being satisfactory, since when this function is applied to an integer there are some casts that still need to be inserted outside of the function. The reason is that the compiled version of the function has type \code{\dyn$\to$number}, that is, it expects an argument of type \dyn, and thus we have to apply a cast (either to the argument or to the function) whenever this is not the case. In particular, the application \code{foo(42)} will be compiled as \code{foo(\Cast{\dyn}{42})}. Now the main problem with such a cast is not that it produces some unnecessary overhead by performing useless checks (a cast to \dyn can easily be detected and safely ignored at runtime). The main problem is that the combination of such a cast with type-cases will lead to unintuitive results under the standard operational semantics of type-cases and casts. Indeed, consider the standard semantics of the type-case \code{(typeof($e$)==="$t$")} which consists in reducing $e$ to a value and checking whether the type of the value is a subtype of $t$. In standard gradual semantics, \code{\Cast{\dyn}{42}} is a value. And this value is of type \code{\dyn}, which is not a subtype of \code{number}. Therefore the check in \code{foo} would fail for \code{\Cast{\dyn}{42}}, and so would the whole function call. Although this behavior is sound, this is the opposite of what every programmer would expect: one would expect the test \code{(typeof($e$)==="number")} to return true for \code{\Cast{\dyn}{42}} and false for, say, \code{\Cast{\dyn}{true}}, whereas the standard semantics of type-cases would return false in both cases. A solution is to modify the semantics of type-cases, and in particular of \code{typeof}, to strip off all the casts in a value, even nested ones. This however adds a new overhead at runtime. Another solution is to simply accept this counter-intuitive result, which has the additional benefit of promoting the dynamic type to a first class type, instead of just considering it as a directive to the front-end. Indeed, this approach allows to dynamically check whether some argument has the dynamic type \code{\dyn} (i.e., whether it was applied to a cast to such a type, simply by \code{(typeof($e$)==="\dyn")}. Whatever solution we choose it is clear that in both cases it would be much better if the application \code{foo(42)} were compiled as is, thus getting rid of a cast that at best is useless and at worse gives a counter-intuitive and unexpected semantics. This is where the previous section about refining function types comes in handy. To get rid of all superfluous casts, we have to fully exploit the information provided to us by occurrence typing and deduce for the compiled function the type \code{(number$\to$number)$\wedge$((\dyn\textbackslash number)$\to$number)}, so that no cast is inserted when the function is applied to a number. To achieve this, we simply modify the typing rule for functions that we defined in the previous section to accommodate for gradual typing. For every gradual type $\tau$, we define $\tau^*$ as the type obtained from $\tau$ by replacing all covariant occurrences of \dyn by \Any\ and all contravariant ones by \Empty. The type $\tau^*$ can be seen as the \emph{maximal} interpretation of $\tau$, that is, any expression that can safely be cast to $\tau$ is of type $\tau^*$. In other words, if a function expects an argument of type $\tau$ but can be typed under the hypothesis that the argument has type $\tau^*$, then no casts are needed, since every cast that succeeds will always be to a subtype of $\tau^*$. Taking advantage of this property, we modify the typing rule for functions as follows:  Victor Lanvin committed Jul 04, 2019 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 %\begin{mathpar} % \Infer[Abs] % {\Gamma,x:\tau\vdash e\triangleright\psi\and \forall i\in I\quad \Gamma,x:\sigma_i\vdash e:\tau_i % \and \forall j \in J \subseteq I\quad \Gamma,x:\sigma_j^*\vdash e:\tau_j} % { % \Gamma\vdash\lambda x:\tau.e:\textstyle\bigwedge_{i\in I}\sigma_i\to \tau_i % \land \bigwedge_{j\in J}\sigma_j^*\to \tau_j % } % {\psi(x)=\{\sigma_i\alt i\in I\}} %\end{mathpar} \textsc{[AbsInf+]} \frac {  Mickael Laurent committed Jul 04, 2019 117  \begin{aligned}  Victor Lanvin committed Jul 04, 2019 118 119 120 121  \Gamma,x:&\sigma\vdash e\triangleright\psi \qquad \qquad \Gamma,x:\sigma\vdash e:\tau\\ T = \{ (\sigma, \tau) \} &\cup \{ (\sigma,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma \vdash e: \tau \}\\ &\cup \{ (\sigma^*,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma^* \vdash e: \tau \}  Mickael Laurent committed Jul 04, 2019 122  \end{aligned}  Victor Lanvin committed Jul 04, 2019 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149  } { \Gamma\vdash\lambda x:\sigma.e:\textstyle\bigwedge_{(\sigma,\tau) \in T}\sigma\to \tau } The main idea behind this rule is the same as before: we first collect all the information we can into $\psi$ by analyzing the body of the function. We then retype the function using the new hypothesis $x : \sigma$ for every $\sigma \in \psi(x)$. However, we also retype the function using the hypothesis $x : \sigma^*$, following the explanation we gave in the previous paragraph. This allows us to eliminate unnecessary gradual types. Going back to our previous example with function \code{foo}, we first deduce the refined hypothesis $\psi(x) = \{\code{number}\land\dyn, \dyn \textbackslash \code{number}\}$. Typing the function using this new hypothesis but without considering the maximal interpretation would yield $(\dyn \to \code{number}) \land ((\code{number} \land \dyn) \to \code{number}) \land ((\dyn \textbackslash \code{number}) \to \code{number})$. However, as we stated before, this would introduce an unnecessary cast if the function were to be applied to an integer. Hence the need for the second part of Rule~\textsc{[AbsInf+]}: the maximal interpretation of $\code{number} \land \dyn$ is $\code{number}$, and it is clear that, if $x$ is given type \code{number}, the function type-checks, thanks to occurrence typing. Thus, after some simplifications, we can actually deduce the desired type $(\code{number} \to \code{number}) \land ((\dyn \textbackslash \code{number}) \to \code{number})$.