gradual.tex 9.53 KB
Newer Older
Victor Lanvin's avatar
Victor Lanvin committed
1
2
3
4
5
6
7
Gradual typing is an approach proposed by~\citet{siek2006gradual} to
combine the safety guarantees of static typing with the programming
flexibility of dynamic typing. The idea is to introduce an \emph{unknown} 
(or \emph{dynamic}) type, denoted $\dyn$, used to inform the compiler that
some static type-checking can be omitted, at the cost of some additional
runtime checks. The use of both static typing and dynamic typing in a same
program creates a boundary between the two, where the compiler automatically
Giuseppe Castagna's avatar
Giuseppe Castagna committed
8
adds---often costly~\cite{takikawa2016sound}---dynamic type-checks to ensure 
Victor Lanvin's avatar
Victor Lanvin committed
9
10
that a value crossing the barrier is correctly typed.

Victor Lanvin's avatar
Victor Lanvin committed
11
12
Occurrence typing and gradual typing are two complementary disciplines
which have a lot to gain to be integrated, although we are not aware
Giuseppe Castagna's avatar
Giuseppe Castagna committed
13
14
of any study in this sense. We study it for the formalism of Section~\ref{sec:language} for which the integration of gradual typing was defined by~\citet{castagna2019gradual}.
In a sense, occurrence typing is a
Victor Lanvin's avatar
Victor Lanvin committed
15
discipline designed to push forward the frontiers beyond which gradual
Victor Lanvin's avatar
Victor Lanvin committed
16
typing is needed, thus reducing the amount of runtime checks needed. For 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
17
instance, the the JavaScript code of~\eqref{foo} and~\eqref{foo2} in the introduction can be
Giuseppe Castagna's avatar
Giuseppe Castagna committed
18
typed by using gradual typing:\vspace{-.4mm}
19
\begin{alltt}\color{darkblue}\morecompact
Giuseppe Castagna's avatar
Giuseppe Castagna committed
20
21
  function foo(x\textcolor{darkred}{ : \pmb{\dyn}}) \{
    (typeof(x) === "number")? x++ : x.trim()  \refstepcounter{equation}                                \mbox{\color{black}\rm(\theequation)}\label{foo3}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
22
  \}\negspace
Victor Lanvin's avatar
Victor Lanvin committed
23
\end{alltt}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
24
``Standard'' gradual typing inserts two dynamic checks since it compiles the code above into:
25
\begin{alltt}\color{darkblue}\morecompact
Victor Lanvin's avatar
Victor Lanvin committed
26
  function foo(x) \{
Giuseppe Castagna's avatar
Giuseppe Castagna committed
27
    (typeof(x) === "number")? (\textcolor{darkred}{\Cast{number}{{\color{darkblue}x}}})++ : (\textcolor{darkred}{\Cast{string}{{\color{darkblue}x}}}).trim()
Giuseppe Castagna's avatar
Giuseppe Castagna committed
28
  \}\negspace
Victor Lanvin's avatar
Victor Lanvin committed
29
\end{alltt}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
30
31
where {\Cast{$t$}{$e$}} is a type-cast that dynamically checks whether the value returned by $e$ has type $t$.\footnote{Intuitively, \code{\Cast{$t$}{$e$}} is
  syntactic sugar for \code{(typeof($e$)==="$t$")\,?\,$e$\,:\,(throw "Type
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
32
    error")}. Not exactly though, since to implement compilation \emph{à la} sound gradual typing it is necessary to use casts on function types that need special handling.}
Victor Lanvin's avatar
Victor Lanvin committed
33
%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
34
35
36
We already saw  that thanks to occurrence typing we can annotate the parameter \code{x} by \code{number|string} instead of \dyn{} and avoid the insertion of any cast. 
But occurrence typing can be used also on the gradually typed code in order to statically detect the insertion of useless casts. Using
occurrence typing to type the gradually-typed version of \code{foo} in~\eqref{foo3}, allows the system to avoid inserting the first cast
Victor Lanvin's avatar
Victor Lanvin committed
37
38
\code{\Cast{number}{x}} since, thanks to occurrence typing, the
occurrence of \code{x} at issue is given type \code{number} (the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
39
second cast is still necessary however). But removing only this cast is far
Victor Lanvin's avatar
Victor Lanvin committed
40
from being satisfactory, since when this function is applied to an integer
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
41
there are some casts that still need to be inserted outside  the function.
Victor Lanvin's avatar
Victor Lanvin committed
42
43
44
45
46
The reason is that the compiled version of the function
has type \code{\dyn$\to$number}, that is, it expects an argument of type
\dyn, and thus we have to apply a cast (either to the argument or
to the function) whenever this is not the case. In particular, the
application \code{foo(42)} will be compiled as
Giuseppe Castagna's avatar
Giuseppe Castagna committed
47
\code{foo(\Cast{\dyn}{42})}. Now, the main problem with such a cast is not
Victor Lanvin's avatar
Victor Lanvin committed
48
that it produces some unnecessary overhead by performing useless
Giuseppe Castagna's avatar
Giuseppe Castagna committed
49
checks (a cast to \dyn{} can easily be detected and safely ignored at runtime). 
Victor Lanvin's avatar
Victor Lanvin committed
50
51
52
53
54
55
56
57
58
59
The main problem is that the combination of such a cast with type-cases 
will lead to unintuitive results under the standard operational
semantics of type-cases and casts.
Indeed, consider the standard semantics
of the type-case \code{(typeof($e$)==="$t$")} which consists in
reducing $e$ to a value and checking whether the type of the value is a
subtype of $t$. In standard gradual semantics, \code{\Cast{\dyn}{42}} is a value. 
And this value is of type \code{\dyn}, which is not a subtype of \code{number}. 
Therefore the check in \code{foo} would fail for \code{\Cast{\dyn}{42}}, and so
would the whole function call. 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
60
Although this behavior is type safe, this is the opposite of
Victor Lanvin's avatar
Victor Lanvin committed
61
62
63
64
65
66
67
68
what every programmer would expect: one would expect the test
\code{(typeof($e$)==="number")} to return true for \code{\Cast{\dyn}{42}}
and false for, say, \code{\Cast{\dyn}{true}}, whereas
the standard semantics of type-cases would return false in both cases.

A solution is to modify the semantics of type-cases, and in particular of 
\code{typeof}, to strip off all the casts in a value, even nested ones. This
however adds a new overhead at runtime. Another solution is to simply accept
Giuseppe Castagna's avatar
Giuseppe Castagna committed
69
this counter-intuitive result, which has at least the benefit of promoting
Victor Lanvin's avatar
Victor Lanvin committed
70
the dynamic type to a first class type, instead of just considering it as a 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
71
directive to the front-end. Indeed, this would allow to dynamically check
Victor Lanvin's avatar
Victor Lanvin committed
72
whether some argument has the dynamic type \code{\dyn} (i.e., whether it was
Giuseppe Castagna's avatar
Giuseppe Castagna committed
73
applied to a cast to such a type) simply by \code{(typeof($e$)==="\dyn")}. 
Victor Lanvin's avatar
Victor Lanvin committed
74
75
76
77
78
79
80
Whatever solution we choose it is clear that in both cases it would be much
better if the application \code{foo(42)} were compiled as is, thus getting
rid of a cast that at best is useless and at worse gives a counter-intuitive and
unexpected semantics.

This is where the previous section about refining function types comes in handy.
To get rid of all superfluous casts, we have to fully exploit the information 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
81
provided to us by occurrence typing and deduce for the function in~\eqref{foo3} the type
Victor Lanvin's avatar
Victor Lanvin committed
82
\code{(number$\to$number)$\wedge$((\dyn\textbackslash
Giuseppe Castagna's avatar
Giuseppe Castagna committed
83
  number)$\to$string)}, so that no cast is inserted when the
Victor Lanvin's avatar
Victor Lanvin committed
84
85
function is applied to a number. 
To achieve this, we simply modify the typing rule for functions that we defined
Giuseppe Castagna's avatar
Giuseppe Castagna committed
86
87
88
89
90
in the previous section to accommodate for gradual typing. Let $\sigma$ and $\tau$ range over \emph{gradual types}, that is the types produced by the grammar in Definition~\ref{def:types} to which we add \dyn{} as basic type (see~\citet{castagna2019gradual} for the definition of the subtyping relation on these types). For every gradual type
$\tau$, define $\tauUp$ as the (non graudal) type obtained from $\tau$ by replacing all
covariant occurrences of \dyn{} by \Any{} and all contravariant ones by \Empty. The
type $\tauUp$ can be seen as the \emph{maximal} interpretation of $\tau$, that is,
every expression that can safely be cast to $\tau$ is of type $\tauUp$. In
Victor Lanvin's avatar
Victor Lanvin committed
91
other words, if a function expects an argument of type $\tau$ but can be 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
92
93
94
typed under the hypothesis that the argument has type $\tauUp$, then no casts
are needed, since every cast that succeeds will be a subtype of
$\tauUp$. Taking advantage of this property, we modify the rule for
Giuseppe Castagna's avatar
Giuseppe Castagna committed
95
functions as: \vspace{-2mm}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
96
%
Victor Lanvin's avatar
Victor Lanvin committed
97
98
99
100
101
102
103
104
105
106
107
108
109
110
%\begin{mathpar}
%  \Infer[Abs]
%    {\Gamma,x:\tau\vdash e\triangleright\psi\and \forall i\in I\quad \Gamma,x:\sigma_i\vdash e:\tau_i
%     \and \forall j \in J \subseteq I\quad \Gamma,x:\sigma_j^*\vdash e:\tau_j}
%    {
%    \Gamma\vdash\lambda x:\tau.e:\textstyle\bigwedge_{i\in I}\sigma_i\to \tau_i
%      \land \bigwedge_{j\in J}\sigma_j^*\to \tau_j
%    }
%    {\psi(x)=\{\sigma_i\alt i\in I\}}
%\end{mathpar}
\[
  \textsc{[AbsInf+]}
  \frac
  {
Giuseppe Castagna's avatar
Giuseppe Castagna committed
111
112
113
114
    \begin{array}{c}
    \hspace*{-8mm}T = \{ (\sigma', \tau') \} \cup \{ (\sigma,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma \vdash e: \tau \} \cup \{ (\sigmaUp,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigmaUp \vdash e: \tau \}\\
    \Gamma,x:\sigma'\vdash e\triangleright\psi \qquad \qquad \qquad\Gamma,x:\sigma'\vdash e:\tau'
    \end{array}
Victor Lanvin's avatar
Victor Lanvin committed
115
116
  }
  {
Giuseppe Castagna's avatar
Giuseppe Castagna committed
117
    \Gamma\vdash\lambda x:\sigma'.e:\textstyle\bigwedge_{(\sigma,\tau) \in T}\sigma\to \tau
Giuseppe Castagna's avatar
Giuseppe Castagna committed
118
  }\vspace{-2mm}
Victor Lanvin's avatar
Victor Lanvin committed
119
120
121
122
\]
The main idea behind this rule is the same as before: we first collect all the
information we can into $\psi$ by analyzing the body of the function. We then
retype the function using the new hypothesis $x : \sigma$ for every 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
123
124
125
$\sigma \in \psi(x)$. Furthermore, we also retype the function using the hypothesis
$x : \sigmaUp$: as explained before the rule, whenever this typing suceeds it eliminates unnecessary gradual types and, thus, unecessary casts.
Let us see how this works on the function \code{foo} in \eqref{foo3}. First, we deduce
Victor Lanvin's avatar
Victor Lanvin committed
126
the refined hypothesis 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
127
$\psi(\code x) = \{\,\code{number}{\land}\dyn\;,\;\dyn \textbackslash \code{number}\,\}$.
Victor Lanvin's avatar
Victor Lanvin committed
128
129
Typing the function using this new hypothesis but without considering the
maximal interpretation would yield
Giuseppe Castagna's avatar
Giuseppe Castagna committed
130
131
$(\dyn \to \code{number}\vee\code{string}) \land ((\code{number} \land \dyn) \to \code{number})
\land ((\dyn \textbackslash \code{number}) \to \code{string})$. However, as
Victor Lanvin's avatar
Victor Lanvin committed
132
133
134
we stated before, this would introduce an unnecessary cast if the function 
were to be applied to an integer. Hence the need for the second part of
Rule~\textsc{[AbsInf+]}: the maximal interpretation of $\code{number} \land \dyn$
Giuseppe Castagna's avatar
Giuseppe Castagna committed
135
is $\code{number}$, and it is clear that, if $\code x$ is given type \code{number},
Victor Lanvin's avatar
Victor Lanvin committed
136
the function type-checks, thanks to occurrence typing. Thus, after some
Giuseppe Castagna's avatar
Giuseppe Castagna committed
137
138
routine simplifications, we can actually deduce the desired type
$(\code{number} \to \code{number}) \land ((\dyn \textbackslash \code{number}) \to \code{string})$.
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
139
140
141
142



\beppe{Problem: how to compile functions with intersection types, since their body is typed several distinct types. I see two possible solutions: either merge the casts of the various typings (as we did in the compilation of polymorphic functions for semantic subtyping) or allow just one gradual type in the intersection when a function is explicitly typed (reasonable since, why would you use more gradual types in an intersection?)}