gradual.tex 10.5 KB
Newer Older
Victor Lanvin's avatar
Victor Lanvin committed
1
2
3
4
5
6
7
Gradual typing is an approach proposed by~\citet{siek2006gradual} to
combine the safety guarantees of static typing with the programming
flexibility of dynamic typing. The idea is to introduce an \emph{unknown} 
(or \emph{dynamic}) type, denoted $\dyn$, used to inform the compiler that
some static type-checking can be omitted, at the cost of some additional
runtime checks. The use of both static typing and dynamic typing in a same
program creates a boundary between the two, where the compiler automatically
Giuseppe Castagna's avatar
Giuseppe Castagna committed
8
adds---often costly~\cite{takikawa2016sound}---dynamic type-checks to ensure 
Victor Lanvin's avatar
Victor Lanvin committed
9
10
that a value crossing the barrier is correctly typed.

Victor Lanvin's avatar
Victor Lanvin committed
11
12
Occurrence typing and gradual typing are two complementary disciplines
which have a lot to gain to be integrated, although we are not aware
Giuseppe Castagna's avatar
Giuseppe Castagna committed
13
14
of any study in this sense. We study it for the formalism of Section~\ref{sec:language} for which the integration of gradual typing was defined by~\citet{castagna2019gradual}.
In a sense, occurrence typing is a
Victor Lanvin's avatar
Victor Lanvin committed
15
discipline designed to push forward the frontiers beyond which gradual
Victor Lanvin's avatar
Victor Lanvin committed
16
typing is needed, thus reducing the amount of runtime checks needed. For 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
17
instance, the JavaScript code of~\eqref{foo} and~\eqref{foo2} in the introduction can also be
Giuseppe Castagna's avatar
Giuseppe Castagna committed
18
typed by using gradual typing:\vspace{-.4mm}
19
\begin{alltt}\color{darkblue}\morecompact
Giuseppe Castagna's avatar
Giuseppe Castagna committed
20
21
  function foo(x\textcolor{darkred}{ : \pmb{\dyn}}) \{
    (typeof(x) === "number")? x++ : x.trim()  \refstepcounter{equation}                                \mbox{\color{black}\rm(\theequation)}\label{foo3}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
22
  \}\negspace
Victor Lanvin's avatar
Victor Lanvin committed
23
\end{alltt}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
24
``Standard'' gradual typing inserts two dynamic checks since it compiles the code above into:
25
\begin{alltt}\color{darkblue}\morecompact
Victor Lanvin's avatar
Victor Lanvin committed
26
  function foo(x) \{
Giuseppe Castagna's avatar
Giuseppe Castagna committed
27
    (typeof(x) === "number")? (\textcolor{darkred}{\Cast{number}{{\color{darkblue}x}}})++ : (\textcolor{darkred}{\Cast{string}{{\color{darkblue}x}}}).trim()
Giuseppe Castagna's avatar
Giuseppe Castagna committed
28
  \}\negspace
Victor Lanvin's avatar
Victor Lanvin committed
29
\end{alltt}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
30
31
where {\Cast{$t$}{$e$}} is a type-cast that dynamically checks whether the value returned by $e$ has type $t$.\footnote{Intuitively, \code{\Cast{$t$}{$e$}} is
  syntactic sugar for \code{(typeof($e$)==="$t$")\,?\,$e$\,:\,(throw "Type
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
32
    error")}. Not exactly though, since to implement compilation \emph{à la} sound gradual typing it is necessary to use casts on function types that need special handling.}
Victor Lanvin's avatar
Victor Lanvin committed
33
%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
34
We already saw  that thanks to occurrence typing we can annotate the parameter \code{x} by \code{number|string} instead of \dyn{} and avoid the insertion of any cast. 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
35
But occurrence typing can be used also on the gradually typed code above in order to statically detect the insertion of useless casts. Using
Giuseppe Castagna's avatar
Giuseppe Castagna committed
36
occurrence typing to type the gradually-typed version of \code{foo} in~\eqref{foo3}, allows the system to avoid inserting the first cast
Victor Lanvin's avatar
Victor Lanvin committed
37
\code{\Cast{number}{x}} since, thanks to occurrence typing, the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
38
39
occurrence of \code{x} at issue is given type \code{number} (but the
second cast is still necessary though). But removing only this cast is far
Victor Lanvin's avatar
Victor Lanvin committed
40
from being satisfactory, since when this function is applied to an integer
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
41
there are some casts that still need to be inserted outside  the function.
Victor Lanvin's avatar
Victor Lanvin committed
42
43
44
45
46
The reason is that the compiled version of the function
has type \code{\dyn$\to$number}, that is, it expects an argument of type
\dyn, and thus we have to apply a cast (either to the argument or
to the function) whenever this is not the case. In particular, the
application \code{foo(42)} will be compiled as
Giuseppe Castagna's avatar
Giuseppe Castagna committed
47
\code{foo(\Cast{\dyn}{42})}. Now, the main problem with such a cast is not
Victor Lanvin's avatar
Victor Lanvin committed
48
that it produces some unnecessary overhead by performing useless
Giuseppe Castagna's avatar
Giuseppe Castagna committed
49
checks (a cast to \dyn{} can easily be detected and safely ignored at runtime). 
Victor Lanvin's avatar
Victor Lanvin committed
50
51
52
53
54
55
56
57
58
The main problem is that the combination of such a cast with type-cases 
will lead to unintuitive results under the standard operational
semantics of type-cases and casts.
Indeed, consider the standard semantics
of the type-case \code{(typeof($e$)==="$t$")} which consists in
reducing $e$ to a value and checking whether the type of the value is a
subtype of $t$. In standard gradual semantics, \code{\Cast{\dyn}{42}} is a value. 
And this value is of type \code{\dyn}, which is not a subtype of \code{number}. 
Therefore the check in \code{foo} would fail for \code{\Cast{\dyn}{42}}, and so
59
60
61
62
63
would the whole function call.
Although this behavior is type safe, this violates the gradual 
guarantee~\cite{siek2015refined} since giving a \emph{more precise} type to
the parameter \code{x} (such as \code{number}) would make the function succeed,
as the cast to $\dyn$ would not be inserted.
Victor Lanvin's avatar
Victor Lanvin committed
64
A solution is to modify the semantics of type-cases, and in particular of 
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
\code{typeof}, to strip off all the casts in values, even nested ones.
While this adds a new overhead at runtime, this is preferable to losing the
gradual guarantee, and the overhead can be mitigated by having a proper
representation of cast values that allows to strip all casts at once.

However, this problem gets much more complex when considering functional values.
In fact, as we hinted in Section~\ref{ssec:algorithm}, there is no way to
modify the semantics of type cases to preserve both the gradual guarantee and
the soundness of the system in the presence of arbitrary type cases.
For example, consider the function
$f = \lambda^{(\Int \to \Int) \to \Int} g. \tcase{g}{(\Int \to \Int)}{g\
1}{\code{true}}$. This function is well-typed since the type of the
parameter guarantees that only the first branch can be taken, and thus that
only an integer can be returned. However, if we apply this function
to $h = \MCast{\Int \to \Int}{(\lambda^{\dyn \to \dyn} x.\ x)}$, the type case
strips off the cast around $h$ (to preserve the gradual guarantee), then
checks if $\lambda^{\dyn \to \dyn} x.\ x$ has type $\Int \to \Int$. 
Since $\dyn \to \dyn$ is not a subtype of $\Int \to \Int$, the check fails
and the application returns $\code{true}$, which is unsound.
Therefore, to preserve soundness in the presence of gradual types, type cases
should not test functional types other than $\Empty \to \Any$, which is
the same restriction as the one presented by~\citet{siek2016recursive}.
Victor Lanvin's avatar
Victor Lanvin committed
87

88
89
90
91
While this solves the problem of the gradual guarantee, it is clear that
it would be much better if the application \code{foo(42)} were compiled as is,
without introducing the cast \code{\Cast{\dyn}{42}}, thus getting rid of the
overhead associated with removing this cast in the type case.
Victor Lanvin's avatar
Victor Lanvin committed
92
93
This is where the previous section about refining function types comes in handy.
To get rid of all superfluous casts, we have to fully exploit the information 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
94
provided to us by occurrence typing and deduce for the function in~\eqref{foo3} the type
Victor Lanvin's avatar
Victor Lanvin committed
95
\code{(number$\to$number)$\wedge$((\dyn\textbackslash
Giuseppe Castagna's avatar
Giuseppe Castagna committed
96
  number)$\to$string)}, so that no cast is inserted when the
Victor Lanvin's avatar
Victor Lanvin committed
97
98
function is applied to a number. 
To achieve this, we simply modify the typing rule for functions that we defined
Giuseppe Castagna's avatar
Giuseppe Castagna committed
99
100
101
102
103
in the previous section to accommodate for gradual typing. Let $\sigma$ and $\tau$ range over \emph{gradual types}, that is the types produced by the grammar in Definition~\ref{def:types} to which we add \dyn{} as basic type (see~\citet{castagna2019gradual} for the definition of the subtyping relation on these types). For every gradual type
$\tau$, define $\tauUp$ as the (non graudal) type obtained from $\tau$ by replacing all
covariant occurrences of \dyn{} by \Any{} and all contravariant ones by \Empty. The
type $\tauUp$ can be seen as the \emph{maximal} interpretation of $\tau$, that is,
every expression that can safely be cast to $\tau$ is of type $\tauUp$. In
Victor Lanvin's avatar
Victor Lanvin committed
104
other words, if a function expects an argument of type $\tau$ but can be 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
105
106
107
typed under the hypothesis that the argument has type $\tauUp$, then no casts
are needed, since every cast that succeeds will be a subtype of
$\tauUp$. Taking advantage of this property, we modify the rule for
Giuseppe Castagna's avatar
Giuseppe Castagna committed
108
functions as: \vspace{-2mm}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
109
%
Victor Lanvin's avatar
Victor Lanvin committed
110
111
112
113
114
115
116
117
118
119
120
121
122
123
%\begin{mathpar}
%  \Infer[Abs]
%    {\Gamma,x:\tau\vdash e\triangleright\psi\and \forall i\in I\quad \Gamma,x:\sigma_i\vdash e:\tau_i
%     \and \forall j \in J \subseteq I\quad \Gamma,x:\sigma_j^*\vdash e:\tau_j}
%    {
%    \Gamma\vdash\lambda x:\tau.e:\textstyle\bigwedge_{i\in I}\sigma_i\to \tau_i
%      \land \bigwedge_{j\in J}\sigma_j^*\to \tau_j
%    }
%    {\psi(x)=\{\sigma_i\alt i\in I\}}
%\end{mathpar}
\[
  \textsc{[AbsInf+]}
  \frac
  {
Giuseppe Castagna's avatar
Giuseppe Castagna committed
124
125
126
127
    \begin{array}{c}
    \hspace*{-8mm}T = \{ (\sigma', \tau') \} \cup \{ (\sigma,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma \vdash e: \tau \} \cup \{ (\sigmaUp,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigmaUp \vdash e: \tau \}\\
    \Gamma,x:\sigma'\vdash e\triangleright\psi \qquad \qquad \qquad\Gamma,x:\sigma'\vdash e:\tau'
    \end{array}
Victor Lanvin's avatar
Victor Lanvin committed
128
129
  }
  {
Giuseppe Castagna's avatar
Giuseppe Castagna committed
130
    \Gamma\vdash\lambda x:\sigma'.e:\textstyle\bigwedge_{(\sigma,\tau) \in T}\sigma\to \tau
Giuseppe Castagna's avatar
Giuseppe Castagna committed
131
  }\vspace{-2mm}
Victor Lanvin's avatar
Victor Lanvin committed
132
133
134
135
\]
The main idea behind this rule is the same as before: we first collect all the
information we can into $\psi$ by analyzing the body of the function. We then
retype the function using the new hypothesis $x : \sigma$ for every 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
136
137
138
$\sigma \in \psi(x)$. Furthermore, we also retype the function using the hypothesis
$x : \sigmaUp$: as explained before the rule, whenever this typing suceeds it eliminates unnecessary gradual types and, thus, unecessary casts.
Let us see how this works on the function \code{foo} in \eqref{foo3}. First, we deduce
Victor Lanvin's avatar
Victor Lanvin committed
139
the refined hypothesis 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
140
$\psi(\code x) = \{\,\code{number}{\land}\dyn\;,\;\dyn \textbackslash \code{number}\,\}$.
Victor Lanvin's avatar
Victor Lanvin committed
141
142
Typing the function using this new hypothesis but without considering the
maximal interpretation would yield
Giuseppe Castagna's avatar
Giuseppe Castagna committed
143
144
$(\dyn \to \code{number}\vee\code{string}) \land ((\code{number} \land \dyn) \to \code{number})
\land ((\dyn \textbackslash \code{number}) \to \code{string})$. However, as
Victor Lanvin's avatar
Victor Lanvin committed
145
146
147
we stated before, this would introduce an unnecessary cast if the function 
were to be applied to an integer. Hence the need for the second part of
Rule~\textsc{[AbsInf+]}: the maximal interpretation of $\code{number} \land \dyn$
Giuseppe Castagna's avatar
Giuseppe Castagna committed
148
is $\code{number}$, and it is clear that, if $\code x$ is given type \code{number},
Victor Lanvin's avatar
Victor Lanvin committed
149
the function type-checks, thanks to occurrence typing. Thus, after some
Giuseppe Castagna's avatar
Giuseppe Castagna committed
150
151
routine simplifications, we can actually deduce the desired type
$(\code{number} \to \code{number}) \land ((\dyn \textbackslash \code{number}) \to \code{string})$.
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
152
153
154
155



\beppe{Problem: how to compile functions with intersection types, since their body is typed several distinct types. I see two possible solutions: either merge the casts of the various typings (as we did in the compilation of polymorphic functions for semantic subtyping) or allow just one gradual type in the intersection when a function is explicitly typed (reasonable since, why would you use more gradual types in an intersection?)}