Commit eaf6d4de authored by Giuseppe Castagna's avatar Giuseppe Castagna
Browse files

Finished pass on Section 3

parent 2f04fe50
......@@ -142,22 +142,3 @@ It is equivalent to the type $\orecord{b=\Bool}$, and thus we can deduce that $x
\subsection{Integrating gradual typing}
\input{gradual}
\subsection{Limitations}
In general we cannot handle flow of information. In particular, the result of a type test can flow only to the branches but not outside the test. As a consequence:
\begin{itemize}
\item Cannot handle let bindings:
\begin{alltt}
let x = (y\(\in\)Int)?`yes:`no
in (x\(\in\)`yes)?y+1:not(y)
\end{alltt}
which is well typed for $y:\Int\vee\Bool$
\item The previous example cannot be solved by partial evaluation since we cannot handle nesting of ifs in the condition:
\begin{alltt}
( ((y\(\in\)Int)?`yes:`no)\(\in\)`yes )? y+1 : not(y)
\end{alltt}
\end{itemize}
Both things that system of~\citet{THF10} can do.
......@@ -5,43 +5,38 @@ flexibility of dynamic typing. The idea is to introduce an \emph{unknown}
some static type-checking can be omitted, at the cost of some additional
runtime checks. The use of both static typing and dynamic typing in a same
program creates a boundary between the two, where the compiler automatically
adds ---often costly~\cite{takikawa2016sound}--- dynamic type-checks to ensure
adds---often costly~\cite{takikawa2016sound}---dynamic type-checks to ensure
that a value crossing the barrier is correctly typed.
Occurrence typing and gradual typing are two complementary disciplines
which have a lot to gain to be integrated, although we are not aware
of any work in this sense. Moreover, the integration of gradual typing with
set-theoretic types has already been studied by~\citet{castagna2019gradual},
which allows us to keep the same formalism. In a sense, occurrence typing is a
of any study in this sense. We study it for the formalism of Section~\ref{sec:language} for which the integration of gradual typing was defined by~\citet{castagna2019gradual}.
In a sense, occurrence typing is a
discipline designed to push forward the frontiers beyond which gradual
typing is needed, thus reducing the amount of runtime checks needed. For
instance, the example at the beginning can be
instance, the the JavaScript code of~\eqref{foo} and~\eqref{foo2} in the introduction can be
typed by using gradual typing:
\begin{alltt}\color{darkblue}
function foo(x\textcolor{darkred}{ : \dyn}) \{
(typeof(x) === "number")? x++ : x.trim()
function foo(x\textcolor{darkred}{ : \pmb{\dyn}}) \{
(typeof(x) === "number")? x++ : x.trim() \refstepcounter{equation} \mbox{\color{black}\rm(\theequation)}\label{foo3}
\}
\end{alltt}
Using ``standard'' gradual typing this is compiled into:
``Standard'' gradual typing inserts two dynamic checks since it compiles the code above into:
\begin{alltt}\color{darkblue}
function foo(x) \{
(typeof(x) === "number")? (\Cast{number}{x})++ : (\Cast{string}{x}).trim()
(typeof(x) === "number")? (\textcolor{darkred}{\Cast{number}{{\color{darkblue}x}}})++ : (\textcolor{darkred}{\Cast{string}{{\color{darkblue}x}}}).trim()
\}
\end{alltt}
where {\Cast{$t$}{$e$}} is a type-cast.\footnote{Intuitively, \code{\Cast{$t$}{$e$}} is
syntactic sugar for \code{(typeof($e$)==="$t$")? $e$ : (throw "Type
error")}. Not exactly though, since to implement compilation \emph{à la} sound gradual typing we need casts on function types.}
where {\Cast{$t$}{$e$}} is a type-cast that dynamically checks whether the value returned by $e$ has type $t$.\footnote{Intuitively, \code{\Cast{$t$}{$e$}} is
syntactic sugar for \code{(typeof($e$)==="$t$")\,?\,$e$\,:\,(throw "Type
error")}. Not exactly though, since to implement compilation \emph{à la} sound gradual typing is is necessary to use casts on function types that need special handling.}
%
We have already seen in the introduction that by using occurrence
typing combined with a union type instead of the gradual type \dyn{}
for parameter annotation, we can avoid the insertion of any cast, at the cost
of some additional type annotations.
But occurrence typing can be used also on the gradually typed code. If we use
occurrence typing to type the gradually-typed version of \code{foo}, this
allows the system to avoid inserting the first cast
We already saw that thanks to occurrence typing we can annotate the parameter \code{x} by \code{number|string} instead of \dyn{} and avoid the insertion of any cast.
But occurrence typing can be used also on the gradually typed code in order to statically detect the insertion of useless casts. Using
occurrence typing to type the gradually-typed version of \code{foo} in~\eqref{foo3}, allows the system to avoid inserting the first cast
\code{\Cast{number}{x}} since, thanks to occurrence typing, the
occurrence of \code{x} at issue is given type \code{number} (the
second cast is still necessary however). But removing this cast is far
second cast is still necessary however). But removing only this cast is far
from being satisfactory, since when this function is applied to an integer
there are some casts that still need to be inserted outside of the function.
The reason is that the compiled version of the function
......@@ -49,9 +44,9 @@ has type \code{\dyn$\to$number}, that is, it expects an argument of type
\dyn, and thus we have to apply a cast (either to the argument or
to the function) whenever this is not the case. In particular, the
application \code{foo(42)} will be compiled as
\code{foo(\Cast{\dyn}{42})}. Now the main problem with such a cast is not
\code{foo(\Cast{\dyn}{42})}. Now, the main problem with such a cast is not
that it produces some unnecessary overhead by performing useless
checks (a cast to \dyn can easily be detected and safely ignored at runtime).
checks (a cast to \dyn{} can easily be detected and safely ignored at runtime).
The main problem is that the combination of such a cast with type-cases
will lead to unintuitive results under the standard operational
semantics of type-cases and casts.
......@@ -62,7 +57,7 @@ subtype of $t$. In standard gradual semantics, \code{\Cast{\dyn}{42}} is a value
And this value is of type \code{\dyn}, which is not a subtype of \code{number}.
Therefore the check in \code{foo} would fail for \code{\Cast{\dyn}{42}}, and so
would the whole function call.
Although this behavior is sound, this is the opposite of
Although this behavior is type safe, this is the opposite of
what every programmer would expect: one would expect the test
\code{(typeof($e$)==="number")} to return true for \code{\Cast{\dyn}{42}}
and false for, say, \code{\Cast{\dyn}{true}}, whereas
......@@ -71,11 +66,11 @@ the standard semantics of type-cases would return false in both cases.
A solution is to modify the semantics of type-cases, and in particular of
\code{typeof}, to strip off all the casts in a value, even nested ones. This
however adds a new overhead at runtime. Another solution is to simply accept
this counter-intuitive result, which has the additional benefit of promoting
this counter-intuitive result, which has at least the benefit of promoting
the dynamic type to a first class type, instead of just considering it as a
directive to the front-end. Indeed, this approach allows to dynamically check
directive to the front-end. Indeed, this would allow to dynamically check
whether some argument has the dynamic type \code{\dyn} (i.e., whether it was
applied to a cast to such a type, simply by \code{(typeof($e$)==="\dyn")}.
applied to a cast to such a type) simply by \code{(typeof($e$)==="\dyn")}.
Whatever solution we choose it is clear that in both cases it would be much
better if the application \code{foo(42)} were compiled as is, thus getting
rid of a cast that at best is useless and at worse gives a counter-intuitive and
......@@ -83,22 +78,22 @@ unexpected semantics.
This is where the previous section about refining function types comes in handy.
To get rid of all superfluous casts, we have to fully exploit the information
provided to us by occurrence typing and deduce for the compiled function the type
provided to us by occurrence typing and deduce for the function in~\eqref{foo3} the type
\code{(number$\to$number)$\wedge$((\dyn\textbackslash
number)$\to$number)}, so that no cast is inserted when the
number)$\to$string)}, so that no cast is inserted when the
function is applied to a number.
To achieve this, we simply modify the typing rule for functions that we defined
in the previous section to accommodate for gradual typing. For every gradual type
$\tau$, we define $\tau^*$ as the type obtained from $\tau$ by replacing all
covariant occurrences of \dyn by \Any\ and all contravariant ones by \Empty. The
type $\tau^*$ can be seen as the \emph{maximal} interpretation of $\tau$, that is,
any expression that can safely be cast to $\tau$ is of type $\tau^*$. In
in the previous section to accommodate for gradual typing. Let $\sigma$ and $\tau$ range over \emph{gradual types}, that is the types produced by the grammar in Definition~\ref{def:types} to which we add \dyn{} as basic type (see~\citet{castagna2019gradual} for the definition of the subtyping relation on these types). For every gradual type
$\tau$, define $\tauUp$ as the (non graudal) type obtained from $\tau$ by replacing all
covariant occurrences of \dyn{} by \Any{} and all contravariant ones by \Empty. The
type $\tauUp$ can be seen as the \emph{maximal} interpretation of $\tau$, that is,
every expression that can safely be cast to $\tau$ is of type $\tauUp$. In
other words, if a function expects an argument of type $\tau$ but can be
typed under the hypothesis that the argument has type $\tau^*$, then no casts
are needed, since every cast that succeeds will always be to a subtype of
$\tau^*$. Taking advantage of this property, we modify the typing rule for
functions as follows:
typed under the hypothesis that the argument has type $\tauUp$, then no casts
are needed, since every cast that succeeds will be a subtype of
$\tauUp$. Taking advantage of this property, we modify the rule for
functions as:
%
%\begin{mathpar}
% \Infer[Abs]
% {\Gamma,x:\tau\vdash e\triangleright\psi\and \forall i\in I\quad \Gamma,x:\sigma_i\vdash e:\tau_i
......@@ -109,44 +104,38 @@ functions as follows:
% }
% {\psi(x)=\{\sigma_i\alt i\in I\}}
%\end{mathpar}
\[
\textsc{[AbsInf+]}
\frac
{
\begin{aligned}
\Gamma,x:&\sigma\vdash e\triangleright\psi \qquad \qquad \Gamma,x:\sigma\vdash e:\tau\\
T = \{ (\sigma, \tau) \}
&\cup \{ (\sigma,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma \vdash e: \tau \}\\
&\cup \{ (\sigma^*,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma^* \vdash e: \tau \}
\end{aligned}
\begin{array}{c}
\hspace*{-8mm}T = \{ (\sigma', \tau') \} \cup \{ (\sigma,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigma \vdash e: \tau \} \cup \{ (\sigmaUp,\tau) ~|~ \sigma \in \psi(x) \land \Gamma, x: \sigmaUp \vdash e: \tau \}\\
\Gamma,x:\sigma'\vdash e\triangleright\psi \qquad \qquad \qquad\Gamma,x:\sigma'\vdash e:\tau'
\end{array}
}
{
\Gamma\vdash\lambda x:\sigma.e:\textstyle\bigwedge_{(\sigma,\tau) \in T}\sigma\to \tau
\Gamma\vdash\lambda x:\sigma'.e:\textstyle\bigwedge_{(\sigma,\tau) \in T}\sigma\to \tau
}
\]
The main idea behind this rule is the same as before: we first collect all the
information we can into $\psi$ by analyzing the body of the function. We then
retype the function using the new hypothesis $x : \sigma$ for every
$\sigma \in \psi(x)$. However, we also retype the function using the hypothesis
$x : \sigma^*$, following the explanation we gave in the previous paragraph.
This allows us to eliminate unnecessary gradual types.
Going back to our previous example with function \code{foo}, we first deduce
$\sigma \in \psi(x)$. Furthermore, we also retype the function using the hypothesis
$x : \sigmaUp$: as explained before the rule, whenever this typing suceeds it eliminates unnecessary gradual types and, thus, unecessary casts.
Let us see how this works on the function \code{foo} in \eqref{foo3}. First, we deduce
the refined hypothesis
$\psi(x) = \{\code{number}\land\dyn, \dyn \textbackslash \code{number}\}$.
$\psi(\code x) = \{\,\code{number}{\land}\dyn\;,\;\dyn \textbackslash \code{number}\,\}$.
Typing the function using this new hypothesis but without considering the
maximal interpretation would yield
$(\dyn \to \code{number}) \land ((\code{number} \land \dyn) \to \code{number})
\land ((\dyn \textbackslash \code{number}) \to \code{number})$. However, as
$(\dyn \to \code{number}\vee\code{string}) \land ((\code{number} \land \dyn) \to \code{number})
\land ((\dyn \textbackslash \code{number}) \to \code{string})$. However, as
we stated before, this would introduce an unnecessary cast if the function
were to be applied to an integer. Hence the need for the second part of
Rule~\textsc{[AbsInf+]}: the maximal interpretation of $\code{number} \land \dyn$
is $\code{number}$, and it is clear that, if $x$ is given type \code{number},
is $\code{number}$, and it is clear that, if $\code x$ is given type \code{number},
the function type-checks, thanks to occurrence typing. Thus, after some
simplifications, we can actually deduce the desired type
$(\code{number} \to \code{number}) \land ((\dyn \textbackslash \code{number}) \to \code{number})$.
routine simplifications, we can actually deduce the desired type
$(\code{number} \to \code{number}) \land ((\dyn \textbackslash \code{number}) \to \code{string})$.
......
......@@ -121,13 +121,12 @@ Finally, I took advantage of this opportunity to describe some undocumented adva
SLIDES = polydeuces-slides.pdf,
}
@article{popl19,
@article{castagna2019gradual,
AUTHOR = {Giuseppe Castagna and Victor Lanvin and Tommaso Petrucciani and Jeremy G.~Siek},
TITLE = {Gradual Typing: a New Perspective},
JOURNAL = {Proc. ACM Program. Lang.},
JOURNAL = {Proc.\ ACM Program.\ Lang.},
VOLUME = {3, Article 16},
NUMBER = {POPL\,'19 46nd ACM Symposium on Principles of Programming Languages},
MONTH = jan,
YEAR = {2019},
}
......@@ -152,13 +151,4 @@ Finally, I took advantage of this opportunity to describe some undocumented adva
organization={ACM}
}
@article{castagna2019gradual,
title={Gradual typing: a new perspective},
author={Castagna, Giuseppe and Lanvin, Victor and Petrucciani, Tommaso and Siek, Jeremy G},
journal={Proceedings of the ACM on Programming Languages},
volume={3},
number={POPL},
pages={16},
year={2019},
publisher={ACM}
}
......@@ -227,7 +227,7 @@
\section{Related work}
\input{related}
\section{Conclusion}
\section{Future work and conclusion}
\input{conclusion}
......
......@@ -176,13 +176,13 @@ function.
we could do the same (by collecting type scheme) in $\psi$ but we
then need to choose a candidate in the type scheme \ldots }
In a nutshell, what we have done is that in order to type a
In summary, in order to type a
function we use the type-cases on its parameter to partition the
domain of the function and we type-check the function on the single partitions rather
domain of the function and we type-check the function on each single partitions rather
than on the union thereof. Of course, we could use much a finer
partition: the finest (but impossible) one is to check the function
against the singleton types of all its inputs. But any finer partition
would not in general return much better information, since most
would not return, in general, much better information, since most
partitions would collapse on the same return type: type-cases on the
parameter are the tipping points that are likely make a difference, by returning different
types for different partitions thus yielding more precise typing.
......
......@@ -48,6 +48,8 @@
\newcommand {\Rule}[1] {[\textsc{#1}]}
\newcommand{\tcase}[4]{\ensuremath{(#1{\in}#2)\,\texttt{\textup{?}}\,#3\,\texttt{\textup{:}}\,#4}}
\newcommand{\tauUp}{\tau^\Uparrow}
\newcommand{\sigmaUp}{\sigma^\Uparrow}
\newcommand{\types}{\textbf{Types}}
\newcommand{\occ}[2]{#1{\downarrow}#2}
\newcommand{\basic}[1]{\text{\fontshape{\itdefault}\fontseries{\bfdefault}\selectfont b}_{#1}}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment