Commit 039367e0 authored by Giuseppe Castagna's avatar Giuseppe Castagna
Browse files

modified explaination for fixpoint

parent 68879542
......@@ -48,12 +48,10 @@ since \code{++} returns an integer and \code{trim()} a string, then our
function \code{foo} has type
\code{(number|string)$\to$(number|string)}.\footnote{\label{null}Actually,
both Flow and TypeScript deduce as return type
\code{number|string|void}, since both languages track the possibility
that some operations may yield \code{void} results. Considering this
would not pose any problem but clutter our presentation, which is why we omit
\code{number|string|void}, since they track when operations may yield \code{void} results. Considering this would be easy but also clutter our presentation, which is why we omit
such details.} But it is clear that a more precise type would be
one that states that \code{foo} returns a number when it is applied to
a number and it returns a string when it is applied to a string, so that the type deduced for, say \code{foo(42)} would be \code{number} rather than \code{number|string}. This is
a number and it returns a string when it is applied to a string, so that the type deduced for, say, \code{foo(42)} would be \code{number} rather than \code{number|string}. This is
exactly what the intersection type \code{(number$\to$number) \&
(string$\to$string)} states (intuitively, an expression has an intersection of type, noted \code{\&}, if and only if it has all the types of the intersection) and corresponds in Flow to declaring \code{foo} as follows:
\begin{alltt}\color{darkblue}
......@@ -66,20 +64,20 @@ typing approach, even though they are used only at meta-theoretic
level,\footnote{At the moment of writing there is a pending pull
request to add negation types to the syntax of TypeScript, but that is all.} in
particular to determine the type environment when the type case
fails. We already saw them at work when we informally typed the ``else''
fails. We already saw negation types at work when we informally typed the ``else''
branch in \code{foo}, for which we assumed that $x$ did \emph{not} have
type \code{number}---i.e., it had the (negation) type \code{$\neg$number}---and
deduced from it that $x$ then had type \code{string}---i.e.,
\code{(number|string)\&$\neg$number} which is equivalent to the set-theoretic difference
\code{(number|string)\textbackslash\,number} and, thus, to \code{string}.
The approaches described above essentially focus on refining the type
The approaches cited above essentially focus on refining the type
of variables that occur in an expression whose type is being tested. They
do it when the variable occurs at top-level in the test (i.e., the variable is the
expression being tested) or under some specific positions such as in
nested pairs or at the end of a path of selectors. \beppe{Not precise,
please check which are the cases that are handled in the cited
papers}. In this work we aim at removing this restriction on the
papers}. In this work we aim at removing this limitation on the
contexts and develop a general theory to refine the type of variables
that occur in tested expressions under generic contexts, such as variables occurring on the
left or the right expressions of an application. In other words, we aim at establishing a formal framework to
......@@ -105,15 +103,15 @@ We focus our study on conditionals that test types and consider the following s
\(
\ifty{x}{\Int}{x+1}{(\textsf{trim } x)}
\)).
In particular, in this introduction section we concentrate on applications, since they constitute the most difficult case and many other cases can be reduced to them. A typical example is the expression
In particular, in this introduction we concentrate on applications, since they constitute the most difficult case and many other cases can be reduced to them. A typical example is the expression
\begin{equation}\label{typical}
\ifty{x_1x_2}{t}{e_1}{e_2}
\end{equation}
where $x_i$'s denote variables, $t$ is some type, and $e_i$'s are generic expressions.
Depending the actual $t$ and the static types of $x_1$ and $x_2$ we
can make type assumptions for $x_1$, for $x_2$ \emph{and} for the application $x_1x_2$
Depending the actual $t$ and the static types of $x_1$ and $x_2$, we
can make type assumptions for $x_1$, for $x_2$, \emph{and} for the application $x_1x_2$
when typing $e_1$ that are different from those we can make when typing
$e_2$. For instance, suppose $x_1$ is bound to the function \code{foo} defined in \eqref{foo2}. Thus $x_1$ has type, $(\Int\to\Int)\wedge(\String\to\String)$ (we used the syntax of the types of Section~\ref{sec:language} where unions and intersections are denoted by $\vee$ and $\wedge$).
$e_2$. For instance, suppose $x_1$ is bound to the function \code{foo} defined in \eqref{foo2}. Thus $x_1$ has type $(\Int\to\Int)\wedge(\String\to\String)$ (we used the syntax of the types of Section~\ref{sec:language} where unions and intersections are denoted by $\vee$ and $\wedge$).
Then it is not hard to see that the expression\footnote{Most of the following expressions are given just for the sake of example. Determining the type of expressions other than variables is interesting for constructor expressions but less so for destructors such as applications, projections, and selections: any reasonable programmer would not repeat the same application twice, it would store its result in a variable. This becomes meaningful when we introduce constructor such as pairs, as we do for instance in the expression in~\eqref{pair}.}
%
\begin{equation}\label{mezzo}
......@@ -185,7 +183,7 @@ $x_1x_2$ is of type $\Bool$.
Let us recap. If $e$ is an expression of type $t_0$ and we are trying to type
%
\[\ifty{e}{t}{e_1}{e_2}\]
\(\ifty{e}{t}{e_1}{e_2}\),
%
then we can assume that $e$ has type $t_0\wedge t$ when typing $e_1$
and type $t_0\setminus t$ when typing $e_2$. If furthermore $e$ is of
......@@ -211,8 +209,9 @@ third, we also know the type of the application, that is the type
that denotes all the values that may result from the application of a
function in $t_1$ to an argument in $t_2$, type that we denote by
$t_1\circ t_2$. For instance, if $t_1=\Int\to\Bool$ and $t_2=\Int$,
then $\dom{t_1}=\Int$ and $t_1\circ t_2 = \Bool$. Notice that, introducing operations such as $\dom{}$ and $\circ$ is redundant when working with simple types, but becomes necessary in the presence of set-theoretic types. If for instance $t_1$ is the type of \eqref{foo2}, that is, \(t_1=(\Int{\to}\Int)\) \(\wedge\) \((\String{\to}\String)\)---which
denotes functions that return an integer if applied to an integer and a string if applied to a string---, then
then $\dom{t_1}=\Int$ and $t_1\circ t_2 = \Bool$. Notice that, introducing operations such as $\dom{}$ and $\circ$ is redundant when working with simple types, but becomes necessary in the presence of set-theoretic types. If for instance $t_1$ is the type of \eqref{foo2}, that is, \(t_1=(\Int{\to}\Int)\) \(\wedge\) \((\String{\to}\String)\),
%---which denotes functions that return an integer if applied to an integer and a string if applied to a string---
then
\( \dom{t} = \Int \vee \String \),
that is the union of all the possible
input types, while the precise return type of such a
......@@ -228,7 +227,7 @@ that $e_1e_2$ has type $t$ succeeds or fails. Let us start with refining the typ
which the test succeeds. Intuitively, we want to remove from $t_2$ all
the values for which the application will surely return a result not
in $t$, thus making the test fail. Consider $t_1$ and let $s$ be the
largest subtype of $\dom{t_1}$ such that
largest subtype of $\dom{t_1}$ such that\vspace{-1.29mm}
\begin{equation}\label{eq1}
t_1\circ s\leq \neg t
\end{equation}
......@@ -238,7 +237,7 @@ $t_2$ all the values in $s$ or, equivalently, keep in $t_2$ all the
values of $\dom{t_1}$ that are not in $s$. Let us implement the second
viewpoint: the set of all elements of $\dom{t_1}$ for which an
application \emph{does not} surely give a result in $\neg t$ is
denoted $\worra{t_1}t$ (read, ``worra'') and defined as $\min\{u\leq \dom{t_1}\alt
denoted $\worra{t_1}t$ (read, ``$t_1$ worra $t$'') and defined as $\min\{u\leq \dom{t_1}\alt
t_1\circ(\dom{t_1}\setminus u)\leq\neg t\}$: it is easy to see that
according to this definition $\dom{t_1}\setminus(\worra{t_1} t)$ is
the largest subset of $\dom{t_1}$ satisfying \eqref{eq1}. Then we can
......@@ -254,7 +253,7 @@ $\worra{t_1} t$, the set of arguments that when applied to a function
of type $t_1$ \emph{may} return a result in $t$; then we can refine the
type of $e_2$ as $t_2^+ \eqdeftiny t_2\wedge(\worra{t_1} t)$ in the ``then'' branch (we call it the \emph{positive} branch)
and as $t_2^- \eqdeftiny t_2\setminus(\worra{t_1} t)$ in the ``else'' branch (we call it the \emph{negative} branch).
As a side remark note
As a side remark note\marginpar{\tiny\beppe{Remove if space is needed}}
that the set $\worra{t_1} t$ is different from the set of elements that return a
result in $t$ (though it is a supertype of it). To see that, consider
for $t$ the type \String{} and for $t_1$ the type $(\Bool \to\Bool)\wedge(\Int\to(\String\vee\Int))$,
......@@ -273,7 +272,7 @@ $t_1\!\circ t_2^+$ (which is a subtype of $t_1\!\circ
t_2$). Therefore, we can exclude from $t_1$ all the functions that when applied to an argument
in $t_2^+$ they either diverge or yield a result not in $t_1\!\circ t_2^+$.
Both of these things can be obtained simply by removing from $t_1$ the functions in
$(t_2^+\to \neg (t_1\!\circ t_2^+))$. Therefore for what concerns $e_1$ we can refine its type in the ``then'' branch as
$(t_2^+\to \neg (t_1\!\circ t_2^+))$, that is, we refine the type of $e_1$ in the ``then'' branch as
$t_1^+=t_1\setminus (t_2^+\to
\neg(t_1\!\circ t_2^+)))$. That this removes the functions that applied to $t_2^+$ arguments yield results not in $t_1\!\circ t_2^+$ should be pretty obvious. That this also removes functions diverging on $t_2^+$ arguments is subtler and deserves some
explanation. In particular, the interpretation of a type $t\to s$ is the set
......@@ -298,14 +297,14 @@ that diverge on all \String{} arguments.
This is essentially what we formalize in Section~\ref{sec:language}, in the type system by the rule \Rule{PAppL} and in the typing algorithm with the case \eqref{uno} of the definition of the function \constrf.
This is essentially what we formalize in Section~\ref{sec:language}, in the type system by the rule \Rule{PAppL} and in the typing algorithm with the case \eqref{due} of the definition of the function \constrf.
\subsection{Technical challenges}\label{sec:challenges}
In the previous section we outlined the main ideas of our approach to occurrence typing. However, devil is in the details. So the formalization we give in Section~\ref{sec:language} is not so smooth as we just outlined: we must introduce several auxiliary definitions to handle some corner cases. This section presents by tiny examples the main technical difficulties we had to overcome and the definitions we introduced to handle them. As such it provides a kind of road-map for the technicalities of Section~\ref{sec:language}.
\paragraph{Typing occurrences} As it should be clear by now, not only variables but also generic expression are given different types in the ``then'' and ``else'' branches of type tests. For instance, in \eqref{two} the expression $x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative one. In this specific case it is possible to deduce these typings from the specialized types of the single variables (in particular, thanks to the fact that $x_2$ has type \Int{} the positive branch and \Bool{} in the negative one), but this is not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$, $x_2:\Int$, and the expression
\paragraph{Typing occurrences} As it should be clear by now, not only variables but also generic expression are given different types in the ``then'' and ``else'' branches of type tests. For instance, in \eqref{two} the expression $x_1x_2$ has type \Int{} in the positive branch and type \Bool{} in the negative one. In this specific case it is possible to deduce these typings from the refined types of the variables (in particular, thanks to the fact that $x_2$ has type \Int{} the positive branch and \Bool{} in the negative one), but this is not possible in general. For instance, consider $x_1:\Int\to(\Int\vee\Bool)$, $x_2:\Int$, and the expression
\begin{equation}\label{twobis}
\ifty{x_1x_2}{\Int}{...x_1x_2...}{...x_1x_2...}
\end{equation}
......@@ -366,30 +365,19 @@ $\ifty{e(42)}{\Bool}{e(42)}{\textsf{true}}$ reduces in one step to
$\ifty{\textsf{false}}{\Bool}{\textsf{false}}{\textsf{true}}$): see Appendix~\ref{app:parallel}.
\paragraph{Nested Checks} The last class of technical problems arise
from the nesting of type checks. In particular, there are two cases
that pose a problem. The first can be shown by this example
\paragraph{Checks interdependence} The last class of technical problems arise
from the multual dependence of different checks type checks. In particular, there are two cases
that pose a problem. The first can be shown by considering two functions $f$ and $g$ both of type $(\Int\to\Int)\wedge(\Any\to\Bool)$ and the test
\begin{equation}\label{nest1}
\ifty{(x,y)}{(\pair\Int\Int)\vee(\pair\Bool\Bool)}{e}{...}
\ifty{(fx,gx)}{\pair\Int\Bool}{\,...\,}{\,...}
\end{equation}
our technique deduces for the positive branch $e$ that both $x$ and
$y$ have type $\Int\vee\Bool$. But if inside this branch $e$ there is a test such as
$\ifty{x}{\Int}{(y+x)}{(\neg y)}$, then we want to
consider this test well-typed, that is, we want to deduce that $y$ is of type
$\Int$ in the positive branch of this test and of type \Bool{} in the negative
one. To be able to deduce that we need to produce type environments by
using all hypotheses available even when these hypotheses are formulated later in the flow of control. In the example, when we try to type $y+x$ we want to use the
hypothesis that $x\in\Int$ succeeded not only locally, but also to
refine the information we can get by supposing that the test in
\eqref{nest1} succeeded, thus deducing that $(x,y)$ will be of type $\Int\times\Int$. This will be done in the type systems of
if we independently check $fx$ against $\Int$ and $gx$ against $\Bool$ we deduce $\Int$ for the first occurrence of $x$ and $\Any$ for the second. Thus we would type the positive branch under the hypothesis that $x$ is of type $\Int$. But if we use the hypothesis generated by the test of $fx$, that is that $x$ is of type \Int, to check $gx$, then it is easy to see that the positive branch of \eqref{nest1} can never be selected. In other words, we want to produce type environmments for occurrence typing by taking into account all all hypotheses available, even when these hypotheses are formulated later in the flow of control. This will be done in the type systems of
Section~\ref{sec:language} by the rule \Rule{Path} and will require at
algorithmic level to look for a fix-point solution of a function, or
an approximation thereof.
\mick{The paragraph above is not up to date.}
\beppe{Is it ok now?}
Finally, a nested check may help refining not only the types but also
the type assumptions on some expressions. For instance, when typing
Finally, a nested check may help refining
the type assumptions on some outer expressions. For instance, when typing
the positive branch $e$ of
\begin{equation}\label{pair}
\ifty{(x,y)}{(\pair{(\Int\vee\Bool)}\Int)}{e}{...}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment