Commit afe52f23 authored by Giuseppe Castagna's avatar Giuseppe Castagna
Browse files

More on Section 2 ... explaination of the iteration

parent b543d3ae
In this section we formalize the ideas we outlined in the introduction. We start by the definition of types followed by the language and its reduction semantics. The static semantics is the core of our work: we first present a declarative type system that deduces (possibly many) types for well-typed expressions and then the algorithms to decide whether an expression is well typed or not.
In this section we formalize the ideas we outlined in the introduction. We start by the definition of types followed by the language and its reduction semantics. The static semantics is the core of our work: we first present a declarative type system that deduces (possibly many) types for well-typed expressions and then the algorithms to decide whether according to it an expression is well typed or not.
\subsection{Types}
......@@ -441,7 +441,7 @@ algorithm that given a type environment $\Gamma$ and an expression $e$
decides whether there exists a type $t$ such that $\Gamma\vdash e:t$
is provable. For that we need to solve essentially two problems:
$(i)$~how to handle the fact that it is possible to deduce several
types for the same well-typed expression $(ii)$ how to compute the
types for the same well-typed expression and $(ii)$~how to compute the
auxiliary deduction system for paths.
Multiple types have two distinct origins each requiring a distinct
......@@ -450,16 +450,16 @@ it is possible to deduce for every well-typed lambda abstractions
infinitely many types, that is the annotation of the function
intersected with as many negations of arrow types as it is possible
without making the type empty. To handle this multiplicity we use and
extend the technique of \emph{type schemes} defined
adapt the technique of \emph{type schemes} defined
by~\citet{Frisch2008}. Type schemes---whose definition we recall in Section~\ref{sec:type-schemes}---are canonical representations of
the infinite set of types of $\lambda$-abstractions. The second origin is due
the infinite sets of types of $\lambda$-abstractions. The second origin is due
to the presence of structural rules\footnote{\label{fo:rules}In logic, logical rules
refer to a particular connective (here, a type constructor, that is,
either $\to$, or $\times$, or $b$), while identity rules (e.g.,
axioms and cuts) and structural rules (e.g., weakening and
contraction) do not.} such as \Rule{Subs} and \Rule{Inter}. We
handle it in the classic way: we define an algorithmic system that
tracks the miminum type (actually, minimum \emph{type scheme}) of an
tracks the miminum type---actually, the minimum \emph{type scheme}---of an
expression; this system is obtained from the original system by eliminating
the two structural rules and by distributing suitable checks of the subtyping
relation in the remaining rules. To do that in the presence of
......@@ -482,7 +482,7 @@ system given in Section~\ref{sec:algorules}.
\subsubsection{Type schemes}\label{sec:type-schemes}
We introduce the new syntactic category of \emph{types schemes} which are the terms $\ts$ inductively produced by the following grammar.
We introduce the new syntactic category of \emph{types schemes} which are the terms~$\ts$ inductively produced by the following grammar.
\[
\begin{array}{lrcl}
\textbf{Type schemes} & \ts & ::= & t \alt \tsfun {\arrow t t ; \cdots ; \arrow t t} \alt \ts \tstimes \ts \alt \ts \tsor \ts \alt \tsempty
......@@ -507,7 +507,7 @@ Type schemes denote sets of types, as formally stated by the following definitio
\end{align*}
\end{definition}
Note that $\tsint \ts$ is closed under subsumption and intersection
(straightforward induction) and that $\tsempty$, which denotes the
and that $\tsempty$, which denotes the
empty set of types is different from $\Empty$ whose interpretation is
the set of all types.
......@@ -531,8 +531,6 @@ We also need to perform intersections of type schemes so as to intersect the sta
Let $\ts$ be a type scheme and $t$ a type. We can compute a type scheme, written $t \tsand \ts$, such that
\(\tsint {t \tsand \ts} = \{s \alt \exists t' \in \tsint \ts.\ t \land t' \leq s \}\)
\end{lemma}
\beppe{Do we need the next definition or it can go in the appendix?\\
Finally, given a type schema it is straightforward to choose in its interpretation a type which serves as the canonical representative of the set:
\begin{definition}[Representative]
We define a function $\tsrep {\_}$ that maps every non-empty type scheme into a type, \textit{representative} of the set of types denoted by the scheme.
......@@ -549,12 +547,10 @@ Finally, given a type schema it is straightforward to choose in its interpretati
\end{definition}
Note that $\tsrep \ts \in \tsint \ts$.
}
{}
\beppe{The explaination that follows is redundant with Section~\ref{sec:ideas}. Harmonize!}
\subsubsection{Operators for type constructors}\label{sec:typeops}
{}\beppe{The explaination that follows is redundant with Section~\ref{sec:ideas}. Harmonize!}
In order to define the algorithmic typing of expressions like
applications and projections we need to define the operators on
......@@ -592,9 +588,9 @@ In the example with \code{\(t=(\Int \to \Int)\) \(\wedge\) \((\Bool \to \Bool)\)
\begin{eqnarray}
\dom t & = & \max \{ u \alt t\leq u\to \Any\}
\\
\apply t s & = &\min \{ u \alt t\leq s\to u\}
\apply t s & = &\,\min \{ u \alt t\leq s\to u\}
\\
\worra t s & = &\min\{u \alt t\circ(\dom t\setminus u)\leq \neg s\}\label{worra}
\worra t s & = &\,\min\{u \alt t\circ(\dom t\setminus u)\leq \neg s\}\label{worra}
\end{eqnarray}
The first two operators belongs to the theory of semantic subtyping while the last one is new and we described it in Section~\ref{sec:ideas}
......@@ -604,7 +600,18 @@ We need similar operators for projections since the type $t$ of $e$ in $\pi_i e$
\bpl t & = & \min \{ u \alt t\leq \pair u\Any\}\\
\bpr t & = & \min \{ u \alt t\leq \pair \Any u\}
\end{eqnarray}
All the operators above but $\worra{}{}$ are already present in the theory of semantic subtyping. The reader can find how to compute them and how to extend their definition to type schemes in~\cite[Section 6.11]{Frisch2008}. Below we just show the formula that computes $\worra t s$ for a $t$ subtype of $\Empty\to\Any$. For that, we use a result of semantic subtyping that states that every type $t$ is equivalent to a type in disjunctive normal form and that if furthermore $t\leq\Empty\to\Any$, then $t \simeq \bigvee_{i\in I}\left(\bigwedge_{p\in P_i}(s_p\to t_p)\bigwedge_{n\in N_i}\neg(s_n'\to t_n')\right)$ with $\bigwedge_{p\in P_i}(s_p\to t_p)\bigwedge_{n\in N_i}\neg(s_n'\to t_n') \not\simeq \Empty$ for all $i$ in $I$. For such a $t$ and any type $s$ then we have:
All the operators above but $\worra{}{}$ are already present in the
theory of semantic subtyping: the reader can find how to compute them
and how to extend their definition to type schemes in~\cite[Section
6.11]{Frisch2008}. Below we just show the formula that computes
$\worra t s$ for a $t$ subtype of $\Empty\to\Any$. For that, we use a
result of semantic subtyping that states that every type $t$ is
equivalent to a type in disjunctive normal form and that if
furthermore $t\leq\Empty\to\Any$, then $t \simeq \bigvee_{i\in
I}\left(\bigwedge_{p\in P_i}(s_p\to t_p)\bigwedge_{n\in
N_i}\neg(s_n'\to t_n')\right)$ with $\bigwedge_{p\in P_i}(s_p\to
t_p)\bigwedge_{n\in N_i}\neg(s_n'\to t_n') \not\simeq \Empty$ for all
$i$ in $I$. For such a $t$ and any type $s$ then we have:
%
\begin{equation}
\worra t s = \dom t \wedge\bigvee_{i\in I}\left(\bigwedge_{\{P\subset P_i\alt s\leq \bigvee_{p \in P} \neg t_p\}}\left(\bigvee_{p \in P} \neg s_p\right) \right)
......@@ -618,7 +625,7 @@ The proof that this type satisfies \eqref{worra} is given in the Appendix~\ref{}
The last step for our presentation is to define the algorithm for the
deduction of $\Gamma \evdash e t \Gamma'$, that is an algorithm that
takes as input $\Gamma$, $e$, and $t$, and returns an enviromnt that
takes as input $\Gamma$, $e$, and $t$, and returns an environment that
extends $\Gamma$ with hypotheses on the occurrences of $e$ that are
the most general that can be deduced by assuming that $e\in t$ succeeds.
......@@ -677,18 +684,18 @@ to the application of \Rule{PEps}; case \eqref{due} implements \Rule{Pappl}
straightforwardly; the implementation of rule \Rule{PAppR} is subtler:
instead of finding the best $t_1$ to subtract (by intersection) from the
static type of the argument, \eqref{tre} finds directly the best type for the argument by
applying the operator $\worra{}{}$ to the static types of the function
applying the $\worra{}{}$ operator to the static types of the function
and the refined type of the application. The remaining (\ref{quattro}--\ref{sette})
cases are the straightforward implementations of the rules
\Rule{PPairL}, \Rule{PPairR}, \Rule{PFst}, and \Rule{PSnd},
respectively.
The other recursive function $\env{}{}$ implements the two structural
The other recursive function, $\env{}{}$, implements the two structural
rules \Rule{PInter} and \Rule{PTypeof} by intersecting the type
obtained for $\varpi$ by the logical rules, with the static type
deduce by the type system of the expression occurring at $\varpi$. The
remaining structural rule, \Rule{Psubs}, is accounted for by the use
of the different operators $\worra{}{}$ and $\boldsymbol{\pi}_i$ in
of the operators $\worra{}{}$ and $\boldsymbol{\pi}_i$ in
the definition of $\constr{}{}$.
......@@ -699,10 +706,44 @@ This can be seen by analyzing the rule \Rule{Case\Aa}: the definition of $\Refin
$\tyof{\occ e{\varpi}}\Gamma$, and this is defined for all $\varpi$ since the first premisses of \Rule{Case\Aa} states that
$\Gamma\vdash e:\ts_0$ (and this is possible only if we were able to deduce under the hypothesis $\Gamma$ the type of every occurrence of $e$.)}
It remains to explain how to compute the best environment $\Gamma'$ produced from $\Gamma$ by the deduction system for $\Gamma \evdash e t \Gamma'$. Alas, this is the most delicate part of our algorithm.
\beppe{TODO
It remains to explain how to compute the environment $\Gamma'$ produced from $\Gamma$ by the deduction system for $\Gamma \evdash e t \Gamma'$. Alas, this is the most delicate part of our algorithm.
In a nutshell what we want to do is to define a function
$\Refine{\_,\_}{\_}$ that takes a type environment $\Gamma$, an
expression $e$ and a type $t$ and returns the best type environment
$\Gamma'$ such that $\Gamma \evdash e t \Gamma'$ holds. By the best
environment we mean the one in which the occurrences of $e$ are
associated to the largest possible types (type environments are
hypotheses so they are contravariant: the larger the type the better
the hypothesis). Recall that in Section~\ref{sec:challenges} we said
that we want our analysis to be able to capture all the information
available from nested checks. If we gave up such a kind of precision
then the definition of $\Refine{}{}$ would be pretty easy: it must map
each subexpression of $e$ to the intersection of the types deduced by
$\vdashp$ (\emph{ie}, by $\env{}{}$) for each of its occurrences. That
is, for each expression $e'$ occurring in $e$, $\Refine {e,t}\Gamma$
is a type environment that maps $e'$ into $\bigwedge_{\{\varpi \alt
\occ e \varpi \equiv e'\}} \env {\Gamma,e,t} (\varpi)$. As we
explained in Section~\ref{sec:challenges} the intersection is needed
to apply occurrence typing to expression such as
$\tcase{(x,x)}{\pair{t_1}{t_2}}{e_1}{e_2}$ where some
expressions---here $x$---occur multiple times.
In order to capture most of the type information from nested queries
the rule \Rule{Path} allows the deduction of the type of some
occurrence $\varpi$ to use a type environment $\Gamma'$ that may
contain information about some suboccurrences of $\varpi$. On the
algorithm this would correspond to apply the $\Refine{}{}$ defined
above to a result of $\Refine{}{}$, and so on. Therefore, ideally our
algorithm should compute the type environment as a fixpoint of the
function $X\mapsto\Refine{e,t}{X}$. Unfortunately, an iteration of $\Refine{}{}$ may
not converge. As an example consider the (dumb) expression $\tcase
{x_1x_2}{\Any}{e_1}$: if $x_1:\Any\to\Any$, then every iteration of
$\Refine{}{}$ yields for $x_1$ a type strictly more precise than the
previous iteration.
The solution we adopt here is to bound the number of iterations to some number $n_o$.
From a formal point of view, this means to give up the completeness of the algorithm: $\Refine{}{}$ will be complete (\emph{ie}, it will find all solutions) for the deductions of $\Gamma \evdash e t \Gamma'$ of depth at most $n_o$. This is obtained by the following definition:
\begin{align*}
&\RefineStep {e,t} (\Gamma) = \Gamma' \text{ with:}\\
&\dom{\Gamma'}=\dom{\Gamma} \cup \{e' \alt \exists \varpi.\ \occ e \varpi \equiv e'\}\\
......@@ -713,20 +754,24 @@ It remains to explain how to compute the best environment $\Gamma'$ produced fro
\env {\Gamma,e,t} (\varpi) & \text{if } \exists \varpi.\ \occ e \varpi \equiv e' \\
\Gamma(e') & \text{otherwise}
\end{array}\right.\\&\\
&\Refine {e,t} \Gamma={\RefineStep {e,t}}^n(\Gamma)\qquad\text{with $n$ a global parameter}
&\Refine {e,t} \Gamma={\RefineStep {e,t}}^{n_o}(\Gamma)\qquad\text{with $n$ a global parameter}
\end{align*}
\beppe{still to explain and better write the definition above}
While this is unsatisfactory from a formal point of view, in practice
the problem is a very mild one. Divergence may happen only when
refining the type of a function in an application: not only such a
refinement is meaningful only when the function is typed by a union
type (which is quite rare in practice)\footnote{The only impact of adding a negated arrow type to the type of a function is when we test whether the function has a given arrow type: in practice this never happens since programming languages test whether a value \emph{is} a function, rather than the type of a given function.}, but also we had to build the
expression that causes the divergence in quite an ad hoc way which
makes divergence even more unlikely: setting an $n_o$ twice the depth
of the syntax tree of the outermost type case should capture all
realistic cases \beppe{can we give an estimate based on benchmarking
the protottype?}
The reason for the definition of $\RefineStep p {e,t}$ is that the same subterm of $e$
may occur several times in $e$ and therefore we can collect for every
occurrence a different type constraint. Since all the constraints must
hold, then we take their intersection. For instance, if $f$ is a function
of type $(s_1\times s_2\to t)\wedge( s_3\to\neg t)$ and we test
whether $f(x,x)$ is of type $t$ or not, then the test succeed only if $(x,x)$ is of
type $s_1\times s_2$, that is, that $x$ has both type $s_1$ and type
$s_2$ and thus their intersection $s_1{\wedge}s_2$.
}
\subsection{Algorithmic typing rules}\label{sec:algorules}
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment