practical.tex 10.3 KB
Newer Older
Kim Nguyễn's avatar
Kim Nguyễn committed
1
We have implemented a simplified version of the algorithm presented in
Giuseppe Castagna's avatar
Giuseppe Castagna committed
2
3
4
Section~\ref{ssec:algorithm} that does
not make use of type schemes and is, therefore, incomplete w.r.t. the
system of Section~\ref{sec:static}. In particular, as we
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
5
explained in Section~\ref{ssec:algorithm}, \beppe{did we?} in the absence of type
Giuseppe Castagna's avatar
Giuseppe Castagna committed
6
schemes it is not always possible to prove that $\forall v, \forall t,
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
7
v \in t \text{~or~} v \not\in \lnot t$. Since this property ceases
8
to hold only for $\lambda$-abstractions, then not using type schemes
Giuseppe Castagna's avatar
Giuseppe Castagna committed
9
yields less precise typing only for tests $\ifty{e}t{e_1}{e_2}$ where $e$
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
10
11
has a functional type, that is, the value tested will be a $\lambda$-abstraction.
This seems like a reasonable compromise between the
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
12
complexity of an implementation involving type schemes and the programs
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
13
14
15
we want to type-check in practice. Indeed, if we restrict the language
so that the only functional type $t$ allowed in a test $\ifty{e}t{e_1}{e_2}$
is $\Empty{\to}\Any$---i.e., if we allow to check whether a value is a
Giuseppe Castagna's avatar
Giuseppe Castagna committed
16
function but not whether it has a specific function type (\emph{cf.}, Footnote~\ref{foo:typecase})---, then our
17
implementation becomes complete (see Corollary~\ref{app:completeness} in the appendix for a formal proof).
18

Kim Nguyễn's avatar
Kim Nguyễn committed
19
Our implementation is written in OCaml and uses CDuce as a library to
20
provide the semantic sub-typing machinery. Besides a type-checking
Kim Nguyễn's avatar
Kim Nguyễn committed
21
22
algorithm defined on the base language, our implementation supports
record types (Section \ref{ssec:struct}) and the refinement of function types
Giuseppe Castagna's avatar
Giuseppe Castagna committed
23
(Section \ref{sec:refining} with the rule of Appendix~\ref{app:optimize}). The implementation is rather crude and
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
24
consist of 2000 lines of OCaml code, including parsing, type-checking
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
25
of programs, and pretty printing of types. We demonstrate the output of
Kim Nguyễn's avatar
Kim Nguyễn committed
26
27
28
29
our type-checking implementation in Table~\ref{tab:implem}. These
examples and others can be tested in the online toplevel available at
\url{https://occtyping.github.io/} (the corresponding repository is
anonymized).
Kim Nguyễn's avatar
Kim Nguyễn committed
30
\input{code_table}
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
31
In this table, the second column gives a code fragment and the third
Kim Nguyễn's avatar
Kim Nguyễn committed
32
column the type deduced by our implementation. Code~1 is a
Giuseppe Castagna's avatar
Giuseppe Castagna committed
33
34
straightforward function similar to our introductory example \code{foo} in (\ref{foo},\ref{foo2}). Here the
programmer annotates the parameter of the function with a coarse type
35
$\Int\vee\Bool$. Our implementation first type-checks the body of the
Kim Nguyễn's avatar
Kim Nguyễn committed
36
function under this assumption, but doing so collects that the type of
Giuseppe Castagna's avatar
Giuseppe Castagna committed
37
38
39
40
41
$\texttt{x}$ is specialized to \Int{} in the ``then'' case and to \Bool{}
in the ``else'' case. The function is thus type-checked twice more
under each hypothesis for \texttt{x}, yielding the precise type
$(\Int\to\Int)\land(\Bool\to\Bool)$. Note that w.r.t.\  rule \Rule{AbsInf+} of Section~\ref{sec:refining}, we improved the output of the computed
type. Indeed using rule~[{\sc AbsInf}+] we would obtain the
Kim Nguyễn's avatar
Kim Nguyễn committed
42
43
type
$(\Int\to\Int)\land(\Bool\to\Bool)\land(\Bool\vee\Int\to\Bool\vee\Int)$
44
45
with a redundant arrow. Here we can see that since we deduced
the first two arrows $(\Int\to\Int)\land(\Bool\to\Bool)$, and since
46
the union of their domain exactly covers the domain of the third arrow,
47
the latter is not needed. Code~2 shows what happens when the argument
Giuseppe Castagna's avatar
Giuseppe Castagna committed
48
of the function is left unannotated (i.e., it is annotated by the top
49
type \Any, written \texttt{Any} in our implementation). Here
50
51
type-checking and refinement also work as expected, but the function
only type checks if all cases for \texttt{x} are covered (which means
Giuseppe Castagna's avatar
Giuseppe Castagna committed
52
53
that the function must handle the case of inputs that are neither in \Int{}
nor in \Bool).
54
55
56

The following examples paint a more interesting picture. First
(Code~3) it is
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
57
easy in our formalism to program type predicates such as those
Giuseppe Castagna's avatar
Giuseppe Castagna committed
58
hard-coded in the $\lambda_{\textit{TR}}$ language of \citet{THF10}. Such type
59
60
predicates, which return \texttt{true} if and only if their input has
a particular type, are just plain functions with an intersection
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
61
type inferred by the system of Section~\ref{sec:refining}. We next define Boolean connectives as overloaded
Giuseppe Castagna's avatar
Giuseppe Castagna committed
62
functions. The \texttt{not\_} connective (Code~4) just tests whether its
63
argument is the Boolean \texttt{true} by testing that it belongs to
64
65
66
67
68
69
70
71
72
73
74
the singleton type \True{} (the type whose only value is
\texttt{true}) returning \texttt{false} for it and \texttt{true} for
any other value (recall that $\neg\True$ is equivalent to
$\texttt{Any\textbackslash}\True$). It works on values of any type,
but we could restrict it to Boolean values by simply annotating the
parameter by \Bool{} (which in CDuce is syntactic sugar for
\True$\vee$\False) yielding the type
$(\True{\to}\False)\wedge(\False{\to}\True)$.
The \texttt{or\_} connective (Code~5) is straightforward as far as the
code goes, but we see that the overloaded type precisely capture all
the possible cases. Again we use a generalized version of the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
75
\texttt{or\_} connective that accepts and treats any value that is not
76
77
\texttt{true} as \texttt{false} and again, we could easily restrict the
domain to \Bool{} if desired.
78

Giuseppe Castagna's avatar
Giuseppe Castagna committed
79
To showcase the power of our type system, and in particular of
80
the ``$\worra{}{}$''
Giuseppe Castagna's avatar
Giuseppe Castagna committed
81
type operator, we define \texttt{and\_} (Code~6) using De Morgan's
82
Laws instead of
Giuseppe Castagna's avatar
Giuseppe Castagna committed
83
using a direct definition. Here the application of the outermost \texttt{not\_} operator is checked against type \True. This
84
allows the system to deduce that the whole \texttt{or\_} application
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
85
86
has type \False, which in turn leads to \texttt{not\_\;x} and
\texttt{not\_\;y} to have type $\lnot \True$ and therefore \texttt{x}
87
88
89
90
91
92
and \texttt{y} to have type \True. The whole function is typed with
the most precise type (we present the type as printed by our
implementation, but the first arrow of the resulting type is
equivalent to
$(\True\to\lnot\True\to\False)\land(\True\to\True\to\True)$).

Giuseppe Castagna's avatar
Giuseppe Castagna committed
93
All these type predicates and Boolean connectives can be used together
94
95
96
to write complex type tests, as in Code~7. Here we define a function
\texttt{f} that takes two arguments \texttt{x} and \texttt{y}. If
\texttt{x} is an integer and \texttt{y} a Boolean, then it returns the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
97
98
integer \texttt{1}; if \texttt{x} is a character or
\texttt{y} is an integer, then it returns \texttt{2}; otherwise the
99
function returns \texttt{3}. Our system correctly deduces a (complex)
Giuseppe Castagna's avatar
Giuseppe Castagna committed
100
101
intersection type that covers all cases (plus several redundant arrow
types). That this type is as precise as possible can be shown by the fact that
102
when applying
Giuseppe Castagna's avatar
Giuseppe Castagna committed
103
104
\texttt{f} to arguments of the expected type, the \emph{type} deduced for the
whole expression is the singleton type \texttt{1}, or \texttt{2},
105
106
107
or \texttt{3}, depending on the type of the arguments.

Code~8 allows us to demonstrate the use and typing of record paths. We
Giuseppe Castagna's avatar
Giuseppe Castagna committed
108
model, using open records, the type of DOM objects that represent XML
109
or HTML documents. Such objects possess a common field
Giuseppe Castagna's avatar
Giuseppe Castagna committed
110
111
\texttt{nodeType} containing an integer constant denoting the kind of
the node (e.g., \p{9} for the root element, \p{1} for an element node, \p{3} for a text node, \ldots). Depending on the kind, the object will have
112
113
114
115
different fields and methods. It is common practice to perform a test
on the value of the \texttt{nodeType} field. In dynamic languages such
as JavaScript, the relevant field or method can directly be accessed
after having checked for the appropriate \texttt{nodeType}. In
Giuseppe Castagna's avatar
Giuseppe Castagna committed
116
mainstream statically typed languages, such as Java, a downward cast
117
from the generic \texttt{Node} type to the expected precise type of
Giuseppe Castagna's avatar
Giuseppe Castagna committed
118
the object is needed. We can see that using the extension presented in
119
120
121
122
123
124
125
Section~\ref{ssec:struct} we can deduce the correct type for
\texttt{x} in all cases. Of particular interest is the last case,
since we use a type case to check the emptiness of the list of child
nodes. This splits, at the type level, the case for the \Keyw{Element}
type depending on whether the content of the \texttt{childNodes} field
is the empty list or not.

126
127
Our implementation features one last enhancement that allows us
to further improve the precision of the inferred type.
128
129
130
Consider the definition of the \texttt{xor\_} operator (Code~9).
Here the rule~[{\sc AbsInf}+] is not sufficient to precisely type the
function, and using only this rule would yield a type
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
131
 $\Any\to\Any\to\Bool$. Let us follow the behavior of the
132
133
 ``$\worra{}{}$'' operator. Here the whole \texttt{and\_} is requested
 to have type \True, which implies that \texttt{or\_ x y} must have
Giuseppe Castagna's avatar
Giuseppe Castagna committed
134
 type \True. This can always happen, whether \texttt{x} is \True{} or
135
136
 not (but then depends on the type of \texttt{y}). The ``$\worra{}{}$''
 operator correctly computes that the type for \texttt{x} in the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
137
 ``\texttt{then}'' branch is $\True\vee\lnot\True\lor\True\simeq\Any$,
138
139
and a similar reasoning holds for \texttt{y}. To solve this problem,
we can first remark that even though type cases in the body of a
Giuseppe Castagna's avatar
Giuseppe Castagna committed
140
141
142
143
function are tipping points that may change the type of the result
of the function, they are not the only ones: applications of overloaded functions play exactly the same role. We
therefore extend deduction system for $\Gamma \vdash e\triangleright\psi$ defined in Section~\ref{sec:refining} with
the following rule\\[1mm]
144
145
146
\centerline{\(
\Infer[OverApp]
    {
Giuseppe Castagna's avatar
Giuseppe Castagna committed
147
      \Gamma \vdash e : \textstyle\bigvee \bigwedge_{i \in I}t_i\to{}s_i\\
148
149
150
151
      \Gamma \vdash x : t\\
      \Gamma \vdash e\triangleright\psi_1\\
      \Gamma \vdash x\triangleright\psi_2\\
    }
Giuseppe Castagna's avatar
Giuseppe Castagna committed
152
    { \Gamma \vdash\textstyle
153
154
155
      {e}{~x}\triangleright\psi_1\cup\psi_2\cup\bigcup_{i\in I}\{
      x\mapsto t\wedge t_i\} }
    {}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
156
157
    \)}\\
Whenever a function parameter is the argument of an
158
overloaded function, we record as possible types for this parameter
Giuseppe Castagna's avatar
Giuseppe Castagna committed
159
160
all the possible domains of the arrows that type the overloaded
function, restricted by the static type of the parameter. In Code~9,
161
162
163
164
since, \texttt{or\_} has type\\
\centerline{$(\True\to\Any\to\True)\land(\Any\to\True\to\True)\land
   (\lnot\True\to\lnot\True\to\False)$}
We consider \True, \Any and $\lnot\True$ as candidate types for
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
165
\texttt{x} which, in turn allows us to deduce a precise type given in the table. Finally, thanks to this rule it is no longer necessary to use a type case to force refinement. As a consequence we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as:
Giuseppe Castagna's avatar
space    
Giuseppe Castagna committed
166
\begin{alltt}\color{darkblue}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
167
168
169
170
let and_ = fun (x : Any) -> fun (y : Any) -> not_ (or_ (not_ x) (not_ y))
let xor_ = fun (x : Any) -> fun (y : Any) -> and_ (or_ x y) (not_ (and_ x y))
\end{alltt}
for which the very same types as in Table~\ref{tab:implem} are deduced.