practical.tex 10.3 KB
Newer Older
Giuseppe Castagna's avatar
Giuseppe Castagna committed
1
We have implemented the algorithmic system $\vdashA$. Our implementation is written in OCaml and uses CDuce as a library to
Giuseppe Castagna's avatar
Giuseppe Castagna committed
2
provide the semantic subtyping machinery. Besides a type-checking
Kim Nguyễn's avatar
Kim Nguyễn committed
3
4
algorithm defined on the base language, our implementation supports
record types (Section \ref{ssec:struct}) and the refinement of function types
Giuseppe Castagna's avatar
Giuseppe Castagna committed
5
(Section \ref{sec:refining} with the rule of Appendix~\ref{app:optimize}). The implementation is rather crude and
Giuseppe Castagna's avatar
Giuseppe Castagna committed
6
consists of 2000 lines of OCaml code, including parsing, type-checking
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
7
of programs, and pretty printing of types. We demonstrate the output of
Kim Nguyễn's avatar
Kim Nguyễn committed
8
9
our type-checking implementation in Table~\ref{tab:implem}. These
examples and others can be tested in the online toplevel available at
Giuseppe Castagna's avatar
Giuseppe Castagna committed
10
11
12
\url{https://occtyping.github.io/}%
\ifsubmission
~(the corresponding repository is
Kim Nguyễn's avatar
Kim Nguyễn committed
13
anonymized).
Giuseppe Castagna's avatar
Giuseppe Castagna committed
14
15
\else.
\fi
Kim Nguyễn's avatar
Kim Nguyễn committed
16
\input{code_table}
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
17
In this table, the second column gives a code fragment and the third
Kim Nguyễn's avatar
Kim Nguyễn committed
18
column the type deduced by our implementation. Code~1 is a
Giuseppe Castagna's avatar
Giuseppe Castagna committed
19
20
straightforward function similar to our introductory example \code{foo} in (\ref{foo},\ref{foo2}). Here the
programmer annotates the parameter of the function with a coarse type
21
$\Int\vee\Bool$. Our implementation first type-checks the body of the
Kim Nguyễn's avatar
Kim Nguyễn committed
22
function under this assumption, but doing so collects that the type of
Giuseppe Castagna's avatar
Giuseppe Castagna committed
23
24
25
26
27
$\texttt{x}$ is specialized to \Int{} in the ``then'' case and to \Bool{}
in the ``else'' case. The function is thus type-checked twice more
under each hypothesis for \texttt{x}, yielding the precise type
$(\Int\to\Int)\land(\Bool\to\Bool)$. Note that w.r.t.\  rule \Rule{AbsInf+} of Section~\ref{sec:refining}, we improved the output of the computed
type. Indeed using rule~[{\sc AbsInf}+] we would obtain the
Kim Nguyễn's avatar
Kim Nguyễn committed
28
29
type
$(\Int\to\Int)\land(\Bool\to\Bool)\land(\Bool\vee\Int\to\Bool\vee\Int)$
30
31
with a redundant arrow. Here we can see that since we deduced
the first two arrows $(\Int\to\Int)\land(\Bool\to\Bool)$, and since
32
the union of their domain exactly covers the domain of the third arrow,
33
the latter is not needed. Code~2 shows what happens when the argument
Giuseppe Castagna's avatar
Giuseppe Castagna committed
34
of the function is left unannotated (i.e., it is annotated by the top
Giuseppe Castagna's avatar
Giuseppe Castagna committed
35
type \Any, written ``\texttt{Any}'' in our implementation). Here
36
37
type-checking and refinement also work as expected, but the function
only type checks if all cases for \texttt{x} are covered (which means
Giuseppe Castagna's avatar
Giuseppe Castagna committed
38
39
that the function must handle the case of inputs that are neither in \Int{}
nor in \Bool).
40
41
42

The following examples paint a more interesting picture. First
(Code~3) it is
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
43
easy in our formalism to program type predicates such as those
Giuseppe Castagna's avatar
Giuseppe Castagna committed
44
hard-coded in the $\lambda_{\textit{TR}}$ language of \citet{THF10}. Such type
45
46
predicates, which return \texttt{true} if and only if their input has
a particular type, are just plain functions with an intersection
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
47
type inferred by the system of Section~\ref{sec:refining}. We next define Boolean connectives as overloaded
Giuseppe Castagna's avatar
Giuseppe Castagna committed
48
functions. The \texttt{not\_} connective (Code~4) just tests whether its
49
argument is the Boolean \texttt{true} by testing that it belongs to
50
51
52
53
54
55
56
57
58
the singleton type \True{} (the type whose only value is
\texttt{true}) returning \texttt{false} for it and \texttt{true} for
any other value (recall that $\neg\True$ is equivalent to
$\texttt{Any\textbackslash}\True$). It works on values of any type,
but we could restrict it to Boolean values by simply annotating the
parameter by \Bool{} (which in CDuce is syntactic sugar for
\True$\vee$\False) yielding the type
$(\True{\to}\False)\wedge(\False{\to}\True)$.
The \texttt{or\_} connective (Code~5) is straightforward as far as the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
59
60
code goes, but we see that the overloaded type precisely captures all
possible cases. Again we use a generalized version of the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
61
\texttt{or\_} connective that accepts and treats any value that is not
62
63
\texttt{true} as \texttt{false} and again, we could easily restrict the
domain to \Bool{} if desired.
64

Giuseppe Castagna's avatar
Giuseppe Castagna committed
65
To showcase the power of our type system, and in particular of
66
the ``$\worra{}{}$''
Giuseppe Castagna's avatar
Giuseppe Castagna committed
67
type operator, we define \texttt{and\_} (Code~6) using De Morgan's
68
Laws instead of
Giuseppe Castagna's avatar
Giuseppe Castagna committed
69
using a direct definition. Here the application of the outermost \texttt{not\_} operator is checked against type \True. This
70
allows the system to deduce that the whole \texttt{or\_} application
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
71
has type \False, which in turn leads to \texttt{not\_\;x} and
Victor Lanvin's avatar
Fixes    
Victor Lanvin committed
72
\texttt{not\_\;y} to have type $\lnot \True$ and therefore both \texttt{x}
73
74
75
76
77
78
and \texttt{y} to have type \True. The whole function is typed with
the most precise type (we present the type as printed by our
implementation, but the first arrow of the resulting type is
equivalent to
$(\True\to\lnot\True\to\False)\land(\True\to\True\to\True)$).

Giuseppe Castagna's avatar
Giuseppe Castagna committed
79
All these type predicates and Boolean connectives can be used together
80
81
82
to write complex type tests, as in Code~7. Here we define a function
\texttt{f} that takes two arguments \texttt{x} and \texttt{y}. If
\texttt{x} is an integer and \texttt{y} a Boolean, then it returns the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
83
84
integer \texttt{1}; if \texttt{x} is a character or
\texttt{y} is an integer, then it returns \texttt{2}; otherwise the
85
function returns \texttt{3}. Our system correctly deduces a (complex)
Giuseppe Castagna's avatar
Giuseppe Castagna committed
86
87
intersection type that covers all cases (plus several redundant arrow
types). That this type is as precise as possible can be shown by the fact that
88
when applying
Giuseppe Castagna's avatar
Giuseppe Castagna committed
89
90
\texttt{f} to arguments of the expected type, the \emph{type} deduced for the
whole expression is the singleton type \texttt{1}, or \texttt{2},
91
92
93
or \texttt{3}, depending on the type of the arguments.

Code~8 allows us to demonstrate the use and typing of record paths. We
Giuseppe Castagna's avatar
Giuseppe Castagna committed
94
model, using open records, the type of DOM objects that represent XML
95
or HTML documents. Such objects possess a common field
Giuseppe Castagna's avatar
Giuseppe Castagna committed
96
97
\texttt{nodeType} containing an integer constant denoting the kind of
the node (e.g., \p{9} for the root element, \p{1} for an element node, \p{3} for a text node, \ldots). Depending on the kind, the object will have
98
99
100
101
different fields and methods. It is common practice to perform a test
on the value of the \texttt{nodeType} field. In dynamic languages such
as JavaScript, the relevant field or method can directly be accessed
after having checked for the appropriate \texttt{nodeType}. In
Giuseppe Castagna's avatar
Giuseppe Castagna committed
102
mainstream statically typed languages, such as Java, a downward cast
103
from the generic \texttt{Node} type to the expected precise type of
Giuseppe Castagna's avatar
Giuseppe Castagna committed
104
the object is needed. We can see that using the extension presented in
105
106
107
108
109
110
111
Section~\ref{ssec:struct} we can deduce the correct type for
\texttt{x} in all cases. Of particular interest is the last case,
since we use a type case to check the emptiness of the list of child
nodes. This splits, at the type level, the case for the \Keyw{Element}
type depending on whether the content of the \texttt{childNodes} field
is the empty list or not.

Kim Nguyễn's avatar
.    
Kim Nguyễn committed
112
Our implementation features another enhancement that allows us
113
to further improve the precision of the inferred type.
114
115
116
Consider the definition of the \texttt{xor\_} operator (Code~9).
Here the rule~[{\sc AbsInf}+] is not sufficient to precisely type the
function, and using only this rule would yield a type
Giuseppe Castagna's avatar
typos    
Giuseppe Castagna committed
117
 $\Any\to\Any\to\Bool$. Let us follow the behavior of the
118
119
 ``$\worra{}{}$'' operator. Here the whole \texttt{and\_} is requested
 to have type \True, which implies that \texttt{or\_ x y} must have
Giuseppe Castagna's avatar
Giuseppe Castagna committed
120
 type \True. This can always happen, whether \texttt{x} is \True{} or
121
122
 not (but then depends on the type of \texttt{y}). The ``$\worra{}{}$''
 operator correctly computes that the type for \texttt{x} in the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
123
 ``\texttt{then}'' branch is $\True\vee\lnot\True\lor\True\simeq\Any$,
124
125
and a similar reasoning holds for \texttt{y}. To solve this problem,
we can first remark that even though type cases in the body of a
Giuseppe Castagna's avatar
Giuseppe Castagna committed
126
127
128
129
function are tipping points that may change the type of the result
of the function, they are not the only ones: applications of overloaded functions play exactly the same role. We
therefore extend deduction system for $\Gamma \vdash e\triangleright\psi$ defined in Section~\ref{sec:refining} with
the following rule\\[1mm]
130
131
132
\centerline{\(
\Infer[OverApp]
    {
Giuseppe Castagna's avatar
Giuseppe Castagna committed
133
      \Gamma \vdash e : \textstyle\bigvee \bigwedge_{i \in I}t_i\to{}s_i\\
134
135
136
137
      \Gamma \vdash x : t\\
      \Gamma \vdash e\triangleright\psi_1\\
      \Gamma \vdash x\triangleright\psi_2\\
    }
Giuseppe Castagna's avatar
Giuseppe Castagna committed
138
    { \Gamma \vdash\textstyle
139
140
      {e}{~x}\triangleright\psi_1\cup\psi_2\cup\bigcup_{i\in I}\{
      x\mapsto t\wedge t_i\} }
141
    {(t\wedge t_i\not\simeq\Empty)}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
142
143
    \)}\\
Whenever a function parameter is the argument of an
144
overloaded function, we record as possible types for this parameter
145
146
147
all the domains $t_i$ of the arrows that type the overloaded
function, restricted (via intersection) by the static type $t$ of the parameter and provided that the type is not empty ($t\wedge t_i\not\simeq\Empty$). In Code~9,
since \texttt{or\_} has type\\[.7mm]
148
\centerline{$(\True\to\Any\to\True)\land(\Any\to\True\to\True)\land
Giuseppe Castagna's avatar
Giuseppe Castagna committed
149
   (\lnot\True\to\lnot\True\to\False)$}\\[.7mm]
150
151
then \True, \Any, and $\lnot\True$ become candidate types for
\texttt{x} and this allows us to deduce the precise type given in the table. Finally, thanks to this rule it is no longer necessary to use a type case to force refinement. As a consequence we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as:
152
\begin{alltt}\color{darkblue}\morecompact
Giuseppe Castagna's avatar
Giuseppe Castagna committed
153
154
  let and_ = fun (x : Any) -> fun (y : Any) -> not_ (or_ (not_ x) (not_ y))
  let xor_ = fun (x : Any) -> fun (y : Any) -> and_ (or_ x y) (not_ (and_ x y))
Giuseppe Castagna's avatar
Giuseppe Castagna committed
155
156
\end{alltt}
for which the very same types as in Table~\ref{tab:implem} are deduced.
Kim Nguyễn's avatar
.    
Kim Nguyễn committed
157

Giuseppe Castagna's avatar
Giuseppe Castagna committed
158
159
160
Last but not least Code~10 (corresponding to our introductory
example~\eqref{nest1}) illustrates the need for iterative refinement of
type environments, as defined in Section~\ref{sec:typenv}. As
Giuseppe Castagna's avatar
empty    
Giuseppe Castagna committed
161
explained, a single pass analysis would deduce 
Kim Nguyễn's avatar
.    
Kim Nguyễn committed
162
for {\tt x}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
163
a type \Int{} from the {\tt f\;x} application and \Any{} from the {\tt g\;x}
Kim Nguyễn's avatar
.    
Kim Nguyễn committed
164
application. Here by iterating a second time, the algorithm deduces
Giuseppe Castagna's avatar
empty    
Giuseppe Castagna committed
165
that {\tt x} has type $\Empty$ (i.e., $\textsf{Empty}$), that is that the first branch can never
Giuseppe Castagna's avatar
Giuseppe Castagna committed
166
be selected (and our implementation warns the user accordingly). In hindsight, the only way for a well-typed overloaded function to have
Giuseppe Castagna's avatar
empty    
Giuseppe Castagna committed
167
type $(\Int{\to}\Int)\land(\Any{\to}\Bool)$ is to diverge when the
Giuseppe Castagna's avatar
typo    
Giuseppe Castagna committed
168
argument is of type \Int: since this intersection type states that
Kim Nguyễn's avatar
.    
Kim Nguyễn committed
169
170
171
172
173
whenever the input is \Int, {\em both\/} branches can be selected,
yielding a result that is at the same time an integer and a Boolean.
This is precisely reflected by the case $\Int\to\Empty$ in the result.
Indeed our {\tt example10} function can be applied to an integer, but
at runtime the application of {\tt f ~x} will diverge.