extensions.tex 12.5 KB
Newer Older
Giuseppe Castagna's avatar
Giuseppe Castagna committed
1
2
3
4
5
6
7
As we recalled in the introduction, the main application of occurrence
typing is to type dynamic languages. In this section we explore how to
extend our work to encompass three features that are necessary to type
these languages.

First, we consider record types and record expressions which, in dynamic
languages, are used to implement objects. In particular, we extend our
Giuseppe Castagna's avatar
Giuseppe Castagna committed
8
system to cope with typical usage patterns of objects employed in these
Giuseppe Castagna's avatar
Giuseppe Castagna committed
9
languages such as adding, modifying, or deleting a field, or dynamically
Giuseppe Castagna's avatar
Giuseppe Castagna committed
10
testing its presence to specify different behaviors.
Giuseppe Castagna's avatar
Giuseppe Castagna committed
11
12
13
14
15
16

Second, in order
to precisely type applications in dynamic languages it is
crucial to refine the type of some functions to account for their
different behaviors with specific input types. But current approaches
are bad at it: they require the programmer to explicitly specify a
Giuseppe Castagna's avatar
Giuseppe Castagna committed
17
precise intersection type for these functions and, even with such specifications, some common cases
Giuseppe Castagna's avatar
Giuseppe Castagna committed
18
19
20
fail to type (in that case the only solution is to hard-code the
function and its typing discipline into the language). We show how we
can use the work developed in the previous sections to infer precise
Giuseppe Castagna's avatar
Giuseppe Castagna committed
21
22
intersection types for functions. In our system, these functions do not
require any type annotation or just an annotation for
Giuseppe Castagna's avatar
Giuseppe Castagna committed
23
24
25
26
the function parameters, whereas some of them fail to type in
current alternative approaches even when they are given the full intersection type
specification.

Giuseppe Castagna's avatar
Giuseppe Castagna committed
27
Finally, to type dynamic languages it is often necessary to make
Giuseppe Castagna's avatar
Giuseppe Castagna committed
28
29
30
31
32
statically-typed parts of a program coexist with dynamically-typed
ones. This is the aim of gradually typed systems that we explore in
the third extension of this section.


Giuseppe Castagna's avatar
Giuseppe Castagna committed
33
\subsection{Record types}
Kim Nguyễn's avatar
Kim Nguyễn committed
34
\label{ssec:struct}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
35
\iflongversion%%%%%%%%%%%%%%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
36
37
38
The previous analysis already covers a large gamut of realistic
cases. For instance, the analysis already handles list data
structures, since products and recursive types can encode them as
Giuseppe Castagna's avatar
Giuseppe Castagna committed
39
right-associative nested pairs, as it is done in the language
Giuseppe Castagna's avatar
Giuseppe Castagna committed
40
CDuce (e.g., $X = \textsf{Nil}\vee(\Int\times X)$ is the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
41
42
type of the lists of integers): see Code 8 in Table~\ref{tab:implem} of Section~\ref{sec:practical} for a concrete example. Even more, thanks to the presence of
union types it is possible to type heterogeneous lists whose
Giuseppe Castagna's avatar
Giuseppe Castagna committed
43
content is described by regular expressions on types as proposed
Giuseppe Castagna's avatar
Giuseppe Castagna committed
44
45
46
47
48
by~\citet{hosoya00regular}. However, this is not enough to cover
records and, in particular, the specific usage patterns in dynamic
languages of records, whose field are dynamically tested, deleted,
added, and modified. This is why we extend here our work to records,
building on the record types as they are defined in CDuce.
49
50
51
52

The extension we present in this section is not trivial. Although we use the record \emph{types} as they are
defined in CDuce we cannot do the same for CDuce record
\emph{expressions}. The reasons why we cannot use the record
Giuseppe Castagna's avatar
Giuseppe Castagna committed
53
expressions of CDuce and we have to define and study new ones are
54
55
56
twofold. On the one hand  we want to capture the typing of record field extension and field
deletion, two operation widely used in dynamic language; on the other
hand we need to have very simple expressions formed by elementary
Giuseppe Castagna's avatar
Giuseppe Castagna committed
57
sub-expressions, in order to limit the combinatorics of occurrence
58
59
60
typing. For this reason we build our records one field at a time,
starting from the empty record and adding, updating, or deleting
single fields.
Giuseppe Castagna's avatar
Giuseppe Castagna committed
61
62
63
64
65
66
67
68
69
70
\else%%%%%%%%%%%
 Since the main application of occurrence
typing is to type dynamic languages, then it is important to show that our
work can handle records. We cannot handle them via a simple encoding into
pairs, since we need a precise type analysis for operations such
as field addition, update, and deletion. We therefore extend our approach to deal
with the records (types and expressions) as they are to be found in the aforementioned
CDuce language, where these operations can be precisely typed.
\fi%%%%%

71
Formally, CDuce record \emph{types} can be embedded in our types by adding the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
72
73
following two type constructors:\\[1.4mm]
%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
74
\centerline{\(\textbf{Types} \quad t ~ ::= ~ \record{\ell_1=t \ldots \ell_n=t}{t}\alt \Undef\)}\\[1.4mm]
75
76
77
78
79
%% \[
%% \begin{array}{lrcl}
%% \textbf{Types} & t & ::= & \record{\ell_1=t \ldots \ell_n=t}{t}\alt \Undef
%% \end{array}
%% \]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
80
where $\ell$ ranges over an infinite set of labels $\Labels$ and $\Undef$
Giuseppe Castagna's avatar
Giuseppe Castagna committed
81
is a special singleton type whose only value is the constant
Giuseppe Castagna's avatar
Giuseppe Castagna committed
82
$\undefcst$, a constant not in $\Any$. The type
Giuseppe Castagna's avatar
Giuseppe Castagna committed
83
84
85
86
87
$\record{\ell_1=t_1 \ldots \ell_n=t_n}{t}$ is a \emph{quasi-constant
  function} that maps every $\ell_i$ to the type $t_i$ and every other
$\ell \in \Labels$ to the type $t$ (all the $\ell_i$'s must be
distinct). Quasi constant functions are the internal representation of
record types in CDuce. These are not visible to the programmer who can use
Giuseppe Castagna's avatar
Giuseppe Castagna committed
88
only two specific forms of quasi constant functions, open record types and closed record types (as for OCaml object types), provided by the
Mickael Laurent's avatar
Mickael Laurent committed
89
following syntactic sugar:%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
90
91
92
\iflongversion%
\footnote{Note that in the definitions ``$\ldots{}$'' is meta-syntax to denote the presence of other fields while in the open records ``{\large\textbf{..}}'' is the syntax that distinguishes them from closed ones.}
\fi
Giuseppe Castagna's avatar
Giuseppe Castagna committed
93
\begin{itemize}[nosep]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
94
95
\item $\crecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1 \ldots \ell_n=t_n}{\Undef}$ \hfill(closed records).
\item $\orecord{\ell_1=t_1, \ldots, \ell_n=t_n}$ for $\record{\ell_1=t_1 \ldots \ell_n=t_n}{\Any \vee \Undef}$\hfill (open records).
Mickael Laurent's avatar
Mickael Laurent committed
96
\end{itemize}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
97
plus the notation $\mathtt{\ell \eqq} t$ to denote optional fields,
Giuseppe Castagna's avatar
Giuseppe Castagna committed
98
which corresponds to using in the quasi-constant function notation the
Giuseppe Castagna's avatar
Giuseppe Castagna committed
99
100
101
102
103
104
field $\ell = t \vee \Undef$%
\iflongversion
.
\else
\ (note that ``$\ldots{}$'' is meta-syntax while ``{\large\textbf{..}}'' is syntax).
\fi
Giuseppe Castagna's avatar
Giuseppe Castagna committed
105

Giuseppe Castagna's avatar
Giuseppe Castagna committed
106
For what concerns \emph{expressions}, we cannot use CDuce record expressions
107
as they are, but instead we must adapt them to our analysis. So as anticipated, we
Kim Nguyễn's avatar
Kim Nguyễn committed
108
consider records that are built starting from the empty record expression \erecord{} by adding, updating, or removing fields:\svvspace{-0.75mm}
Mickael Laurent's avatar
Mickael Laurent committed
109
110
\[
\begin{array}{lrcl}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
111
\textbf{Expr} & e & ::= & \erecord {} ~\alt~ \recupd e \ell e ~\alt~ \recdel e \ell ~\alt~ e.\ell
Kim Nguyễn's avatar
Kim Nguyễn committed
112
\end{array}\svvspace{-.75mm}
Mickael Laurent's avatar
Mickael Laurent committed
113
\]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
114
in particular $\recdel e \ell$ deletes the field $\ell$ from $e$, $\recupd e \ell e'$ adds the field $\ell=e'$ to the record $e$ (deleting any existing $\ell$ field), while $e.\ell$ is field selection with the reduction:
Giuseppe Castagna's avatar
Giuseppe Castagna committed
115
\(\erecord{...,\ell=e,...}.\ell\ \reduces\ e\).
Mickael Laurent's avatar
Mickael Laurent committed
116

117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
To define record type subtyping and record expression type inference
we need three operators on record types: $\proj \ell t$ which returns
the type of the field $\ell$ in the record type $t$, $t_1+t_2$ which
returns the record type formed by all the fields in $t_2$ and those in
$t_1$ that are not in $t_2$, and $\recdel t\ell$ which returns the
type $t$ in which the field $\ell$ is
\iflongversion%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
undefined. They are formally defined as follows (see~\citet{alainthesis} for more details):
\begingroup
\allowdisplaybreaks
\begin{eqnarray}
\proj \ell t & = & \left\{\begin{array}{ll}\min \{ u \alt t\leq \orecord{\ell=u}\} &\text{if } t \leq \orecord{\ell = \Any}\\ \text{undefined}&\text{otherwise}\end{array}\right.\\[2mm]
  t_1 + t_2 & = & \min\left\{
      u \quad\bigg|\quad \forall \ell \in \Labels.\left\{\begin{array}{ll}
      \proj \ell u \geq \proj \ell {t_2} &\text{ if } \proj \ell {t_2} \leq \neg \Undef\\
      \proj \ell u \geq \proj \ell {t_1} \vee (\proj \ell {t_2} \setminus \Undef) &\text{ otherwise}
      \end{array}\right\}
    \right\}\\[2mm]
  \recdel t \ell & = & \min \left\{ u \quad\bigg|\quad  \forall \ell' \in \Labels. \left\{\begin{array}{ll}
    \proj {\ell'} u \geq \Undef &\text{ if }\ell' = \ell\\
    \proj {\ell'} u \geq \proj {\ell'} t &\text{ otherwise}
    \end{array}\right\}\right\}
\end{eqnarray}
\endgroup
\else%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
undefined
(see \Appendix\ref{app:recop} for the formal definition and~\citet{alainthesis} for more details).
\fi%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
145
%
Giuseppe Castagna's avatar
Giuseppe Castagna committed
146
Then two record types $t_1$ and $t_2$ are in subtyping relation, $t_1 \leq t_2$, if and only if for all $\ell \in \Labels$ we have $\proj \ell {t_1} \leq \proj \ell {t_2}$. In particular $\orecord{\!\!}$ is the largest record type.
Mickael Laurent's avatar
Mickael Laurent committed
147

Kim Nguyễn's avatar
Kim Nguyễn committed
148
Expressions are then typed by the following rules (already in algorithmic form).\svvspace{-.1mm}
Mickael Laurent's avatar
Mickael Laurent committed
149
\begin{mathpar}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
150
151
    \Infer[Record]
        {~}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
152
        {\Gamma \vdash \erecord {}:\crecord {}}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
153
        {}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
154
155
156
157
158
~
  \Infer[Update]
        {\Gamma \vdash e_1:t_1\and t_1\leq\orecord {\!\!} \and \Gamma \vdash e_2:t_2}
        {\Gamma \vdash \recupd{e_1}{\ell}{e_2}: t_1 + \crecord{\ell=t_2}}
        {\recupd{e_1\!\!}{\!\!\ell}{e_2} \not\in\dom\Gamma}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
159
%\end{mathpar}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
160
        %\begin{mathpar}
Kim Nguyễn's avatar
Kim Nguyễn committed
161
        \svvspace{-1.9mm}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
162
\\        
Giuseppe Castagna's avatar
Giuseppe Castagna committed
163
164
165
166
  \Infer[Delete]
        {\Gamma \vdash e:t\and t\leq\orecord {\!\!}}
        {\Gamma \vdash \recdel e \ell: \recdel t \ell}
        {\recdel e \ell\not\in\dom\Gamma}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
167

Mickael Laurent's avatar
Mickael Laurent committed
168
  \Infer[Proj]
169
        {\Gamma \vdash e:t\and t\leq\orecord{\ell = \Any}}
170
        {\Gamma \vdash e.\ell:\proj \ell {t}}
Kim Nguyễn's avatar
Kim Nguyễn committed
171
        {e.\ell\not\in\dom\Gamma}\svvspace{-2mm}
Mickael Laurent's avatar
Mickael Laurent committed
172
\end{mathpar}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
173
To extend occurrence typing to records we add the following values to paths: $\varpi\in\{\ldots,a_\ell,u_\ell^1,u_\ell^2,r_\ell\}^*$, with 
Giuseppe Castagna's avatar
Giuseppe Castagna committed
174
175
176
177
\(e.\ell\downarrow a_\ell.\varpi =\occ{e}\varpi\), 
\(\recdel e \ell\downarrow r_\ell.\varpi = \occ{e}\varpi\), and
\(\recupd{e_1}\ell{e_2}\downarrow u_\ell^i.\varpi = \occ{e_i}\varpi\)
and add the following rules for the new paths:
178
\begin{mathpar}\svvspace{-8.7mm}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
179
180
181
182
183
184
185
186
187
    \Infer[PSel]
        { \pvdash \Gamma e t \varpi:t'}
        { \pvdash \Gamma e t {\varpi.a_\ell}:\orecord {\ell:t'} }
        { }
        
    \Infer[PDel]
        { \pvdash \Gamma e t \varpi:t'}
        { \pvdash \Gamma e t \varpi.r_\ell: (\recdel {t'} \ell) + \crecord{\ell \eqq \Any}}
        { }
Kim Nguyễn's avatar
Kim Nguyễn committed
188
        \svvspace{-3mm}\\
Giuseppe Castagna's avatar
Giuseppe Castagna committed
189
    \Infer[PUpd1]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
190
191
192
193
        { \pvdash \Gamma e t \varpi:t'}
        { \pvdash \Gamma e t \varpi.u_\ell^1: (\recdel {t'} \ell) + \crecord{\ell \eqq \Any}}
        { }
        
Giuseppe Castagna's avatar
Giuseppe Castagna committed
194
    \Infer[PUpd2]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
195
196
197
198
        { \pvdash \Gamma e t \varpi:t}
        { \pvdash \Gamma e t \varpi.u_\ell^2: \proj \ell t'}
        { }
\end{mathpar}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
199
200
201
202
203
204
205
Deriving the algorithm from these rules is then straightforward:\\[1.8mm]
\hspace*{-2mm}\(
\begin{array}{llll}
  \constr{\varpi.a_\ell}{\Gamma,e,t} =  \orecord {\ell: \env {\Gamma,e,t} (\varpi)}\hspace*{-1mm}&
  \constr{\varpi.r_\ell}{\Gamma,e,t} =  \recdel {(\env {\Gamma,e,t} (\varpi))} \ell + \crecord{\ell \eqq \Any}\\
   \constr{\varpi.u_\ell^2}{\Gamma,e,t}  =  \proj \ell {(\env {\Gamma,e,t} (\varpi))} &
 \constr{\varpi.u_\ell^1}{\Gamma,e,t}  =  \recdel {(\env {\Gamma,e,t} (\varpi))} \ell + \crecord{\ell \eqq \Any}\\[1.8mm]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
206
\end{array}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
207
\)
208

Giuseppe Castagna's avatar
Giuseppe Castagna committed
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
%% \beppe{Je ne suis absolument pas convaincu par Ext1. Pour moi il devrait donner le field avec le type de $u_\ell^2$ autrement dit pour moi les regles correctes sont:
%% \begin{mathpar}
%%     \Infer[PExt1]
%%         { \pvdash \Gamma e t \varpi:t_1 \and \pvdash \Gamma e t \varpi.u_\ell^2:t_2}
%%         { \pvdash \Gamma e t \varpi.u_\ell^1: (\recdel {t'} \ell) + \crecord{\ell = t_2}}
%%         { }
%% \end{mathpar}
%% \[
%% \begin{array}{lcl}
%%    \constr{\varpi.u_\ell^1}{\Gamma,e,t} & = & \recdel {(\env {\Gamma,e,t} (\varpi))} \ell + \crecord{\ell =  \env {\Gamma,e,t} (\varpi.u_\ell^2)}
%% \end{array}
%% \]
%% }
Notice that the effect of doing $\recdel t \ell + \crecord{\ell \eqq
  \Any}$ corresponds to setting the field $\ell$ of the (record) type
$t$ to the type $\Any\vee\Undef$, that is, to the type of all
Giuseppe Castagna's avatar
Giuseppe Castagna committed
225
undefined fields in an open record. So \Rule{PDel} and \Rule{PUpd1}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
226
mean that if we remove, add, or redefine a field $\ell$ in an expression $e$
Giuseppe Castagna's avatar
Giuseppe Castagna committed
227
then all we can deduce for $e$ is that its field $\ell$ is undefined: since the original field was destroyed we do not have any information on it apart from the static one.
Giuseppe Castagna's avatar
Giuseppe Castagna committed
228
229
For instance, consider the test:\\[1mm]
\centerline{\(\ifty{\recupd{x}{a}{0}}{\orecord{a=\Int, b=\Bool}\vee \orecord{a=\Bool, b=\Int}}{x.b}{\False}\)}\\[1mm]
Giuseppe Castagna's avatar
Giuseppe Castagna committed
230
231
232
233
234
By  $\constr{\varpi.u_\ell^1}{\Gamma,e,t}$---i.e., by \Rule{Ext1}, \Rule{PTypeof}, and \Rule{PInter}---the type for $x$ in the positive branch is $((\orecord{a=\Int, b=\Bool}\vee \orecord{a=\Bool, b=\Int}) \land \orecord{a=\Int}) + \crecord{a\eqq \Any}$.
It is equivalent to the type $\orecord{b=\Bool}$, and thus we can deduce that $x.b$ has the type $\Bool$.

\beppe{Compare with path expressions of ~\citet{THF10} }

235
\subsection{Refining function types}\label{sec:refining}
236
237
\input{refining}

238
\subsection{Integrating gradual typing}\label{sec:gradual}
Victor Lanvin's avatar
Victor Lanvin committed
239
\input{gradual}
Giuseppe Castagna's avatar
Giuseppe Castagna committed
240