practical.tex 8.74 KB
 Kim Nguyễn committed Jul 08, 2019 1 2 3 We have implemented a simplified version of the algorithm presented in Section~\ref{ssec:algorithm}. In particular, our implementation does not make use of type schemes and is therefore incomplete. In  Kim Nguyễn committed Jul 08, 2019 4 particular, as we explained in Section~\ref{ssec:algorithm}, in the absence of type  Kim Nguyễn committed Jul 08, 2019 5 scheme it is not always possible to prove that $\forall v, \forall t,  Kim Nguyễn committed Jul 08, 2019 6 v \in t \text{~or~} v \not\in \lnot t$. However this is property does  Kim Nguyễn committed Jul 08, 2019 7 8 9 10 11 12 13 not hold only for lambda expressions, therefore not using type schemes only yields imprecise typing for tests of the form \begin{displaymath} \ifty{(\lambda^{\bigwedge_{i\in I}t_i \to s_i} x. e)}{t}{e_1}{e_2} \end{displaymath} This seems like a reasonable compromise between the complexity of an implementation involving type scheme and the programs we want to  Kim Nguyễn committed Jul 08, 2019 14 type-check in practice.  Giuseppe Castagna committed Jan 03, 2019 15   Kim Nguyễn committed Jul 08, 2019 16 Our implementation is written in OCaml and uses CDuce as a library to  Kim Nguyễn committed Jul 08, 2019 17 provide the semantic sub-typing machinery. Besides a type-checking  Kim Nguyễn committed Jul 08, 2019 18 19 20 21 22 23 24 25 26 27 algorithm defined on the base language, our implementation supports record types (Section \ref{ssec:struct}) and the refinement of function types (Section \ref{sec:refining}). The implementation is rather crude and consits of 2000 lines of OCaml code, including parsing, type-checking of programs and pretty printing of types. We demonstrate the output of our type-checking implementation in Table~\ref{tab:implem}. \input{code_table} In this table, the second column gives a code fragment and third column the type deduced by our implementation. Code~1 is a straightforward function similar to our introductory example. Here the  Kim Nguyễn committed Jul 08, 2019 28 29 programmer annotates the domain of the function with a coarse type $\Int\vee\Bool$. Our implementation first type-checks the body of the  Kim Nguyễn committed Jul 08, 2019 30 31 function under this assumption, but doing so collects that the type of $\texttt{x}$ is specialized to \Int in the then'' case and to \Bool  Kim Nguyễn committed Jul 08, 2019 32 in the else'' case. The function is therefore type-checked twice more  Kim Nguyễn committed Jul 08, 2019 33 34 35 36 37 38 under both hypothesis for \texttt{x} yielding the precise type $(\Int\to\Int)\land(\Bool\to\Bool)$. Note that w.r.t to rule [{\sc AbsInf}+] of Section~\{sec:refining}, we further improve the computed output type. Indeed using rule~[{\sc AbsInf}+] we would obtain the type $(\Int\to\Int)\land(\Bool\to\Bool)\land(\Bool\vee\Int\to\Bool\vee\Int)$  Kim Nguyễn committed Jul 08, 2019 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 with a redundant arrow. Here we can see that since we deduced the first two arrows $(\Int\to\Int)\land(\Bool\to\Bool)$, and since the union of their domain exactly covers the domain the third arrow, the latter is not needed. Code~2 shows what happens when the argument of the function is left unannotated (that is, annotated by the top type \Any written \texttt{Any} in our implementation). Here the type-checking and refinement also work as expected, but the function only type checks if all cases for \texttt{x} are covered (which means that the function must handle the case of inputs that are neither \Int nor \Bool). The following examples paint a more interesting picture. First (Code~3) it is easy in our formalism to provide type predicates such as those hard-coded in the $\lambda_{TR}$ language of \cite{THF10}. Such type predicates, which return \texttt{true} if and only if their input has a particular type, are just plain functions with an intersection type. We then define Boolean combinators as overloaded functions. The \texttt{not\_} combinator just tests whether its argument is the Boolean \texttt{true} by testing that it belongs to the singleton type \True. The \texttt{or\_} combinator (Code~5) is defined as a curryfied function, that first type cases on its argument. Depending on the type it may return either a constant function that returns \texttt{true} for any input, or, a function that discriminates on its argument (which is the second argument of the \texttt{or\_}) and returns \texttt{true} or \texttt{false} accordingly. Notice that we use here a generalized version of the \texttt{or\_} operation that accepts and treats any value that is not \texttt{true} as \texttt{false}. Again, we could restrict the initial domain to \Bool if we wanted. To showcase the power of our type system, and in particular the $\worra{}{}$'' type operation, we define \texttt{and\_} (Code~6) using De Morgan's Laws instead of using a direct definition. Here the application of the outer most \texttt{not\_} operator is checked against type \True. This allows the system to deduce that the whole \texttt{or\_} application has type \False, which in turn leads to \texttt{not\_ x} and \texttt{not\_ y} to have type $\lnot \True$ and therefore \texttt{x} and \texttt{y} to have type \True. The whole function is typed with the most precise type (we present the type as printed by our implementation, but the first arrow of the resulting type is equivalent to $(\True\to\lnot\True\to\False)\land(\True\to\True\to\True)$). All these type predicates and Boolean combinator can be used together to write complex type tests, as in Code~7. Here we define a function \texttt{f} that takes two arguments \texttt{x} and \texttt{y}. If \texttt{x} is an integer and \texttt{y} a Boolean, then it returns the integer \texttt{1}. Otherwise, if \texttt{x} is a character or \texttt{y} is an integer, it returns \texttt{2} and otherwise the function returns \texttt{3}. Our system correctly deduces a (complex) intersection types that covers all cases (plus several redundant arrow types). The fact that is type is precise can be shown by the fact that when applying \texttt{f} to arguments of the expected type, the deduced type for the whole expression is either the singleton type \texttt{1}, \texttt{2} or \texttt{3}, depending on the type of the arguments. Code~8 allows us to demonstrate the use and typing of record paths. We model, using open records the type of DOM objects that represent XML or HTML documents. Such objects possess a common field \texttt{nodeType} containing an integer constants denoting the kind of the node (\emph{e.g.} 9 for the root element, 1 for an element node and 3 for a text node). Depending on the kind, the object will have different fields and methods. It is common practice to perform a test on the value of the \texttt{nodeType} field. In dynamic languages such as JavaScript, the relevant field or method can directly be accessed after having checked for the appropriate \texttt{nodeType}. In mainstream statically typed languages, such as Java, downward cast from the generic \texttt{Node} type to the expected precise type of the object. We can see that using the extension presented in Section~\ref{ssec:struct} we can deduce the correct type for \texttt{x} in all cases. Of particular interest is the last case, since we use a type case to check the emptiness of the list of child nodes. This splits, at the type level, the case for the \Keyw{Element} type depending on whether the content of the \texttt{childNodes} field is the empty list or not. Our implementation features one last improvement that allows us further improve the precision of the inferred type. Consider the definition of the \texttt{xor\_} operator (Code~9). Here the rule~[{\sc AbsInf}+] is not sufficient to precisely type the function, and using only this rule would yield a type $\Any\to\Any\to\Bool$. Lets simulate the behaviour of the $\worra{}{}$'' operator. Here the whole \texttt{and\_} is requested to have type \True, which implies that \texttt{or\_ x y} must have type \True. This can always happen, whether \texttt{x} is \True or not (but then depends on the type of \texttt{y}). The $\worra{}{}$'' operator correctly computes that the type for \texttt{x} in the  \texttt{then}'' branch is $\True\vee\lnot\True\lor\True\equiv\Any$, and a similar reasoning holds for \texttt{y}. To solve this problem, we can first remark that even though type cases in the body of a function are the tipping points that may change the type of the result of the function; they are not the only ones. Indeed, Applications of overloaded functions plays a similar role. We therefore extend judgement $\Gamma \vdash e\triangleright\psi$ with the following rule\\ \centerline{$$\Infer[OverApp] { \Gamma \vdash e : \bigvee \bigwedge_{i \in I}t_i\to{}s_i\\ \Gamma \vdash x : t\\ \Gamma \vdash e\triangleright\psi_1\\ \Gamma \vdash x\triangleright\psi_2\\ } { \Gamma \vdash {e}{~x}\triangleright\psi_1\cup\psi_2\cup\bigcup_{i\in I}\{ x\mapsto t\wedge t_i\} } {}$$} With this rule, whenever a function parameter is applied to an overloaded function, we record as possible types for this parameter all the possible domains of the arrows that make the overloaded function, restricted to the static type of the parameter. In Code~9, since, \texttt{or\_} has type\\ \centerline{$(\True\to\Any\to\True)\land(\Any\to\True\to\True)\land (\lnot\True\to\lnot\True\to\False)$} We consider \True, \Any and $\lnot\True$ as candidate types for \texttt{x} which, in turn allows us to deduce a precise type.