practical.tex 9.64 KB
 Giuseppe Castagna committed Feb 26, 2020 1 2 We have implemented the algorithmic system $\vdashA$. Our implementation is written in OCaml and uses CDuce as a library to  Giuseppe Castagna committed Jul 10, 2019 3 provide the semantic subtyping machinery. Besides a type-checking  Kim Nguyễn committed Jul 08, 2019 4 algorithm defined on the base language, our implementation supports  Giuseppe Castagna committed Feb 26, 2020 5 6 7 record types (Section \ref{ssec:struct}) and the refinement of function types (Section \ref{sec:refining} with the rule of Appendix~\ref{app:optimize}). The implementation is rather crude and  Giuseppe Castagna committed Jul 24, 2019 8 consists of 2000 lines of OCaml code, including parsing, type-checking  Giuseppe Castagna committed Feb 26, 2020 9 10 11 12 13 of programs, and pretty printing of types. We demonstrate the output of our type-checking implementation in Table~\ref{tab:implem} by listing some examples none of which can be typed by current systems (even though some system such as Flow and TypeScript can type some of them by adding explicit type annotation, the code 6,  Giuseppe Castagna committed Feb 26, 2020 14 7, 9, and 10 in Table~\ref{tab:implem} and, even more, the \code{and\_} and \code{xor\_} functions at the end of this  Giuseppe Castagna committed Feb 26, 2020 15 16 17 section are out of reach of current systems, even when using the right explicit annotations). These examples and others can be tested in the online toplevel available at  Giuseppe Castagna committed Jul 10, 2019 18 19 20 \url{https://occtyping.github.io/}% \ifsubmission ~(the corresponding repository is  Kim Nguyễn committed Jul 09, 2019 21 anonymized).  Giuseppe Castagna committed Jul 10, 2019 22 23 \else. \fi  Kim Nguyễn committed Jul 08, 2019 24 \input{code_table}  Giuseppe Castagna committed Jul 09, 2019 25 In this table, the second column gives a code fragment and the third  Kim Nguyễn committed Jul 08, 2019 26 column the type deduced by our implementation. Code~1 is a  Giuseppe Castagna committed Jul 09, 2019 27 28 straightforward function similar to our introductory example \code{foo} in (\ref{foo},\ref{foo2}). Here the programmer annotates the parameter of the function with a coarse type  Kim Nguyễn committed Jul 08, 2019 29 $\Int\vee\Bool$. Our implementation first type-checks the body of the  Kim Nguyễn committed Jul 08, 2019 30 function under this assumption, but doing so collects that the type of  Giuseppe Castagna committed Jul 09, 2019 31 32 33 $\texttt{x}$ is specialized to \Int{} in the then'' case and to \Bool{} in the else'' case. The function is thus type-checked twice more under each hypothesis for \texttt{x}, yielding the precise type  Giuseppe Castagna committed Feb 26, 2020 34 $(\Int\to\Int)\land(\Bool\to\Bool)$. Note that w.r.t.\ rule \Rule{AbsInf+} of Section~\ref{sec:refining}, our implementation improved the output of the computed  Giuseppe Castagna committed Jul 09, 2019 35 type. Indeed using rule~[{\sc AbsInf}+] we would obtain the  Kim Nguyễn committed Jul 08, 2019 36 37 type $(\Int\to\Int)\land(\Bool\to\Bool)\land(\Bool\vee\Int\to\Bool\vee\Int)$  Kim Nguyễn committed Jul 08, 2019 38 39 with a redundant arrow. Here we can see that since we deduced the first two arrows $(\Int\to\Int)\land(\Bool\to\Bool)$, and since  Kim Nguyễn committed Jul 09, 2019 40 the union of their domain exactly covers the domain of the third arrow,  Kim Nguyễn committed Jul 08, 2019 41 the latter is not needed. Code~2 shows what happens when the argument  Giuseppe Castagna committed Jul 09, 2019 42 of the function is left unannotated (i.e., it is annotated by the top  Giuseppe Castagna committed Jul 24, 2019 43 type \Any, written \texttt{Any}'' in our implementation). Here  Kim Nguyễn committed Jul 08, 2019 44 45 type-checking and refinement also work as expected, but the function only type checks if all cases for \texttt{x} are covered (which means  Giuseppe Castagna committed Jul 09, 2019 46 47 that the function must handle the case of inputs that are neither in \Int{} nor in \Bool).  Kim Nguyễn committed Jul 08, 2019 48 49 50  The following examples paint a more interesting picture. First (Code~3) it is  Giuseppe Castagna committed Jul 09, 2019 51 easy in our formalism to program type predicates such as those  Giuseppe Castagna committed Jul 09, 2019 52 hard-coded in the $\lambda_{\textit{TR}}$ language of \citet{THF10}. Such type  Kim Nguyễn committed Jul 08, 2019 53 54 predicates, which return \texttt{true} if and only if their input has a particular type, are just plain functions with an intersection  Giuseppe Castagna committed Jul 09, 2019 55 type inferred by the system of Section~\ref{sec:refining}. We next define Boolean connectives as overloaded  Giuseppe Castagna committed Jul 09, 2019 56 functions. The \texttt{not\_} connective (Code~4) just tests whether its  Kim Nguyễn committed Jul 08, 2019 57 argument is the Boolean \texttt{true} by testing that it belongs to  Kim Nguyễn committed Jul 09, 2019 58 59 60 61 62 63 64 65 66 the singleton type \True{} (the type whose only value is \texttt{true}) returning \texttt{false} for it and \texttt{true} for any other value (recall that $\neg\True$ is equivalent to $\texttt{Any\textbackslash}\True$). It works on values of any type, but we could restrict it to Boolean values by simply annotating the parameter by \Bool{} (which in CDuce is syntactic sugar for \True$\vee$\False) yielding the type $(\True{\to}\False)\wedge(\False{\to}\True)$. The \texttt{or\_} connective (Code~5) is straightforward as far as the  Giuseppe Castagna committed Jul 10, 2019 67 68 code goes, but we see that the overloaded type precisely captures all possible cases. Again we use a generalized version of the  Giuseppe Castagna committed Jul 09, 2019 69 \texttt{or\_} connective that accepts and treats any value that is not  Kim Nguyễn committed Jul 09, 2019 70 \texttt{true} as \texttt{false} and again, we could easily restrict the  Giuseppe Castagna committed Mar 02, 2020 71 72 domain to \Bool{} if desired.\\ \indent  Giuseppe Castagna committed Jul 09, 2019 73 To showcase the power of our type system, and in particular of  Kim Nguyễn committed Jul 08, 2019 74 the $\worra{}{}$''  Giuseppe Castagna committed Jul 09, 2019 75 type operator, we define \texttt{and\_} (Code~6) using De Morgan's  Kim Nguyễn committed Jul 08, 2019 76 Laws instead of  Giuseppe Castagna committed Jul 09, 2019 77 using a direct definition. Here the application of the outermost \texttt{not\_} operator is checked against type \True. This  Kim Nguyễn committed Jul 08, 2019 78 allows the system to deduce that the whole \texttt{or\_} application  Giuseppe Castagna committed Jul 09, 2019 79 has type \False, which in turn leads to \texttt{not\_\;x} and  Victor Lanvin committed Jul 10, 2019 80 \texttt{not\_\;y} to have type $\lnot \True$ and therefore both \texttt{x}  Kim Nguyễn committed Jul 08, 2019 81 82 83 84 85 86 and \texttt{y} to have type \True. The whole function is typed with the most precise type (we present the type as printed by our implementation, but the first arrow of the resulting type is equivalent to $(\True\to\lnot\True\to\False)\land(\True\to\True\to\True)$).  Giuseppe Castagna committed Jul 09, 2019 87 All these type predicates and Boolean connectives can be used together  Kim Nguyễn committed Jul 08, 2019 88 89 90 to write complex type tests, as in Code~7. Here we define a function \texttt{f} that takes two arguments \texttt{x} and \texttt{y}. If \texttt{x} is an integer and \texttt{y} a Boolean, then it returns the  Giuseppe Castagna committed Jul 09, 2019 91 92 integer \texttt{1}; if \texttt{x} is a character or \texttt{y} is an integer, then it returns \texttt{2}; otherwise the  Kim Nguyễn committed Jul 08, 2019 93 function returns \texttt{3}. Our system correctly deduces a (complex)  Giuseppe Castagna committed Jul 09, 2019 94 95 intersection type that covers all cases (plus several redundant arrow types). That this type is as precise as possible can be shown by the fact that  Kim Nguyễn committed Jul 08, 2019 96 when applying  Giuseppe Castagna committed Jul 09, 2019 97 98 \texttt{f} to arguments of the expected type, the \emph{type} deduced for the whole expression is the singleton type \texttt{1}, or \texttt{2},  Kim Nguyễn committed Jul 08, 2019 99 100 101 or \texttt{3}, depending on the type of the arguments. Code~8 allows us to demonstrate the use and typing of record paths. We  Giuseppe Castagna committed Jul 09, 2019 102 model, using open records, the type of DOM objects that represent XML  Kim Nguyễn committed Jul 08, 2019 103 or HTML documents. Such objects possess a common field  Giuseppe Castagna committed Jul 09, 2019 104 105 \texttt{nodeType} containing an integer constant denoting the kind of the node (e.g., \p{9} for the root element, \p{1} for an element node, \p{3} for a text node, \ldots). Depending on the kind, the object will have  Kim Nguyễn committed Jul 08, 2019 106 107 108 109 different fields and methods. It is common practice to perform a test on the value of the \texttt{nodeType} field. In dynamic languages such as JavaScript, the relevant field or method can directly be accessed after having checked for the appropriate \texttt{nodeType}. In  Giuseppe Castagna committed Jul 09, 2019 110 mainstream statically typed languages, such as Java, a downward cast  Kim Nguyễn committed Jul 08, 2019 111 from the generic \texttt{Node} type to the expected precise type of  Giuseppe Castagna committed Jul 09, 2019 112 the object is needed. We can see that using the extension presented in  Kim Nguyễn committed Jul 08, 2019 113 114 115 116 117 118 119 Section~\ref{ssec:struct} we can deduce the correct type for \texttt{x} in all cases. Of particular interest is the last case, since we use a type case to check the emptiness of the list of child nodes. This splits, at the type level, the case for the \Keyw{Element} type depending on whether the content of the \texttt{childNodes} field is the empty list or not.  Giuseppe Castagna committed Feb 26, 2020 120 Code~9 shows the usefulness of the rule \Rule{OverApp}.  Kim Nguyễn committed Feb 28, 2020 121 Consider the definition of the \texttt{xor\_} operator.  Kim Nguyễn committed Jul 08, 2019 122 123 Here the rule~[{\sc AbsInf}+] is not sufficient to precisely type the function, and using only this rule would yield a type  Giuseppe Castagna committed Feb 26, 2020 124 125 126  $\Any\to\Any\to\Bool$. \iflongversion Let us follow the behavior of the  Kim Nguyễn committed Jul 08, 2019 127 128  $\worra{}{}$'' operator. Here the whole \texttt{and\_} is requested to have type \True, which implies that \texttt{or\_ x y} must have  Giuseppe Castagna committed Jul 09, 2019 129  type \True. This can always happen, whether \texttt{x} is \True{} or  Kim Nguyễn committed Jul 08, 2019 130 131  not (but then depends on the type of \texttt{y}). The $\worra{}{}$'' operator correctly computes that the type for \texttt{x} in the  Giuseppe Castagna committed Jul 09, 2019 132  \texttt{then}'' branch is $\True\vee\lnot\True\lor\True\simeq\Any$,  Giuseppe Castagna committed Feb 26, 2020 133 134 and a similar reasoning holds for \texttt{y}. \fi%%%%%%%%%%%%%%  Kim Nguyễn committed Feb 28, 2020 135 However, since \texttt{or\_} has type  Giuseppe Castagna committed Feb 26, 2020 136 137 138 139 140 141 %\\[.7mm]\centerline{% $(\True\to\Any\to\True)\land(\Any\to\True\to\True)\land (\lnot\True\to\lnot\True\to\False)$ %}\\[.7mm] then the rule \Rule{OverApp} applies and \True, \Any, and $\lnot\True$ become candidate types for \texttt{x}, which allows us to deduce the precise type given in the table. Finally, thanks to rule \Rule{OverApp} it is not necessary to use a type case to force refinement. As a consequence we can define the functions \texttt{and\_} and \texttt{xor\_} more naturally as:  Giuseppe Castagna committed Jul 10, 2019 142 \begin{alltt}\color{darkblue}\morecompact  Giuseppe Castagna committed Jul 10, 2019 143 144  let and_ = fun (x : Any) -> fun (y : Any) -> not_ (or_ (not_ x) (not_ y)) let xor_ = fun (x : Any) -> fun (y : Any) -> and_ (or_ x y) (not_ (and_ x y))  Giuseppe Castagna committed Jul 09, 2019 145 146 \end{alltt} for which the very same types as in Table~\ref{tab:implem} are deduced.  Kim Nguyễn committed Jul 11, 2019 147   Giuseppe Castagna committed Jul 11, 2019 148 149 150 Last but not least Code~10 (corresponding to our introductory example~\eqref{nest1}) illustrates the need for iterative refinement of type environments, as defined in Section~\ref{sec:typenv}. As  Giuseppe Castagna committed Jul 11, 2019 151 explained, a single pass analysis would deduce  Kim Nguyễn committed Jul 11, 2019 152 for {\tt x}  Giuseppe Castagna committed Jul 11, 2019 153 a type \Int{} from the {\tt f\;x} application and \Any{} from the {\tt g\;x}  Kim Nguyễn committed Jul 11, 2019 154 application. Here by iterating a second time, the algorithm deduces  Giuseppe Castagna committed Jul 11, 2019 155 that {\tt x} has type $\Empty$ (i.e., $\textsf{Empty}$), that is that the first branch can never  Giuseppe Castagna committed Jul 11, 2019 156 be selected (and our implementation warns the user accordingly). In hindsight, the only way for a well-typed overloaded function to have  Giuseppe Castagna committed Jul 11, 2019 157 type $(\Int{\to}\Int)\land(\Any{\to}\Bool)$ is to diverge when the  Giuseppe Castagna committed Jul 11, 2019 158 argument is of type \Int: since this intersection type states that  Kim Nguyễn committed Jul 11, 2019 159 160 161 162 163 whenever the input is \Int, {\em both\/} branches can be selected, yielding a result that is at the same time an integer and a Boolean. This is precisely reflected by the case $\Int\to\Empty$ in the result. Indeed our {\tt example10} function can be applied to an integer, but at runtime the application of {\tt f ~x} will diverge.