Commit 94bdbdfe authored by Pierre Letouzey's avatar Pierre Letouzey
Browse files

td4: english version (plus a few extra details)

parent 8a7746f6
TD4 : Lists with Fast Random Access
**M2 LMFI**
We consider here data structures that are purely functional (also said *persistent* or *immutable*) and allow us to encode *lists*. We have already seen the Coq standard implementation of lists, but more generally a list is here a finite sequence of elements where the order of elements in the list is meaningful and where the operations "on the left" are efficient, both a `cons` extension operation and a `head` and a `tail` access functions.
Actually, instead of two separated `head` and `tail` functions, we will consider here a unique function `uncons` doing both. More precisely, `uncons l = Some (h,t)` whenever the list has a head `h` and a tail `t` and `uncons l = None` whenever `l` is empty.
We will also focus on a `nth` function for our lists, allowing us to perform "random" access anywhere in a liste, not just on the left.
The goal of this work is to implement lists in various manners, ending with `nth` functions of logarithmic complexity without compromising too much the cost of functions `cons` and `uncons` (and ideally keeping this cost constant). Here, the complexity we consider is the number of access to any part of any innner substructure, in the worst case, expressed in function of the number of elements present in the list. In a first time, we neglect the cost of any arithmetical operations we may perform on integers.
#### Exercise 1 : Implementation via regular Coq lists
Start a *module* to isolate the code of this exercise from the next ones :
Module RegularList.
Implement the following operations on the Coq usual `list` datatype, and give their complexity.
cons : forall {A}, A -> list A -> list A.
uncons : forall {A}, list A -> option (A * list A).
nth : forall {A}, list A -> nat -> option A.
Finish the previous module :
End RegularList.
#### Exercise 2 : Implementation via b-lists (a.k.a binary lists)
We will now devise a data-structure where both `cons`, `uncons` and `nth` operations will all be logarithmic (at worst).
We call *b-list* a (regular) list of perfect binary trees with the following properties : the datas are stored at the leaves of the trees, and the sizes of the trees are strictly increasing when going through the external list from left to right. The elements of a b-list are the elements of the leftmost tree (from left to right) then the elements of the next tree, until the elements of the rightmost tree.
- Start again a module dedicated to this exercise : `Module BList.`
- Define a Coq type `blist : Type -> Type` corresponding to b-list, i.e. list of binary trees with data at the leaves. No need to enforce in `blist` the other constraints (trees that are all perfect and are of strictly increasing sizes). But you should always maintain these invariants when programming with `blist`. And if you wish you may write later boolean tests checking whether a particular `blist` fulfills these invariants.
- Write some examples of small b-lists. In particular how is encoded the empty b-list ? Could we have two b-lists of different shape while containing the same elements ?
- (Optional) Adjust your `blist` definition in such a way that retrieving the sizes (or depth) of a tree inside a `blist` could be done without revisiting the whole tree.
- On this `blist` type, implement operations `cons`, `uncons` and `nth`. Check that all these operations are logarithmic (if you did the last suggested step).
- Finish the current module : `End BList.`
This b-list structure shows that a fast random access in a list-like structure is indeed possible. But this comes here at an extra cost : the operations "on the left" (`cons` and `uncons`) have a complexity that is not constant anymore. We'll see now how to get the best of the two worlds. But first, some arithmetical interlude : in the same way the b-lists were closely related with the binary decomposition of number, here comes an alternative decomposition that may help us in exercise 4.
#### Exercise 3 : Skew binary number system
We call skew binary decomposition (or sb-decomposition) of a number its decomposition as sum of numbers of the form 2^k -1 with k>0. Moreover all the terms in this sum mut be differents, except possibly the two smallest ones.
- Write a `decomp` function computing a sb-decomposition for any natural number. Could we have different ordered sb-decompositions of the same number ?
- Write two functions `next` and `pred` that both take the sb-decomposition of a number, and compute the sb-decomposition of its successor (resp. precedessor), without trying to convert back the number in a more standard representation, and moreover without using any recursivity (no `Fixpoint`) apart from a possible use of `Nat.eqb`.
For the last section, we admit that the sb-decomposition of a number `n` is a sum whose number of terms is logarithmic in `n`.
#### Exercise 4 : Implementation via sb-lists (skew binary lists)
- Based on all previous questions, propose a data-structure of *skew binary lists* for which `cons` and `uncons` have constant complexity while `nth` is logarithmic.
- Compare these sb-lists with the usual Coq lists : what reason may explain that sb-lists are not used universally instead of usual lists ?
#### Possible extensions
- For b-lists and sb-lists, code a function `drop` of logarithmic complexity such that `drop k l` returns the list `l` except its first `k` elements.
- For b-lists and sb-lists, code a function `update_nth` such that `update_nth l n a` is either `Some l'` when `l'` is `l` except for `a` at position `n`, or `None` when `n` is not a legal position in `l`.
- Use a binary representation of numbers instead of `nat`, and compute the complexity of all operations b-list and sb-list operations when taking in account the cost of the arithmetical sub-operations.
#### Reference
- <>
- Okasaki, Purely Functional Data Structures, 1998, Cambridge University Press
......@@ -47,13 +47,13 @@ On admet que la qb-écriture de `n` est une somme dont le nombre de termes est l
#### Exercice 4 : Implémentation par qb-listes
- En vous inspirant de ce qui précède, proposer une structure de *listes quasi-binaire* (ou qb-listes) pour laquelle `cons` et `decons` ont des complexités constantes tandis que `nth` est logarithmiques.
- En vous inspirant de ce qui précède, proposer une structure de *listes quasi-binaire* (ou qb-listes) pour laquelle `cons` et `decons` ont des complexités constantes tandis que `nth` est logarithmique.
- Comparer ces qb-listes avec les listes usuelles : quelles raisons peuvent expliquer la faible adoption des qb-listes en lieu et place des listes usuelles ?
#### Extensions possibles
- Coder sur les b-listes et les qb-listes une opération `drop` de complexité logarithmique, telle que `drop k l` retourne la b-liste `l` privée de ses `k` premiers arguments.
- Coder sur les b-listes et les qb-listes une opération `drop` de complexité logarithmique, telle que `drop k l` retourne la b-liste `l` privée de ses `k` premiers éléments.
- Coder sur les b-listes et les qb-listes une opération `update_nth` telle que `update_nth l n a` est une valeur optionnelle contenant une nouvelle liste où le n-ième élément a été changé en `a`, ou bien `None` si `n` n'est pas une position légale dans `l`.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment