Commit 3a69bccd authored by Pietro Abate's avatar Pietro Abate
Browse files

[r2003-07-08 09:23:43 by cvscast] Empty log message

Original author: cvscast
Date: 2003-07-08 09:23:44+00:00
parent e02861b7
......@@ -7,6 +7,7 @@
<include file="tutorial/getting_started.xml"/>
<include file="tutorial/first_functions.xml"/>
<include file="tutorial/overloading.xml"/>
<include file="tutorial/patterns.xml"/>
......@@ -74,7 +74,7 @@ code.
<section title="Complex example">
<section title="A more complex example">
<a name="canonical"/>
Let us examine a more complex example. Recall the types used to represent persons
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<page name="tutorial_patterns">
<title>Patterns for dummies</title>
<banner>Patterns for dummies</banner>
<box title="Key concepts" link="p1">
<b style="color:#FF0080">TO BE DONE</b>
<box title="Recursive patterns" link="pr">
Recursive patterns use the same syntax as recursive types:
<code>%%P%% where %%P1%%=%%p1%% and ... and %%Pn%%=%%pn%%</code> with <i>P, P1,..., Pn</i>
being variables ranging over pattern identifiers (i.e.,
identifiers starting by a capital letter). Recursive
patterns allow one to express complex extraction of information from
the matched value. For instance, consider the pattern
<code>P where P = (x &amp; Int, _) | (_, P)</code>; it extracts from a sequence the first
element of type <code>Int</code> (recall that sequences are
encoded with pairs). The order
is important, because
the pattern <code>P where P = (_, P) | (x &amp; Int, _)</code>
extracts the <i>last</i> element of type <code>Int</code>.
A pattern may also extract and reconstruct a subsequence,
using the convention described before that when a capture variable appears
on both sides of a pair pattern, the two values bound
to this variable are paired together.
For instance, <code>P where P = (x &amp; Int, P) | (_, P) | (x := `nil)</code>
extracts all the elements of type <code>Int</code> from a sequence (<code>x</code>
is bound to the sequence containing them)
and the pattern <code>P where P = (x &amp; Int, (x &amp; Int, _)) | (_, P)</code>
extracts the first pair of consecutive integers.
<box title="Regular expression patterns" link="pre">
CDuce provides syntactic sugar for defining patterns working on
sequences with regular expressions built from patterns, usual regular
expression operators, and <i>sequence capture variables</i> of the form <code>x::%%R%%</code>
(where <i>R</i> is a pattern regular expression).
Regular expression operators <code>*</code>, <code>+</code>, <code>?</code> are
<i>greedy</i> in the sense that they try to match as many times as possible.
Ungreedy versions <code>*?</code>, <code>+?</code> and <code>??</code>
are also provided; the difference in the compilation scheme
is just a matter of order in alternative patterns.
For instance, <code>[_* (x &amp; Int) _*]</code> is compiled
to <code>P where P = (_,P) | (x &amp; Int, _)</code>
while <code>[_*? (x &amp; Int) _*]</code> is compiled
to <code>P where P = (x &amp; Int, _) | (_,P)</code>.
Let us detail the compilation of an example with a sequence capture variable:
[ _*? d::(Echar+ '.' Echar+) ]
The first step is
to propagate the variable down to simple patterns:
[ _*? (d::Echar)+ (d::'.') (d::Echar)+ ]]
which is then
compiled to the recursive pattern:
P where P = (d & Echar, Q) | (_,P)
and Q = (d & Echar, Q) | (d & '.', (d & Echar, R))
and R = (d & Echar, R) | (d & `nil)
The <code>(d &amp; `nil)</code>
pattern above has a double purpose: it checks that the end
of the matched sequence has been reached, and it binds <code>d</code> to
<code>`nil</code>, to create the end of the new sequence.
Note the difference between <code>[ x&amp;Int ]</code>
and <code>[ x::Int ]</code>. Both patterns accept sequences
formed of a single integer <code>{{i}}</code>, but the first one binds <code>{{i}}</code> to <code>x</code>,
whereas the second one binds to <code>x</code> the sequence <code>[{{i}}]</code>.
A mix of greedy and ungreedy operators with the first match policy of alternate
patterns allows the definition of powerful extractions. For instance, one can
define a function that for a given person returns the first work phone number if
any, otherwise the last e-mail, if any, otherwise any telephone number, or the
string <code>"no contact"</code>:
let preferred_contact(Person->String)
<_>[ _ _ ( _*? <tel kind="work">x) | (_* <email>x) | <tel>x ] -> x
| _ -> "no contact"
(note that <code>&lt;tel>x</code> does not need to be preceded by any wildcard pattern as it is
the only possible remaining case).
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment