Commit 3a69bccd authored by Pietro Abate's avatar Pietro Abate
Browse files

[r2003-07-08 09:23:43 by cvscast] Empty log message

Original author: cvscast
Date: 2003-07-08 09:23:44+00:00
parent e02861b7
......@@ -7,6 +7,7 @@
<include file="tutorial/getting_started.xml"/>
<include file="tutorial/first_functions.xml"/>
<include file="tutorial/overloading.xml"/>
<include file="tutorial/patterns.xml"/>
<left>
<p>
......
......@@ -74,7 +74,7 @@ code.
</section>
<section title="Complex example">
<section title="A more complex example">
<a name="canonical"/>
<p>
Let us examine a more complex example. Recall the types used to represent persons
......
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<page name="tutorial_patterns">
<title>Patterns for dummies</title>
<banner>Patterns for dummies</banner>
<left>
<boxes-toc/>
</left>
<box title="Key concepts" link="p1">
<b style="color:#FF0080">TO BE DONE</b>
</box>
<box title="Recursive patterns" link="pr">
<p>
Recursive patterns use the same syntax as recursive types:
<code>%%P%% where %%P1%%=%%p1%% and ... and %%Pn%%=%%pn%%</code> with <i>P, P1,..., Pn</i>
being variables ranging over pattern identifiers (i.e.,
identifiers starting by a capital letter). Recursive
patterns allow one to express complex extraction of information from
the matched value. For instance, consider the pattern
<code>P where P = (x &amp; Int, _) | (_, P)</code>; it extracts from a sequence the first
element of type <code>Int</code> (recall that sequences are
encoded with pairs). The order
is important, because
the pattern <code>P where P = (_, P) | (x &amp; Int, _)</code>
extracts the <i>last</i> element of type <code>Int</code>.
</p>
<p>
A pattern may also extract and reconstruct a subsequence,
using the convention described before that when a capture variable appears
on both sides of a pair pattern, the two values bound
to this variable are paired together.
For instance, <code>P where P = (x &amp; Int, P) | (_, P) | (x := `nil)</code>
extracts all the elements of type <code>Int</code> from a sequence (<code>x</code>
is bound to the sequence containing them)
and the pattern <code>P where P = (x &amp; Int, (x &amp; Int, _)) | (_, P)</code>
extracts the first pair of consecutive integers.
</p>
</box>
<box title="Regular expression patterns" link="pre">
<p>
CDuce provides syntactic sugar for defining patterns working on
sequences with regular expressions built from patterns, usual regular
expression operators, and <i>sequence capture variables</i> of the form <code>x::%%R%%</code>
(where <i>R</i> is a pattern regular expression).
</p>
<p>
Regular expression operators <code>*</code>, <code>+</code>, <code>?</code> are
<i>greedy</i> in the sense that they try to match as many times as possible.
Ungreedy versions <code>*?</code>, <code>+?</code> and <code>??</code>
are also provided; the difference in the compilation scheme
is just a matter of order in alternative patterns.
For instance, <code>[_* (x &amp; Int) _*]</code> is compiled
to <code>P where P = (_,P) | (x &amp; Int, _)</code>
while <code>[_*? (x &amp; Int) _*]</code> is compiled
to <code>P where P = (x &amp; Int, _) | (_,P)</code>.
</p>
<p>
Let us detail the compilation of an example with a sequence capture variable:
</p>
<sample><![CDATA[
[ _*? d::(Echar+ '.' Echar+) ]
]]></sample>
<p>
The first step is
to propagate the variable down to simple patterns:
</p>
<sample><![CDATA[
[ _*? (d::Echar)+ (d::'.') (d::Echar)+ ]]
]></sample>
<p>
which is then
compiled to the recursive pattern:
</p>
<sample><![CDATA[
P where P = (d & Echar, Q) | (_,P)
and Q = (d & Echar, Q) | (d & '.', (d & Echar, R))
and R = (d & Echar, R) | (d & `nil)
]]></sample>
<p>
The <code>(d &amp; `nil)</code>
pattern above has a double purpose: it checks that the end
of the matched sequence has been reached, and it binds <code>d</code> to
<code>`nil</code>, to create the end of the new sequence.
</p>
<p>
Note the difference between <code>[ x&amp;Int ]</code>
and <code>[ x::Int ]</code>. Both patterns accept sequences
formed of a single integer <code>{{i}}</code>, but the first one binds <code>{{i}}</code> to <code>x</code>,
whereas the second one binds to <code>x</code> the sequence <code>[{{i}}]</code>.
</p>
<p>
A mix of greedy and ungreedy operators with the first match policy of alternate
patterns allows the definition of powerful extractions. For instance, one can
define a function that for a given person returns the first work phone number if
any, otherwise the last e-mail, if any, otherwise any telephone number, or the
string <code>"no contact"</code>:
</p>
<sample><![CDATA[
let preferred_contact(Person->String)
<_>[ _ _ ( _*? <tel kind="work">x) | (_* <email>x) | <tel>x ] -> x
| _ -> "no contact"
]]></sample>
<p>
(note that <code>&lt;tel>x</code> does not need to be preceded by any wildcard pattern as it is
the only possible remaining case).
</p>
</box>
</page>
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment