Commit bd2403d6 authored by Pietro Abate's avatar Pietro Abate
Browse files

[r2005-04-28 17:03:13 by cmiachon] finishing the select from where part.

Original author: cmiachon
Date: 2005-04-28 17:03:13+00:00
parent b114f659
......@@ -21,13 +21,13 @@ TUTORIAL
- finish the sections on patterns and references
- rewrite section on queries
- DONE: rewrite section on queries
ALL
===
(PDF) Add page numbers
DONE: (PDF) Add page numbers
(PDF) Add table of contents
DONE: (PDF) Add table of contents
......@@ -156,12 +156,13 @@ val titles : [ title* ] = [ <title>[ 'TCP/IP Illustrated' ]
</li>
<li>All authors in the bibliography biblio
<sample><![CDATA[
let authors = [biblio]/book/author
let authors = [biblio]/book/<author>_
]]>
</sample>
<p> Yielding the result: </p>
<sample><![CDATA[
val authors : [ author* ] = [ <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ]
val authors : [ <author>[ last first ]* ] =
[ <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ]
<author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ]
<author>[
<last>[ 'Abiteboul' ]
......@@ -173,6 +174,12 @@ val authors : [ author* ] = [ <author>[ <last>[ 'Stevens' ] <first>[ 'W.' ] ]
]
]]>
</sample>
<p>Note the difference between this two projections.<br/>
In the fist one, we use the preset type title (<![CDATA[type title = <title>[PCDATA ]]]>
).<br/>
In the second one, the type <![CDATA[<author>_]]> means all the xml fragments beginning by the tag author ( _ means Any),
and this tag is without attribute. In contrary, we write note <![CDATA[<author ..>_]]>.
</p>
</li>
<li> All books having an editor in the bibliography biblio
<sample><![CDATA[
......@@ -306,7 +313,8 @@ val freebooks : [ <price>[ '0' ]* ] = ""
]]>
</sample>
<p>
this projection returns the empty sequence (<code>""</code>)</p>
There is no free books in this bibliography, That is not indicated by the type of biblio.
Then, this projection returns the empty sequence (<code>""</code>)</p>
</section>
......@@ -316,35 +324,32 @@ this projection returns the empty sequence (<code>""</code>)</p>
<p> All the titles </p>
<sample><![CDATA[
let tquery = select y
from x in [bib]/<paper>_ ,
y in [x]/<title>_
from x in [biblio]/book ,
y in [x]/title
]]>
</sample>
<p> This query is programmed in a XQuery-like style largely relying on the projections. Note that <code>x</code> and <code>y</code> are CDuce's patterns. The result is:</p>
<sample> <![CDATA[
val tquery : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
val tquery : [ title* ] = [ <title>[ 'TCP/IP Illustrated' ]
<title>[ 'Advanced Programming in the Unix environment' ]
<title>[ 'Data on the Web' ]
<title>[ 'The Economics of Technology and Content for Digital TV' ]
]
]]>
</sample>
<p> Now let's program the same query with the translation given previously thus eliminating the <code>y</code> variable </p>
<sample><![CDATA[
let withouty = flatten(select [x] from x in [bib]/<paper>_/<title>_)
let withouty = flatten(select [x] from x in [biblio]/book/title)
]]>
</sample>
<p> Yielding: </p>
<sample><![CDATA[
val withouty : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
- : [ <title>[ Char* ]* ] = [ <title>[ 'The Relevance of Semantic Subtyping' ] ]
- : [ <title>[ Char* ]* ] = [ <title>[ 'The Relevance of Semantic Subtyping' ] ]
Ok.
val tquery : [ title* ] = [ <title>[ 'TCP/IP Illustrated' ]
<title>[ 'Advanced Programming in the Unix environment' ]
<title>[ 'Data on the Web' ]
<title>[ 'The Economics of Technology and Content for Digital TV' ]
]
]]>
</sample>
......@@ -353,32 +358,26 @@ Ok.
<p>
But the <code>select_from_where</code> expressions are likely to be used for
more complex queries such as the one that selects all titles whose at least one
author is "Alain Frisch" or "Veronique Benzaken"
author is "Peter Buneman" or "Dan Suciu"
</p>
<sample><![CDATA[
let sel = select y
from x in [bib]/<paper>_ ,
y in [x]/<title>_,
z in [x]/<author>_
where ((z = <author>"Alain Frisch") || (z = <author>"Veronique Benzaken"));;
from x in [biblio]/book ,
y in [x]/title,
z in [x]/author
where ( (z = <author>[<last>['Buneman']<first>['Peter']])
|| (z = <author>[<last>['Suciu'] <first>['Dan']]) )
]]>
</sample>
<p> Which yields: </p>
<sample><![CDATA[
val sel : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
<title>[ 'CDuce: a white-paper' ]
]
Ok.
val sel : [ title* ] = [ <title>[ 'Data on the Web' ]
<title>[ 'Data on the Web' ]
]
]]>
</sample>
<p>Note that the corresponding semantics, as in SQL, is a multiset one.
Thus duplicates are not eliminated. To discard them, one has to use the <code>distinct_values</code> operator.
</p>
......@@ -393,25 +392,29 @@ any XPath-like projections)
<sample><![CDATA[
let sel = select t
from <_ ..>[(x::<paper>_ | _ )*] in [bib],
<_ ..>[ _* (<author>"Alain Frisch" | <author>"Veronique Benzaken") _* (t&<title>_ ); _] in x
from <_ ..>[(x::book| _ )*] in [biblio],
<_ ..>[ t&title _* (<author>[<last>['Buneman']<first>['Peter']]| <author>[<last>['Suciu'] <first>['Dan']]) _* ; _] in x
]]>
</sample>
<p>
Note the pattern on the second line in the <code> from </code> clause. As the type of an element in <code>x</code> is <code><![CDATA[<paper>[ Author+ Title Conference File]]]></code>, we skip the tag : <code><![CDATA[<_>]]></code>, then we skip authors <code><![CDATA[_*]]> </code> until we find either Alain Frisch or Veronique Benzaken <code><![CDATA[ (<author>"Alain Frisch" | <author>"Veronique Benzaken")]]></code>, then we skip the remaining authors <code>_*</code>, we then capture the corresponding title <code><![CDATA[(t &<title>_)]]></code> and then ignore the tail of the sequence by writing <code>; _</code>
Note the pattern on the second line in the <code> from </code> clause.
As the type of an element in <code>x</code> is
<code><![CDATA[type book = <book year=String>[title (author+ | editor+ ) publisher price ]]]></code>,
we skip the tag :
<code><![CDATA[<_ ..>]]></code>, then
we then capture the corresponding title <code><![CDATA[(t &title)]]></code>
then we skip authors <code><![CDATA[_*]]> </code>
until we find either Peter Buneman or Dans Suciu
<code><![CDATA[ (<author>[<last>['Buneman']<first>['Peter']]| <author>[<last>['Suciu'] <first>['Dan']])]]></code>,
then we skip the remaining authors <code>_*</code>,
and then ignore the tail of the sequence by writing <code>; _</code>
</p>
<p>
Result:
</p>
<sample><![CDATA[
val sel : [ <title>[ Char* ]* ] = [ <title>[ 'Semantic subtyping' ]
<title>[ 'The Relevance of Semantic Subtyping' ]
<title>[ 'CDuce: a white-paper' ]
]
Ok.
]]>
val sel : [ title* ] = [ <title>[ 'Data on the Web' ] ]]]>
</sample>
<p>
This pure pattern form of the query yields (in general) better performance than
......@@ -423,25 +426,44 @@ optimiser automatically translates the latter into a pure pattern one
<p>
This example is the exact transcription of query Q5 of XQuery use cases.
We first give the corresponding CDuce types. We leave the user in charge of creating the corresponding relevant values.
This example is the exact transcription of <a href="http://www.w3.org/TR/xquery-use-cases/#xmp-queries-results-q5">query Q5 of XQuery use cases</a>.
On top of this section we give the corresponding CDuce types.
We give here the type of the document to be joined, and the sample value.
</p>
<sample><![CDATA[
type Bib = <bib>[Book*]
type Book = <book year=String id=?String>[
Title (Author+ | Editor+ ) Publisher Price]
type Author = <author>[Last First]
type Editor = <editor>[Last First Affiliation]
type Title = <title>[PCDATA]
type Last = <last>[PCDATA]
type First = <first>[PCDATA]
type Affiliation = <affiliation>[PCDATA]
type Publisher = <publisher>[PCDATA]
type Price = <price>[PCDATA]
type Reviews =<reviews>[Entry*]
type Entry = <entry> [ Title Price Review]
type Title = <title>[PCDATA]
type Price= <price>[PCDATA]
type Review =<review>[PCDATA]
let bstore2 : Reviews =
<reviews>[
<entry>[
<title>['Data on the Web']
<price>['34.95']
<review>
['A very good discussion of semi-structured database
systems and XML.']
]
<entry>[
<title>['Advanced Programming in the Unix environment']
<price>['65.95']
<review>
['A clear and detailed discussion of UNIX programming.']
]
<entry>[
<title>['TCP/IP Illustrated']
<price>['65.95']
<review>
['One of the best books on TCP/IP.']
]
]
]]>
</sample>
<p>
The queries are expressed first in an XQuery-like style, then in a pure pattern style: the first pattern-based query is the one produced by the automatic translation from the first one. The last query correponds to a pattern aware programmer's version.
</p>
......@@ -450,30 +472,29 @@ XQuery style
</p>
<sample><![CDATA[
<books-with-prices>
select <book-with-price>[t1
<price-amazon>([p2]/_) <price-bn>([p1]/_)]
from b in [biblio]/Book ,
t1 in [b]/Title,
e in [amazon]/Entry,
select <book-with-price>[t1 <price-bstore1>([p1]/Char) <price-bstore2>([p2]/Char)]
from b in [biblio]/book ,
t1 in [b]/title,
e in [bstore2]/Entry,
t2 in [e]/Title,
p2 in [e]/Price,
p1 in [b]/Price
where t1=t2
p1 in [b]/price
where t1=t2
]]>
</sample>
<p> Automatic translation of the previous query into a pure pattern (thus more efficient) one </p>
<sample><![CDATA[
<books-with-prices>
select <book-with-price>[t1 <price-amazon>x11 <price-bn>x10 ]
from <_ ..>[(x3::Book|_)*] in [biblio],
<_ ..>[(x9::Price|x5::Title|_)*] in x3,
select <book-with-price>[t1 <price-bstore1>x10 <price-bstore2>x11 ]
from <_ ..>[(x3::book|_)*] in [biblio],
<_ ..>[(x9::price|x5::title|_)*] in x3,
t1 in x5,
<_ ..>[(x6::Entry|_)*] in [amazon],
<_ ..>[(x6::Entry|_)*] in [bstore2],
<_ ..>[(x7::Title|x8::Price|_)*] in x6,
t2 in x7,
<_ ..>[(x10::_)*] in x9,
<_ ..>[(x11::_)*] in x8
<_ ..>[(x10::Char)*] in x9,
<_ ..>[(x11::Char)*] in x8
where t1=t2
]]>
</sample>
......@@ -485,10 +506,10 @@ This version of the query is very efficient. Be aware of patterns.
<sample><![CDATA[
<books-with-prices>
select <book-with-price>[t2 <price-amazon>p2 <price-bn>p1]
from <bib>[b::Book*] in [biblio],
<book ..>[t1&Title _* <price>p1] in b,
<reviews>[e::Entry*] in [amazon],
select <book-with-price>[t2 <price-bstore1>p1 <price-bstore2>p2]
from <bib>[b::book*] in [biblio],
<book ..>[t1&title _* <price>p1] in b,
<reviews>[e::Entry*] in [bstore2],
<entry>[t2&Title <price>p2 ;_] in e
where t1=t2
]]>
......@@ -498,11 +519,11 @@ where t1=t2
<section title="More complex Queries: on the power of patterns">
<sample><![CDATA[
let bib = [biblio]/Book;;
let biblio = [biblio]/book;;
<bib>
select <book (a)> x
from <book (a)>[ (x::(Any\Editor)|_ )* ] in bib
from <book (a)>[ (x::(Any\editor)|_ )* ] in biblio
]]>
</sample>
......@@ -512,21 +533,21 @@ If one wants to write more explicitly:
</p>
<sample><![CDATA[
select <book (a)> x
from <book (a)>[ (x::(Any\$$<editor ..>_$$)|_ )* ] in bib
from <book (a)>[ (x::(Any\$$<editor ..>_$$)|_ )* ] in biblio
]]>
</sample>
<p>Or even:
</p>
<sample><![CDATA[
select <book (a)> x
from <book (a)>[ (x::(<($$_$$\$$`editor$$) ..>_)|_ )* ] in bib
from <book (a)>[ (x::(<($$_$$\$$`editor$$) ..>_)|_ )* ] in biblio
]]>
</sample>
<p>Back to the first one:</p>
<sample><![CDATA[
<bib>
select <book (a)> x
from <$$($$book$$)$$ (a)>[ (x::(Any\Editor)|_ )* ] in bib
from <$$($$book$$)$$ (a)>[ (x::(Any\editor)|_ )* ] in biblio
]]>
</sample>
<p>
......@@ -536,14 +557,14 @@ the from
</p>
<sample><![CDATA[
select <$$($$book$$)$$ (a)> x
from <(book) (a)>[ (x::(Any\Editor)|_ )* ] in bib
from <(book) (a)>[ (x::(Any\editor)|_ )* ] in biblio
]]>
</sample>
<p> Same thing but without tranforming tag to "book".<br/>
More interestingly:</p>
<sample><![CDATA[
select <(b) (a\$$id$$)> x
from <(b) (a)>[ (x::(Any\Editor)|_ )* ] in bib
from <(b) (a)>[ (x::(Any\editor)|_ )* ] in biblio
]]>
</sample>
<p>removes all "id" attribute (if any) from the attributes of the element in bib.
......@@ -551,7 +572,7 @@ More interestingly:</p>
<sample><![CDATA[
select <(b) (a\id+{bing=a.id})> x
from <(b) (a)>[ (x::(Any\Editor)|_ )* ] in bib
from <(b) (a)>[ (x::(Any\editor)|_ )* ] in biblio
]]>
</sample>
<p>Changes attribute <code>id=x</code> into <code>bing=x</code>
......@@ -560,7 +581,7 @@ if such is not the case the expression is ill-typed. If one wants to perform thi
</p>
<sample><![CDATA[
select <(b) (a\id+{bing=a.id})> x
from <(b) (a&{id=_})>[ (x::(Any\Editor)|_ )* ] in bib
from <(b) (a&{id=_})>[ (x::(Any\editor)|_ )* ] in biblio
]]>
</sample>
</section>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment