namespaces.xml 9.8 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<page name="namespaces">

<title>XML Namespaces</title>

<box title="Overview" link="ov">

<p>

CDuce fully implements the W3C <a
href="http://www.w3.org/TR/REC-xml-names/">XML Namespaces</a>
Recommendation. Atom names (hence XML element tags) and record labels
(hence XML attribute names) are logically composed of a namespace URI
and a local part. Syntactically, they are written as <em>qualified
names</em>, conforming to the QName production of the Recommendation:

</p>

<sample>
QName     ::= (Prefix ':')? LocalPart
Prefix    ::= NCName
LocalPart ::= NCName
</sample>

<p>

The prefix in a QName must be bound to a namespace URI. In XML, the
bindings from prefixes to namespace URIs are introduction through
special <code>xmlns:prefix</code> attributes. In CDuce, instead,
there are explicit namespace binders. For instance, the following
XML documents

</p>

<sample>
36
&lt;p:a q:c="3" xmlns:p="http://a.com" xmlns:q="http://b.com"/>
37
38
39
40
41
42
</sample>

<p>
can be written in CDuce:
</p>

43
44
45
46
47
48
49
50
<sample>
namespace p = "http://a.com" in
namespace q = "http://b.com" in
&lt;p:a q:c="3">[]
</sample>

<p>
This element can be bound to a variable
51
<code>x</code> by a <code>let</code> binding as follows: 
52
53
</p>

54
55
56
57
<sample>
let x = 
  namespace p = "http://a.com" in
  namespace q = "http://b.com" in
58
  &lt;p:a q:c="3">[]
59
60
61
</sample>

<p>
62
63
In which case the namespace declarations are local to the scope
of the let.
64
65
66
67
68
69
Alternatively, it is possible to use global prefix bindings:
</p>

<sample>
namespace p = "http://a.com"
namespace q = "http://b.com"
70
let x = &lt;p:a q:c="3">[]
71
72
73
</sample>

<p>
74
75
Similarly, CDuce supports namespace <i>defaulting</i>. This is introduced
by a local or global <code>namespace "..."</code> construction.
76
As in the XML, default namespaces apply only to tags (atoms), not
77
78
79
80
81
82
83
attributes (record labels).
For instance, in the expression <code>namespace "A" in &lt;x
y="3">[]</code>, the namespace for the element tag is "A", and
the attribute has no namespace.
</p>

<p>
84
The toplevel directive <code>#env</code> causes CDuce to print, amongst
85
others, the current set of global bindings.
86
87
88
89
</p>

</box>

90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
<box title="Reusing namespace declarations" link="reuse">

<p>
A global namespace declaration actually defines an identifier which is 
exported by the current compilation unit. It is possible to use
this identifier in another unit to redefine another prefix with the
same namespace URI. E.g., if the unit <code>a</code> contains:
</p>

<sample>
namespace ns = "http://a.com"
</sample>

then, in another unit, it is possible to declare:

<sample>
namespace ans = a.ns
</sample>

<p>
The <code>open</code> statement operates on namespace declarations;
all the declarations from the open'ed unit are re-exported by the
current unit.
</p>

</box>

<box title="XML Schema and namespaces" link="ns">

<p>
If an XML Schema has been bound to some identifier (in the current
compilation unit or another one), it is possible to use this
identifier in the right-hand side of a namespace declarations. The
namespace URI is the targetNamespace of the XML Schema. E.g.:
</p>

<sample>
schema s = "..."
namespace ns = s
</sample>

</box>

133
<box title="Types for atoms" link="types">
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153

<p>
The type <code>Atom</code> represents all the atoms, in all the
namespaces. An underscore in tag position (as in
<code>&lt;_>[]</code>) stands for this type.
</p>

<p>
Each atom constitutes a subtype of <code>Atom</code>.
In addition to these singelton types, there are
the ``any in namespace'' subtypes, written:
<code>p:*</code> where <code>p</code> is a namespace prefix;
this type has all the atoms in the namespace denoted by <code>p</code>.

The token <code>.:*</code> represents all the atoms
in the current default namespace.
</p>

<p>
When used as atoms and not tags, the singleton types
154
and ``any in namespace'' types must be prefixed by a backquote,
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
as for atom values: <code>`p:x, `p:*, `.:*</code>.
</p>

</box>


<box title="Printing XML documents" link="print">

<p> The <code>print_xml</code> and <code>print_xml_utf8</code>
operators produce a string representation of an XML document.
They have to assign prefixes to namespace. In the current
implementation, CDuce produces XML documents with no
default namespace and only toplevel prefix bindings (that is,
<code>xmlns:p="..."</code> attributes are only produced for the root
element). Prefix names are chosen using several heuristics.
First, CDuce tries using the prefixes bound in the scope
of the <code>print_xml</code> operator. When this is not possible,
it uses global ``hints'': each time a prefix binding is encountered
(in the CDuce program or in loaded XML documents), it creates
a global hint for the namespace. Finally, it generates fresh
prefixes of the form <code>ns%%n%%</code> where <code>%%n%%</code>
is an integer.

For instance, consider the expression:
</p>

<sample>
print_xml (namespace "A" in &lt;a>[])
</sample>

<p>
As there is no available name the prefix URI "A", CDuce generates
a fresh prefix and produces the following XML documents:
</p>

<sample>
&lt;ns1:a xmlns:ns1="A"/>
</sample>

<p>
Now consider this expression:
</p>

<sample>
print_xml (namespace p = "A" in &lt;p:a>[])
</sample>

<p>CDuce produces:</p>

<sample>
&lt;p:a xmlns:p="A"/>
</sample>

<p>
In this case, the prefix binding for the namespace "A" is not
in the scope of <code>print_xml</code>, but the name <code>p</code>
is available as a global hint. Finally, consider:
</p>

<sample>
namespace q = "A" in print_xml (namespace p = "A" in &lt;p:a>[])
</sample>

<p>
Here, the prefix <code>q</code> is available in the scope of
the <code>print_xml</code>. So it is used in priority:
</p>

<sample>
&lt;q:a xmlns:q="A"/>
</sample>

<p>
As a final example, consider the following expression:
</p>

<sample>
print_xml (namespace p ="A" in &lt;p:a>[ (namespace p = "B" in &lt;p:a>[]) ])
</sample>

<p>
A single name <code>p</code> is available for both namespaces
"A" and "B". CDuce choses to assign it to "A", and it generates
a fresh name for "B", so as to produce:
</p>

<sample>
&lt;p:a xmlns:ns1="B" xmlns:p="A">&lt;ns1:a/>&lt;/p:a>
</sample>

<p>
Note that the fresh names are ``local'' to an application of
<code>print_xml</code>. Several application of <code>print_xml</code>
will re-use the same names <code>ns1, ns2</code>, ...
</p>

</box>

<box title="Pretty-printing of XML values and types" link="pretty">

<p>
256
The CDuce interpreter and toplevel use an algorithm similar
257
258
259
260
261
262
263
264
265
266
267
268
269
to the one mentioned in the previous section to pretty-print
CDuce values and types that involve namespace.
</p>

<p>
The main difference is that it does <em>not</em> use by default the current
set of prefix bindings. The rationale is that this set can change
and this would make it difficult to understand the output of CDuce.
So only global hints are used to produce prefixes. Once a prefix has
been allocated, it is not re-used for another namespace. 
The toplevel directive <code>#env</code> causes CDuce to print, amongst
other, the table of prefixes used for pretty-printing.
It is possible to reinitialize this table with the directive
270
<code>#reinit_ns</code>. This directive also sets
271
272
273
274
275
276
the current set if prefix bindings as a primary source of
hints for assigning prefixes for pretty-printing in the future.
</p>

</box>

277
278
279
280
281
282
<box title="Accessing namespace bindings" link="acc">

<p>
CDuce encourages a processing model where namespace prefixes
are just considered as macros (for namespaces) which are
resolved by the (CDuce or XML) parser. However, some
283
XML specifications require the application to keep for each
284
285
286
287
288
289
290
XML element the set of locally visible bindings from prefixes
to namespaces. CDuce provides some support for that.
</p>

<p>
Even if this is not reflected in the type system, CDuce can optionally
attach to any XML element a table of namespace bindings.
291
The following built-in functions allow the programmer to explictly
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
access this information:
</p>
<sample>
type Namespaces = [ (String,String)* ]
namespaces: AnyXml -> Namespaces
set_namespaces: Namespaces -> AnyXml -> AnyXml
</sample>

<p>
The <code>namespaces</code> function raises an exception
when its argument has no namespace information attached.
</p>

<p>
When XML elements are generated, either as literals in the CDuce code
or by <code>load_xml</code>, it is possible to tell CDuce to remember
in-scope namespace bindings. This can be done with the following
construction:
</p>

<sample>
namespace on in %%e%%
</sample>

<p>
The XML elements built within <code>%%e%%</code> (including by calling
<code>load_xml</code>) will be annotated. There is a similar
<code>namespace off</code> construction to turn off this mecanism
320
in a sub-expression, and both constructions can be used at top-level.
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
</p>

<sample><![CDATA[
# namespace cduce = "CDUCE";;
# namespaces <cduce:a>[];;
Uncaught CDuce exception: [ `Invalid_argument 'namespaces' ]

# namespace on;;
# namespaces <cduce:a>[];;
- : Namespaces = [ [ "xsd" 'http://www.w3.org/2001/XMLSchema' ]
                   [ "xsi" 'http://www.w3.org/2001/XMLSchema-instance' ]
                   [ "cduce" 'CDUCE' ]
                   ]
# namespaces (load_xml "string:<a xmlns='xxx'/>");;
- : Namespaces = [ [ "" 'xxx' ] ]
]]>
</sample>

<p>
The default binding for the prefix <code>xml</code> never appear
in the result of <code>namespaces</code>.
</p>

<p>
The <code>xtransform</code> iterator does not change
the attached namespace information for XML elements which are just
traversed. The generic comparison operator cannot distinguish
two XML elements which only differ by the attached namespace information.
</p>

</box>


354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
<box title="Miscellaneous" link="misc">

<p>
Contrary to the W3C <a
href="http://www.w3.org/TR/xml-names11">Namespaces in XML 1.1</a>
Candidate Recommendation, a CDuce declaration <code>namespace p =
""</code> does <em>not</em> undeclare the prefix
<code>p</code>. Instead, it binds it to the null namespace
(that is, a QName using this prefix is interpreted as
having no namespace).
</p>

</box>

</page>