Commit 6d0fe0cf authored by Pietro Abate's avatar Pietro Abate
Browse files

[r2003-11-20 11:24:10 by szach] Added perliminary documentation for XML Schema support. Includes

introduction and import.

Original author: szach
Date: 2003-11-20 11:24:10+00:00
parent df5bf1cc
......@@ -8,6 +8,7 @@
<include file="manual/types_patterns.xml"/>
<include file="manual/expressions.xml"/>
<include file="manual/namespaces.xml"/>
<include file="manual/schema.xml"/>
<left>
<p>
This Guide describes all CDuce's constructions.
......
......@@ -167,7 +167,7 @@ You can quit the toplevel with the toplevel directive
<p>
The toplevel directive <code>#help</code> prints an help message about
the available toplevel directives
the available toplevel directives.
</p>
<p>
The toplevel directive <code>#dump_value</code> dump an XML-like
......@@ -190,6 +190,18 @@ table of prefix-to-namespace bindings used for pretty-printing
values and types with namespaces (see <local href="namespaces"/>).
</p>
<p>
The toplevel directive <code>#print_type</code> shows a representationo of a
CDuce type including types imported from <local href="manual_schema">XML
Schema</local> documents.
</p>
<p>
The toplevel directive <code>#print_schema</code> shows the <local
href="manual_schema">XML Schema</local> components contained in a given
schema.
</p>
<p>
The toplevel has no line editing facilities.
You can use an external wrapper such as
......
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE page [
<!ENTITY larr "&#8592;"> <!-- leftwards arrow, U+2190 ISOnum -->
<!ENTITY uarr "&#8593;"> <!-- upwards arrow, U+2191 ISOnum-->
<!ENTITY rarr "&#8594;"> <!-- rightwards arrow, U+2192 ISOnum -->
<!ENTITY darr "&#8595;"> <!-- downwards arrow, U+2193 ISOnum -->
]>
<page name="manual_schema">
<title>XML Schema</title>
<box title="Overview" link="overview">
<p>
CDuce partially supports <a href="http://www.w3.org/XML/Schema">XML
Schema</a> Recommendations (<a
href="http://www.w3.org/TR/xmlschema-0/">Primer</a>, <a
href="http://www.w3.org/TR/xmlschema-1/">Structures</a>, <a
href="http://www.w3.org/TR/xmlschema-2/">Datatypes</a>). Using this CDuce
feature is possible to manipulate XML documents which leaves are typed
values like integers, dates, binary data and so on.
</p>
<p>
CDuce supports XML Schema implementing the following features:
</p>
<ul>
<li>
<a href="#import">XML Schema components import</a>
</li>
<li>
<a href="#validation">XML Schema validation</a>
</li>
<li>
<a href="#print_xml">XML Schema instances output</a>
</li>
</ul>
</box>
<box title="XML Schema components (micro) introduction" link="primer">
<p>
An XML Schema document could define five different kinds of component, each
of them could be imported in CDuce and used as CDuce types:
</p>
<ul>
<li>
<b>Type definitions</b><br />
A type definition defines either a simple type or a complex type. The
former could be used to type more precisely the string content of an
element. You can think at it as a refinement of #PCDATA. XML Schema
provides a set of <a
href="http://www.w3.org/TR/xmlschema-2/#built-in-datatypes">predefined
simple types</a> and a way to define new simple types. The latter could
be used to constraint the content model and the attributes of an XML
element. An XML Schema complex type is strictly more expressive than a DTD
element declaration.
</li>
<li>
<b>Attribute declaration</b><br />
An attribute declaration links an attribute name to a simple type.
Optionally it can constraints the set of possible values for the attribute
mandating a fixed value or providing a default value.
</li>
<li>
<b>Element declarations</b>
An element declaration links an attribute name to a complex type.
Optionally, if the type is a simple type, it can constraints the set of
possible values for the element mandating a fixed value or providing a
default value.
</li>
<li>
<b>Attribute group definitions</b>
An attribute group definitions links a set of attribute declarations to a
name which can be referenced from other XML Schema components.
</li>
<li>
<b>Model group definitions</b>
A model group definition links a name to a constraint over the complex
content of an XML element. The linked name can be referenced from other
XML Schema components.
</li>
</ul>
</box>
<box title="XML Schema components import" link="import">
<p>
In order to import XML Schema components in CDuce, you first need to tell
CDuce to import an XML Schema document. You can do this using the
<code>schema</code> keyword to bind an uppercase identifier to a local
schema document:
</p>
<sample>
# {{schema Mails = "tests/schema/mails.xsd"}};;
Registering schema type: Mails # mailType
Registering schema type: Mails # envelopeType
Registering schema type: Mails # mailsType
Registering schema type: Mails # bodyType
Registering schema element: Mails # mails
</sample>
<p>
The above declaration will (try to) import all schema components included in
the schema document as CDuce types. You can reference them using the
<code>#</code> (sharp) operator.
</p>
<p>
XML Schema permits ambiguity in components name, this imply that you can
have both an element declaration and an attribute declaration having the
same name in a single schema document. In case of no ambiguity you can
reference CDuce types corresponding to schema components just using the name
with the following syntax:<br /> <code>&lt;schema_name&gt; #
&lt;component_name&gt;</code><br /> Otherwise you can specify the kind of
schema component as follows:<br /> <code>&lt;schema_name&gt; #
&lt;component_name&gt; as &lt;component_kind&gt;</code><br /> where
component kind is one of:<br /> <code>element | type | attribute |
attribute_group | model_group</code><br />
</p>
<p>
The result of a schema component reference is an ordinary CDuce type which
you can use as usual in function definitions, pattern matching and so on.
</p>
<sample>
let is_valid_mail (Any -> Bool)
| {{Mails # mailType}} -> `true
| _ -> `false
</sample>
<p>
<em>
Please note the spaces which surround the sharp character, they are
needed, otherwise <code>#mailType</code> will be considered by the lexer
as a(n unexistent) directive.
</em>
</p>
</box>
<box noindex="true" title="" link="">
<p>
<em>
<b>Correctness remark:</b> while parsing XML Schema documents, CDuce
assumes that they're correct with respect to XML Schema recommendations.
At minimum they're required to be valid with respect to <a
href="http://www.w3.org/TR/xmlschema-1/#normative-schemaSchema">XML
Schema for Schemas</a>. It's recommended that you will check for
validity your schemas before importing them in CDuce, strange behaviour is
assured otherwise.
</em>
</p>
</box>
<box title="Toplevel directives" link="directives">
<p>
The toplevel directive <code>#env</code> supports schemas, it lists the
currently defined schemas.
</p>
<sample>
# #env;;
Types: Empty Any Int Char Byte Atom Pair Arrow Record String Latin1 Bool
Namespace prefixes:
=>""
xml=>"http://www.w3.org/XML/1998/namespace"
Namespace prefixes used for pretty-printing:
{{Schemas: Mails}}
Values:
val argv : [ String* ] = ""
</sample>
<p>
The toplevel directive <code>#print_type</code> supports schemas too, it can
be used to print types corresponding to schema components with the usual
sharp syntax.
</p>
<sample>
# #print_type {{Mails # bodyType}};;
[ Char ]
</sample>
<p>
The toplevel directive <code>#print_schema</code> is not really user
friendly (because it shows some representation internals), but can be used
to show the various schema components contained in a given schema.
</p>
<sample><![CDATA[
# #print_schema Mails;;
Types: C:10:mailType C:7:envelopeType C:12:mailsType S:bodyType'
Elements: E:13:<mails>
]]></sample>
<p>
For more information about toplevel directives
<local href="manual_interpreter">click here</local>.
</p>
</box>
<box title="XML Schema &rarr; CDuce mapping" link="mapping">
<ul>
<li>
<p>
XML Schema <b>predefined simple types</b> are mapped to CDuce types
directly in the CDuce implementation preserving as most as possible XML
Schema constraints. The table below lists the most significant mappings.
</p>
<table border="1">
<tr>
<td><b>XML Schema predefined simple type</b></td>
<td><b>CDuce type</b></td>
</tr>
<tr>
<td>
<code>duration</code>, <code>dateTime</code>, <code>time</code>,
<code>date</code>, <code>gYear</code>, <code>gMonth</code>, ...
</td>
<td>
closed record types with some of the following fields (depending on
the Schema type): <code>year</code>, <code>month</code>,
<code>day</code>, <code>hour</code>, <code>minute</code>,
<code>second</code>, <code>timezone</code>
</td>
</tr>
<tr><td><code>boolean</code></td><td><code>Bool</code></td></tr>
<tr>
<td>
<code>anySimpleType</code>, <code>string</code>,
<code>base64Binary</code>, <code>hexBinary</code>,
<code>anyURI</code>
</td>
<td><code>String</code></td>
</tr>
<tr><td><code>integer</code></td><td><code>Int</code></td></tr>
<tr>
<td>
<code>nonPositiveInteger</code>, <code>negativeInteger</code>,
<code>nonNegativeInteger</code>, <code>positiveInteger</code>,
<code>long</code>, <code>int</code>, <code>short</code>,
<code>byte</code>
</td>
<td>integer intervals with the appropriate limits</td>
</tr>
<tr>
<td> <code>string</code>, <code>normalizedString</code>, and the other
types derived (directly or indirectly) by restriction from string
</td>
<td>String</td>
</tr>
<tr>
<td>
<code>NMTOKENS</code>, <code>IDREFS</code>, <code>ENTITIES</code>
</td>
<td>
<code>String</code> list (i.e. Kleene star of a <code>String</code>
type)
</td>
</tr>
</table>
<p>
<b>Simple type definitions</b> are built from the above types following
the XML Schema derivation rules.
</p>
</li>
<li>
<p>
XML Schema <b>complex type definitions</b> are mapped to CDuce types
representing XML elements which can have any tag, but whose attributes
and content are constrained to be valid with respect to the original
complex type.
</p>
<p>
As an example, the following XML Schema complex type:
</p>
<sample><![CDATA[
<xsd:complexType name="mailType">
<xsd:sequence>
<xsd:element name="envelope" type="{{envelopeType}}"/>
<xsd:element name="body" type="{{bodyType}}"/>
</xsd:sequence>
<xsd:attribute use="{{required}}" name="{{id}}" type="{{xsd:integer}}"/>
</xsd:complexType>
]]></sample>
<p>
will be mapped to a CDuce type which must have an <tt>id</tt> attribute
of type Int and two children elements respectively of the types
corresponding to the XML Schema types <tt>envelopeType</tt> and
<tt>bodyType</tt>.
</p>
<sample><![CDATA[
# #print_type Mails # mailType;;
<({{Any}}) {| {{id = Int}} |}>
[ <{{envelope}} {| |}>
[ <From {| |}> String
<To {| |}> String
<Date {| |}> {
positive = Bool;
year = Int; month = Int; day = Int;
hour = Int; minute = Int; second = Int;
timezone =? { positive = Bool; hour = Int; minute = Int }
}
<Subject {| |}> String
(<header {| name = String |}> [ String ])*
]
<{{body}} {| |}>[ Char ]
]
]]></sample>
</li>
<li>
<p>
XML Schema <b>attribute declarations</b> are converted to record types
with just one field corresponding to the declared attribute.
</p>
<sample>
# #print_type Person # age;;
{| {{age = 1--*}} |}
</sample>
</li>
<li>
<p>
XML Schema <b>element declarations</b> can bound an XML element either
to a complex type or to a simple type. In the former case the conversion
is almost identical as what we have seen for complex type conversion.
The only difference is that this time element's tag must correspond to
the name of the XML element in the schema element declaration, whereas
previously it was <code>Any</code> type.
</p>
<p>
In the latter case (element with simple type content), the corresponding
CDuce types is an element type. Its tag must correspond to the name of
the XML element in the schema element declaration; its content type its
the CDuce translation of the simple type provided in the element
declaration.
</p>
<p>
For example, the following XML Schema element:
</p>
<sample><![CDATA[
<xsd:element name="day" type="xsd:date"/>
]]></sample>
<p>
will be translated to the following CDuce type:
</p>
<sample><![CDATA[
# #print_type Calendar # day;;
<day {| |}> {
positive = Bool;
year = Int;
month = Int;
day = Int
}
]]></sample>
<p>
Note that the type of the element content <em>is not a sequence</em>
unless the translation of the XML Schema types is a sequence itself.
</p>
</li>
<li>
<p>
XML Schema <b>attribute group definitions</b> are mapped to record types
containing one field for each attribute declarations contained in the
group. <tt>use</tt> constraints are respected: optional attributes are
mapped to optional fields, required attributes to required fields.
</p>
<p>
The following XML Schema attribute group declaration:
</p>
<sample><![CDATA[
<xsd:attributeGroup name="nameAttributes">
<xsd:attribute use="required" name="name" type="xsd:string" />
<xsd:attribute use="required" name="surname" type="xsd:string" />
<xsd:attribute use="optional" name="nickname" type="xsd:string" />
</xsd:attributeGroup>
]]></sample>
<p>
will thus be mapped to the following CDuce type:
</p>
<sample>
# #print_type Person # nameAttributes;;
{| name = String; surname = String; nickname =? String |}
</sample>
</li>
<li>
<p>
XML Schema <b>model group definitions</b> are mapped to CDuce sequence
types. <tt>minOccurs</tt> and <tt>maxOccurs</tt> constraints are
respected, using CDuce recursive types to represent <tt>unbounded</tt>
repetition (i.e. Kleene star).
</p>
<p>
<tt>all</tt> constraints, also known as <em>interleaving
constraints</em>, can't be expressed in the CDuce type system avoiding
type sizes explosion. Thus, this kind of content models are normalized
and considered, in the type system, as sequence types.
</p>
<p>
As an example, the following XML Schema model group definition:
</p>
<sample><![CDATA[
<xsd:group name="family">
<xsd:sequence>
<xsd:element name="mother" type="xsd:string" />
<xsd:element name="father" type="xsd:string" />
<xsd:sequence minOccurs="0" maxOccurs="unbounded">
<xsd:choice>
<xsd:element name="son" type="xsd:string" />
<xsd:element name="daughter" type="xsd:string" />
</xsd:choice>
</xsd:sequence>
</xsd:sequence>
</xsd:group>
]]></sample>
<p>
will be mapped to the following CDuce type:
</p>
<sample><![CDATA[
# #print_type Person # family;;
[ <mother {| |}> String
<father {| |}> String
(<daughter {| |}> String | <son {| |}>String)*
]
]]></sample>
</li>
</ul>
</box>
<box title="XML Schema validation" link="validation">
<p>
<b>TODO</b>
</p>
</box>
<box title="XML Schema instances output" link="print_xml">
<p>
<b>TODO</b>
</p>
</box>
</page>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment