Commit 823bfe02 authored by Pietro Abate's avatar Pietro Abate
Browse files

[r2006-05-15 21:28:48 by beppe] Empty log message

Original author: beppe
Date: 2006-05-15 21:28:48+00:00
parent 7366ede9
......@@ -222,61 +222,6 @@ Wand in 1993.</p></abstract>
</paper>
</li>
<li>
<paper file="http://www.ps.uni-sb.de/Papers/abstracts/cut.html">
<title>Interactive Learning of Node Selecting Tree Transducer</title>
<author>Julien Carme </author>
<author> Rémi Gilleron</author>
<author> Aurélien Lemay </author>
<author> Joachim Niehren</author>
<comment>ML.</comment>
<abstract><p>
We develop new algorithms for learning monadic node selection
queries in unranked trees from annotated examples,
and apply them to visually interactive Web
information extraction.
We propose to represent monadic queries by bottom-up deterministic
Node Selecting Tree Transducers (NSTTs), a particular class of tree
automata that we introduce. We prove that deterministic NSTTs capture
the class of queries definable in monadic second order logic (MSO) in
trees, which Gottlob and Koch (2002) argue to have the right expressiveness for Web
information extraction, and prove that monadic queries defined by NSTTs
can be answered efficiently.
We present a new polynomial time algorithm in RPNI-style
that learns monadic queries defined by deterministic NSTTs from
completely annotated examples, where all selected nodes are
distinguished.
In practice, users prefer to provide partial annotations. We
resolve this by intelligent tree pruning heuristics. We introduce
pruning NSTTs - a formalism that shares many advantages of NSTTs.
This leads us to an interactive learning algorithm for monadic
queries defined by pruning NSTTs, which satisfies a new formal
active learning model in the style of Angluin (1987).
We have implemented our interactive learning algorithm
and integrated it into a visually interactive Web information
extraction system -- called Squirrel -- by plugging it into
the Mozilla Web browser. Experiments on realistic Web documents confirm
excellent quality with very few user interactions
during wrapper induction.
</p></abstract>
</paper>
</li>
<li>
<paper file="http://www.ps.uni-sb.de/Papers/abstracts/wellnested-cu.bib">
......@@ -498,3 +443,58 @@ the decidability results are shown by a reduction to a decision problem on tree
This work is a step towards resolving long-standing open problems of the decidability of entailment for non-structural subtyping.</p></abstract>
</paper>
</li>
<li>
<paper file="http://www.ps.uni-sb.de/Papers/abstracts/cut-ml.html">
<title>Interactive Learning of Node Selecting Tree Transducer</title>
<author>Julien Carme </author>
<author> Rémi Gilleron</author>
<author> Aurélien Lemay </author>
<author> Joachim Niehren</author>
<comment>ML.</comment>
<abstract><p>
We develop new algorithms for learning monadic node selection
queries in unranked trees from annotated examples,
and apply them to visually interactive Web
information extraction.
We propose to represent monadic queries by bottom-up deterministic
Node Selecting Tree Transducers (NSTTs), a particular class of tree
automata that we introduce. We prove that deterministic NSTTs capture
the class of queries definable in monadic second order logic (MSO) in
trees, which Gottlob and Koch (2002) argue to have the right expressiveness for Web
information extraction, and prove that monadic queries defined by NSTTs
can be answered efficiently.
We present a new polynomial time algorithm in RPNI-style
that learns monadic queries defined by deterministic NSTTs from
completely annotated examples, where all selected nodes are
distinguished.
In practice, users prefer to provide partial annotations. We
resolve this by intelligent tree pruning heuristics. We introduce
pruning NSTTs - a formalism that shares many advantages of NSTTs.
This leads us to an interactive learning algorithm for monadic
queries defined by pruning NSTTs, which satisfies a new formal
active learning model in the style of Angluin (1987).
We have implemented our interactive learning algorithm
and integrated it into a visually interactive Web information
extraction system -- called Squirrel -- by plugging it into
the Mozilla Web browser. Experiments on realistic Web documents confirm
excellent quality with very few user interactions
during wrapper induction.
</p></abstract>
</paper>
</li>
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<page name="tralala_marseille06">
<title>Tralala: Journes du 23 et 24 mai janvier 2006</title>
<left>
<p style="font-size: 12pt; color: fuchsia">Available Pages</p>
<ul>
<li><a href="tralala.html">Tralala</a> </li>
<li><a href="tralala_partenaires.html">Partenaires</a> </li>
<li><a href="tralala_documents.html">Documents</a> </li>
<li><a href="tralala_reunions.html">Runions</a> </li>
<li><a href="tralala_mailing.html">Listes de diffusion</a> </li>
</ul>
</left>
<left>
<p> All pages of this site were automatically generated from an XML
description of the content by <a href="examples.html#site">the
following CDuce program</a>. </p><p><img src="img/cducepower.jpg"
alt="Powered by CDuce"/></p>
</left>
<box title="Programme" link="pro">
<p>Cinquime runion de Tralala, 23 et 24 mai 2006, Salle LSH/411,
Universit de Provence, campus Saint-Charles.</p>
<p><b>Mardi 23 mai</b></p>
<ol>
<li><b>[11:30] </b>Accueil</li>
<li><p/><b>Djeuner</b><p/></li>
<li><b>[13:30] </b> Alain Frisch <i>Streaming XML par valuation
gloutonne</i></li>
<li><b>[14:15] </b> Nicole Bidoit, Dario Colazzo <i>Capturing well typed
references in DTDs</i> </li>
<li><p/><b>Pause</b><p/></li>
<li><b>[15:30] </b> Nabil Layaida, Pierre Geneves (INRIA
Rhne-Alpes) <i>Mu-calcul pour les arbres finis et analyse statique
de XPath</i> </li>
<li><b>[16:15] </b> Joachim Niehren, Emmanuel Filiot <i>TBA</i></li>
</ol>
<p><b>Mercredi 24 mai</b></p>
<ol>
<li><b>[09:00] </b> Denis Lugiez, Cosimo Laneve <i>Presburger modal
logic is only PSPACE complete</i></li>
<li><b>[09h45] </b> Lucia Acciai <i>Responsiveness in Process
Calculi</i></li>
<li><p/><b>Pause</b><p/></li>
<li><b>[11:00] </b> <i>Mtine dmos:</i>
<ol>
<li>Cdric Miachon <i>Pattern by Example: a graphical language
for XML processing -- Dmo</i></li>
<li>Kim Nguyen <i>Crawlers in CDuce -- Dmo</i></li>
<li>Nils Gesbert <i>ECDuce = CDuce + XHTML -- Dmo</i></li>
</ol>
</li>
</ol>
</box>
<box title="Participants" link="par">
<p>
<b>Lille:</b> Anne-Ccile Caron, Sophie Tison, Joachim Niehren,
Jean-Marc Talbot, Denis Debarbieux, Emmanuel Filiot; <b>Ens:</b>
Giuseppe Castagna, Nils Gesbert; <b>LRI:</b> Vronique Benzaken,
Nicole Bidoit, Dario Colazzo, Cdric Miachon, Kim Nguyn;
<b>Marseille:</b> Lucia Acciai, Silvano Dal Zilio, Denis Lugiez;
<b>Paris 7:</b> Mathias Samuelides; <b>Invits :</b> Alain Frisch,
Francoise Gire, Nabil Layaida, Pierre Geneves.
</p>
</box>
</page>
......@@ -6,6 +6,7 @@
<include file="tralala_marseille05.xml"/>
<include file="tralala_lille05.xml"/>
<include file="tralala_paris06.xml"/>
<include file="tralala_marseille06.xml"/>
<left>
<p style="font-size: 12pt; color: fuchsia">Available Pages</p>
<ul>
......@@ -29,6 +30,7 @@ the content by <a href="examples.html#site">the following CDuce program</a>.
<li><b>10-11 mars 2005</b> Marseille. <a href="tralala_marseille05.html">Programme.</a></li>
<li><b>7-8 juillet 2005</b> Lille. <a href="tralala_lille05.html">Programme.</a></li>
<li><b>26-27 janvier 2006</b> Paris. <a href="tralala_paris06.html">Programme.</a></li>
<li><b>23-24 mai 2006</b> Marseille. <a href="tralala_marseille06.html">Programme.</a></li>
</ol>
</box>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment