Handwritten Text Recognition for manuscripts and early printed texts
Visual editing of semi-structured documents on the web
1. V I S U A L E D I T I N G O F
S E M I - S T R U C T U R E D D O C U M E N T S
O N T H E W E B
E X P L O R I N G PA T T E R N S A N D A P P R O A C H E S
Jonas Oesch
Prof. Jean-Marc Seydoux
3. N O V I C E U S E R
• Does only occasional editing
• Competent in the domain of the content
• Does not know the content model used
• Is not trained for a particular editing software
4. D O C U M E N T S
• Complies to some rules of structure = Semistructured
• Meaning is lost when the relationship between
elements is broken = Hierarchical
• Atomic
6. L AY E R S O F A D O C U M E N T
content
model
presentation
2
boat
length [ @unit ]
<boat>
<length unit=“m“>
2
</length>
</boat>
321
l e n g t h
B o a t
m2
boat m length
m
0 1 2
7. C O N V E Y I N G M E A N I N G
meaning
presentation
«The boat has a length of two meters.»
<boat>
<length unit=“m“>
2
</length>
</boat>
321
l e n g t h
B o a t
m2m
0 1 2
8. P O LY P U B L I S H I N G
content
model
presentation
meaning
Print Website Mobile
9. E D I T I N G I N T E R FA C E
??
Editing
Print Website Mobile
content
model
meaning
10. T W O WAY S : I N T E R A C T I V I T Y
Editing
content
model
meaning
11. R E Q U I R E M E N T S
• Ensuring the adherence to a given model
• Allow a novice user to create and edit documents
12. F R A M E W O R K
Ensuring model
Novice friendly
«The Unicorn»
13. D ATA C E N T E R E D
CGDocAbbr
Conception et gestion de documentsTitre
LucPrenom FontollietNom
luc.fontolliet@heig-vdEmail
Professeur
+
Approfondir les notions de transformations
(XSLT).
Objectif
–
Comprendre et concevoir des modèles de
documents (Schémas XML).
Objectif
+
14. D O C U M E N T C E N T E R E D
Ges Co n
Conception et gestion de documents
LUC FONTOLLIET
luc.fontolliet@heig-vd.ch
• Approfondir les notions de transformations (XSLT).
• Comprendre et concevoir des modèles de documents
(Schémas XML).
• Installer et utiliser une base de données XML native.
• Développer une application cliente (HTML, JavaScript,
AJAX(JSON)-serveur (XQuery, DB XML native)).
• Produire du PDF en utilisant XSL-FO dans le contexte de
Objectifs
15. F R A M E W O R K
Ensuring model
Novice friendly
Webform
Word-Document
«The Unicorn»
16. H O W T O M O V E F O R WA R D
Webform
Word-Document
add guidance
add visual cues
Ensuring model
Novice friendly
«The Unicorn»
17. I N V E S T I G AT E D S O L U T I O N S
Webform
Word-Document
Ensuring model
XForms
AXEL
Doctored
XOpus
FontoXML
oXygen
XEditor
Inkling Habitat
«The Unicorn»
proprietary
open sourceNovice friendly
18. I N V E S T I G AT E D S O L U T I O N S
Webform
Word-Document
Ensuring model
XForms
AXEL
Doctored
XOpus
FontoXML
oXygen
XEditor
Inkling Habitat
«The Unicorn»
proprietary
open sourceNovice friendly
19. ??
Editor XTiger XML + XHTML Template
interpretet by
AXEL JavaScript Library
=
E P F L A N D I N R I A U N I V E R S I T I E S
Stéphane Sire
Christine Vanoirbeek
Vincent Quint
Cécile Roisin
20. E D I T I N G P R E S E N TAT I O N ( A X E L )
21. C O N T E N T
<?xml version="1.0" encoding="UTF-8"?>
<Unite UUID="fce2127f-e72d-4b6e-8179-4da864fbfd55" version="1" periodes="48"
variante="M" annee="2015">
<Abreviation>CGDdoc</Abreviation>
<Titre>Conception, gestion, diffusion de documents structurés</Titre>
<Professeur>
<Prenom>Luc</Prenom>
<Nom>Fontolliet</Nom>
<Email>luc.fontolliet@heig-vd.ch</Email>
<Login>fonto</Login>
</Professeur>
<Objectifs>
<Objectif>Approfondir les notions de transformations (XSLT).</Objectif>
<Objectif>Comprendre et concevoir des modèles de documents (Schémas XML).</Objectif>
<Objectif>Installer et utiliser une base de données XML native.</Objectif>
<Objectif>Développer une application cliente (HTML, JavaScript,
AJAX(JSON)-serveur (XQuery, DB XML native)).</Objectif>
<Objectif>Produire du PDF en utilisant XSL-FO dans le contexte de l'application
ci-dessus.</Objectif>
</Objectifs>
<Contenus>
<Contenu>Bases de donnés XML</Contenu>
</Contenus>
<Links>
<Link>
<Text>Fonto</Text><Url>http://www.fonto.ch</Url>
</Link>
</Links>
<Prerequisites>
<Prerequisite>e2c6dbc0-d863-11e0-9572-0800200c9a66</Prerequisite>
</Prerequisites>
</Unite>
22. S T R U C T U R E
Unite @UUID @version @periodes @variante @annee
Abreviation
Titre
Professeur+
Nom
Prenom
Email
Login
Objectifs
Objectif+
Contenus
Contenu+
Commentaires?
Links?
Link+
Text
Url
Prerequisites?
Prerequisite+
27. W I D G E T S : W H AT I S N E E D E D ?
X T Q
Z Y X
G 9 ?
cmyktastic.ch
This is wheat
Tables
Images
Links
Section
Sed cursus turpis
vitae tortor. Donec
posuere vulputate
arcu.
Content boxes
Formulas
Widgeti
i=0
n
∑
Eggs
Ham
Chunky bacon✓
Checkboxes
28. M I X E D C O N T E N T
• How to markup inline elements?
• How to suggest the use of inline elements?
• How to validate the use of inline elements?
30. S O U R C E S
• S. Sire, C. Vanoirbeek, V. Quint, C. Roisin: Authoring XML all the Time,
Everywhere and by Everyone
• Krug S.: Don’t make me think
• Norman D. : The Design of Everyday Things
• Rockley A. und Cooper C. : Managing Entreprise Content
• http://www.fontoxml.com
• http://www.xeditor.com/
• http://holloway.co.nz/doctored/
• http://xopus.com/demo/rich-text
• http://www.oxygenxml.com/webapp/
• http://habitat.inkling.com
• https://ssire.github.com/xtiger-xml-spec/
• http://www.w3.org/TR/xforms20