Slides for the paper "Higher Order Applicative XML", given at the Workshop on Radical Innovations of Software and Systems Engineering in the Future, Venice, Italy, October 2002. Published in Springer LNCS 2941, pages 91-107. The Springer URL is http://link.springer.com/chapter/10.1007%2F978-3-540-24626-8_6, with DOI 10.1007/978-3-540-24626-8_6 . A preprint is available at http://www.academia.edu/1413571/Higher_order_applicative_XML_documents .
1. Venice, 7-11 Oct 2002 Monterey Workshop 2002 1
Higher Order Applicative
XML
Carlos Delgado Kloos
with P.T. Breuer, V. Luque, L. Sánchez
Universidad Carlos III de
Madrid
www.it.uc3m.es
2. Venice, 7-11 Oct 2002Monterey Workshop 2002 2
What is the most successful
format to represent data?
XML
3. Venice, 7-11 Oct 2002Monterey Workshop 2002 3
XML represents structure
XML allows to represent
hierarchical information
The elements of the hierarchy are
not predefined
XML allows to invent languages of
multiple brackets
{ []() }
4. Venice, 7-11 Oct 2002Monterey Workshop 2002 4
Structure: Hierarchy
"The structure of concepts is formally called
a hierarchy and since ancient times has
been a basic structure for all western
knowledge. Kingdoms, empires, churches,
armies have all been structured into
hierarchies. Tables of contents of reference
material are so structured, mechanical
assemblies, computer software, all
scientific and technical knowledge is so
structured..."
-- Robert M. Pirsig:
Zen and the Art of Motorcycle Maintenance
5. Venice, 7-11 Oct 2002Monterey Workshop 2002 5
Areas of application
Accounting
MarketingBusiness
EducationCommunication
Banking
Automotive
Insurances
Human
resourcesHealth
ERP
Chemistry
Mathematics
News
Law
Workflow
Software
Tourism
6. Venice, 7-11 Oct 2002Monterey Workshop 2002 6
Success of XML
XML has had a lot of success, much
more than their authors could expect
Not just for the reason they expected
Separation of form and content
But for a reason, they had not thought of
Data has to travel trough the net
The tree structure is a format useful
for any kind of data
XML is used as a data transfer mechanism
7. Venice, 7-11 Oct 2002Monterey Workshop 2002 7
What is XML?
"XML is ASCII for the 21st
century."
-- Henry S. Thompson,
U Edinburgh & W3C
8. Venice, 7-11 Oct 2002Monterey Workshop 2002 8
HTML and JavaScript
<HTML>
<HEAD><TITLE>JavaScript</TITLE></HEAD
>
<BODY>
Text<P>
i=1<BR>
i=2<BR>
i=3<BR>
</BODY>
</HTML>
9. Venice, 7-11 Oct 2002Monterey Workshop 2002 9
HTML and JavaScript
<HTML>
<HEAD><TITLE>JavaScript</TITLE></HEAD
>
<BODY>
Text<P>
<SCRIPT LANGUAGE="JavaScript"> <!--
for (i=0; i<3; ++i)
document.write("i=" + i + "<BR>");
// -->
</SCRIPT>
</BODY>
</HTML>
10. Venice, 7-11 Oct 2002Monterey Workshop 2002 10
Objective
Extend XML to "higher order texts"
do not add to it
do not change it
do not write a separate language
do reinterpret XML semantics in a larger
universe
do conserve the initial semantics
do anything that is natural in a categoric
sense
11. Venice, 7-11 Oct 2002Monterey Workshop 2002 11
The problem with XML
XML can express data
basic types and free data types
XML cannot express function
a separate language is used to
traverse XML data
Advantage or disadvantage?
12. Venice, 7-11 Oct 2002Monterey Workshop 2002 12
An idea: Syntax
Let
<f> a </f>
mean "apply function f to argument
a"
13. Venice, 7-11 Oct 2002Monterey Workshop 2002 13
An idea: Semantics
Let XML documents take arguments
An abstract document f is
function :: [Doc] -> [Doc]
A simple document s is a string or other
basic type
string, integer:: Doc
Virtual document is calculated, sometimes
trivially so
apply function f to argument a
XML documents use f which are free datatype
constructors
16. Venice, 7-11 Oct 2002Monterey Workshop 2002 16
Language Syntax: Basics
Juxtaposition is concatenation of
lists
<a>hello</a>
is both in type a and in type a*
<a>hello</a> <a>there</a>
is of type a*
17. Venice, 7-11 Oct 2002Monterey Workshop 2002 17
Language Syntax: Basics
Function application is via tags,
but functions are by default the
free datatype constructor
<a>hello</a> is of type a
Function definition needs a special
metatag
<def name=a var=x>
<def.val> x x </def.val> ... </def>
18. Venice, 7-11 Oct 2002Monterey Workshop 2002 18
Document Layout
Header section (DTD)
types of functions and other tags
Definition section
semantics of functions
Text section (document)
XML or HOAX document interpreted
according to definitions given
19. Venice, 7-11 Oct 2002Monterey Workshop 2002 19
Example
<!DOCTYPE example [
<!ELEMENT example (ANY*)>
<!ELEMENT ANY* dbl (ANY*)>
]>
<example>
<def name="dbl" var="x">
<def.val> x x </def.val>
<dbl><dbl>bye</dbl></dbl>
</def>
</example>
<example>
byebyebyebye
</example>
20. Venice, 7-11 Oct 2002Monterey Workshop 2002 20
Definition of XML
application
DTD or XML Schema can be used
DTD is used in paper for brevity
DTD normally specifies syntax
in HOAX
syntax = type
type = parser
there are more types than in XML
HOAX function types are not free type
declarations
a "result type" precedes the function name
DTD declarations are local in HOAX
21. Venice, 7-11 Oct 2002Monterey Workshop 2002 21
Meta Tags
HOAX introduces 4 (really 2)
special tags
<def> for binding a name to a value
<var> for expressing abstraction
<eval> for application of a function to
its arguments
<ref> for dereferencing a variable
22. Venice, 7-11 Oct 2002Monterey Workshop 2002 22
Example: notes
Scope of a definition is strictly defined
<def ...> ... </def>
A definition has three parts
<def name=... var=... val=...> ...
Attributes can be made into tags
<def name= ... var=...>
<def.val> ... </def.val>
Type definitions use elements and C
types
<!DOCTYPE [<!ELEMENT res fun (arg)>]>
23. Venice, 7-11 Oct 2002Monterey Workshop 2002 23
Meta Tags
only <def> is common
<var> normally appears within a
<def>
<eval> and <ref> are unusual.
<def name=“a” var=“b” val=“c”>…</def>
means
<def name=“a”>
<def.val><var name=“b”>c</var>
</def.val>…
</def>
24. Venice, 7-11 Oct 2002Monterey Workshop 2002 24
BNF of HOAX
HOAX ::= <def name=t>
<def.val>xs</def.val> xs' </def>
| <var name=t> xs </var>
| <eval name=t> xs </eval>
| <t> xs </t>
| <ref name=t> | t
where xs is a sequence of HOAX
documents
Function names may be applied as tags.
Simple variable names don't need ref tags.
25. Venice, 7-11 Oct 2002Monterey Workshop 2002 25
Types
The type system has to be
adjusted to XML
certain things are indistinguishable,
and hence have the same type
a string is indistinguishable from a
singleton list of strings
a sequence of strings is indistinguishable
from a string
a list of a's followed by a list of a's is
indistinguishable from a list of a's, in
general
26. Venice, 7-11 Oct 2002Monterey Workshop 2002 26
Types
Solved by using the parsers of XML
as types
have inclusions
a | a* means "the parser of a*'s will
parse a"
type equality is not identity, but it does
not matter
27. Venice, 7-11 Oct 2002Monterey Workshop 2002 27
HOAX types are
parsers of HOAX texts
ambiguous parsers
alternative parsers p | q produce all
possible outcomes
sequence of parsers p q partition the
input in all possible ways and produce
all possible results for each part, then
resequence results
p* = p p* | ε
28. Venice, 7-11 Oct 2002Monterey Workshop 2002 28
Algebra of types
The model gives rise to inclusions and
equalities
p ⊆ p*, p p*⊆ p*, ...
Type equality is difficult to decide, but
type satisfaction is easy
each type is precisely a mechanism for
checking type satisfaction
each type is a mechanism for evaluating a
text
each document text has a "canonical type"
calculated from its presentation.
29. Venice, 7-11 Oct 2002Monterey Workshop 2002 29
Canonical types of
document texts
"s" :: #PCDATA
<t> xs </t> :: t
if the construction meets the declared type
constraints for the tag t.
<f> xs </f> :: r
where r is the result type declared for f, provided
xs :: a and a is the declared argument type for f.
<def name=“a” val=“b”> c </def> :: t
where t is the type calculated for c given the
hypothesis that the type of a, wherever it appears
in c, is the same as the type calculated for b.
30. Venice, 7-11 Oct 2002Monterey Workshop 2002 30
HOAX semantics
Two HOAX documents can look distinct but
mean the same thing
The semantics is defined by "how to parse"
instructions
The instructions are predicated on a type
"how to parse x using type t"
if t is the result type for function tags f which takes
argument type a, then <f> x </f> is parsed by t by
first parsing x with a, then applying the definition of
function f to the results of the parse, then parsing
with t.
31. Venice, 7-11 Oct 2002Monterey Workshop 2002 31
Can prove ...
That each document has a unique
parse semantics
means that the semantics is well-
defined
That the semantics of functional
application is substitution
means that the parse of a document
with references is the parse of the
document with the references
replaced by the text to which they
32. Venice, 7-11 Oct 2002Monterey Workshop 2002 32
Advantages
HOAX puts the semantics back into
XML
not only functions were missing, but
reductions of any form to any other form
Do not need a second "transformation
language"
HOAX documents are self-transforming in
situ
Clearly arbitrarily higher order
functionality
"abstract documents are first-class
33. Venice, 7-11 Oct 2002Monterey Workshop 2002 33
Disadvantages
Might prefer to separate form and
function
There is no mechanism for saying
where to perform a reduction -
here or there
What happens in the case of
unavailable data?
Ditto function?
34. Venice, 7-11 Oct 2002Monterey Workshop 2002 34
Open issues
Reflexivity
Can HOAX be expressed entirely within
HOAX
Probably, but meaningfully? Usefully?
Circular reference trails
HOAX is careful to apply definitions only in a
well-defined scope.
External references drag in their definitions,
but with scope localized to the reference.
Do mutually dependent external definitions
resolve?
35. Venice, 7-11 Oct 2002Monterey Workshop 2002 35
Conclusion
Experiment in extending XML to
support functional semantics
Prototyped
Proposed approach to semantics of
XML
use types=parsers, since XML is pure
syntax
use interpreter=parser, since XML is pure
syntax
ergo type=interpreter=parser
36. Venice, 7-11 Oct 2002Monterey Workshop 2002 36
Conclusion
By giving an interpretation to
tags, one can easily reduce
documents to others (without
XSLT!)
Exploratory work