3. Introduction
•
•
•
•
•
XML: Extensible Markup Language
Defined by the WWW Consortium (W3C)
Derived from SGML (Standard Generalized Markup Language),
but simpler to use than SGML
Documents have tags giving extra information about sections of
the document
– E.g. <title> XML </title> <slide> Introduction …</slide>
Extensible, unlike HTML
– Users can add new tags, and s e p a ra te ly specify how the tag
should be handled for display
4. Con…..
•
The ability to specify new tags, and to create nested tag structures
make XML a great way to exchange data, not just documents.
– Much of the use of XML has been in data exchange applications, not
as a replacement for HTML
•
Tags make data (relatively) self-documenting
– E.g.
<bank>
<account>
<account_number> A-101 </account_number>
<branch_name>
Downtown </branch_name>
<balance>
500
</balance>
</account>
<depositor>
<account_number> A-101 </account_number>
<customer_name> Johnson </customer_name>
</depositor>
</bank>
5. XML: Motivation
•
•
•
Data interchange is critical in today’s networked world
– Examples:
• Banking: funds transfer
• Order processing (especially inter-company orders)
• Scientific data
– Chemistry: ChemML, …
– Genetics: BSML (Bio-Sequence Markup Language),
…
– Paper flow of information between organizations is being
replaced by electronic flow of information
Each application area has its own set of standards for representing
information
XML has become the basis for all new generation data interchange
formats
6. Structure of XML Data
•
•
•
•
Tag: label for a section of data
Element: section of data beginning with <ta g na m e > and ending
with matching </ta g na m e >
Elements must be properly nested
– Proper nesting
• <account> … <balance> …. </balance> </account>
– Improper nesting
• <account> … <balance> …. </account> </balance>
– Formally: every start tag must have a unique matching end
tag, that is in the context of the same parent element.
Every document must have a single top-level element
7. Cont…..
•
Mixture of text with sub-elements is legal in XML.
– Example:
<account>
This account is seldom used any more.
<account_number> A-102</account_number>
<branch_name> Perryridge</branch_name>
<balance>400 </balance>
</account>
– Useful for document markup, but discouraged for data
representation
8. Attributes
•
•
•
Elements can have attributes
<account acct-type = “checking” >
<account_number> A-102 </account_number>
<branch_name> Perryridge </branch_name>
<balance> 400 </balance>
</account>
Attributes are specified by na m e = va lue pairs inside the starting tag
of an element
An element may have several attributes, but each attribute name
can only occur once
<account acct-type = “checking” monthlyfee=“5”>
9. Complex Example of XML
<States>
<State>
<Name>NewJersy</Name>
<Number>1</Number>
</State>
<State>
<Name>NewYork</Name>
<Number>2<Number>
</State>
<Gov>
<Location>Washington</Location>
</Gov>
</States>
10. Tree Representation of XML
S ta te s
S ta te
S ta te
G ov
Nam e
N um ber
Nam e
N um ber
L o c a t io n
N e w J e rs y
1
N e w Y o rk
2
W a s h in g t o n
11. XML Document Schema
•
•
•
•
Database schemas constrain what information can be stored, and
the data types of stored values
XML documents are not required to have an associated schema
However, schemas are very important for XML data exchange
– Otherwise, a site cannot automatically interpret data received
from another site
Two mechanisms for specifying XML schema
– Document Type Definition (DTD)
• Widely used
– XML Schema
• Newer, increasing use
12. Document Type Definition (DTD)
•
•
•
•
The type of an XML document can be specified using a DTD
DTD constraints structure of XML data
– What elements can occur
– What attributes can/must an element have
– What subelements can/must occur inside each element, and
how many times.
DTD does not constrain data types
– All values represented as strings in XML
DTD syntax
– <!ELEMENT element (subelements-specification) >
– <!ATTLIST element (attributes) >
13. Importance of XML
•
•
•
•
Extensible Markup Language (XML) is fast emerging as the
dominant standard for representing data on the Internet.
Most organizations use XML as a data communication
standard.
All commercial development frameworks are XML oriented
(.NET, Java).
All modern web systems architecture is designed based on
XML.
14. XML Querying
•
XPath
– An XPath expression returns a collection of element nodes that
satisfy certain patterns specified in the expression.
– The names in the XPath expression are node names in the
XML document tree that are either tag (element) names or
attribute names, possibly with additional qualifier conditions to
further restrict the nodes that satisfy the pattern
15. XPath
– There are two main separators when specifying a
path:
• single slash (/) and double slash (//)
– A single slash before a tag specifies that the tag must
appear as a direct child of the previous (parent) tag,
whereas a double slash specifies that the tag can
appear as a descendant of the previous tag at any
level.
– It is customary to include the file name in any
XPath query allowing us to specify any local file
name or path name that specifies the path.
– doc(www.company.com/info.XML)/company =>
COMPANY XML doc
16. XQuery
– XQuery uses XPath expressions, but has
additional constructs.
– XQuery permits the specification of more general
queries on one or more XML documents.
– The typical form of a query in XQuery is known as
a FLW expression, which stands for the four main
R
clauses of XQuery and has the following form:
• FOR <variable bindings to individual nodes
(elements)>
• LET <variable bindings to collections of nodes
(elements)>
• W
HERE <qualifier conditions>
• RETURN <query result specification>
17. Storage of XML Data
•
XML data can be stored in
– Non-relational data stores
• Flat files
– Natural for storing XML
• XML database
– Database built specifically for storing XML data,
supporting DOM model and declarative querying
– Currently no commercial-grade systems
– Relational databases
• Data must be translated into relational form
• Advantage: mature database systems
• Disadvantages: overhead of translating data and queries
18. Applications of XML
•
•
•
•
•
•
XML has a variety of uses for Web, e-business, and
portable applications.
The following are some of the many applications for which
XML is useful:
W publishing: XML allows you to create interactive pages
eb
.
Metadata applications: XML makes it easier to express
metadata
W searching and automating W tasks: XML defines the
eb
eb
type of information contained in a document, making it
easier to return useful results when searching the Web: in a
portable, reusable format.
HTML , WML . .. etc
19. Advantages of XML
•
•
•
•
XML is an open standard.
It is human readable and not cryptic like a machine
language.
XML processing is easy.
It can be used to integrate complex web based
systems(using XML as communication).
20. Conclusion
•
XML is more than just a text format for describing
documents. It is a mechanism for describing structured
and semi-structured data, which provides access to a
rich family of technologies for processing such data.
Powerful abstractions like the XML Information Set open
the door to processing non-textual data such as file
systems, the Windows® registry, relational databases
and even programming language objects using XML
technologies. XML brings us one step closer to
universal data access.
21. REFRENCES
-"XML Media Types, RFC 3023“
• http://web.archive.org/web/20110514120305/http://divei
ntomark.org/archives/2004/01/08/postels-law
•
^ "XML and Semantic Web W3C Standards Timeline“
•
^ "Extensible Markup Language (XML) 1.0 (Fifth
Edition)“
•
^ "W3C I18N FAQ: HTML, XHTML, XML and Control
Codes"