2. XML
• Extensible Markup Language
• Designed to describe data and focus on what data is.
• Used to structure store and send information.
• Easy to understand and is self describing.
• XML is derived from Standard Generalized Markup
Language (SGML)
• Documents have tags giving extra information about
sections of the document
– E.g. <title> XML </title> <slide> Introduction …</slide>
• Extensible, unlike HTML
– Users can add new tags, and separately specify how the
tag should be handled for display
3. Types of XML databases
There are two major types of XML databases:
• XML-enabled: These map all XML to a
traditional database, accepting XML as input
and rendering XML as output.
• Native XML (NXD): The internal model
depends on XML and uses XML documents as
the fundamental unit of storage.
4. XML document rules
• A “well formed” XML doc has to have correct
XML syntax and they are :
– Must start with a n XML declaration to indicate
the version of XML being used as well as other
relevant attributes.
– Must have a root element.
– Must have a closing tag.
– XML tags are case sensitive.
– XML elements must be properly nested.
– XML attribute values must be quoted.
5. <?xml version = "1.0"?>
Structure of XML Data
<library xmlns:mevlana=“http://mevlana.edu.tr”>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
</book>
</library>
<!-- comments -->
5
Start tag
End tag
Element
Attribute
Namespace
Comments
tag
XML
Document
Node
6. • Hierarchical Data Model (Tree Structure).
• Basic object in XML is XML document.
• 2 main structuring concepts are used to
construct an XML doc
– Elements : start tag and end tag.
– Attributes : additional info to describe elements.
• In the tree representation
– Internal node : complex elements.
– Leaf node : simple elements.
7. Types of XML documents
• Data-centric XML doc
– Many small data items that follow a specific structure
and hence may be extracted from a structured DB.
– Formatted as XML doc inoder to exchange or display
over web.
• Doc-centric XML doc
– Large amounts of text ie.book
– Few or no structured data elements in these doc.
• Hybrid XML doc
– Doc may contain structured or unstructured data.
8. XML vs. Relational Database
XML Database
• XML data is hierarchical
• XML data is self-describing
• XML data has inherent
ordering
• An XML database contains
collections
Relational Database
• relational data is
represented in a model of
logical relationships
• relational data is not self-
describing
• Relational data does not
have inherent ordering
• A relational database
contains tables
11. XML Document Schema
• Database schemas constrain what information
can be stored, and the data types of stored
values
• XML documents are not required to have an
associated schema
• Schemas are very important for XML data
exchange
• Two mechanisms for specifying XML schema
–Document Type Definition (DTD)
–XML Schema Definition (XSD)
12. Document Type Definition
• DTD constraints structure of XML data
– What elements can occur
– What attributes can/must an element have
– What subelements can/must occur inside each
element, and how many times.
• Limitations
– Data types in DTD are not general
– DTD has its own special syntax and thus require
special processors.
14. XML Schema Definition
• XML Schema is a more sophisticated schema
language which addresses the drawbacks of
DTDs. Supports Different data types.
• XML Schema is itself specified in XML syntax,
unlike DTDs
• XML Schema is integrated with namespaces
• XML Schema is significantly more complicated
than DTDs
15. Querying XML Data
• There are several languages used to access
XML data from XML Documents, some are:
– XPath
– Xquery (most popular)
17. XQuery
• XQuery is a general purpose query language for XML data
• XQuery is built on XPath expressions
• XQuery is derived from the quilt query language, which itself
borrows from SQL
• XQuery is supported by all database engines (IBM, Oracle,
Microsoft, etc.)
• XQuery uses FLOWR (for, let, where, order by , result)
– for SQL from
– where SQL where
– order by SQL order by
– result SQL select
– let allows temporary variables
19. Benefits of XML
• XML doc is text based
– Takes less space can be transmitted efficiently
• One XML doc can be displayed differently on
different media.
• XML doc parts can be reused.
• Easy to understand.
20. Drawbacks of XML
• Case sensitive in nature
• XML syntax are redundant or large to binary
representation of the same data.
• Users must predefine their own tags
• Linking between XML docs requires xlink
which is complex compared to hyperlinks.