2. It is Data Description Language.
It is a flexible way to create common information
formats.
It also provide to share both the format and the data
on the World Wide Web, intranets, and elsewhere.
Computer users might agree on a standard or
common way to describe a piece of information
using a platform independent format with XML.
Such a standard way of describing data would
enable a user, to exchange the data with description
between any type of platforms.
3. XML → Extensible Markup Language
Just a text file, with tags, attributes, free text...
looks much like HTML
Used to create special-purpose markup languages
Facilitates sharing of structured text and
information across the Internet.
XML Structure facilitates parsing of data.
XML tags are not pre-defined, you must define
your own tags.
Two Apps still have to agree on what the meaning
of the "descriptive tags“.
4. HTML
MML
SGML
CML
XML
XML for Developers -
Version 1a 4
5. HTML XML
<H1>Cars for Sale</H1> <for_sale>
<h3>Audi 80</h3> <heading>Cars for Sale</heading>
1800 cc<BR> <make>Audi 80</make>
Blue<BR> <engine>1800 cc</engine>
Manual<BR> <color>Blue</color>
1988<BR> <transmission>Manual</transmission>
$1250 <year>1988</year>
<P> <price>$1250</price>
<H3>Toyota Corolla</h3> <make>Toyota Corolla</make>
1250 cc<BR> <engine>1250</engine>
Red<BR> <color>Red</color>
Automatic<BR> <transmission>Automatic</transmission>
1984<BR> <year>1984</year>
Red<BR> <price>$940</price>
$940<BR> </for_sale>
Example1.html Example1.xml
XML for Developers -
Version 1a 5
6. HTML XML in IE5 browser
Cars for Sale
Audi 80
1800 cc
Blue
Manual
1988
$1250
Toyota Corolla
1250 cc
Red
Automatic
1984
Red
$940
6
7. XML HTML
It is free & extensible Derived language from
language. SGML.
Tags are user-defined.
Tags are pre-defined
It is about describing
It is about displaying
information
Extensible set of tags
information.
Content orientated Fixed set of tags
Standard Data Presentation oriented
infrastructure No data validation
Allows multiple output capabilities
forms Single presentation
8. Many number of extensible languages are defined from XML.
◦ Wireless Markup Language (WML).
◦ Chemical Markup Language (CML)
◦ Bio-informatic Sequence Markup Language (BSML).
◦ Mathematical Markup Language (MathML).
◦ Open Office Markup Language ( OOML )
Directly usable over the internet, means any platform and any
protocol can understand XML.
Plain text documents and easier to write & transport
Programs.
Can be used with SGML without any conflict along with other
web technologies.
Clearly understandable by human.
Accurate validation can be possible with the help of DTD and
schema.
We can define structure of data and along with description.
Semi structured Data Bases are also defined using XML.
9. Web publishing : XML allows the customer to customize
web pages. With XML, you store the data once and then
render that content for different viewers.
Web searching and automating Web tasks: XML defines
the type of information contained in a document, making
it easier to return useful results when searching the Web.
General applications: XML provides a standard method to
use, store, transmit, and display data for all kinds of
applications and devices.
e-business applications: XML allows to make electronic
data interchange (EDI) for both business-to-business
transactions, and business-to-consumer transactions.
Metadata applications: XML makes it easier to express
metadata in a portable, reusable format.
Pervasive computing: XML provides portable and
structured information types for display on pervasive
(wireless) computing devices such as personal digital
assistants (PDAs), cellular phones.
10. DTD (Document Type Definition) and XML Schemas are
used to define legal XML tags and their attributes for
particular purposes
CSS (Cascading Style Sheets) describe how to display
HTML or XML in a browser
XSLT (eXtensible Stylesheet Language Transformations)
and XPath are used to translate from one form of XML
to another
DOM (Document Object Model), SAX (Simple API for
XML, and JAXP (Java API for XML Processing) are all APIs
for XML parsing
10
11. XML documents use a self-describing and simple
syntax.
<?xml version=“1.0” encoding=“ISO-8859-1”?>
<note>
<to> Ramesh </to>
<from> Kiran</from>
<heading> Reminder </heading>
<body> Don’t forget me this weekend </body>
</note>
12. XML Syntax consists of
◦ XML Declaration
◦ XML Elements
◦ XML Attributes
The first line of an XML document should always
consist of an XML declaration defining the version of
XML.
An XML document may start with one or more
processing instructions (PIs) or directives:
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="ss.css"?>
Following the directives, there must be exactly one
root element containing all the rest of the XML:
<weatherReport>
...
</weatherReport>
13. XML elements have relationships
Elements can have different content types
Element Naming Rules:
1) Names contain letters, numbers & other characters.
2) Names must not start with a number or
punctuation marks.
3) Names must not start with the letters xml.
4) Names cannot contain spaces.
Names (as used for tags and attributes) must begin
with a letter or underscore, and can consist of:
◦ Letters, both Roman (English) and foreign
◦ Digits, both Roman and foreign
. (dot)
- (hyphen)
_ (underscore)
15. Attributes and elements are somewhat interchangeable
Example using just elements:
<name>
<first>David</first>
<last>Matuszek</last>
</name>
Example using attributes:
<name first="David" last="Matuszek"></name>
You will find that elements are easier to use in your
programs--this is a good reason to prefer them
Attributes often contain metadata, such as unique IDs
Generally speaking, browsers display only elements
(values enclosed by tags), not tags and attributes
15
16. While elements can contain multiple values
attributes cannot
Attributes are not expandable
Elements can describe structure but not
Attributes
Attributes are more difficult to manipulate by
program code than elements
Attribute values are difficult to validate against
a DTD
17. <novel>
<foreword>
<paragraph> This is the great American novel.</paragraph>
</foreword>
<chapter number="1">
<paragraph>It was a dark and stormy night.</paragraph>
<paragraph>Suddenly, a shot rang out!</paragraph>
</chapter>
</novel>
novel
foreword chapter
number="1"
paragraph paragraph paragraph
This is the great It was a dark Suddenly, a shot
American novel. and stormy night. rang out!
17
18. Every element must have both a start tag and an end
tag, e.g. <name> ... </name>
◦ But empty elements can be abbreviated: <break />.
◦ XML tags are case sensitive
◦ XML tags may not begin with the letters xml, in any
combination of cases
Elements must be properly nested,
e.g. not <b><i>bold and italic</b></i>
Every XML document must have one and only one root
element.
The values of attributes must be enclosed in single or
double quotes, e.g. <time unit="days">
Character data cannot contain < or &
18
19. Start with <?xml version="1.0"?>
XML is case sensitive
You must have exactly one root element that
encloses all the rest of the XML
Every element must have a closing tag
Elements must be properly nested
Attribute values must be enclosed in double
or single quotation marks
There are only five pre-declared entities
19
20. Five special characters must be written as
entities:
& for & (almost always necessary)
< for < (almost always necessary)
> for > (not usually necessary)
" for " (necessary inside double quotes)
' for ' (necessary inside single quotes)
These entities can be used even in places
where they are not absolutely required
These are the only predefined entities in XML
20
21. The XML declaration looks like this:
<?xml version="1.0" encoding="UTF-8"
standalone="yes"?>
◦ The XML declaration is not required by browsers, but is
required by most XML processors (so include it!)
◦ If present, the XML declaration must be first--not even
whitespace should precede it
◦ Note that the brackets are <? and ?>
◦ version="1.0" is required (this is the only version so far)
◦ encoding can be "UTF-8" (ASCII) or "UTF-16" (Unicode), or
something else, or it can be omitted
◦ standalone tells whether there is a separate DTD
21
22. PIs (Processing Instructions) may occur anywhere in
the XML document (but usually first)
A PI is a command to the program processing the
XML document to handle it in a certain way
XML documents are typically processed by more
than one program
Programs that do not recognize a given PI should
just ignore it
General format of a PI: <?target instructions?>
Example: <?xml-stylesheet type="text/css"
href="mySheet.css"?>
22
23. <!-- This is a comment in both HTML and XML -->
Comments can be put anywhere in an XML document
Comments are useful for:
◦ Explaining the structure of an XML document
◦ Commenting out parts of the XML during development and
testing
Comments are not elements and do not have an end tag
The blanks after <!-- and before --> are optional
The character sequence -- cannot occur in the comment
The closing bracket must be -->
Comments are not displayed by browsers, but can be
seen by anyone who looks at the source code
23
24. By default, all text inside an XML document is
parsed.
You can force text to be treated as unparsed
character data by enclosing it in <![CDATA[ ... ]]>
Any characters, even & and <, can occur inside a
CDATA
Whitespace inside a CDATA is (usually) preserved
The only real restriction is that the character
sequence ]]> cannot occur inside a CDATA
CDATA is useful when your text has a lot of illegal
characters (for example, if your XML document
contains some HTML text)
24
25. <note>
<to> Ramesh </to>
<from> Kiran</from>
<heading> Reminder </heading>
<body> Don’t forget me this weekend </body>
</note>
<remainder>
<heading> Reminder </heading>
<to> Ramesh </to>
<from> Kiran</from>
<message> Don’t forget me this weekend </message>
</ remainder >
26. Basically DTD is used to specify the set of rules for
structuring data in xml file.
It is used to define the building blocks of XML document.
Using DTD we can specify the various elements types,
attributes and their relationship.
DTD constraints structure of XML data
◦ What elements can occur
◦ What attributes can/must an element have.
◦ What subelements can/must occur inside each element, and how
many times.
DTD syntax
◦ <!ELEMENT element (subelements-specification) >
◦ <!ATTLIST element (attributes) >
27. A DTD adds syntactical requirements in
addition to the well-formed requirement
It helps in eliminating errors when creating or
editing XML documents
It clarifies the intended semantics
It simplifies the processing of XML
documents
27
28. <?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
<note>
<to>Raj</to>
<from>AEC</from>
<heading>Invitation</heading>
<body>Welcome to Aditya!</body>
</note>
29. !DOCTYPE note defines that the root element of this
document is note
!ELEMENT note defines that the note element contains
four child elements: "to,from,heading,body"
!ELEMENT to defines the to element to be of type
"#PCDATA"
!ELEMENT from defines the from element to be of type
"#PCDATA"
!ELEMENT heading defines the heading element to be
of type "#PCDATA"
!ELEMENT body defines the body element to be of type
"#PCDATA"
30. <person>
<name> K.Vijay Kumar </name> Exactly one name
<greet> Happy new year </greet> At most one greeting
<addr>19-12, main road </addr> As many address
<addr> Kakinada </addr> lines as needed
<tel> 943786254 </tel>
Mixed telephones
<fax> 227862544 </fax>
and faxes
<tel> 227862551 </tel>
<email> vkumar123@gmail.com </email> As many
as needed
</person>
30
31. name to specify a name element
greet? to specify an optional
(0 or 1) greet elements
name, greet? to specify a name followed by
an optional greet
addr* to specify 0 or more address
lines
tel | fax a tel or a fax element
(tel | fax)* 0 or more repeats of tel or fax
email* 0 or more email elements
31
32. So the whole structure of a person entry is
specified by
name, greet?, addr*, (tel | fax)*, email*
This is known as a regular expression
32
33. <?xml version="1.0" encoding="UTF-8"?> The name of
<!DOCTYPE addressbook [ the DTD is
<!ELEMENT addressbook (person*)> addressbook
<!ELEMENT person
(name, greet?, address*, (fax | tel)*, email*)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT greet (#PCDATA)> The syntax
<!ELEMENT address (#PCDATA)> of a DTD is
<!ELEMENT tel (#PCDATA)> not XML
<!ELEMENT fax (#PCDATA)>
syntax
<!ELEMENT email (#PCDATA)>
]>
“Internal” means that the DTD and the
XML Document are in the same file 33
34. Suffixes:
? optional foreword?
+ one or more chapter+
* zero or more appendix*
Separators
, both, in order foreword?, chapter+
| or section|chapter
Grouping
() grouping (section|chapter)+
35. The syntax is <!ELEMENT name category>
◦ The name is the element name used in start and end
tags
◦ The category may be EMPTY:
In the DTD: <!ELEMENT br EMPTY>
In the XML: <br></br> or just <br />
◦ In the XML, an empty element may not have any
content between the start tag and the end tag
◦ An empty element may (and usually does) have
attributes
36. The syntax is <!ELEMENT name category>
◦ The category may be ANY
This indicates that any content--character data,
elements, even undeclared elements--may be used
Since the whole point of using a DTD is to define the
structure of a document, ANY should be avoided
wherever possible
◦ The category may be (#PCDATA), indicating that only
character data may be used
In the DTD: <!ELEMENT paragraph (#PCDATA)>
In the XML: <paragraph>A shot rang out!</paragraph>
The parentheses are required!
Note: In (#PCDATA), whitespace is kept exactly as entered
Elements may not be used within parsed character data
Entities are character data, and may be used
37. A category may describe one or more children:
<!ELEMENT novel (foreword, chapter+)>
◦ Parentheses are required, even if there is only one child
◦ A space must precede the opening parenthesis
◦ Commas (,) between elements mean that all children must
appear, and must be in the order specified
◦ “|” separators means any one child may be used
◦ All child elements must themselves be declared
◦ Children may have children
◦ Parentheses can be used for grouping:
<!ELEMENT novel (foreword, (chapter+|section+))>
38. # #PCDATA describes elements with only
character data
#PCDATA can be used in an “or” grouping:
◦ <!ELEMENT note (#PCDATA|message)*>
◦ This is called mixed content
◦ Certain (rather severe) restrictions apply:
#PCDATA must be first
The separators must be “|”
The group must be starred (meaning zero or more)
39. The format of an attribute is:
<!ATTLIST element-name
name type requirement
name type requirement>
where the name-type-requirement may be
repeated as many times as desired
◦ Note that only spaces separate the parts, so careful
counting is essential
◦ The element-name tells which element may have these
attributes
◦ The name is the name of the attribute
◦ Each element has a type, such as CDATA (character data)
◦ Each element may be required, optional, or “fixed”
◦ In the XML, attributes may occur in any order
40. There are ten attribute types
These are the most important ones:
◦ CDATA The value is character data
◦ (man|woman|child) The value is one of enumerated
values
◦ ID The value is a unique identifier
ID values must be legal XML names and must be unique
within the document
◦ NMTOKEN The value is a legal XML name
This is sometimes used to disallow whitespace in the name
It also disallows numbers, since an XML name cannot begin
with a digit
41. IDREF The ID of another element
IDREFS A list of other IDs
NMTOKENS A list of valid XML names
ENTITY An entity
ENTITIES A list of entities
NOTATION A notation
xml: A predefined XML value
42. Recall that an attribute has the form
<!ATTLIST element-name name type requirement>
The requirement is one of:
◦ A default value, enclosed in quotes
Example: <!ATTLIST degree CDATA "PhD">
◦ #REQUIRED
The attribute must be present
◦ #IMPLIED
The attribute is optional
◦ #FIXED "value"
The attribute always has the given value
If specified in the XML, the same value must be used
43. Invoice Element Declaration:
<?xml version=“1.0” ?>
<!ELEMENT employee (#PCDATA)>
<! ElementName AttributeName Type Default >
<!ATTLIST employee type (FullTime | PartTime) “FullTime” >
Usage in XML file:
<?xml version=“1.0” ?>
<employee type=“PartTime”/>
44. CDATA
◦ CDATA attributes are strings , any text is allowed
ID
◦ The values of an ID attribute must be a name. All id the ID attributes used in a
document must be unique. IDs uniquely identify individual elements in a
document.Elements can only have a single ID attrinute
IDREF or IDREFS
◦ An IDREF attributes value must be the value of a single ID attribute on some
element in the document. The value of an IDREFs attribute may contain multiple
IDREF values seperated by white space.
ENTITY or ENTITIES
◦ An ENTITY attribute’s must be the name of a single ENTITY. The value of an
ENTITIES attribute may contain multiple entity names separated by white space.
NMTOKEN or NMTOKENS
◦ Name token attributes are a restricted form of string attribute, but there are no
other restrictions on the word.
List of Names Enumerated
◦ You can specify that the value of an attribute must be taken from a specific list
of names. This frequently called an enumerated type because each of the
possible values must be explicitely enumerated in the declaration
45. #REQUIRED
◦ The attribute must have an explicitly specified value for every occurrence of the
element in the document
#IMPLIED
◦ The attribute value is not required and no default value is provided. If a value is not
specified the XMP processor must proceed without one.
“value”
◦ An attrubute can be given any legal value as a default. The attribute value is not
required on each element of the document, and if it is not present it will appear to be
the specified default
#FIXED “value”
◦ An attribute declaration may specify that an attribute has a fixed value. In this case,
the attribute is not required, but if it occurrs, it must have the specified value. If it is
not present, it will appear to be the specified defualt
46. CDATA ID
◦ Character data ◦ Unique ID
NMTOKEN IDREF
◦ Single token ◦ Match to ID
NMTOKENS IDREFS
◦ Multiple tokens ◦ Match to multiple ID's
ENTITY NOTATION
◦ Attribute is entity ref ◦ Describe non-XML data
ENTITIES Name group
◦ Multiple entity ref's ◦ Restricted list
48. Can specify a default attribute value for
when its missing from XML document, or
state that value must be entered
◦ #REQUIRED Must be specified
◦ #IMPLIED May be specifed
◦ "default" Default value if unspecified
◦ #FIXED Only one value allowed
<ATTLIST tag name type default>
<!ATTLIST seqlist sepchar NMTOKEN #REQUIRED
type (alpha|num) "num"
50. In XML, element names are defined by the developer. This often
results in a conflict when trying to mix XML documents from
different XML applications.
This XML carries HTML table information: This XML carries information about a table
<table> (a piece of furniture):
<tr> <table>
<td>Apples</td> <name>Wooden Table</name>
<td>Bananas</td> <width>80</width>
</tr> <length>120</length>
</table> </table>
•If these both XML tags were added together, there would be a name conflict.
• Both contain a <table> element, but the elements have different content and
meaning.
•An XML parser will not know how to handle these differences.
51. Name conflicts in XML can easily be avoided using a name prefix.
This XML carries information about an HTML table, and a piece of
furniture:
<h:table>
In the example above, there
<h:tr>
<h:td>Apples</h:td> will be no conflict because the
<h:td>Bananas</h:td> two <table> elements have
</h:tr> different names.
</h:table>
<f:table>
<f:name>Wooden Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
52. When using prefixes in XML, a so-called namespace for the prefix must be
defined.
The namespace is defined by the xmlns attribute in the start tag of an element.
The namespace declaration has the following
syntax. xmlns:prefix="URI".
<root xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f=“http://www.w3schools.com/furniture”>
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
53. Recall that DTDs are used to define the tags
that can be used in an XML document
An XML document may reference more than
one DTD
Namespaces are a way to specify which DTD
defines a given tag
XML, like Java, uses qualified names
◦ This helps to avoid collisions between names
◦ Java: myObject.myVariable
◦ XML: myDTD:myTag
◦ Note that XML uses a colon (:) rather than a dot (.)
53
54. A namespace is defined as a unique string
◦ To guarantee uniqueness, typically a URI (Uniform
Resource Indicator) is used, because the author
“owns” the domain
◦ It doesn't have to be a “real” URI; it just has to be
a unique string
◦ Example: http://www.matuszek.org/ns
There are two ways to use namespaces:
◦ Declare a default namespace
◦ Associate a prefix with a namespace, then use the
prefix in the XML to refer to the namespace
54
55. In any start tag you can use the reserved attribute name xmlns:
<book xmlns="http://www.matuszek.org/ns">
◦ This namespace will be used as the default for all elements
up to the corresponding end tag
◦ You can override it with a specific prefix
You can use almost this same form to declare a prefix:
<book xmlns:dave="http://www.matuszek.org/ns">
◦ Use this prefix on every tag and attribute you want to use
from this namespace, including end tags--it is not a default
prefix
<dave:chapter dave:number="1">To Begin</dave:chapter>
You can use the prefix in the start tag in which it is defined:
<dave:book xmlns:dave="http://www.matuszek.org/ns">
55
56. XSL stands for EXtensible Stylesheet Language, and is a
style sheet language for XML documents.
XSLT stands for XSL Transformations.
XSLT is used to transform XML documents into other
formats, like XHTML.
XSLT is used to transform an XML document into another
XML document, or another type of document that is
recognized by a browser, like HTML and XHTML.
XSLT does this by transforming each XML element into an
(X)HTML element.
With XSLT you can add/remove elements and attributes
to or from the output file.
You can also rearrange and sort elements.
You can also perform tests and make decisions about
which elements to hide and display, and a lot more.
57. DTDs are a very weak specification language
◦ You can’t put any restrictions on element contents.
◦ It’s difficult to specify:
All the children must occur, but may be in any order.
This element must occur a certain number of times.
◦ There are only ten data types for attribute values.
DTDs aren’t written in XML!
◦ If you want to do any validation, you need one parser for
the XML and another for the DTD.
◦ This makes XML parsing harder than it needs to be.
◦ There is a newer and more powerful technology:
XML Schemas.
◦ However, DTDs are still very much in use.
58. An XML Schema describes the structure of an XML document.
XML Schema is an XML-based alternative to DTD.
The XML Schema language is also referred to as XML Schema
Definition (XSD).
Ex: remainder.xsd
<?xml version="1.0"?>
< xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema>
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
< /xs:element>
< /xs:schema>
60. The purpose of an XML Schema is to define the legal
building blocks of an XML document, just like a DTD.
An XML Schema:
◦ defines elements that can appear in a document.
◦ defines attributes that can appear in a document.
◦ defines which elements are child elements.
◦ defines the order of child elements.
◦ defines the number of child elements.
◦ defines whether an element is empty or can include text.
◦ defines data types for elements and attributes.
◦ defines default and fixed values for elements and
attributes.
XML Schemas are the Successors of DTDs
◦ XML Schemas are extensible to future additions.
◦ XML Schemas are richer and more powerful than DTDs.
◦ XML Schemas are written in XML.
◦ XML Schemas support data types.
◦ XML Schemas support namespaces.
XML Schemas are much more powerful than DTDs.
61. XML Schemas is the support for data types.
◦ It is easier to validate the correctness of data.
◦ It is easier to work with data from a database.
◦ It is easier to define data facets (restrictions on data), data patterns
(data formats) and easy to convert data between different data types.
XML Schemas is that they are written in XML.
◦ You don't have to learn a new language.
◦ You can use your XML editor to edit your Schema files.
◦ You can use your XML parser to parse your Schema files.
XML Schemas provides Secure Data Communication.
◦ A date like: "03-11-2004" will be interpreted as in some countries,
3.November and in other as 11.March.
◦ However, an XML element with a data type like this:
◦ <date type="date">2004-03-11</date>
◦ ensures a mutual understanding between sender and reciever, i.e., the
XML "date“ type requires the format "YYYY-MM-DD".
XML Schemas are Extensible.
◦ Reuse your Schema in other Schemas.
◦ Create your own data types derived from the standard types.
◦ Reference multiple schemas in the same document.
62. Defining Simple Element :
<xs:element name="xxx" type="yyy"/>
XML Schema has a lot of built-in data types.
o xs:string
o xs:decimal
o xs:integer
o xs:boolean
o xs:date
o xs:time
Example
Here are some XML elements:
<lastname>Refsnes</lastname>
<age>36</age>
<dateborn>1970-03-27</dateborn>
Here are the corresponding simple element definitions in Schema:
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
<xs:element name="color" type="xs:string" default="red"/>
<xs:element name="color" type="xs:string" fixed="red"/>
63. Syntex :
<xs:attribute name="xxx" type="yyy"/>
Example
Here is an XML element with an attribute:
<lastname lang="EN">Smith</lastname>
And here is the corresponding attribute definition:
<xs:attribute name="lang" type="xs:string"/>
<xs:attribute name="lang" type="xs:string" default="EN"/>
<xs:attribute name="lang" type="xs:string" fixed="EN"/>
<xs:attribute name="lang" type="xs:string" use="required"/>
64. Restrictions are used to define acceptable values for XML
elements or attributes.
Restrictions on XML elements are called facets.
Restrictions on Values
The example defines an element called "age" with a restriction.
The value of age cannot be lower than 0 or greater than 100:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
65. The example below defines an element called "car" with a
restriction.
The only acceptable values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
66. Below example defines an element "letter" with a restriction.
The acceptable value is ONE of the LOWERCASE letters from a to z:
<xs:element name="letter">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The only acceptable value is THREE of the UPPERCASE letters from a
to z:
<xs:element name="initials">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[A-Z][A-Z][A-Z]"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
67. The next example defines an element called "gender" with a
restriction. The only acceptable value is male OR female:
<xs:element name="gender">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="male|female"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The example defines an element “mobileno" with a restriction.
There must be exactly 10 digits:
<xs:element name=“mobileno">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:pattern value="[0-9]{10}"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
68. The whiteSpace constraint is set to "preserve", which means that the XML
processor WILL NOT remove any white space characters:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
The whiteSpace constraint is set to "replace", which means that the XML
processor WILL REPLACE all white space characters (line feeds, tabs, spaces, and
carriage returns) with spaces:
<xs:element name="address">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:whiteSpace value="replace"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
69. The value must be minimum five characters
and maximum eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:minLength value="5"/>
<xs:maxLength value="8"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
70. A complex element is an XML element that
contains other elements and/or attributes.
There are four kinds of complex elements:
◦ empty elements
◦ elements that contain only other elements
◦ elements that contain only text
◦ elements that contain both other elements and text
71. It is a software library (or a package) that
provides methods (or interfaces) for client
applications to work with XML documents
It checks the well-formattedness
It may validate the documents
It does a lot of other detailed things so that a
client is shielded from that complexities
72.
73. DOM: Document Object Model
SAX: Simple API for XML
A DOM parser implements DOM API
A SAX parser implement SAX API
Most major parsers implement both
DOM and SAX API’s
74. A DOM document is an object containing
all the information of an XML document
It is composed of a tree (DOM tree) of
nodes , and various nodes that are
somehow associated with other nodes in
the tree but are not themselves part of the
DOM tree
75. There are 12 types of nodes in a DOM
Document object
Document node
Element node
Text node
Attribute node
Processing instruction node
…….
76. Sample XML document
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href=“test.css"?>
<!-- It's an xml-stylesheet processing instruction. -->
<!DOCTYPE shapes SYSTEM “shapes.dtd">
<shapes>
……
<squre color=“BLUE”>
<length> 20 </length>
</squre>
……
</shapes>
77.
78. A DOM parser creates an internal structure in
memory which is a DOM document object
Client applications get the information of the
original XML document by invoking methods
on this Document object or on other objects it
contains
DOM parser is tree-based (or DOM obj-based)
Client application seems to be pulling the data
actively, from the data flow point of view
79. Advantage:
(1) It is good when random access to widely
separated parts of a document is
required
(2) It supports both read and write operations
Disadvantage:
(1) It is memory inefficient
(2) It seems complicated, although not really
80. It does not first create any internal structure
Client does not specify what methods to call
Client just overrides the methods of the API
and place his own code inside there
When the parser encounters start-tag, end-
tag,etc., it thinks of them as events
81. When such an event occurs, the handler
automatically calls back to a particular method
overridden by the client, and feeds as
arguments the method what it sees
SAX parser is event-based,it works like an
event handler in Java (e.g. MouseAdapter)
Client application seems to be just receiving
the data inactively, from the data flow point of
view
82. Advantage:
(1) It is simple
(2) It is memory efficient
(3) It works well in stream application
Disadvantage:
The data is broken into pieces and clients
never have all the information as a whole
unless they create their own data structure