SlideShare ist ein Scribd-Unternehmen logo
1 von 44
11
Dr. C.V. Suresh Babu
2
• XML stands for eXtensible Markup Language.
• A markup language is used to provide information
about a document.
• Tags are added to the document to provide the extra
information.
• HTML tags tell a browser how to display the
document.
• Separates presentation (HTML) from data (XML)
• XML tags give a reader some idea what some of the
data means.
3
• XML documents are used to transfer data from one place to
another often over the Internet.
• XML subsets are designed for particular applications.
• One is RSS (Rich Site Summary or Really Simple Syndication ).
It is used to send breaking news bulletins from one web site
to another.
• A number of fields have their own subsets. These include
chemistry, mathematics, and books publishing.
• Most of these subsets are registered with the W3Consortium
and are available for anyone’s use.
History
4
SGML
HTML
XML
1985 1990
1995/98
Standardized General Markup Language
HyperText Markup Language
eXtensible Markup Language
XML is a “use everywhere” data specification
Documents
Configuration
Database
Application X
Repository
XML XML
XML XML
Why XML is so Popular
• XML is popular because:
– it allows very complex sets of structured data to
be transmitted over the internet as a simple
stream of characters (a document), but…
– It allows processing of that simple stream as a rich
object model in the receiving application
• XML is ubiquitous on the Internet, and is the
basis of all web-services applications
Documents vs. Data
• XML is used to represent
two main types of things:
– Documents
• Lots of text with tags to
identify and annotate portions
of the document
– Data
• Hierarchical data structures
Overview
XML and Structured Data
• Pre-XML representation of data:
• XML representation of the same data:“PO-1234”,”CUST001”,”X9876”,”5”,”14.98”
<PURCHASE_ORDER>
<PO_NUM> PO-1234 </PO_NUM>
<CUST_ID> CUST001 </CUST_ID>
<ITEM_NUM> X9876 </ITEM_NUM>
<QUANTITY> 5 </QUANTITY>
<PRICE> 14.98 </PRICE>
</PURCHASE_ORDER>
Advantages of XML
• XML is text (Unicode) based.
– Takes up less space.
– Can be transmitted efficiently.
• One XML document can be displayed differently in different media.
– Html, video, CD, DVD,
– You only have to change the XML document in order to change all the rest.
• XML documents can be modularized. Parts can be reused.
• Electronic data interchange is critical in today’s networked
world and needs to be standardized
– Examples:
• Banking: funds transfer
• Education: e-learning contents
• Scientific data
– Chemistry: ChemML, …
– Genetics: BSML (Bio-Sequence Markup Language), …
The defunct W3C text-centric XML framework for XML-based documents
(XHTML2 is dead, the programmer-centric HTML5 rules)
HTML
XMLapplications.
PICS
P3P
RDF Semantics
XML
RDF
app. 2.0
SGML
XSL
CSS
DSSL
(for
SGML)
XLink
XPointer
Typical document
SVG
.....ML
SMIL
(subset of SGML)
XSLT
applications
XPath
The W3C consortium defines many XML-based languages
... details later
Syntax and Structure
Components of an XML Document
• Elements
– Each element has a beginning and ending tag
• <TAG_NAME>...</TAG_NAME>
– Elements can be empty (<TAG_NAME />)
• Attributes
– Describes an element; e.g. data type, data range, etc.
– Can only appear on beginning tag
• Processing instructions
– Encoding specification (Unicode by default)
– Namespace declaration
– Schema declaration
Syntax and Structure
Components of an XML Document
<?xml version=“1.0” ?>
<?xml-stylesheet type="text/xsl” href=“template.xsl"?>
<ROOT>
<ELEMENT1><SUBELEMENT1 /><SUBELEMENT2 /></ELEMENT1>
<ELEMENT2> </ELEMENT2>
<ELEMENT3 type=‘string’> </ELEMENT3>
<ELEMENT4 type=‘integer’ value=‘9.3’> </ELEMENT4>
</ROOT>
Prologue (processing instructions)
Elements
Elements with Attributes
XML Element
13
Syntax and Structure
Rules For Well-Formed XML
• There must be one, and only one, root element
• Sub-elements must be properly nested
– A tag must end within the tag in which it was started
• Attributes are optional
– Defined by an optional schema
• Attribute values must be enclosed in “” or ‘’
• Processing instructions are optional
• XML is case-sensitive
– <tag> and <TAG> are not the same type of element
An XML Document Example
<imdb>
<show year=“1993”>
<title>Fugitive, The</title>
<review>
<suntimes>
<reviewer>Roger Ebert</reviewer> gives <rating>two thumbs
up</rating>! A fun action movie, Harrison Ford at his best.
</suntimes>
</review>
<review>
<nyt>The standard &hollywood; summer movie strikes back.</nyt>
</review>
<box_office>183,752,965</box_office>
</show>
<show year=“1994”>
<title>X Files,The</title>
<seasons>4</seasons>
</show>
</imdb>
Start Tag
End Tag
Attribute
Element
Mixed Content
XML Structure
16
Comparisons
• Extensible set of tags
• Content orientated
• Standard Data
infrastructure
• Allows multiple output
forms
• Fixed set of tags
• Presentation oriented
• No data validation
capabilities
• Single presentation
XML HTML
HTML vs. XML
<h1> Bibliography </h1>
<p> <i> Foundations of Databases </i>
Abiteboul, Hull, Vianu
<br> Addison Wesley, 1995
<p> <i> Data on the Web </i>
Abiteoul, Buneman, Suciu
<br> Morgan Kaufmann, 1999
<bibliography>
<book> <title> Foundations…
</title>
<author> Abiteboul
</author>
<author> Hull </author>
<author> Vianu </author>
<publisher> Addison Wesley
</publisher>
<year> 1995 </year>
</book>
…
</bibliography>
<h1> Bibliography </h1>
<p> <i> Foundations of Databases </i>
Abiteboul, Hull, Vianu
<br> Addison Wesley, 1995
<p> <i> Data on the Web </i>
Abiteoul, Buneman, Suciu
<br> Morgan Kaufmann, 1999
<bibliography>
<book> <title> Foundations… </title>
<author> Abiteboul </author>
<author> Hull </author>
<author> Vianu </author>
<publisher> Addison Wesley </publisher>
<year> 1995 </year>
</book>
…
</bibliography>
HTML describes presentation
XML describes contentXSL (stylesheets)
can be used to
specify the conversion
DKS – 15/09/2019 Slide 20
XML information structures (1)
Book
FrontMatter
BookTitle
Author(s)
PubInfo
Chapter(s)
ChapterTitle
Paragraph(s)
BackMatter
References
Index
Example 1: A possible book structure
XML information structures (2)
• Premise: A text is the sum of its component parts
– A <Book> could be defined as containing:
<FrontMatter>, <Chapter>s, <BackMatter>
– <FrontMatter> could contain:
<BookTitle> <Author>s <PubInfo>
– A <Chapter> could contain:
<ChapterTitle> <Paragraph>s
– A <Paragraph> could contain:
<Sentence>s or <Table>s or <Figure>s …
• Components chosen for book markup language
should reflect anticipated use ….
XML information structures (3)
<Book>
<FrontMatter>
<BookTitle>XML Is Easy</BookTitle>
<Author>Tim Cole</Author>
<Author>Tom Habing</Author>
<PubInfo>CDP Press, 2002</PubInfo>
</FrontMatter>
<Chapter>
<ChapterTitle>First Was SGML</ChapterTitle>
<Paragraph>Once upon a time …</Paragraph>
</Chapter>
</Book>
A corresponding XML fragment
(based on a corresponding XML application):
begin
element
end
element
XML information structures (4)
Example 2: Movies
• Elements can have attributes
<movies>
<movie genre="action" star="Halle Berry">
<name>Catwoman</name>
<date>(2004)</date>
<length>104 minutes</length>
</movie>
<movie genre="horror" star="Halle Berry">
<name>Gothika</name>
<date>(2003)</date>
<length>98 minutes</length>
</movie>
<movie genre="drama" star="Halle Berry">
<name>Monster&apos;s Ball</name>
<date>(2001)</date>
<length>111 minutes</length>
</movie>
</movies>
attribute
Encoding
• XML (like Java) uses Unicode to encode characters.
• Unicode comes in many flavors. The most common one used
in the West is UTF-8.
• UTF-8 is a variable length code. Characters are encoded in 1
byte, 2 bytes, or 4 bytes.
• The first 128 characters in Unicode are ASCII.
• In UTF-8, the numbers between 128 and 255 code for some of
the more common characters used in western Europe, such as
ã, á, å, or ç.
• Two byte codes are used for some characters not listed in the
first 256 and some Asian ideographs.
• Four byte codes can handle any ideographs that are left.
• Those using non-western languages should investigate other
versions of Unicode.
DKS – 15/09/2019 Slide 25
What is an XML document ?
• An XML document is a content marked up
with XML (can be a file, a string, a message
content or any other sort of data storage)
• There are 2 levels of conforming documents:
– Well-formed: respects the XML syntax
– Valid: In addition, respects one (or more)
associated grammars (schemas).
What is a well-formed XML document (1) ?
Well-formed documents follow basic syntax rules e.g.
• there is an XML declaration in the first line
• there is a single document root
• all tags use proper delimiters
• all elements have start and end tags
– But can be minimized if empty: <br/> instead of <br></br>
• all elements are properly nested
<author> <firstname>Mark</firstname>
<lastname>Twain</lastname> </author>
• appropriate use of special characters
• all attribute values are quoted:
<subject scheme=“LCSH”>Music</subject>
DKS – 15/09/2019 Slide 27
What is a well-formed XML document (2) ?
• Good example
<addressBook>
<person>
<name> <family>Wallace</family> <given>Bob</given> </name>
<email>bwallace@megacorp.com</email>
<address>Rue de Lausanne, Genève</address>
</person>
</addressBook>
• Bad example
<addressBook>
<address>Rue de Lausanne, Genève <person></address>
<name>
<family>Schneider</family> <firstName>Nina</firstName>
</name>
<email>nina@nina.name</email>
</person>
<name><family> Muller </family> <name>
</addressBook>
What is a valid XML document (3) ?
• Parser (i.e. that program that reads the XML) can
check markup of individual document against rules
expressed in a schema (DTD, XML Schema, etc.)
• Typically, a schema (grammar):
– Defines available elements
– Defines attributes of elements
– Defines how elements can be embedded
– Defines mandatory and optional information
• Authoring tools usually can enforce rules of
DTD/Schema while document is edited
DKS – 15/09/2019 Slide 29
Document Type Definitions (DTDs 1)
• XML document types can be specified using a DTD
• DTD constraints structure of XML data
– What elements can occur
– What attributes can/must an element have
– What subelements can/must occur inside each element, and how
many times.
• DTD does not constrain data types
– All values represented as strings in XML
• DTD definition syntax
– <!ELEMENT element (subelements-specification) >
– <!ATTLIST element (attributes) >
– … more details later
• Valid XML documents refer to a DTD (or other Schema)
DKS – 15/09/2019 Slide 30
Document Type Definitions (DTDs 2)
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test PUBLIC "-//Webster//DTD test V1.0//EN"
<test> "test" is a document element </test>
<!DOCTYPE test [
<!ELEMENT test EMPTY> ]>
<test/>
External Public DTD Declaration
Internal DTD Declaration
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test SYSTEM "test.dtd">
<test> "test" is a document element </test>
External DTD Declaration referring to a file or a URL
test = name
of the root
element
DTD is defined
in file test.dtd
DTD is defined
inside XML
Application
should know
DTD
XML Files are Trees
address
name email phone birthday
first last year month day
XML Trees
• An XML document has a single root node.
• The tree is a general ordered tree.
– A parent node may have any number of children.
– Child nodes are ordered, and may have siblings.
• Preorder traversals are usually used for getting
information out of the tree.
Validity
• A well-formed document has a tree structure and
obeys all the XML rules.
• A particular application may add more rules in either
a DTD (document type definition) or in a schema.
• Many specialized DTDs and schemas have been
created to describe particular areas.
• These range from disseminating news bulletins (RSS)
to chemical formulas.
• DTDs were developed first, so they are not as
comprehensive as schema.
Document Type Definitions
• A DTD describes the tree structure of a
document and something about its data.
• There are two data types, PCDATA and CDATA.
– PCDATA is parsed character data.
– CDATA is character data, not usually parsed.
• A DTD determines how many times a node
may appear, and how child nodes are ordered.
DTD for address Example
<!ELEMENT address (name, email, phone, birthday)>
<!ELEMENT name (first, last)>
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT birthday (year, month, day)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT month (#PCDATA)>
<!ELEMENT day (#PCDATA)>
Schemas
• Schemas are themselves XML documents.
• They were standardized after DTDs and provide more
information about the document.
• They have a number of data types including string,
decimal, integer, boolean, date, and time.
• They divide elements into simple and complex types.
• They also determine the tree structure and how
many children a node may have.
Schema Elements
37
DKS – 15/09/2019 Slide 38
XML Schemas
• XML Schema is a more sophisticated schema language which
addresses the drawbacks of DTDs. Supports
– Typing of values
• E.g. integer, string, etc
• Also, constraints on min/max values
– User-defined, complex types
– Many more features, including
• uniqueness and foreign key constraints, inheritance
• XML Schema is itself specified in XML syntax, unlike DTDs
– More-standard representation, but verbose
• XML Scheme is integrated with namespaces
• BUT: XML Schema is significantly more complicated than
DTDs.
Schema for First address Example
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="phone" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Explanation of Example Schema
<?xml version="1.0" encoding="ISO-8859-1" ?>
• ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
• www.w3.org/2001/XMLSchema contains the schema standards.
<xs:element name="address">
<xs:complexType>
• This states that address is a complex type element.
<xs:sequence>
• This states that the following elements form a sequence and must come in the
order shown.
<xs:element name="name" type="xs:string"/>
• This says that the element, name, must be a string.
<xs:element name="birthday" type="xs:date"/>
• This states that the element, birthday, is a date. Dates are always of the form
yyyy-mm-dd.
XSLT
Extensible Stylesheet Language Transformations
• XSLT is used to transform one xml document into
another, often an html document.
• The Transform classes are now part of Java 1.4.
• A program is used that takes as input one xml
document and produces as output another.
• If the resulting document is in html, it can be viewed
by a web browser.
• This is a good way to display xml data.
A Style Sheet to Transform address.xml
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="address">
<html><head><title>Address Book</title></head>
<body>
<xsl:value-of select="name"/>
<br/><xsl:value-of select="email"/>
<br/><xsl:value-of select="phone"/>
<br/><xsl:value-of select="birthday"/>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
The Result of the Transformation
Alice Lee
alee@aol.com
123-45-6789
1983-7-15
Parsers
• There are two principal models for parsers.
• SAX – Simple API for XML
– Uses a call-back method
– Similar to javax listeners
• DOM – Document Object Model
– Creates a parse tree
– Requires a tree traversal

Weitere ähnliche Inhalte

Was ist angesagt? (20)

01 xml document structure
01 xml document structure01 xml document structure
01 xml document structure
 
XML
XMLXML
XML
 
XML-Extensible Markup Language
XML-Extensible Markup Language XML-Extensible Markup Language
XML-Extensible Markup Language
 
XML
XMLXML
XML
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
DTD
DTDDTD
DTD
 
Html table tags
Html table tagsHtml table tags
Html table tags
 
Xml Presentation-3
Xml Presentation-3Xml Presentation-3
Xml Presentation-3
 
1 xml fundamentals
1 xml fundamentals1 xml fundamentals
1 xml fundamentals
 
Extensible Markup Language (XML)
Extensible Markup Language (XML)Extensible Markup Language (XML)
Extensible Markup Language (XML)
 
Html ppt
Html pptHtml ppt
Html ppt
 
Xhtml
XhtmlXhtml
Xhtml
 
XSLT
XSLTXSLT
XSLT
 
Xml
XmlXml
Xml
 
Php Presentation
Php PresentationPhp Presentation
Php Presentation
 
Sgml
SgmlSgml
Sgml
 
HTML
HTMLHTML
HTML
 
Event In JavaScript
Event In JavaScriptEvent In JavaScript
Event In JavaScript
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
HTML Forms
HTML FormsHTML Forms
HTML Forms
 

Ähnlich wie Xml (20)

Xml iet 2015
Xml iet 2015Xml iet 2015
Xml iet 2015
 
Xml
XmlXml
Xml
 
Xml and DTD's
Xml and DTD'sXml and DTD's
Xml and DTD's
 
Data interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTDData interchange integration, HTML XML Biological XML DTD
Data interchange integration, HTML XML Biological XML DTD
 
IT6801-Service Oriented Architecture- UNIT-I notes
IT6801-Service Oriented Architecture- UNIT-I notesIT6801-Service Oriented Architecture- UNIT-I notes
IT6801-Service Oriented Architecture- UNIT-I notes
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
 
Unit3wt
Unit3wtUnit3wt
Unit3wt
 
WT UNIT-2 XML.pdf
WT UNIT-2 XML.pdfWT UNIT-2 XML.pdf
WT UNIT-2 XML.pdf
 
XML
XMLXML
XML
 
Ch2 neworder
Ch2 neworderCh2 neworder
Ch2 neworder
 
XML
XMLXML
XML
 
Xml unit1
Xml unit1Xml unit1
Xml unit1
 
Chapter4
Chapter4Chapter4
Chapter4
 
Unit iv xml dom
Unit iv xml domUnit iv xml dom
Unit iv xml dom
 
XML(EXtensible Markup Language). XML(EXtensible Markup Language).pptppt
XML(EXtensible Markup Language). XML(EXtensible Markup Language).pptpptXML(EXtensible Markup Language). XML(EXtensible Markup Language).pptppt
XML(EXtensible Markup Language). XML(EXtensible Markup Language).pptppt
 
paper about xml
paper about xmlpaper about xml
paper about xml
 
Xml
XmlXml
Xml
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 
XML, DTD & XSD Overview
XML, DTD & XSD OverviewXML, DTD & XSD Overview
XML, DTD & XSD Overview
 
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
M.FLORENCE DAYANA WEB DESIGN -Unit 5   XMLM.FLORENCE DAYANA WEB DESIGN -Unit 5   XML
M.FLORENCE DAYANA WEB DESIGN -Unit 5 XML
 

Mehr von Dr. C.V. Suresh Babu (20)

Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
 
Association rules
Association rulesAssociation rules
Association rules
 
Clustering
ClusteringClustering
Clustering
 
Classification
ClassificationClassification
Classification
 
Blue property assumptions.
Blue property assumptions.Blue property assumptions.
Blue property assumptions.
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
 
DART
DARTDART
DART
 
Mycin
MycinMycin
Mycin
 
Expert systems
Expert systemsExpert systems
Expert systems
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
 
Bayes network
Bayes networkBayes network
Bayes network
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
 
Knowledge based agents
Knowledge based agentsKnowledge based agents
Knowledge based agents
 
Rule based system
Rule based systemRule based system
Rule based system
 
Formal Logic in AI
Formal Logic in AIFormal Logic in AI
Formal Logic in AI
 
Production based system
Production based systemProduction based system
Production based system
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
 
Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AI
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
 

Kürzlich hochgeladen

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Kürzlich hochgeladen (20)

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Xml

  • 2. 2 • XML stands for eXtensible Markup Language. • A markup language is used to provide information about a document. • Tags are added to the document to provide the extra information. • HTML tags tell a browser how to display the document. • Separates presentation (HTML) from data (XML) • XML tags give a reader some idea what some of the data means.
  • 3. 3 • XML documents are used to transfer data from one place to another often over the Internet. • XML subsets are designed for particular applications. • One is RSS (Rich Site Summary or Really Simple Syndication ). It is used to send breaking news bulletins from one web site to another. • A number of fields have their own subsets. These include chemistry, mathematics, and books publishing. • Most of these subsets are registered with the W3Consortium and are available for anyone’s use.
  • 4. History 4 SGML HTML XML 1985 1990 1995/98 Standardized General Markup Language HyperText Markup Language eXtensible Markup Language
  • 5. XML is a “use everywhere” data specification Documents Configuration Database Application X Repository XML XML XML XML
  • 6. Why XML is so Popular • XML is popular because: – it allows very complex sets of structured data to be transmitted over the internet as a simple stream of characters (a document), but… – It allows processing of that simple stream as a rich object model in the receiving application • XML is ubiquitous on the Internet, and is the basis of all web-services applications
  • 7. Documents vs. Data • XML is used to represent two main types of things: – Documents • Lots of text with tags to identify and annotate portions of the document – Data • Hierarchical data structures
  • 8. Overview XML and Structured Data • Pre-XML representation of data: • XML representation of the same data:“PO-1234”,”CUST001”,”X9876”,”5”,”14.98” <PURCHASE_ORDER> <PO_NUM> PO-1234 </PO_NUM> <CUST_ID> CUST001 </CUST_ID> <ITEM_NUM> X9876 </ITEM_NUM> <QUANTITY> 5 </QUANTITY> <PRICE> 14.98 </PRICE> </PURCHASE_ORDER>
  • 9. Advantages of XML • XML is text (Unicode) based. – Takes up less space. – Can be transmitted efficiently. • One XML document can be displayed differently in different media. – Html, video, CD, DVD, – You only have to change the XML document in order to change all the rest. • XML documents can be modularized. Parts can be reused. • Electronic data interchange is critical in today’s networked world and needs to be standardized – Examples: • Banking: funds transfer • Education: e-learning contents • Scientific data – Chemistry: ChemML, … – Genetics: BSML (Bio-Sequence Markup Language), …
  • 10. The defunct W3C text-centric XML framework for XML-based documents (XHTML2 is dead, the programmer-centric HTML5 rules) HTML XMLapplications. PICS P3P RDF Semantics XML RDF app. 2.0 SGML XSL CSS DSSL (for SGML) XLink XPointer Typical document SVG .....ML SMIL (subset of SGML) XSLT applications XPath The W3C consortium defines many XML-based languages ... details later
  • 11. Syntax and Structure Components of an XML Document • Elements – Each element has a beginning and ending tag • <TAG_NAME>...</TAG_NAME> – Elements can be empty (<TAG_NAME />) • Attributes – Describes an element; e.g. data type, data range, etc. – Can only appear on beginning tag • Processing instructions – Encoding specification (Unicode by default) – Namespace declaration – Schema declaration
  • 12. Syntax and Structure Components of an XML Document <?xml version=“1.0” ?> <?xml-stylesheet type="text/xsl” href=“template.xsl"?> <ROOT> <ELEMENT1><SUBELEMENT1 /><SUBELEMENT2 /></ELEMENT1> <ELEMENT2> </ELEMENT2> <ELEMENT3 type=‘string’> </ELEMENT3> <ELEMENT4 type=‘integer’ value=‘9.3’> </ELEMENT4> </ROOT> Prologue (processing instructions) Elements Elements with Attributes
  • 14. Syntax and Structure Rules For Well-Formed XML • There must be one, and only one, root element • Sub-elements must be properly nested – A tag must end within the tag in which it was started • Attributes are optional – Defined by an optional schema • Attribute values must be enclosed in “” or ‘’ • Processing instructions are optional • XML is case-sensitive – <tag> and <TAG> are not the same type of element
  • 15. An XML Document Example <imdb> <show year=“1993”> <title>Fugitive, The</title> <review> <suntimes> <reviewer>Roger Ebert</reviewer> gives <rating>two thumbs up</rating>! A fun action movie, Harrison Ford at his best. </suntimes> </review> <review> <nyt>The standard &hollywood; summer movie strikes back.</nyt> </review> <box_office>183,752,965</box_office> </show> <show year=“1994”> <title>X Files,The</title> <seasons>4</seasons> </show> </imdb> Start Tag End Tag Attribute Element Mixed Content
  • 17. Comparisons • Extensible set of tags • Content orientated • Standard Data infrastructure • Allows multiple output forms • Fixed set of tags • Presentation oriented • No data validation capabilities • Single presentation XML HTML
  • 18. HTML vs. XML <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteoul, Buneman, Suciu <br> Morgan Kaufmann, 1999 <bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography>
  • 19. <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteoul, Buneman, Suciu <br> Morgan Kaufmann, 1999 <bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> HTML describes presentation XML describes contentXSL (stylesheets) can be used to specify the conversion
  • 20. DKS – 15/09/2019 Slide 20 XML information structures (1) Book FrontMatter BookTitle Author(s) PubInfo Chapter(s) ChapterTitle Paragraph(s) BackMatter References Index Example 1: A possible book structure
  • 21. XML information structures (2) • Premise: A text is the sum of its component parts – A <Book> could be defined as containing: <FrontMatter>, <Chapter>s, <BackMatter> – <FrontMatter> could contain: <BookTitle> <Author>s <PubInfo> – A <Chapter> could contain: <ChapterTitle> <Paragraph>s – A <Paragraph> could contain: <Sentence>s or <Table>s or <Figure>s … • Components chosen for book markup language should reflect anticipated use ….
  • 22. XML information structures (3) <Book> <FrontMatter> <BookTitle>XML Is Easy</BookTitle> <Author>Tim Cole</Author> <Author>Tom Habing</Author> <PubInfo>CDP Press, 2002</PubInfo> </FrontMatter> <Chapter> <ChapterTitle>First Was SGML</ChapterTitle> <Paragraph>Once upon a time …</Paragraph> </Chapter> </Book> A corresponding XML fragment (based on a corresponding XML application): begin element end element
  • 23. XML information structures (4) Example 2: Movies • Elements can have attributes <movies> <movie genre="action" star="Halle Berry"> <name>Catwoman</name> <date>(2004)</date> <length>104 minutes</length> </movie> <movie genre="horror" star="Halle Berry"> <name>Gothika</name> <date>(2003)</date> <length>98 minutes</length> </movie> <movie genre="drama" star="Halle Berry"> <name>Monster&apos;s Ball</name> <date>(2001)</date> <length>111 minutes</length> </movie> </movies> attribute
  • 24. Encoding • XML (like Java) uses Unicode to encode characters. • Unicode comes in many flavors. The most common one used in the West is UTF-8. • UTF-8 is a variable length code. Characters are encoded in 1 byte, 2 bytes, or 4 bytes. • The first 128 characters in Unicode are ASCII. • In UTF-8, the numbers between 128 and 255 code for some of the more common characters used in western Europe, such as ã, á, å, or ç. • Two byte codes are used for some characters not listed in the first 256 and some Asian ideographs. • Four byte codes can handle any ideographs that are left. • Those using non-western languages should investigate other versions of Unicode.
  • 25. DKS – 15/09/2019 Slide 25 What is an XML document ? • An XML document is a content marked up with XML (can be a file, a string, a message content or any other sort of data storage) • There are 2 levels of conforming documents: – Well-formed: respects the XML syntax – Valid: In addition, respects one (or more) associated grammars (schemas).
  • 26. What is a well-formed XML document (1) ? Well-formed documents follow basic syntax rules e.g. • there is an XML declaration in the first line • there is a single document root • all tags use proper delimiters • all elements have start and end tags – But can be minimized if empty: <br/> instead of <br></br> • all elements are properly nested <author> <firstname>Mark</firstname> <lastname>Twain</lastname> </author> • appropriate use of special characters • all attribute values are quoted: <subject scheme=“LCSH”>Music</subject>
  • 27. DKS – 15/09/2019 Slide 27 What is a well-formed XML document (2) ? • Good example <addressBook> <person> <name> <family>Wallace</family> <given>Bob</given> </name> <email>bwallace@megacorp.com</email> <address>Rue de Lausanne, Genève</address> </person> </addressBook> • Bad example <addressBook> <address>Rue de Lausanne, Genève <person></address> <name> <family>Schneider</family> <firstName>Nina</firstName> </name> <email>nina@nina.name</email> </person> <name><family> Muller </family> <name> </addressBook>
  • 28. What is a valid XML document (3) ? • Parser (i.e. that program that reads the XML) can check markup of individual document against rules expressed in a schema (DTD, XML Schema, etc.) • Typically, a schema (grammar): – Defines available elements – Defines attributes of elements – Defines how elements can be embedded – Defines mandatory and optional information • Authoring tools usually can enforce rules of DTD/Schema while document is edited
  • 29. DKS – 15/09/2019 Slide 29 Document Type Definitions (DTDs 1) • XML document types can be specified using a DTD • DTD constraints structure of XML data – What elements can occur – What attributes can/must an element have – What subelements can/must occur inside each element, and how many times. • DTD does not constrain data types – All values represented as strings in XML • DTD definition syntax – <!ELEMENT element (subelements-specification) > – <!ATTLIST element (attributes) > – … more details later • Valid XML documents refer to a DTD (or other Schema)
  • 30. DKS – 15/09/2019 Slide 30 Document Type Definitions (DTDs 2) <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE test PUBLIC "-//Webster//DTD test V1.0//EN" <test> "test" is a document element </test> <!DOCTYPE test [ <!ELEMENT test EMPTY> ]> <test/> External Public DTD Declaration Internal DTD Declaration <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE test SYSTEM "test.dtd"> <test> "test" is a document element </test> External DTD Declaration referring to a file or a URL test = name of the root element DTD is defined in file test.dtd DTD is defined inside XML Application should know DTD
  • 31. XML Files are Trees address name email phone birthday first last year month day
  • 32. XML Trees • An XML document has a single root node. • The tree is a general ordered tree. – A parent node may have any number of children. – Child nodes are ordered, and may have siblings. • Preorder traversals are usually used for getting information out of the tree.
  • 33. Validity • A well-formed document has a tree structure and obeys all the XML rules. • A particular application may add more rules in either a DTD (document type definition) or in a schema. • Many specialized DTDs and schemas have been created to describe particular areas. • These range from disseminating news bulletins (RSS) to chemical formulas. • DTDs were developed first, so they are not as comprehensive as schema.
  • 34. Document Type Definitions • A DTD describes the tree structure of a document and something about its data. • There are two data types, PCDATA and CDATA. – PCDATA is parsed character data. – CDATA is character data, not usually parsed. • A DTD determines how many times a node may appear, and how child nodes are ordered.
  • 35. DTD for address Example <!ELEMENT address (name, email, phone, birthday)> <!ELEMENT name (first, last)> <!ELEMENT first (#PCDATA)> <!ELEMENT last (#PCDATA)> <!ELEMENT email (#PCDATA)> <!ELEMENT phone (#PCDATA)> <!ELEMENT birthday (year, month, day)> <!ELEMENT year (#PCDATA)> <!ELEMENT month (#PCDATA)> <!ELEMENT day (#PCDATA)>
  • 36. Schemas • Schemas are themselves XML documents. • They were standardized after DTDs and provide more information about the document. • They have a number of data types including string, decimal, integer, boolean, date, and time. • They divide elements into simple and complex types. • They also determine the tree structure and how many children a node may have.
  • 38. DKS – 15/09/2019 Slide 38 XML Schemas • XML Schema is a more sophisticated schema language which addresses the drawbacks of DTDs. Supports – Typing of values • E.g. integer, string, etc • Also, constraints on min/max values – User-defined, complex types – Many more features, including • uniqueness and foreign key constraints, inheritance • XML Schema is itself specified in XML syntax, unlike DTDs – More-standard representation, but verbose • XML Scheme is integrated with namespaces • BUT: XML Schema is significantly more complicated than DTDs.
  • 39. Schema for First address Example <?xml version="1.0" encoding="ISO-8859-1" ?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="address"> <xs:complexType> <xs:sequence> <xs:element name="name" type="xs:string"/> <xs:element name="email" type="xs:string"/> <xs:element name="phone" type="xs:string"/> <xs:element name="birthday" type="xs:date"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
  • 40. Explanation of Example Schema <?xml version="1.0" encoding="ISO-8859-1" ?> • ISO-8859-1, Latin-1, is the same as UTF-8 in the first 128 characters. <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> • www.w3.org/2001/XMLSchema contains the schema standards. <xs:element name="address"> <xs:complexType> • This states that address is a complex type element. <xs:sequence> • This states that the following elements form a sequence and must come in the order shown. <xs:element name="name" type="xs:string"/> • This says that the element, name, must be a string. <xs:element name="birthday" type="xs:date"/> • This states that the element, birthday, is a date. Dates are always of the form yyyy-mm-dd.
  • 41. XSLT Extensible Stylesheet Language Transformations • XSLT is used to transform one xml document into another, often an html document. • The Transform classes are now part of Java 1.4. • A program is used that takes as input one xml document and produces as output another. • If the resulting document is in html, it can be viewed by a web browser. • This is a good way to display xml data.
  • 42. A Style Sheet to Transform address.xml <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="address"> <html><head><title>Address Book</title></head> <body> <xsl:value-of select="name"/> <br/><xsl:value-of select="email"/> <br/><xsl:value-of select="phone"/> <br/><xsl:value-of select="birthday"/> </body> </html> </xsl:template> </xsl:stylesheet>
  • 43. The Result of the Transformation Alice Lee alee@aol.com 123-45-6789 1983-7-15
  • 44. Parsers • There are two principal models for parsers. • SAX – Simple API for XML – Uses a call-back method – Similar to javax listeners • DOM – Document Object Model – Creates a parse tree – Requires a tree traversal