3. Objectives
• To get an overview of the structure of XML
documents in general
• To learn how to read and write XML documents in
.NET
• To understand advanced concepts of XML
processing such as XPath and XSLT
3 / 58
4. XML Fundamentals
• Short for Extensible Markup Language
• Logical structuring of data
• Domain-specific languages
• Content-oriented markup (in contrast to HTML)
• Self-descriptive structure
• Tags
• XML Schema
• Structural variations (e.g. variable child node count)
4 / 58
5. XML Benefits
• Human-readable
• Automated document validation
• Used in machine-machine communication (e.g.
web services)
5 / 58
10. XML Tree Structure
10 / 58
Window
StackPanel
DockPanel
TextBlock TextBox
DockPanel
TextBlock TextBox
DockPanel
TextBlock StackPanel
Button
11. XML Node Types
• Elements
• Document Element
• Content
• Attributes
• Comment
• Namespaces
• Processing Instructions
11 / 58
12. XML Elements
• Main parts of XML documents
• Boundaries defined by start and end tags
• Consist of name, attributes and content
• Can be nested
• Root element is called document element
• Always exactly one document element
12 / 58
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<edge source="n0" target="n1"/>
</graph>
13. XML Content
• Text between start and end tags of an element
• Either simple, complex or mixed
13 / 58
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<edge source="n0" target="n1"/>
</graph>
14. XML Attributes
• Associate name-value pairs with elements
• May appear within start tags, only
• Order doesn’t matter
• Unique per element
14 / 58
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<edge source="n0" target="n1"/>
</graph>
15. XML Comments
• Not part of document data
• XML processors may (but don’t need to) retrieve
text comments
• Double-hyphen (“- -”) must not occur within
comments
15 / 58
<!-- Connect both nodes. -->
<edge source="n0" target="n1"/>
16. XML Namespaces
• Used for distinguishing nodes with same names
• Bound to an URI by the xmlns attribute
16 / 58
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<edge source="n0" target="n1"/>
</graph>
</graphml>
17. Writing XML in .NET
• Abstract base class XmlWriter
• Non-cached
• Forward-only
• Write-only
• Writes to stream or file
• Verifies that the characters are legal XML characters
and that element and attribute names are valid
XML names
• Verifies that the XML document is well formed
17 / 58
18. Creating an XmlWriter
• XmlWriter instances are created using the static
Create method
• XmlWriterSettings class is used to specify the set
of features you want to enable on the XmlWriter
object
• XmlWriterSettings can be reused to create
multiple writer objects
• Allows adding features to an existing writer
• Create method can accept another XmlWriter object
• Underlying XmlWriter object can be another XmlWriter
instance that you want to add additional features to
18 / 58
20. XmlWriterSettings Defaults
Property Description Default Value
CheckCharacters
Whether to do character
checking.
true
Encoding Type of text encoding to use. Encoding.UTF8
Indent Whether to indent elements. false
IndentChars
Character string to use when
indenting.
Two whitespaces
NewLineChars
Character string to use for line
breaks.
rn
NewLineOnAttributes
Whether to write attributes on a
new line.
false
20 / 58
21. Writing XML
with the XmlWriter
21 / 58
Member Name Description
WriteElementString Writes an entire element node, including a string value.
WriteStartElement Writes the specified start tag.
WriteEndElement Closes one element and pops the corresponding namespace scope.
WriteElementString Writes an element containing a string value.
WriteValue
Takes a CLR object and converts the input value to the desired output type using the
XML Schema definition language (XSD) data type conversion rules. If the CLR object is a
list type, such as IEnumerable, IList, or ICollection, it is treated as an array of the
value type.
WriteAttributeString Writes an attribute with the specified value.
WriteStartAttribute Writes the start of an attribute.
WriteEndAttribute Closes the previous WriteStartAttribute call.
WriteNode Copies everything from the source object to the current writer instance.
WriteAttributes Writes out all the attributes found at the current position in the XmlReader.
22. Writing XML Elements
C#
22 / 58
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create("books.xml", settings))
{
// Write XML data.
writer.WriteStartElement("book");
writer.WriteElementString("title", "Awesome book");
writer.WriteElementString("price", "19.95");
writer.WriteEndElement();
}
23. Writing XML Attributes
C#
23 / 58
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true
};
using (XmlWriter writer = XmlWriter.Create("books.xml", settings))
{
// Write XML data.
writer.WriteStartElement("book");
// Write the genre attribute.
writer.WriteAttributeString("genre", "novel");
writer.WriteElementString("title", "Awesome book");
writer.WriteElementString("price", "19.95");
writer.WriteEndElement();
}
24. Namespace Handling
• XmlWriter maintains a namespace stack
corresponding to all the namespaces defined in the
current namespace scope
• WriteElementString, WriteStartElement,
WriteAttributeString and WriteStartAttribute
methods have overloads that allow you to specify a
namespace URI
24 / 58
26. Reading XML in .NET
• Abstract base class XmlReader
• Non-cached
• Forward-only
• Read-only
• Reads from stream or file
• Verifies that the XML document is well formed
• Validates the data against a DTD or schema
26 / 58
27. Creating an XmlReader
• XmlReader instances are created using the static
Create method
• XmlReaderSettings class is used to specify the set
of features you want to enable on the XmlReader
object
• XmlReaderSettings can be reused to create
multiple reader objects
• Allows adding features to an existing reader
• Create method can accept another XmlReader object
• Underlying XmlReader object can be another XmlReader
instance that you want to add additional features to
27 / 58
29. XmlReaderSettings Defaults
Property Description Default Value
CheckCharacters
Whether to do character
checking.
true
IgnoreComments Whether to ignore comments. false
IgnoreProcessingInstructions
Whether to ignore processing
instructions.
false
IgnoreWhitespace
Whether to ignore insignificant
white space.
false
Schemas
XmlSchemaSet to use when
performing schema validation.
Empty XmlSchemaSet
ValidationType
Whether to perform validation or
type assignment when reading.
ValidationType.None
29 / 58
30. Working with XmlReader
• Current node refers to the node on which the
reader is positioned
• Move through the data and read the contents of a
node
• Reader is advanced using any of the Read methods
30 / 58
31. Reading XML Elements
31 / 58
Member Name Description
IsStartElement Checks if the current node is a start tag or an empty element tag.
ReadStartElement Checks that the current node is an element and advances the reader to the next node.
ReadEndElement Checks that the current node is an end tag and advances the reader to the next node.
ReadElementString Reads a text-only element.
ReadToDescendant Advances the XmlReader to the next descendant element with the specified name.
ReadToNextSibling Advances the XmlReader to the next sibling element with the specified name.
IsEmptyElement Checks if the current element has an empty element tag.
After the XmlReader is positioned on an element, the node properties, such as
Name, reflect the element values.
32. Reading XML Elements
C#
32 / 58
using (XmlReader reader = XmlReader.Create("books.xml"))
{
reader.Read();
reader.ReadStartElement("book");
reader.ReadStartElement("title");
Console.Write("The content of the title element: ");
Console.WriteLine(reader.ReadString());
reader.ReadEndElement();
reader.ReadStartElement("price");
Console.Write("The content of the price element: ");
Console.WriteLine(reader.ReadString());
reader.ReadEndElement();
reader.ReadEndElement();
}
33. Reading XML Attributes
33 / 58
After MoveToAttribute has been called, the node properties, such as Name, reflect
the properties of that attribute, and not the containing element it belongs to.
Member Name Description
AttributeCount Gets the number of attributes on the element.
GetAttribute Gets the value of the attribute.
HasAttributes Gets a value indicating whether the current node has any attributes.
Item Gets the value of the specified attribute.
MoveToAttribute Moves to the specified attribute.
MoveToElement Moves to the element that owns the current attribute node.
MoveToFirstAttribute Moves to the first attribute.
MoveToNextAttribute Moves to the next attribute.
34. Reading XML Attributes
C#
34 / 58
XmlReaderSettings settings = new XmlReaderSettings
{
IgnoreWhitespace = true
};
using (XmlReader reader = XmlReader.Create("books.xml", settings))
{
while (reader.Read())
{
if (reader.HasAttributes)
{
Console.WriteLine("Attributes of <" + reader.Name + ">");
while (reader.MoveToNextAttribute())
{
Console.WriteLine(" {0}={1}", reader.Name, reader.Value);
}
// Move the reader back to the element node.
reader.MoveToElement();
}
}
}
35. Reading Typed Data
• XmlReader class permits callers to read XML data and
return values as simple-typed CLR values rather than
strings
• Gets values in the representation that is most
appropriate for the coding job without having to
manually perform value conversions and parsing
• ReadContentAsBoolean, ReadContentAsDateTime,
ReadContentAsDouble, ReadContentAsLong,
ReadContentAsInt, and ReadContentAsString methods are
used to return a specific CLR object
• ReadElementContentAs method is used to read element
content and return an object of the type specified
35 / 58
36. DOM vs. SAX Parsing
• Simple API for XML (SAX)
• Sequential access
• Required memory is proportional to maximum depth of
the input document
• Document validation requires keeping track of id
attributes, open elements, etc.
• Document Object Model (DOM)
• Operates on the document as a whole
• Required memory is proportional to the entire
document length
36 / 58
37. Working with XmlDocument
• In-memory tree representation of an XML
document
• Enables navigation and editing of parsed
documents, including adding and removing nodes
• XmlNode object is the basic object in the DOM tree
37 / 58
38. Working with XmlDocument
C#
38 / 58
// Load XML document.
XmlDocument document = new XmlDocument();
document.Load("books.xml");
// Find book element.
XmlNode bookNode = document["book"];
// Add author element.
XmlElement element = document.CreateElement("author");
XmlText text = document.CreateTextNode("Nick Prühs");
bookNode.AppendChild(element);
element.AppendChild(text);
// Write document to console.
XmlWriter writer = new XmlTextWriter(Console.Out);
document.WriteTo(writer);
39. Navigating XML Documents
• XML Path Language (XPath) selects nodes from an
XML document
• Each expression is evaluated with respect to a
context node (e.g. “child”)
• Location path consists of a sequence of location
steps
39 / 58
40. XPath Location Steps
• Axis
• Specifies the tree relationship between the nodes
selected by the location step and the context node
• Node test
• Specifies the node type and expanded-name of the
nodes selected by the location step
• Predicate(s)
• Further refine the set of nodes selected by the location
step
40 / 58
41. XPath Axes
Axis
Ancestor Selects all ancestors (parent, grandparent, etc.) of the current
node.
Ancestor-or-self Selects all ancestors (parent, grandparent, etc.) of the current
node and the current node itself.
Attribute Selects all attributes of the current node.
Child Selects all children of the current node.
Descendant Selects all descendants (children, grandchildren, etc.) of the
current node.
Descendant-or-self Selects all descendants (children, grandchildren, etc.) of the
current node and the current node itself.
41 / 58
42. XPath Axes
Axis
Following Selects everything in the document after the closing tag of
the current node.
Following-sibling Selects all siblings after the current node.
Namespace Selects all namespace nodes of the current node
Parent Selects the parent of the current node.
Preceding Selects all nodes that appear before the current node in the
document, except ancestors, attribute nodes and namespace
nodes.
Preceding-sibling Selects all siblings before the current node.
Self Selects the current node.
42 / 58
43. XPath Node Tests
• node() is true for any node of any type
• * is true for any node of the principal node type.
• child::* will select all element children of the context
node
• attribute::* will select all attributes of the context
node
• comment() is true for any comment node
• text() is true for any text node
• processing-instruction() is true for any
processing instruction
43 / 58
44. XPath Predicates
Restrict a node-set to select only those nodes for which some
condition is true.
Example:
book[@genre=‘scifi’]
44 / 58
46. XSLT Fundamentals
• Short for Extensible Stylesheet Language
Transformation
• Transforms XML documents into other formats
such as other XML documents, HTML or plain text
• Turing-complete
46 / 58
48. XSLT Processing
1. Read stylesheet file.
2. Build source tree from input XML document.
3. For each node in the source tree:
1. Process source tree node.
2. Find best-matching template in the stylesheet.
3. Evaluate the template contents.
4. Create node(s) in the result tree.
48 / 58
49. Common XSLT Elements
Node Description
xsl:apply-templates Specifies that other matches may exist
within that node. If “select” is specified,
only the templates that specify a match
that fits the selected node or attribute
type will be applied.
xsl:choose Contains xsl:when blocks and up to one
xsl:otherwise block.
xsl:for-each Creates a loop which repeats for every
match.
xsl:stylesheet Top-level element. Occurs only once in a
stylesheet document.
xsl:template Specifies processing templates.
xsl:value-of Outputs a variable.
49 / 58
58. SaveFileDialog Control
C#
58 / 58
// Show save file dialog box.
SaveFileDialog saveFileDialog = new SaveFileDialog
{
AddExtension = true,
CheckPathExists = true,
DefaultExt = ".txt",
FileName = "TextFile",
Filter = "Text files (.txt)|*.txt",
ValidateNames = true
};
var result = saveFileDialog.ShowDialog();
if (result != true)
{
return;
}
// Open file stream.
using (var stream = saveFileDialog.OpenFile())
{
// Do something.
}
59. Assignment #4
1. Save Map
1. Add a new MenuItem “Save As” to your MainWindow,
along with a ToolBar button with tooltip.
2. Allow saving a map file by showing a Save File Dialog
when the Save As command is executed.
1. Map files should be well-formed, valid XML files.
2. Each map file should contain the width and height of the
map, as well as the types of all map tiles.
3. It should not be possible to save a map if there is
none.
59 / 58
60. Assignment #4
2. Load Map
1. Add a new MenuItem “Open” to your MainWindow,
along with a ToolBar button with tooltip.
2. Add separators to both your toolbar and menu.
3. Allow loading a map file by showing an Open File
Dialog when the Open command is executed.
4. Opening a corrupt or invalid XML file should result in
an error message. The current map should only be
discarded of the map file could be successfully
opened.
60 / 58
61. References
• MSDN. XML Documents and Data.
http://msdn.microsoft.com/en-
us/library/2bcctyt8%28v=vs.110%29.aspx, May 2016.
• Luttenberger. Internet Applications – Web Services Primer.
Department for Computer Science, CAU Kiel, 2008.
• Brandes, Eiglsperger, Lerner. GraphML Primer.
http://graphml.graphdrawing.org/primer/graphml-primer.html,
May 2016.
• Barnes, Finch. COLLADA – Digital Asset Schema Release 1.5.0
Specification. Sony Computer Entertainment Inc., April 2008.
• Clark, DeRose. XML Path Language (XPath).
http://www.w3.org/TR/xpath/, November 16, 1999.
• w3schools.com. XPath Axes.
http://www.w3schools.com/xsl/xpath_axes.asp, May 2016.
61 / 58
63. 5 Minute Review Session
• What are the main benefits of using XML?
• What are the three main XML processing steps?
• Which XML node types do you know?
• How do you properly create XmlWriter instances?
• How do you read typed XML content?
• What is the difference between DOM and SAX
parsing?
• What are the main parts of an XPath location step?
• What is XSLT and how does it work?
63 / 58