An introduction to the XPath XML query possibilities. In particular, there is a focus on the abbreviations that makes XPath efficient to use. A larger section is allocated to explain and illustrated the use of axes in XPath
1. Introduction to XPath
Kristian Torp
Department of Computer Science
Aalborg University
people.cs.aau.dk/˜torp
torp@cs.aau.dk
November 3, 2015
daisy.aau.dk
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 1 / 59
2. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 2 / 59
3. Learning Goals and Focus
Learning Goals
Understand the XPath data model
Know the basic tree terminology
Good at querying XML documents using XPath
Know the abbreviations used in XPath
Very handy to know in practice
Compact and quite readable!
Database Focus
All XML technologies are presented from a database perspective also
called a data focus (i.e., not a document focus)!
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 3 / 59
4. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 4 / 59
5. Introduction
Example
Find all courses: /coursecatalog/course
Find the semesters: //semester/text()
Overview
A language for
finding/addressing information in XML documents
navigating through elements and attributes in an XML document
Used in many XML technologies, e.g., XQuery and XPointer
A part of the XSLT recommendation
Microsoft/Visual Studio makes heavy usage of XSLT
The data model is an abstract and logical structure of an XML
document
Called a node tree
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 5 / 59
6. The Node Tree
Terminology
Document node: The entire XML document
Also called the document root or the root node
Element node: An XML element
A special one is the document element or root element
Text node: The text strings in an element node
Attribute node: An attribute
Example (A Node Tree)
/
coursecatalog
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 6 / 59
7. Example: Find the Courses
Example (Document)
/
coursecatalog
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Query
/coursecatalog/course
Result
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 7 / 59
8. Example: Find the Semesters
Example (Document)
/
coursecatalog
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Query
//semester/text()
Result
3 7
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 8 / 59
9. Major Components
Components
Nodes
XML document treated as a tree of nodes
Examples: Elements, attributes, and comments
Path expressions
Select a set of nodes in an XML document
Examples: /, /coursecatalog/course
Standard functions
Approximate 100 built-in functions
Examples: concat(’a’, ’b’), round(1.5)
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 9 / 59
10. Quiz
Example (Document)
/
coursecatalog
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Questions
Who is the parent of the document element?
How many document elements are there in an XML document?
How many elements can there be in an XML document?
Are elements and attributes the same node type?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 10 / 59
11. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 11 / 59
15. Tree Terminology
Example (A Node Tree)
1
2
3 4
5
6
7
8
9 A B
Children of 1
Quiz
Who are the children of 3?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 12 / 59
16. Tree Terminology
Example (A Node Tree)
1
2
3 4
5
6
7
8
9 A B
Siblings of 9
Quiz
Who are the siblings of 3?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 12 / 59
17. Tree Terminology
Example (A Node Tree)
1
2
3 4
5
6
7
8
9 A B
Ancestors of 6
Quiz
Who are the ancestors of 9?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 12 / 59
18. Tree Terminology
Example (A Node Tree)
1
2
3 4
5
6
7
8
9 A B
Parent of 8
Quiz
Who are the parents of 4?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 12 / 59
19. Tree Terminology
Example (A Node Tree)
1
2
3 4
5
6
7
8
9 A B
Descendants of 1
Quiz
Who are the descendants of 5?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 12 / 59
20. Quiz
Example (Another Node Tree)
1
2
3 4
5 6 7
8
9
A
B
C
D E F
G
H I
J
Questions
Parent of E?
Children of 2?
Descendants of 2?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 13 / 59
21. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 14 / 59
22. Location Path and Location Step I
Definition (Location Path)
A location path evaluates to a sequence of nodes
Example (Location Path)
/child::coursecatalog/child::course[name=’OOP’or name=’DB’][@id<10]
Definition (Location Step)
A location path consists of a number of location steps.
Example (Location Steps)
child::coursecatalog
child::course[name=’OOP’or name=’DB’][@id<10]
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 15 / 59
23. Location Path and Location Step II
Definition
A location step consists of an axis, a node test, and a set of predicates
Example (One)
child::coursecatalog
Axis: child
Node test: coursecatalog
Predicates: empty
Example (Two)
child::course[name=’OOP’or name=’DB’][@id<10]
Axis: child
Node test: course
Predicates: [name=’OOP’or name=’DB’][@id<10]
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 16 / 59
24. Abbreviations
Most Used
Abbreviation Meaning
. self::node()
.. parent::node()
//coursecatalog /descendant-or-self::coursecatalog
course child::course
Example (Abbreviations in Action)
Abbreviation Meaning
//name /descendant-or-self::name
//name/.. /descendant-or-self::name/parent::node()
/coursecatalog/course /child::coursecatalog/child::course
Note
Abbreviations makes the expression more readable
Sometimes abbreviations can make it hard to guess the result
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 17 / 59
25. Evaluation of Location Path I
Example XML Document
/
coursecatalog
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Evalute the Location Path
/child::coursecatalog/child::course[name=’OOP’or name=’DB’][@id<10]/name
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 18 / 59
26. Evaluation of Location Path II
The Steps in the Evaluation
1 Starts with / therefore the context node is set to root node
2 Evaluate the location step child::coursecatalog
3 Result is the coursecatalog root element node
4 Set context to root element node
5 Evaluate the location step
child::course[name=’OOP’or name=’DB’][@id<10]
6 The result is the two course element nodes
7 Set context to the OOP course element node
8 Evaluate the location step child::name
9 Results in the name element node which is the first part of the result
10 Set context to the DB course element node
11 Evaluate the location step child::name
12 Results in the name element node which is the last part of the result
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 19 / 59
27. Context
Definition (Context)
A context node (a node in the node tree)
A context size and context position
A set of variable bindings
A function library
A set of name space declaration
Definition (Context Size)
The context size is the lenght of the sequence of nodes return by the
previous location step
Definition (Context Position)
The context position is the current node in the sequence being evaluated
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 20 / 59
28. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 21 / 59
29. Compact Notation for Node Tree
Example (The Node Tree)
/
coursecatalog
course
id=4 name
OOP
semester
3
desc
snip
course
id=2 name
DB
semester
7
desc
snip
Example (The Equivalent Compact Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 22 / 59
30. Example: Find the Courses
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/course
Result
course:OOP
id=4 name:OOP sem:3 dsc
course:DB
id=2 name:DB sem:3 dsc
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 23 / 59
31. Example: Find Elements That Do Not Exist
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/name
Result
Empty no name element below coursecatalog!
Note that it is not an error!
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 24 / 59
32. Example: Find the Course Names
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog//name
Result
name:OOP name:DB
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 25 / 59
33. Examples: Find the OOP Course
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/course[name="OOP"]
Result
course:OOP
id=4 name:OOP sem:3 dsc
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 26 / 59
34. Example: Find a Course Based on ID
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/course[@id="2"]
Result
course:DB
id=2 name:DB sem:7 dsc
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 27 / 59
35. Example: Filter on an Attribute
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/course[@id="2"]/name
Result
name:DB
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 28 / 59
36. Example: Get the Name of a Course as a String
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/course[@id="2"]/name/text()
Result
The string DB
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 29 / 59
37. Example: Use Parent Axis
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
//course[@id="2"]/parent::node()
Result
The document node, i.e., the entire tree
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 30 / 59
38. Example: Use Child Axis
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/child::node()
Result
course:OOP
id=4 name:OOP sem:3 dsc
course:DB
id=2 name:DB sem:7 dsc
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 31 / 59
39. Example: Use Descendant Axis
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
/coursecatalog/descendant::node()
Result
8 element nodes
6 text nodes
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 32 / 59
40. Example: Use Functions
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
concat("hello, ", "world!")
Result
The string ’hello, world!’
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 33 / 59
41. Example: Functions and XPath Expressions
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Query
concat("hello ", /coursecatalog/course[@id="2"]/name/text())
Result
The string ’hello DB’
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 34 / 59
42. Most used Path Expressions
Often Used Expressions
Path Expression Description
/ select from the root node
//NodeName select NodeName element nodes
. select the current node
.. select parent of the current node
/NodeName[@id>7] select based on attribute node
/NodeName[Node2=’H’] select based on element node
/NodeName/text() select the text node value
/NodeName/attribute() select the attribute nodes
/NodeName[1] select the first NodeName element node
/NodeName[last()] select the last NodeName element node
Note
Almost like Linux/Unix directory navigation
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 35 / 59
43. Quiz
Example (Node Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Questions
/coursecatalog/course/name returns?
/coursecatalog/teacher returns?
/coursecatalog is the same as /?
/coursecatalog/course/../course/../course returns?
/coursecatalog/course[@id<11]/name/text() returns?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 36 / 59
44. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 37 / 59
45. Node Numbering
Example (Node Tree)
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Note
Depth-first numbering of nodes
Used for relative access to other nodes
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 38 / 59
46. Forward and Backward Axes
Definition (Axis)
An axis is a sequence of nodes located relative to the context node.
Definition (Forward Axis)
A forward axis can only return the context node or nodes after in the
document order.
Definition (Backward Axis)
An backward axis can only return the context node or nodes that are
before in the document order.
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 39 / 59
47. The Axes
Axis Name Direction Description
attribute forward All my attributes
self forward My self
child forward All my children
descendant forward All my children, grand children, etc.
parent backward My unique parent
ancestor backward My parent, grand parent, etc.
following forward All after me that are not ancestors
preceding backward All before me that are not ancestors
following-sibling forward My “younger” siblings
preceding-sibling backward My “elder” siblings
descendant-or-self forward My self and all my descendants
ancestor-or-self backward My self or my ancestors
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 40 / 59
48. Child
Finds
Immediately descendants to current node.
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
cur
1 2 3
Quiz
Which direction of the child axis (and why)?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 41 / 59
49. Child Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/child::node()
Result: the two course nodes
/coursecatalog/course/child::node()
Result: six element nodes
/coursecatalog/course/attribute()
Result: two attribute nodes
/coursecatalog/course/semester/child::node()
Result: two text nodes
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 42 / 59
50. Parent
Finds
The one node immediately above
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
1
cur
Quiz
Which direction of the parent axis (and why)?
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 43 / 59
51. Parent Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/course[@id=’2’]/name/parent::node()
Result: the course element node with id = 2
/coursecatalog/course/name/parent::node()
Result: the two course element nodes
/coursecatalog/parent::node()
Result: the document root
/parent::node()
Result: empty
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 44 / 59
52. Descendent
Finds
Children all the way down the tree
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
cur
1
2 3
4 5
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 45 / 59
53. Descendant Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/descendant::node()
Result: 8 element nodes + 6 text nodes
/coursecatalog/course[name="OOP"]/descendant::node()
Result: 3 element nodes + 3 text nodes
/coursecatalog/course[name="OOP"]/descendant::node()/attribute()
Result: 2 attribute nodes
/coursecatalog/course/name/descendant::node()
Result: two text nodes
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 46 / 59
54. Ancestor
Finds
Parents all the way up the tree
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
4
3
2
1
cur
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 47 / 59
56. Following
Finds
All nodes that follows excluding descendants
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
cur 1 2
3 4 5
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 49 / 59
57. Following Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/course[@id="4"]/following::node()
Result: 4 element nodes + 3 text nodes
/coursecatalog/course[@id="2"]/following::node()
Result: empty
/coursecatalog/course[@id="4"]/name/text()/following::node()
Result: 6 element nodes and 5 text nodes
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 50 / 59
58. Preceding
Finds
All preceding nodes excluding ancestors
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
3
2 1
cur
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 51 / 59
59. Preceding Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/course[@id="4"]/semester/text()/preceding::node()
Result: 1 element node + 1 text node, root element is anscestor
/coursecatalog/course/preceding::node()
Result: the OOP course 4 element nodes + 3 text nodes
/coursecatalog/course[name="OOP"]/preceding::node()
Result: empty
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 52 / 59
60. Following Sibling
Finds
All siblings nodes following
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
cur 1 2 3
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 53 / 59
61. Following Sibling Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/course/following-sibling::node()
Result: 1 element node (the DB course)
/coursecatalog/course[@id="2"]/following-sibling::node()
Result: empty
/coursecatalog/course/semester/following-sibling::node()
Result: 2 element nodes (descriptions)
/coursecatalog/course/@id/following-sibling::node()
Result: empty
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 54 / 59
62. Preceding Sibling
Finds
All siblings nodes before
Numbering
1
2
3 4
5
6
7
8 9
10 11
12 13 14
Example
2 1 cur
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 55 / 59
63. Preceding Sibling Examples
Example (Document Tree)
/coursecatalog
course
id=4 name:OOP sem:3 dsc
course
id=2 name:DB sem:7 dsc
Queries
/coursecatalog/course/preceding-sibling::node()
Result: 1 element node (the OOP course)
/coursecatalog/course[@id="2"]/preceding-sibling::node()
Result: 1 element node (the OOP course)
/coursecatalog/course/semester/preceding-sibling::node()
Result: 2 element nodes (names)
/coursecatalog/course/desc/preceding-sibling::node()
Result: 4 element nodes (0 attribute nodes)
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 56 / 59
64. Outline
1 Introduction
2 Tree Terminology
3 Location Path and Steps
4 XPath Path Expressions
5 Axes
6 Summary
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 57 / 59
65. Summary: XPath
Main Points
XPath is widely used
Not an XML syntax!
XPath is used for many purposes in related XML technologies
XQuery
XSLT
SQL/XML
W3C Recommendation November 1999 www.w3.org/TR/xpath
Note
Very good idea to get familiar with XPath
XPath is the foundation for understanding other XML technologies
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 58 / 59
66. Additional Information
Web Sites
www.w3schools.com/XPath/xpath_intro.asp: W3C is always a
good place to start
www.stylusstudio.com/w3c/xpath/: A very good and quite
elaborated tutorial
www.devarticles.com/c/a/XML/Introduction-to-XPath/: Good
4 page tutorial
pierre.senellart.com/wdmd/chap-xpath.pdf: A description of
the XPath data model
Tools
pgfearo.googlepages.com/: A very good tool for playing around
with XPath
There is an introduction screencast
http://www.bit-101.com/xpath/: A good online tool
Kristian Torp (Aalborg University) Introduction to XPath November 3, 2015 59 / 59