Credit: Jon A. Morse (ST Sci/NASA)
Xpath
Xpath is a specialized sub-language which is used in XSL
to identify the XML data structures to be manipulated. The main pattern
in Xpath is a sequence of XML element names separated by "/"s, for
example,
//POlist/PO/Attachments/attachment/link
identifies the "link" element which is a child of an "attachment" element,
which in
turn is a child of an "Attachments" element, which in turn is a child of
a "PO"
element, which is a child in "POlist", which is contained somewhere
in the overall XML tree. In general, you read the path backwards, with
the element name after the last "/" being the element that you actually
are looking for.
That is, in the above expression, we are not locating a "PO"
or an "attachment" element, but only the "link" element that belongs to
them. And
in general, we are not necessarily referring to one "link" element but to
all "link" elements that are so nested.
See
purchaseOrder.xml for an XML file similar to the one used for this and
other examples in this chapter.
Other Elements in Xpath
There are several elaborations of the path which are frequently used:
wildcards, conditionals, and functions.
- Wildcards are characters that identify
positions in the XML tree or groups of elements.
| / |
This selects elements that are direct children of the
current node |
| // |
This selects all matching
nodes at any depth below the current node, no matter how far
down in the tree |
| . |
This indicates the current node |
| * |
This selects all elements of the current node |
| @ |
This refers to attributes of the current element |
For example, the path
//POlist/PO[@* = 'x']/*//links
selects any "links" element that is at any depth below any child
of a "PO" element that has any attribute whose value = 'x', and is
also a child of "POlist"
- Conditionals: these are usually contained inside
square brackets "[", "]", and use the
< ,
> , and = characters to restrict
paths to those that meet specific conditions, along with the
boolean operators "and", "or", and "not". For example,
//PO/Attachments/
attachment[@ftype = 'tif' or @ftype = 'pdf']
selects any "attachment" element that has an attribute named
"ftype" that has a value of 'tif' or 'pdf', and which is also
a child of "Attachments" which in turn is a child of "PO".
Or
<xsl:if test = "not(name(.) = 'status')" >
tests to see if the name of the current node is NOT the word 'status'
- Functions: these are string manipulation, numeric,
and boolean functions that help Xpath specify matches. Some of the
more common ones include:
o concat('a', 'b', 'c'):
returns the string 'abc'
o string-length('abc'):
returns the value '3'
o starts-with('Abc', 'a'):
returns 'false' because 'A' does not = 'a'
o contains('abc', 'bc'):
returns 'true'
o node[last()]
points to the last value of this node
o node[3]
points to the 4-th value of this node
(node[0] is the first)
Context Node
Perhaps the most important concept in Xpath is that of the
context node. Many XSL expressions identify
a node somewhere in the tree that is represented by a period
".". Within that expression, all Xpath references are
relative to that context node. For example, if the current context is
"//POlist/PO" then the "Attachments" child and its children can be
referred to as
./Attachments/attachment
or simply
Attachments/attachment
as the ./ symbol is usually implied.
Examples
The following annotated Xpath examples are embedded in XSL statements
(explained in the next chapter)
to show the context in which they are used:
[1]:
<xsl:apply-templates select=
"//PO/Id[n_date > 20010203 and n_date < 20010609]" />
Select those "Id" elements containing an "n_date" element
between the specified dates that is a child of a "PO" element
[2]:
<xsl:if test=
"contains(ancestor::PO//Name,'ar') or contains(status,'Vendor')">
This test is "true" if the context node is a child of a "PO"
node (that is, "PO" is its ancestor) that, down one or more
levels contains a sub-node named "Name" which contains the
string "ar" somewhere in its value, OR, if the context
node contains a child node named "status" which contains
the string 'Vendor'
[3]:
<xsl:value-of select = "@ftype" />
This outputs the value of the current node (and all of its
descendents) only if it contains an attribute named "ftype"
(regardless of its value)
[4]:
<xsl:when test=
"contains(ancestor::PO/Items/Item[2]/Description,'mo')">
This is "true" only if the context node contains an
ancestor named "PO" that has a child named "Items" which
also has children named "Item" where the second "Item"
child has a child named "Description" whose value
contains the string "mo"
See if you can figure out the next five by yourself. If you get stuck,
the answers are in the popup link that follows the example.
[5]:
<xsl:when test =
"contains(ancestor::PO/Items/Item[last()]/Description, 'mo')" >
Answer-5
[6]:
<xsl:when test=
"ancestor::PO/Attachments/attachment[@ftype='tif' or @ftype='pdf']">
Answer-6
[7]:
<xsl:if test = " @ftype = 'pdf' or @ftype = 'tif' " >>
Answer-7
[8]:
<xsl:value-of select="ancestor::PO/Contact/Name" />
Answer-8
[9]:
<xsl:apply-templates select =
"count(//POList/PO[Id[substring-after(date,'//') > 20010403 and
substring-after(date,'//') < 20010409]])" />
Answer-9
|
|
 |
|
 |