XPath

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

XPath is a specification language that allows specifying parts of an XML structure. It is part of XSL. An XPath expression can be thought as an address of a part of an XML document.

TODO: https://docs.oracle.com/javase/tutorial/jaxp/xslt/xpath.html

Command Line Tooling

Java Support

https://www.baeldung.com/java-xpath

Syntax

https://www.w3schools.com/xml/xpath_syntax.asp

Example

<books>
    <book id="1" category="linux">
        <title lang="en">Linux Device Drivers</title>
        <year>2003</year>
        <author>Jonathan Corbet</author>
        <author>Alessandro Rubini</author>
    </book>
    <book id="2" category="linux">
        <title lang="en">Understanding the Linux Kernel</title>
        <year>2005</year>
        <author>Daniel P. Bovet</author>
        <author>Marco Cesati</author>
    </book>
    <book id="3" category="novel">
        <title lang="en">A Game of Thrones</title>
        <year>2013</year>
        <author>George R. R. Martin</author>
    </book>
    <book id="4" category="novel">
        <title lang="fr">The Little Prince</title>
        <year>1990</year>
        <author>Antoine de Saint-Exupéry</author>
    </book>
</books>

Node Selection

XPath uses path expressions to select nodes in an XML document. The node is selected by following a path (or steps):

/

Select from the root node.

/books/book/title
xml sel -t -v "/books/book/title" ./books.xml

returns:

Linux Device Drivers
Understanding the Linux Kernel
A Game of Thrones
The Little Prince

nodename

Select the node (or nodes) with the given nodename. If multiple nodes with the same node match, they're all selected.

Queries

Full Path Queries

/level-1-element-name/level-2-element-name/...

Recursive Queries

Recursive query based on element name:

//element-name<selector>

Element Selection

element-name[@attribute-name='value']
element-name[sub-element-expression]