Xpath Expressions Explained

Xpath is a language for selecting XML nodes. You can think of it as the CSS of the XML world. It does some cool things that traditional CSS can’t do (CSS 3 can do some of it), such as selecting items based on content and attributes, and selecting parents and children. There is a cool ZF library which will translate your CSS selectors into Xpath, if you’re interested.

Here’s an example of an Xpath expression. It’s relatively complex and shows off a lot of useful Xpath features:

//Item[ItemNumber='4111']//ExternalIdentifier[@Source='Alpha' and @Type='Beta']

Now, let me break it down. The // means that this node is located anywhere in the document (in CSS this is kinda just assumed. If there is a space in the selector, it does the same thing). The Item part means that we are looking for a Item node. The [ItemNumber='4111'] means that we are looking for a child element of Item (the string before it) which has a child ItemNumber node whose text value is equal to 4111. The // means that we are looking for a child anywhere below the selected parent. The ExternalIdentifier means we are looking for a node of that type. The @Source=’Alpha’ means we are looking for an attribute named Source whose value is Alpha belonging to an element of type ExternalIdentifier (the string before it). The @Type=’Beta’ does the same thing. The “ and ” means that this element must have both of these attributes set.

Here’s an example chunk of XML (imagine that there are several of these Item nodes):

<Item>
  <ItemNumber>4111</ItemNumber>
  <ExternalIdentifiers>
    <ExternalIdentifier Type="Beta" Source="Alpha">10</ExternalIdentifier>
    <ExternalIdentifier Type="Beta" Source="Gamma">20</ExternalIdentifier>
    <ExternalIdentifier Type="Delta" Source="Alpha">30</ExternalIdentifier>
    <ExternalIdentifier Type="Delta" Source="Gamma">40</ExternalIdentifier>
  </ExternalIdentifiers>
</Item>

By running the xpath expression above against the provided XML document, we get the following PHP object:

array(1) {
  [0]=>
  object(SimpleXMLElement)#2 (2) {
    ["@attributes"]=>
    array(2) {
      ["Type"]=>
      string(12) "Beta"
      ["Source"]=>
      string(4) "Alpha"
    }
    [0]=>
    string(2) "10"
  }
}

If you were to cast this object as a string, you get the text value of the node (in this case 10).

Comment

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>