skip to Main Content

Problem Description

I am trying to use xpath to locate the "Messi" node from the HTML below. To minimize coding efforts, I am hoping for a solution that uses an array index, instead of looping through an iterator.

My assumption is that the most standard and simplest API is XPathExpression.evaluate(). If there are better APIs, please kindly share.

By the way, I need to make changes to the DOM Node from the returned result. So, XPathResult.resultType will be set to ORDERED_NODE_ITERATOR_TYPE, and therefore XPathResult.snapshotItem() cannot be used.

HTML Example

<html>
<body>

<div>
    <div>NumberOne</div>
    <div>NumberTwo_Mbappe</div>
    <div>NumberOne</div>
    <div>NumberTwo_Ronaldo</div>
    <div>NumberTwo_Messi</div>
</div>

</body>
</html>

Code to get the XPath Results

Running the code below will return an iterator from the above html.

let xpathIterator = new XPathEvaluator()
                        .createExpression("//*[starts-with(text(), 'NumberTwo')]")
                        .evaluate(
                            document, 
                            XPathResult.ORDERED_NODE_ITERATOR_TYPE
                        );

Existing iterator solution for extracting the n-th item

The existing XPathResult interface only has an iterateNext() method, so it will take six lines of code to extract the n-th item:

let n = 3;
while (n > 0) { 
    xpathIterator.iterateNext(); 
    n--; 
}
xpathIterator.iterateNext();

Ideal array solution for extracting the n-th item

Since XPath and Chrome are used by millions of people everyday, ideally, there should be a way to obtain the n-th item directly using an array index (as the following code shows). I would be surprised if such an API doesn’t already exist.

let v = xpathResult[2];

The ideal solution doesn’t necessarily need to use XPathExpression.evaluate(). I am open to any solutions that use standard JavaScript functions supported by Chrome.

(Hopefully, we don’t need to use a function. If a function must be used, it would be good to have no more than 2 to 3 lines of ESLint-linted codes.)

Thanks!

Related Posts

Since XPathResult.resultType is not an iterable, the following posts don’t apply:

2

Answers


  1. inject this into the console:

    document.querySelector(".wikitable >  tbody").children[6];
    
    Login or Signup to reply.
  2. How would you use CssSelector to get the three "NumberTwo" nodes? After getting the three nodes, how would you access the 3rd node (the "Messi" node) directly? By the way, the five text nodes aren’t necessarily located in <ul><li> they are equally likely to be wrapped by <ol><li> or <table><tr>.

    Given the HTML you’re showing in your edits, like this:

    const allNodes = Array.from(document.querySelectorAll(`ul li, ol li, table tr`))
    const allNumberTwoNodes = allNodes.filter(e =>
                                  e.textContent.includes(`NumberTwo`)
                              );
    console.log(allNumberTwoNodes);
    <html>
      <body>
        <ul>
          <li>NumberOne</li>
          <li>NumberTwo_Mbappe</li>
          <li>NumberOne</li>
          <li>NumberTwo_Ronaldo</li>
          <li>NumberTwo_Messi</li>
        </ul>
    
        <ol>
          <li>NumberOne</li>
          <li>NumberTwo_Mbappe</li>
          <li>NumberOne</li>
          <li>NumberTwo_Ronaldo</li>
          <li>NumberTwo_Messi</li>
        </ol>
        
        <table>
          <tr><td>NumberOne</td></tr>
          <tr><td>NumberTwo_Mbappe</td></tr>
          <tr><td>NumberOne</td></tr>
          <tr><td>NumberTwo_Ronaldo</td></tr>
          <tr><td>NumberTwo_Messi</td></tr>
        </table>
      </body>
    </html>

    Here, we’re relying on textContent, which gives us (unsurprisingly) the text content of a node ignoring tags, which is why even though those table rows have table data cells, the <tr>‘s textContent gives us a string as if the <td> markup isn’t there.

    Also, the "NumberTwo" nodes aren’t necessarily the 2nd, 4th, and 5th nodes; they are equally likely to be at the 1-2-5 or 1-4-5 or 3-4-5 positions.

    Query selectors, just like XPath, doesn’t care what order the HTML is in, it’s going to find "the things that match", not "the thing at the xth position" (unless you bake child position into the selector, just like XPath).

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search