skip to Main Content

I’m trying to get a <p> element by its previous text. Example:

<div>
    Header:
    <p>ITEM</p>
    ID:
    <p>123</p>
    Title:
    <p>Test</p>
</div>

where I want to capture "123". I’ve tried a couple of combinations of preceding-sibling but haven’t been able to get it.

.//p[preceding-sibling::node()[1][self::text()][.='ID:']]

.//p[preceding-sibling::text()='ID:']

I don’t have control over the HTML and they don’t want to change it. I will always know the text before the paragraph I want to capture. Is this possible?

Edit: added more to the example. The element to grab won’t always be the first/last item to find.

2

Answers


  1. This XPath give the 123 text after the node containing ID:

    (//div[contains(text(), "ID:")]/p)[1]
    
    Login or Signup to reply.
  2. This XPath,

    //p[preceding-sibling::node()[1][normalize-space()='ID:']
    

    will select all p elements whose immediately preceding sibling has a space-normalized string value of ID:.

    Notes:

    • Your first try was close but failed to account for the whitespace surrounding ID:.
    • Your second try additionally failed to account for the immediacy constraint.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search