skip to Main Content

I’m trying to create a regular expression to capture all the possible HTML classes of a file.

Let’s take this content as an example:

<html>
  <body>
    <div class="bg-secondary-7 -mx-5 p-5 hover:bg-secondary-8"></div>
  </body>
  <script>
    const isTrue = 1 === 1;

    const z = `hello ${isTrue ? "is-true" : "is-false"}`;
  </script>
</html>

It should match:

  • bg-secondary-7
  • -mx-5
  • p-5
  • hover:bg-secondary-8
  • hello
  • is-true
  • is-false

I’ve tried this so far:

/(["'`])(?:(?=(\?))2.)*?1/g

It matches:

  • bg-secondary-7 -mx-5 p-5 hover:bg-secondary-8
  • hello ${isTrue ? "is-true" : "is-false"}

Then I can separate the strings using string.split(' ').

The problem is that for template literals the expressions are not being matched: "is-true" and "is-false" are not being matched.

What can be a good solution for this?

Edit:

The goal is to take an HTML or JS file and get all the possible HTML classes that can be used, it’s easy from the HTML because we could just use a DOM parser but it’s important that we parse them with regex because the user can add classes from JS using element.classList.add("class").

2

Answers


  1. Chosen as BEST ANSWER

    as mentioned by other users, it's a hard task.

    I've found a simple regex that could be helpful.

    /[^<>"'`s]*[^<>"'`s:]/g;
    

    When using it with the content provided in the original question, it finds any string, and although it includes a lot of false positives (like html or const), it gives us a grasp of what we're trying to achieve: to get possible class names from a file.

    Thank you all for taking your time to answer and comment this question. It's appreciated.


  2. In general, avoid use of regex for html, it’s not designed for that. If you can parse DOM, do that instead.

    You can use MutationObserver to detect when class attribute was changed.

    const observer = new MutationObserver((data, _observer) =>
    {
      for(let i = 0; i < data.length; i++)
      {
        const item = data[i];
        const element = item.target;
    
        console.log("Element", element.textContent, 'class changed from', item.oldValue, 'to', element.className);
      }
    
    });
    
    observer.observe(document.getElementById("test"), {
      childList: true,
      subtree: true,
      attributeFilter: ["class"],
      attributeOldValue: true
    });
    
    
    
    const classes = ["red", "blue", "green"];
    function addClass()
    {
      const element = test.children[~~(Math.random()*test.children.length)];
      const oldClassName = element.className;
      const newClassName = classes[~~(Math.random()*classes.length)];
      element.classList.add(newClassName);
      if (newClassName !== oldClassName)
        element.classList.remove(oldClassName);
    }
    .red
    {
      background-color: pink;
    }
    .green
    {
      background-color: lightgreen;
    }
    .blue
    {
      background-color: lightblue;
    }
    <div id="test">
      <div class="red">one</div>
      <div class="green">two</div>
      <div class="blue">three</div>
    </div>
    <button onclick="addClass()" type="button">Add class</button>
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search