Mongodb - Remove HTML Tags MondoDB

JayvenJavier
June 15, 2022
215 views
0 votes
2 Answers

I am creating a query to extract description of customers in mongodb. Unfortunately, the description is in HTML Format. Is there a way to replace all HTML tags and make it as " ". Either replace it with " " or remove HTML Tags.

Below is a sample document

{ 
        "_id" : ObjectId("61f72aefdc85500a8baa6bb8")
        "CustomerPin" : "22010871", 
        "CustomerName" : "TestLastName, TestFirstName", 
        "Age" : 39.0, 
        "Gender" : "Male", 
        "Description" : "<p><span>This will be a test description</span><br/></p>", 
}

The output should remove "p", "span", and "br". Is there a function in mongodb to remove them all at once without repeating $project

This is the expected output:

{ 
        "_id" : ObjectId("61f72aefdc85500a8baa6bb8")
        "CustomerPin" : "22010871", 
        "CustomerName" : "TestLastName, TestFirstName", 
        "Age" : 39.0, 
        "Gender" : "Male", 
        "Description" : "This will be a test description", 
}

Thanks!

Tags: mongodb mongodb-query

Answers

- UsamaMasood
- June 15, 2022 at 9:49 am
- 0 votes
0
One way to do it is by removing all tags by regex in pre hook of save method
```
Description.replace(/(<([^>]+)>)/gi, "");
```
See hooks here
Login or Signup to reply.

- AlisettarHuseynli
- June 15, 2022 at 4:58 pm
- 0 votes
0
If you use Mongo 4.2 then you have to find the exact regex which will extract content from HTML. Below you can find an aggregate pipeline and the regex also.
```
db.getCollection("name_of_your_collection").aggregate({
    $set: {
        contentRegex: {
            $regexFind: { input: "$Description", regex: /([^<>]+)(?!([^<]+)?>)/gi }
        }
    }
},
    {
        $set: {
            content: { $ifNull: ["$contentRegex.match", "$Description"] }
        }
    },
    {
        $unset: [ "contentRegex" ]
    }
)
```
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.

Mongodb – Remove HTML Tags MondoDB

Answers