skip to Main Content

Im working with vue and mongoose and I need to make the logic for a search engine in "Catalan" that meets the following specifications:

The collection contains records with the following fields:

  • Question
  • Answer
  • Alternative answer
  • Comment

1.- If two words are searched, only the results that appear in one or more fields that contain the words or those that are mandatory must appear. That is, if only one of the words appears or not all, that result is not returned.

2.- If, for example, you search for "Eufór" between quotes, I need the results to be exactly as written, but if it is without quotes, the results can be words that contain those letters and it will discriminate between accents, capitalization, and umlauts.

Can someone tell me how I can do this?

Thank you very much

The results I expect based on the searches are the following:

1.- For example:

Search: Guardiola Messi

Results:

{Question: 'Lorem ipsum dolor sit amet **Guardiola** consectetur adipiscing elit **Messi**', Answer: '...', Alternative_answer: '...', Comment: '...'}

{Question: 'Lorem ipsum dolor sit amet **Guardiola** consectetur', Answer: ' sed do eiusmod tempor incididun **Messi** dolore magna aliqua', Alternative_answer: '...', Comment: '...'}


{Question: '...', Answer: ' sed do eiusmod tempor incididun **Guardiola** dolore magna aliqua', Alternative_answer: 'Sed ut perspiciatis unde omnis **Messi** iste natus error sit voluptatem', Comment: '...'}

I have tried to do it using the "$search" function but these texts are in Catalan language and the results are not what I expected.

2.- For example:

Search: "Eufór"

Result:


{Question: 'Lorem ipsum dolor sit amet **Eufór** consectetur adipiscing elit', Answer: '...', Alternative_answer: '...', Comment: '...'}

{Question: '...', Answer: '...', Alternative_answer: 'Lorem ipsum dolor sit amet **Eufór** consectetur adipiscing elit', Comment: '...'}

Search: Eufór

Result:


{Question: 'Lorem ipsum dolor sit amet **euforia** consectetur adipiscing elit', Answer: '...', Alternative_answer: '...', Comment: '...'}

{Question: '...', Answer: '...', Alternative_answer: 'Lorem ipsum dolor sit amet **Eufór** consectetur adipiscing elit', Comment: '...'}

{Question: '...', Answer: '...', Alternative_answer: '...', Comment: 'Lorem ipsum dolor sit amet **EUFORÍS** consectetur adipiscing elit'}

Same if multiple words are written:

Search: Eufór mal

Result:


{Question: 'Lorem ipsum dolor sit amet **euforia** consectetur adipiscing elit', Answer: '...', Alternative_answer: 'odit aut fugit, sed quia consequuntur magn **Maldad**', Comment: '...'}


{Question: 'Lorem ipsum dolor sit amet **maltrato** consectetur adipiscing elit', Answer: '...', Alternative_answer: 'odit aut fugit, sed quia consequuntur magn **euforatis**', Comment: '...'}

The code I have tried is the following:

1.-

    $match: {
        $text: {
            $search: filterBy,
            $caseSensitive: caseSensitive == "true",
            $diacriticSensitive: diacriticSensitive == "true",
        },
    },
};

First I created the index for each field but it does not work correctly.

2.- I have tried to separate the cases but I don’t get a good result either.

mandatoryWords.forEach((md) => {
    andQuery.push({
        $or: [
            {
                Question: {
                    $regex: md,
                    $options: "i",
                },
            },
            {
                Answer: {
                    $regex: md,
                    $options: "i",
                },
            },
            {
                Alternative_answer: {
                    $regex: md,
                    $options: "i",
                },
            },
            {
                Comment: {
                    $regex: md,
                    $options: "i",
                },
            },
        ],
    });
});

let orQuery = [];
filterWords.forEach((fw) => {
    orQuery.push(
        {
            Question: {
                $regex: fw,
                $options: "i",
            },
        },
        {
            Answer: {
                $regex: fw,
                $options: "i",
            },
        },
        {
            Alternative_answer: {
                $regex: fw,
                $options: "i",
            },
        },
        {
            Comment: {
                $regex: fw,
                $options: "i",
            },
        },
    );
});

// Check if there are mandatory and optional conditions
if (andQuery.length > 0 && orQuery.length > 0) {
    search = {
        $match: {
            $and: [
                ...andQuery, // Incorporates all mandatory conditions
                { $or: orQuery }, // Adds the condition that at least one optional word is present
            ],
        },
    };
} else if (andQuery.length > 0) {
    // Mandatory conditions only
    search = {
        $match: {
            $and: andQuery,
        },
    };
} else if (orQuery.length > 0) {
    // Optional conditions only
    search = {
        $match: {
            $or: orQuery,
        },
    };
}

UPDATE

The only problem I have to solve is that the word or phrase has to go between spaces or without them.
If I look for Messi:
"Messi is the….".
"…player, Messi".
"…the player Messi is the…".

I need a regex for this cases

2

Answers


    1. For the first requirement, you need to ensure that all the words in the search query are present in at least one of the fields. This can be achieved by creating an $and query with $regex for each word in the search query.
    2. For the second requirement, you need to differentiate between exact matches and partial matches. This can be achieved by checking if the search query is enclosed in quotes. If it is, you can remove the quotes and search for an exact match. If it’s not, you can search for a partial match.
    let exactMatch = false;
    
    // Check if the search query is enclosed in quotes
    if (searchQuery.startsWith('"') && searchQuery.endsWith('"')) {
        // Remove the quotes
        searchQuery = searchQuery.substring(1, searchQuery.length - 1);
        exactMatch = true;
    }
    
    // Split the search query into words
    let words = searchQuery.split(' ');
    
    let andQuery = [];
    words.forEach((word) => {
        let regex = exactMatch ? `^${word}$` : word;
        andQuery.push({
            $or: [
                { Question: { $regex: regex, $options: 'i' } },
                { Answer: { $regex: regex, $options: 'i' } },
                { Alternative_answer: { $regex: regex, $options: 'i' } },
                { Comment: { $regex: regex, $options: 'i' } },
            ],
        });
    });
    
    search = {
        $match: {
            $and: andQuery,
        },
    };
    

    This code will create an $and query with a $regex for each word in the search query. If the search query is enclosed in quotes, it will search for an exact match. Otherwise, it will search for a partial match. The $options: 'i' makes the search case-insensitive.

    Login or Signup to reply.
  1. The solution:

    {
        pregunta: {
            $regex: `(^|\W)${md}(?=\W|$)`,
            $options: "",
        },
    },
    

    This answer was posted as an edit to the question MongoDB search for words in multiple fields and only return results with both words (Text locales "Catalan") (SOLVED) by the OP Argoitz Estebanez under CC BY-SA 4.0.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search