skip to Main Content

I’m searching through a collection either by hard data (numbers, other values) with $match and operaters like $regex, $in, $eq, $size, $gte, etc.

{'$match': {'$and': [{'isbn': {'$exists': 1}}, {'form': {'$in': ['BC']}}]}}

With $facet I can count the found matches.

For other purposes I use a $search index for nifty text search. I get scores and highlight and the lots, and again (but not with $facet) the count of found matches.

$search': {
    'index': 'default',
    'compound': {
        'must': [
            how...
        ],
    },
    'count': {'type': 'total'},
}

I can combine $match and $search, but cannot get a proper count. $search is not combinable with $facet, and without the $facet the text search always counts what the text search finds, not taking into account the unmatched documents.

How can I combina the both? Adding compound filters or using the limitied amount of operators within Atlas’s text search are not sufficient.

Bottom line, this is all about proper counting of results.

Thanks.

2

Answers


  1. Chosen as BEST ANSWER

    Solution!

    What is not menioned in the documentation (at least not findable), is the possibility to use regex in a filter.

    This way i can replace the whole match part.

    '$search': {
          'index': "default",
          'compound':{
            'filter': [
                {
                    'regex': { # to find specific years in dates
                        'query': '(2019|2020|2021|2022)',
                        'path': ['pubdate'],
                        'allowAnalyzedField': True,
                    }
                },
                {
                    'regex': {
                        'query': '(00|02|04)',
                        'path': ['pubstatus'],
                        'allowAnalyzedField': True,
                    }
                },
                {
                    'regex': { # look for BB or BC
                        'query': '(B.)',
                        'path': ['form'],
                        'allowAnalyzedField': True,
                    }
                },
                {
                    'regex': {
                        'query': '.*', # not empty
                        'path': ['cover'],
                        'allowAnalyzedField': True,
                    }
                },
                {
                    'regex': {
                        'query': '.*', # also to check if any items in the array
                        'path': ['index_recipes.recipe'],
                        'allowAnalyzedField': True,
                    }
                }
            ],
            'must': [{ # the actual text search
              'text': {
                'query': "pizza pasta",
                'path': ['title', 'flaptext']
              }
            }]
          }
        }
      }
    

    And it works fast and perfect.

    Thanks for the help.


  2. Short answer is you can’t.

    The $search stage uses a completely different index and methodology to "match" documents, therefor there are the limitations that it has to be the first step in a pipeline and can’t be nested in a $facet.

    Removing these limitations will essentially just run 2 queries, which is what you must do.

    const count = await collection.aggregate([...matchStage, ...countStage])
    const highlights = await collection.aggregate([...searchStage, ...matchStage])
    

    An approach that is possible in ElasticSearch is to use the children pipeline, but as far as i’m aware this is not a feature available on atlas search yet.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search