skip to Main Content

I have two documents indexed in Azure Search (among many others):

  • Document A contains only one instance of "BRIG" in the whole document.
  • Document B contains 40 instances of "BRIG".

When I do a simple search for "BRIG" in the Azure Search Explorer via Azure Portal, I see Document A returned first with "@search.score": 7.93229 and Document B returned second with "@search.score": 4.6097126.

There is a scoring profile on the index that adds a boost of 10 for the "title" field and a boost of 5 for the "summary" field, but this doesn’t affect these results as neither have "BRIG" in either of those fields.

There’s also a "freshness" scoring function with a boost of 15 over 365 days with a quadratic function profile. Again, this shouldn’t apply to either of these documents as both were created over a year ago.

I can’t figure out why Document A is scoring higher than Document B.

2

Answers


    1. Test your scoring profile configurations. Perhaps try issuing queries without scoring profiles first and see if that meets your needs.

    2. The "searchMode" parameter controls precision and recall. If you want more recall, use the default "any" value, which returns a result if any part of the query string is matched. If you favor precision, where all parts of the string must be matched, change searchMode to "all". Try the above query both ways to see how searchMode changes the outcome. See Simple Query Examples.

    3. If you are using the BM25 algorithm, you also may want to tune your k1 and b values. See Set BM25 Parameters.

    4. Lastly, you may want to explore the new Semantic search preview feature for enhanced relevance.

    Login or Signup to reply.
  1. It’s possible that document A is ‘newer’ than document B and that’s the reason why it’s being displayed first (has a higher score). Besides Term relevance, freshness can also impact the score.

    EDIT:

    After some research it looks like that newer created Azure Cognitive Search uses BM25 algorithm by default. (source: https://learn.microsoft.com/en-us/azure/search/index-similarity-and-scoring#scoring-algorithms-in-search)

    Document length and field length also play a role in the BM25 algorithm. Longer documents and fields are given less weight in the relevance score calculation. Therefore, a document that contains a single instance of the search term in a shorter field may receive a higher relevance score than a document that contains the search term multiple times in a longer field.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search