skip to Main Content

I have asked a simialr question before and i though i solved it but I did not.

I’m using Azure Search with an index that includes the "EdgeNGramTokenFilterV2" to achieve "starts with" behavior for partial matches. However, I noticed that it doesn’t work as expected when the search term contains spaces.

For example, if my data includes the word "Blue hall" and I search for "Blue" I get the correct results. But when I search for "Blue h," I don’t get any matches.

I am using SearchMode All

Example

{
"fields": [
    {
      "name": "myField",
      "type": "Edm.String",
      "searchable": true,
      "analyzer": "myAnalyzer"
...
    }
  ],
  "analyzers": [
    {
      "name": "myAnalyzer",
      "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer": "keyword_v2",
      "tokenFilters": [ "lowercase", "my_edgeNGram" ]
    }
  ],
  "tokenFilters": [
    {
      "@odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
      "name": "my_edgeNGram",
      "minGram": 1,
      "maxGram": 25,
      "side": "front"
    }
  ]

Could someone help with this please?

2

Answers


  1. Chosen as BEST ANSWER

    I was able to get the desired starts with behaviour by adding quotes to the search term before passing it the azure cognitive search service.

    ""Blue C ""
    

    This changes the term into a phrases query. reference link

    This post further explains in detail how the query parser works and it breaks the terms down before the analyzers is applied to it https://stackoverflow.com/a/40857668/15906376

    I am not sure if this is best way to go out about it but it works


  2. I have reproduced the issue with provided configuration.
    Sample data used:
    enter image description here

    Result is empty with query: "Sample t"
    

    To resolve this issue, you can try using the "standard_v2" tokenizer instead of the "keyword_v2" tokenizer.

    Here is the updated Index Definition:

    "fields": [
            {
                "name": "id",
                "type": "Edm.String",
                "key": True
            },
            {
                "name": "myField",
                "type": "Edm.String",
                "searchable": True,
                "analyzer": "myAnalyzer"
            }
        ],
        "analyzers": [
            {
                "name": "myAnalyzer",
                "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
                "tokenizer": "standard_v2",
                "tokenFilters": ["lowercase", "my_edgeNGram"]
            }
        ],
        "tokenFilters": [
            {
                "@odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
                "name": "my_edgeNGram",
                "minGram": 1,
                "maxGram": 25,
                "side": "front"
            }
        ]
    

    With the updated Index definition, the search query able to get the required results.
    enter image description here

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search