Azure Cognitive Search: Cannot serach for text that contains a space that starts with a specific term

Shaq
July 20, 2023
194 views
0 votes
2 Answers

I have asked a simialr question before and i though i solved it but I did not.

I’m using Azure Search with an index that includes the "EdgeNGramTokenFilterV2" to achieve "starts with" behavior for partial matches. However, I noticed that it doesn’t work as expected when the search term contains spaces.

For example, if my data includes the word "Blue hall" and I search for "Blue" I get the correct results. But when I search for "Blue h," I don’t get any matches.

I am using SearchMode All

Example

{
"fields": [
    {
      "name": "myField",
      "type": "Edm.String",
      "searchable": true,
      "analyzer": "myAnalyzer"
...
    }
  ],
  "analyzers": [
    {
      "name": "myAnalyzer",
      "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
      "tokenizer": "keyword_v2",
      "tokenFilters": [ "lowercase", "my_edgeNGram" ]
    }
  ],
  "tokenFilters": [
    {
      "@odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
      "name": "my_edgeNGram",
      "minGram": 1,
      "maxGram": 25,
      "side": "front"
    }
  ]

Could someone help with this please?

Answers

Chosen as BEST ANSWER
- Shaq
- July 20, 2023 at 9:01 am
- 0 votes
0
I was able to get the desired starts with behaviour by adding quotes to the search term before passing it the azure cognitive search service.
```
""Blue C ""
```
This changes the term into a phrases query. reference link

This post further explains in detail how the query parser works and it breaks the terms down before the analyzers is applied to it https://stackoverflow.com/a/40857668/15906376

I am not sure if this is best way to go out about it but it works

(Edit)

- RishabhM
- July 19, 2023 at 2:36 pm
- 0 votes
0
I have reproduced the issue with provided configuration.
Sample data used:
```
Result is empty with query: "Sample t"
```
To resolve this issue, you can try using the "standard_v2" tokenizer instead of the "keyword_v2" tokenizer.

Here is the updated Index Definition:
```
"fields": [
        {
            "name": "id",
            "type": "Edm.String",
            "key": True
        },
        {
            "name": "myField",
            "type": "Edm.String",
            "searchable": True,
            "analyzer": "myAnalyzer"
        }
    ],
    "analyzers": [
        {
            "name": "myAnalyzer",
            "@odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
            "tokenizer": "standard_v2",
            "tokenFilters": ["lowercase", "my_edgeNGram"]
        }
    ],
    "tokenFilters": [
        {
            "@odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
            "name": "my_edgeNGram",
            "minGram": 1,
            "maxGram": 25,
            "side": "front"
        }
    ]
```
With the updated Index definition, the search query able to get the required results.
Login or Signup to reply.

Please signup or login to give your own answer.

Click here to cancel reply.