skip to Main Content

I cannot get prefix search to work on a scalar field. I can however get infix search to work which suggests that my field is not properly tokenized.

ENV: Standalone Milvus 2.4.8 on Ubuntu 22.04

Here is my schema:

{
fields: [
                {
                    name: 'id',
                    description: 'Id field',
                    data_type: DataType.VarChar,
                    is_primary_key: true,
                    max_length: this.NORMALISED_GUID_LENGTH
                },
                {
                    name: 'vector',
                    description: 'Vector field',
                    data_type: DataType.FloatVector,
                    dim: this.embeddingModel.dims
                },
                {
                    name: 'tag',
                    description: 'The partition tag',
                    data_type: DataType.VarChar,
                    max_length: this.NORMALISED_TAG_LENGTH
                },
                {
                    name: 'channels',
                    description: 'The channels that may access this entry',
                    data_type: DataType.VarChar,
                    max_length: this.MAX_CHANNELS_LENGTH
                },
                {
                    name: "payload",
                    description: 'The payload meta data',
                    data_type: DataType.JSON
                }
            ],
            partition_key_field: 'tag',
            index_params: [
                {
                    field_name: "vector",
                    index_type: "DISKANN",
                    metric_type: "IP"
                },
                {
                    field_name: "tag",
                    index_name: "tag_index",
                    index_type: "INVERTED"
                },
                {
                    field_name: "channels",
                    index_name: "channels_index",
                    index_type: "INVERTED"
                }
            ]
}

The field of interest is ‘channels’.

The channels field can look like this:
"48d302b1963841c39790fecf56b91ddc c8a80b710e455460ae8b2399f5adfef5 b9e8870f9f994bc6a172e118fa6e7c8a"

When I search it the filter looks like this (but it does not work):
'channels like "c8a80b710e455460ae8b2399f5adfef5%"' // note it is the 2nd token in the channels field above

The following infix search works:
'channels like "%c8a80b710e455460ae8b2399f5adfef5%"' // note it is the 2nd token in the channels field above

And the following prefix search works but only for the first token in the channels field:
'channels like "48d302b1963841c39790fecf56b91ddc%"' // note it is the 1st token in the channels field above

What am I doing wrong?

2

Answers


  1. If there’s no text after the field the wildcard won’t work. There’s nothing to match. But let me check with engineering.

    Login or Signup to reply.
  2. This is because now for Milvus 2.4.x version, Milvus only support pattern match with %. it will take all your content in channel as one string. So your observation is true.
    But in the coming Milvus 2.5 version, we will introduce "match" phrase, then it will do tokenizer, which means you donot need to write ‘channels like "%c8a80b710e455460ae8b2399f5adfef5%"’, instead you can write as ‘filter: match(channel,’c8a80b710e455460ae8b2399f5adfef5′)’

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search