After reading many similar questions, although new to MongoDB and pymongo, I believe this specific use case is not covered fully.
I need to perform a case insensitive search on a MongoDB collection which holds place names as strings with the following structure:
{
"_id": {
"$oid": "6419bc2aa7cb0cf0502af787"
},
"placename": "Bologna",
"geonamesid": 3181928,
"timestamp": {
"$date": "2023-03-21T14:16:10.976Z"
}
}
as such the following pymongo query works well:
placename = "Bologna"
query = {"placename": placename}
res = mdbcoll.find_one(query, {"geonamesid": 1})
which would correctly return 3181928 in res.
Problem lies in these places sometimes being passed in differente cases such as "BOLOGNA" and the above find would fail of course.
I have therefore read about generating a case insensitive index on placename as follows:
db.loc.createIndex({ placename: 1 }, { collation: { locale: "it", strength: 2 } })
after which I am trying to modify the above query as follows:
res = mdbcoll.find_one(query, {"geonamesid": 1}).collation(
{
"locale": "it",
"strength": 2,
} # find on placename case insensitive index
)
but this fails with the following error:
AttributeError: 'NoneType' object has no attribute 'collation'
2
Answers
When using the
find_one()
method you will only get a single document, which is not a cursor, where the collation method lies. But you could pass a collation as a parameter, the same as in find.Then, here you’re not even getting your result, as your
find_one()
returnsNone
. As the error states. I don’t know why right now though.Finally, an easier way to find things in a case-insensitive way is to pass the query a regex instead of a string:
re.compile()
enables you to reuse the patternAnd
re.I
is the case insensitive flagIf you create a full text index you can then use the
$text
operator to perform a case insensitve search. You don’t need to add any collation to the query, although you may want to add a$language
to the index.One thing to be aware of:
Regex search as per @Er…’s answer will work too but will perform poorly on larger collections as the field is not indexed.
This example shows how it could work. You only need to create the index as a one-off activity.