I have a problem with my database.
I frequently (several time a day) have crash during queries. It throws this error:
PoolClearedError [MongoPoolClearedError]: Connection pool for db2.prod.someDomain.com:27017 was cleared because another operation failed with: "connection <monitor> to [ip:v6:add:ress::]:27017 timed out"
at ConnectionPool.processWaitQueue (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/cmap/connection_pool.js:520:82)
at ConnectionPool.clear (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/cmap/connection_pool.js:251:14)
at updateServers (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/topology.js:461:29)
at Topology.serverUpdateHandler (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/topology.js:332:9)
at Server.<anonymous> (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/topology.js:444:77)
at Server.emit (node:events:513:28)
at Server.emit (node:domain:489:12)
at markServerUnknown (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/server.js:298:12)
at Monitor.<anonymous> (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/server.js:58:46)
at Monitor.emit (node:events:513:28) {
address: 'db.prod.someDomain.com:27017',
[Symbol(errorLabels)]: Set(1) { 'RetryableWriteError' }
}
I’ve tried several changes, from decreasing my write rates and read rates over the db, but can’t get why it’s crashing … I can’t reproduce the bug, since it happens randomly, and when i read my server monitoring informations, CPUs RAM and SSD I/O are at low levels
Here is my configuration :
mongoose last version
mongodb js driver last version
mongodb last version
1 replicaset with 2 nodes
Each node has 32GO ram + 12 core Xeon processor
Full database size is around 130GO, biggest collection has 70M doccuments, total index size is around 24GO
2
Answers
I've found what is causing this error, and share it here if anybody is in my situation.
In fact the problem does not come from my database neither from my network ... but from my client server.
Sometimes my client server get its CPU overloaded, all cores going to 100%. In those cases i guess it can't handle in time its mongoDB requests which leads to this MongoPoolClearedError.
Anyone having this error should check on client side if the server is not overloaded.
Writing it as answer since it’s quite big:
In this case, the server you’re using
[ip:v6:add:ress::]:27017
has failed with timeout error ("connection to [ip:v6:add:ress::]:27017 timed out"). Which made the underlying connection pool unavailable ("paused"). In such case any connection acquiring for new operations (without new server selection) will throw with thePoolClearedError
error until the new server will be elected as primary (then the pool will be healthy again and the issue will gone).Then scenarios can be various. Most likely and I vote for this scenario, somewhere you’re using a cursor operation (long-living?) that still relies on the old server without reselecting a new server (each getMore inside cursor uses a server from initial aggregate/find operations), since server still unhealthy (but the cursor doesn’t know it), when
getMore
tries to get a new connection, you see this error.So, what you can do: