skip to Main Content

I have a problem with my database.
I frequently (several time a day) have crash during queries. It throws this error:

PoolClearedError [MongoPoolClearedError]: Connection pool for db2.prod.someDomain.com:27017 was cleared because another operation failed with: "connection <monitor> to [ip:v6:add:ress::]:27017 timed out"
    at ConnectionPool.processWaitQueue (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/cmap/connection_pool.js:520:82)
    at ConnectionPool.clear (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/cmap/connection_pool.js:251:14)
    at updateServers (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/topology.js:461:29)
    at Topology.serverUpdateHandler (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/topology.js:332:9)
    at Server.<anonymous> (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/topology.js:444:77)
    at Server.emit (node:events:513:28)
    at Server.emit (node:domain:489:12)
    at markServerUnknown (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/server.js:298:12)
    at Monitor.<anonymous> (/var/www/api/prod/node_modules/mongoose/node_modules/mongodb/lib/sdam/server.js:58:46)
    at Monitor.emit (node:events:513:28) {
  address: 'db.prod.someDomain.com:27017',
  [Symbol(errorLabels)]: Set(1) { 'RetryableWriteError' }
}

I’ve tried several changes, from decreasing my write rates and read rates over the db, but can’t get why it’s crashing … I can’t reproduce the bug, since it happens randomly, and when i read my server monitoring informations, CPUs RAM and SSD I/O are at low levels

Here is my configuration :

mongoose last version
mongodb js driver last version
mongodb last version
1 replicaset with 2 nodes
Each node has 32GO ram + 12 core Xeon processor
Full database size is around 130GO, biggest collection has 70M doccuments, total index size is around 24GO

2

Answers


  1. Chosen as BEST ANSWER

    I've found what is causing this error, and share it here if anybody is in my situation.

    In fact the problem does not come from my database neither from my network ... but from my client server.

    Sometimes my client server get its CPU overloaded, all cores going to 100%. In those cases i guess it can't handle in time its mongoDB requests which leads to this MongoPoolClearedError.

    Anyone having this error should check on client side if the server is not overloaded.


  2. Writing it as answer since it’s quite big:
    In this case, the server you’re using [ip:v6:add:ress::]:27017 has failed with timeout error ("connection to [ip:v6:add:ress::]:27017 timed out"). Which made the underlying connection pool unavailable ("paused"). In such case any connection acquiring for new operations (without new server selection) will throw with the PoolClearedError error until the new server will be elected as primary (then the pool will be healthy again and the issue will gone).
    Then scenarios can be various. Most likely and I vote for this scenario, somewhere you’re using a cursor operation (long-living?) that still relies on the old server without reselecting a new server (each getMore inside cursor uses a server from initial aggregate/find operations), since server still unhealthy (but the cursor doesn’t know it), when getMore tries to get a new connection, you see this error.

    So, what you can do:

    1. Check the servers health at the moment when issue happens. If you fix it, the error will gone.
    2. Try to avoid long running operations like cursors. It won’t help with the issue, but any new operation will be started with server selecting which will fail as well, but at the very least you will see the correct error message with the issue.
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search