I have a daemon that runs background jobs requested by our webservice. We have 4 workers running simultaneously.
Sometimes a job is executed twice at the same time, because two workers decided to run that job. To avoid this situation we tried several things:
- Since our jobs comes from our databases, we added a flag called
executed
, that prevents other works to get a job that has already been started to execute; This does not solve the problem, sometimes the delay with our database is enough to have simultaneous executions; - Added
memcached
in the system (all workers run in the same system), but somehow we had simultaneous jobs running today —memcached
does not solve for multiple servers as well.
Here is the following logic we are currently using:
// We create our memcached server
$memcached = new Memcached();
$memcached->addServer("127.0.0.1", 11211);
// Checkup every 5 seconds for operations
while (true) {
// Gather all operations TODO
// In this query, we do not accept operations that are set
// as executed already.
$result = findDaemonOperationsPendingQuery();
// We have some results!
if (mysqli_num_rows($result) > 0) {
$op = mysqli_fetch_assoc($result);
echo "Found an operation todo #" . $op['id'] . "n";
// Set operation as executed
setDaemonOperationAsDone($op['id'], 'executed');
// Verifies if operation is happening on memcached
if (get_memcached_operation($memcached, $op['id'])) {
echo "tOperation id already executing...n";
continue;
} else {
// Set operation on memcached
set_memcached_operation($memcached, $op['id']);
}
... do our stuff
}
}
How this kind of problem is usually solved?
I looked up on the internet and found out a library called Gearman, but I’m not convinced that it will solve my problems when we have multiple servers.
Another thing I thought was to predefine a daemon to run the operation at insertion, and create a failsafe exclusive daemon that runs operations set by daemons that are out of service.
Any ideas?
Thanks.
2
Answers
You have a typical concurrency problem.
The way to solve this is to use transactions and locks, in particular SELECT.. FOR UPDATE. It’ll go like this:
START TRANSACTION
) and tries to acquire an exclusive lockSELECT * FROM jobs [...] FOR UPDATE
EDIT: Specific comment about your PHP code:
An alternative solution to using locks and transactions, assuming each worker has an id.
In your loop run:
The update is a single operation which is atomic and you are only setting worker_id if it is not yet set so no worries about race conditions. Setting the worker_id makes it clear who owns the operation. The update will only assign one operation because of the LIMIT 1.