-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dealing with race condition when de-queuing #7
Comments
This comment is a summarisation of the discussion between me & @amenowanna. JLo (@amenowanna) suggested going with an approach based on PostgreSQL's Advisory Locks. Within this approach, we'll start of a new transaction with the default ( This ensures that all the calculations and update we do stay consistent throughout the dequeuing operation, allowing us to maintain the concurrency guarantees. It all requires fewer locks, compared to both The main downside though is that it'd block all dequeue operations, even if the dequeues operate on different queues, in which case they (ideally) shouldn't block each other. One way we found around this was by giving each Because of the above mentioned limitations (global locking & potential for a deadlock), we started looking into other modes. We determined On the other hand, So, for now, we'll go with |
@amenowanna I tried a use-case with additional filters applied when picking jobs, like, a filter on This, however, does not happen with Any thoughts? |
My gut says a global mutex with the advisory lock will perform better than serializable. Can we run some load testing to find out? |
Moving this to lower priority ready item as we have an interim fix (using |
Spending the time to research the performance difference is not as important right now to have a solution. We know both solutions work. Let's move forward with Serializabe as the current solution and hold on to this ticket to test the performance of the the two different options. |
@riyaz-ali we seem to also have a race condition when enqueueing. We are seeing this now that we are enabling multiple workers to run at the same time. Thoughts on how we could address this? |
There's not a defined way to handle task de-duplication when calling Ideally, this would be something the caller would need to ensure on their end somehow. Other platforms (Celery etc.) also behave similarly, deferring this to the caller. |
Totally agree with this. Enqueue is a consumer problem. I will open an issue in the mergestat repo and we can address it there. But want to have this conversation here for the community to see the discussion. |
Following write-up is an attempt to document a race condition we ran into when testing out the schema, the solutions we explored and trade-offs we made.
Below is a snapshot of the schema we were dealing with:
The Race Condition
As is evident from the schema above, we have a concept of "queue concurrency". A queue can be assigned a concurrency, and the engine would ensure that, at any given point in time, there are at-most
queues.concurrency
jobs running for the queue across all coordinating workers.The issue arises when there are two workers that try to concurrently de-queue jobs from a given queue. Note that the problem we are dealing with is "workers exceeding a queue's concurrency limit" and not "workers pulling in same job to process" (the later can be solved using the
FOR UPDATE SKIP LOCKED
approach).To de-queue a task, we use a query that resembles the following:
The
FOR UPDATE
lock onsqlq.queues
is to ensure that no one else de-queues from the same queue as we.So where does it fail?
Let's assume that there are 2 workers and both of them open 1 transaction each, say,
TX#1
andTX#2
, respectively.Now, in
TX#1
, we run the above query. It acquires a lock onsqlq.queues
, check for running tasks in the queues, pick one and update its status torunning
. It thenCOMMIT
the transaction and continues with processing the job.Concurrently, in
TX#2
, when we try to acquire a lock onsqlq.queues
, we have to wait untilTX#1
releases its lock. AfterTX#1
commits (or rolls back for that matter), we get the lock. But when we try to calculate the number of running tasks, we DO NOT see the update made byTX#1
(where it de-queued and started execution of a task).This becomes a problem now because if
a
had aconcurrency
value set, we won't see the newest job froma
which started execution while we were blocked. And because of that we might end up de-queueing more jobs froma
than allowed by itsconcurrency
settings.What can we do?
The problem here essentially boils down to dealing with a transaction's isolation / snapshot and its serialisability. If our transactions (and by extension the workers) can see changes made by other workers while they were blocked, it'd solve this problem.
The problem is captured pretty concisely by this StackOverflow post.
Let's see how the different Transaction Isolation levels would work in this scenario.
⭐ Read Committed
READ COMMITTED
is the default isolation level for PostgreSQL. It'd seem like this level should be sufficient for us, as we are interested in fetching data from transaction that has recently committed. But when we run the above tests, it still fails as we are not able to compute the correct value for number of running tasks. But why?Quoting from the documentation,
In our case, the query execution starts with the CTE and then blocks for the lock. When it acquires the lock, it resumes execution (with the
SELECT COUNT(*) ... GROUP BY
query that doesn't have aFOR UPDATE
) but doesn't see the new changes that were applied. This is inline with what the documentation says.One way to workaround this is to split the query and run the
SELECT ... FOR UPDATE
onsqlq.queues
as a separate query within the same transaction. This way after we've acquired the lock, when we execute the main query it'd see committed data from any other transaction as well.⭐ Repeatable Read
Quoting from the documentation,
At first, this doesn't seem particularly helpful in our case as we'd actually want to see changes made by any concurrent transaction, to make our dequeue logic work.
But if we drop any custom, explicit locks (
FOR UPDATE
onsqlq.queues
andFOR UPDATE SKIP LOCKED
onsqlq.jobs
) a concurrentREPEATABLE READ
transaction would block on theUPDATE sqlq.jobs SET status = 'running'
stage, and the operation would fail with a serialisation anomaly. This is good as now our application can now retry dequeuing!This approach is more seamless than having to run multiple queries and managing locks ourselves.
⭐ Serializable
SERIALIZABLE
is the strictest of all. The actual behaviour is very well covered in the official documentation.In our case, what
SERIALIZABLE
would allow us to do is emulate a serial execution of the dequeue logic, even in the presence of multiple concurrent workers. If two workers try to dequeue from same queue or try to dequeue same job, the transaction will fail with a serialisation anomaly error and will be retried.It's not bad as it sounds 🙃 If we think about it, we are doing serial dequeue by acquiring a lock on the queue (and blocking during that operation). With
SERIALIZABLE
we needn't worry about that anymore as the database would provide the consistency guarantees that we are trying to get with the explicit locks.SERIALIZABLE
in postgres is implemented using predicate locking and non-blocking SI locks, which the official documentation claims to be more performant than the equivalentFOR UPDATE
strategy.In our case, both
REPEATABLE READ
andSERIALIZABLE
provide the same level of guarantees we require, and ideally we should benchmark both solutions. Switching between isolation level won't functionally change anything in the implementation either, as in both the cases you run the same query for deqeueing. ForSERIALIZABLE
, we can maybe further optimise the query by having better, targeted predicates.The text was updated successfully, but these errors were encountered: