Skip to content

Race-condition in creation+release of RedLock #979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bodograumann opened this issue Apr 7, 2025 · 13 comments
Open

Race-condition in creation+release of RedLock #979

bodograumann opened this issue Apr 7, 2025 · 13 comments

Comments

@bodograumann
Copy link

After switching from memory cache to Redis (Valkey) we got some errors in the RedLock context-manager release:

KeyError: '0506da2cf155316dd99dc58e1e705dc4d588ee0b7e28b9ee49ca2d1a51c88fe6a39681521397a5ee5fd577fc8e13e077-lock'
Traceback (most recent call last):
[…]
    async with RedLock(cache, key, lease=settings.aiohttp_timeout):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/.venv/lib/python3.12/site-packages/aiocache/lock.py", line 91, in __aexit__
    await self._release()
  File "/app/.venv/lib/python3.12/site-packages/aiocache/lock.py", line 96, in _release
    RedLock._EVENTS.pop(self.key).set()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I have looked at the code and found the following race-condition.
Assume there are two parallel calls A and B, where A blocks B. Then another third call C is triggered, exactly as A and B resolve. Then the following can happen:

Until here everything is fine, but then a new event C already comes in, before B has finished.

  • C acquires the lock and adds the lock key back to Valkey, but not yet to _EVENTS
    await self.client._add(self.key, self._value, ttl=self.lease)
  • B removes the lock key from Valkey
    removed = await self.client._redlock_release(self.key, self._value)

This is already wrong, because the lock key belongs to the run of C and not of A and B.
On the other hand C will only retrieve the cached value from A, so locking at this time it not important to work.

  • B tries to also remove the event from _EVENTS, but it was not yet created by C:
    if removed:
    RedLock._EVENTS.pop(self.key).set()

This (probably) leads to the exception we are seeing.

We are not using the lock as a real synchronization mechanism, but to reduce redundant calculations, so when the locking does not work perfectly that is fine. It just should not throw these kind of exceptions.
So I would suggest always running RedLock._EVENTS.pop(self.key) on release (no matter whether the lock key was found in Valkey or not), but not fail on errors:

    async def _release(self):
        await self.client._redlock_release(self.key, self._value)
        with suppress(KeyError):
            RedLock._EVENTS.pop(self.key).set()
@Dreamsorcerer
Copy link
Member

B removes the lock key from Valkey
This is already wrong, because the lock key belongs to the run of C and not of A and B.

I don't think it does. That looks like it only removes the lock if it matches the unique value of the lock.

B tries to also remove the event from _EVENTS, but it was not yet created by C:

From what I can see, removed should only be True if the value in valkey matches the lock's unique value. In your description, B never sets anything in valkey, so it should always be False.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Apr 7, 2025

If you can reproduce this reasonably reliably, then try adding some logs.
For example, add logs at the start of _acquire(), _wait_for_release() and _release(), with self._value included in each log. Then we can see what the exact sequence of events are.

@Dreamsorcerer
Copy link
Member

Only thing I can think of at the moment, is that one lock could delete the key from redis, while the other lock adds the value to redis (i.e. release and acquire).
Due to networking delays, maybe it's possible for the second lock that acquired to resume executing first and therefore set _EVENTS again.
Then the first lock resumes and deletes _EVENTS.
Then the second lock releases successfully, but finds _EVENTS is no longer set, thus triggering the exception.

If this is the case, then another lock starting up at this time would error if it tried to acquire, as it'd fail to fetch an Event to wait on.

@bodograumann
Copy link
Author

bodograumann commented Apr 8, 2025

Thanks for the quick response.
Your are right. I did not consider the uuid value, that prevents any other lock to remove the entry from redis.
So yeah, a race condition can only occur between two different locks that both set the lock key in redis.

Not sure if this notation is clear, but I'll just try to write down the different situations. Ideally it would look like this:

A+L, A+E, A-L, A-E, B+L, B+E, B-L, B-E

We know all the steps belonging to A are executed in the correct order and for B likewise.
Also B+L must occur after A-L. However there are no guarantees of where A-E occurs, as you also suggested.
So it could be:

  • A+L, A+E, A-L, B+L, A-E, B+E, B-L, B-E
    Which would still be fine I think.
  • A+L, A+E, A-L, B+L, B+E, A-E, B-L, B-E or A+L, A+E, A-L, B+L, B+E, B-L, A-E, B-E
    Which would break, because in the last step the event was already deleted by A.
  • A+L, A+E, A-L, B+L, B+E, B-L, B-E, A-E
    Which would break just the same, but in A instead of B.

Looking at the code, I think there are more issues in the RedLock implementation, so I think we'll go completely different route.

  • _EVENTS is global over the instance and just differentiates by the lock key, not taking into account different caches that might have to be handled separately. (Could be worked around using namespaces I guess)
  • When the lock cannot be acquired we try to wait for the event. However Redis is distributed over all instances, while _EVENTS only works for a single instance. That means if one instance has acquired the lock, all others will fail with a KeyError in _wait_for_release and just run anyway without locking.

So I feel like _EVENTS should not be used at all and instead we should wait for the lock to be released in Redis.
In Redis it is possible to subscribe to the key: https://redis.io/docs/latest/develop/use/keyspace-notifications/, but that has to be enabled explicitely.
So I guess polling is the most robust solution.

@Dreamsorcerer
Copy link
Member

A+L, A+E, A-L, B+L, B+E, B-L, B-E, A-E

I don't think you'd see this one happening, that'd require the packets indicating a successful release getting lost for a long time while B continued it's work and somehow managed to release the lock and get a response before those packets arrived. Maybe possible theoretically, but I can't imagine it ever being observed.

  • Could be worked around using namespaces

The key is namespaced.

  • while _EVENTS only works for a single instance

Is this not what the description refers to?

with a single instance because aiocache is focused on single
instance cache.

@Dreamsorcerer
Copy link
Member

Again, if you could add some logs and confirm exactly what is happening, then we can come up with a solution.

@Dreamsorcerer
Copy link
Member

According to glide maintainers, there's no point to a connection pool for async (#975 (comment)). So, if the described race condition is currently happening, it should not be possible after switching to glide.

@bodograumann
Copy link
Author

The key is namespaced.

True, but by default a cache has no namespace. So in order for this to work correctly, each cache would have to be given a unique namespace. This is not apparent to the user I think.

  • while _EVENTS only works for a single instance

Is this not what the description refers to?

aiocache/aiocache/lock.py

Lines 11 to 12 in 58301fb

 with a single instance because aiocache is focused on single 
 instance cache.

No, the description refers to a single instance of redis in contrast to a redis cluster.
I am talking about multiple instances/processes of our application using the same cache.

Again, if you could add some logs and confirm exactly what is happening, then we can come up with a solution.

Unfortunately the error occurs only so often and can only be reproduced in production environments.
Not sure I can find the time to do more debugging work here.

According to glide maintainers, there's no point to a connection pool for async (#975 (comment)). So, if the described race condition is currently happening, it should not be possible after switching to glide.

I don't really understand why that should be the case. Afaik it is not possible to rely on the order of different coroutine executions with asyncio.

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Apr 8, 2025

True, but by default a cache has no namespace. So in order for this to work correctly, each cache would have to be given a unique namespace. This is not apparent to the user I think.

I'm still not clear what the concern is here.

No, the description refers to a single instance of redis in contrast to a redis cluster.

That code has nothing to do with Redis though... The text surely applies if you're using a memory backend too (not that I wrote it, so I'm only guessing at the intention). It would probably be useful if it did support multiple instances of an application though, so if you have any improvements, that'd be good to get implemented.

I don't really understand why that should be the case. Afaik it is not possible to rely on the order of different coroutine executions with asyncio.

Both are waiting on a response from Redis, if there's only 1 connection then the first one to be executed will always be the first one to receive a response and get scheduled back into the loop, thus guaranteeing our order of execution.

The only reason this might be the cause is if there's a connection pool and the release happens on one connection and the acquire then happens on another connection, but the packets for the first connection end up taking longer to get back to the client, resulting in the second task getting scheduled back into the loop first.

@bodograumann
Copy link
Author

True, but by default a cache has no namespace. So in order for this to work correctly, each cache would have to be given a unique namespace. This is not apparent to the user I think.

I'm still not clear what the concern is here.

If I just naively instantiate two independant caches and try to create RedLocks for them, both locks will share the same _EVENT dictionary and the keys will overlap, even though they should be independant.

No, the description refers to a single instance of redis in contrast to a redis cluster.

That code has nothing to do with Redis though... The text surely applies if you're using a memory backend too (not that I wrote it, so I'm only guessing at the intention).

The code does not directly, but the term Red-Lock comes from Red-is and the link that is referenced goes to a Redis docs page: https://redis.io/topics/distlock
On that page there are two parts:

  • RedLock for a single instance of Redis and
  • RedLock for a Redis cluster.

The RedLock implementation here only implements the first part. That's why I am pretty sure the explanation refers to instances Redis.

I don't really understand why that should be the case. Afaik it is not possible to rely on the order of different coroutine executions with asyncio.

Both are waiting on a response from Redis, if there's only 1 connection then the first one to be executed will always be the first one to receive a response and get scheduled back into the loop, thus guaranteeing our order of execution.

The only reason this might be the cause is if there's a connection pool and the release happens on one connection and the acquire then happens on another connection, but the packets for the first connection end up taking longer to get back to the client, resulting in the second task getting scheduled back into the loop first.

Maybe, but that is again making assumptions about the order of execution in asyncio of different coroutine calls.
I don't think the stance "this probably can't happen in practice" is a good approach for a generic library like this one.

A test where the cache is mocked should be able to reproduce the race-condition in RedLock (regarded independant of any actual cache implementation), by controlling the order of execution of the asyncio coroutines.
Would that be acceptable?
I am aware that we cannot be completely certain, that that race-condition is actually the one we observed in practice, but I think fixing what we find is the best we can do here.

@bodograumann
Copy link
Author

It would probably be useful if it did support multiple instances of an application though, so if you have any improvements, that'd be good to get implemented.

I think getting rid of _EVENTS would make sense.

There is one more thing I changed in our implementation though, that is more difficult to put into RedLock as a context manager:
It is not actually necessary to acquire a lock to read the cache entry.
I'll think about what can be done here.

@Dreamsorcerer
Copy link
Member

If I just naively instantiate two independant caches and try to create RedLocks for them, both locks will share the same _EVENT dictionary and the keys will overlap, even though they should be independant.

If you create them with the same key, sure. Isn't that exactly the same as using independent caches to get any value in the cache? I think that's just how the entire library works.

Maybe, but that is again making assumptions about the order of execution in asyncio of different coroutine calls.
I don't think the stance "this probably can't happen in practice" is a good approach for a generic library like this one.

If it's a single connection, it can't happen. The execution order is entirely predictable if you know the variables.
If we need to support connection pools, then we could think about a fix, if this is indeed the cause of the exception.

A test where the cache is mocked should be able to reproduce the race-condition in RedLock (regarded independant of any actual cache implementation), by controlling the order of execution of the asyncio coroutines.
Would that be acceptable?

If you have a proposed solution, then sure, go ahead.

I think getting rid of _EVENTS would make sense.

I'm not clear what the proposed alternative is (I'm not the most familiar with all this), so feel free to propose any improvements.

There is one more thing I changed in our implementation though, that is more difficult to put into RedLock as a context manager: It is not actually necessary to acquire a lock to read the cache entry. I'll think about what can be done here.

I think that's a different type of lock. That'd be a RW lock, like in aiorwlock.

@avifenesh
Copy link

Hi, Glide maintainer, please let me know if I can help with some info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants