Bug - Realtime cache collision #2589

ADBond · 2025-01-20T12:22:17Z

There were occasional failings of test runs in tests/test_realtime.py, particularly in CI, that would generally be resolved on re-running.

The issue was that we used id(settings) as a cache key. In CPython this is simply the address of the object in memory, and in general is not guaranteed to be unique. When new objects are put in the same memory location as those that have been garbage-collected, we got cache collisions, resulting in the wrong SQL being returned.

We use a slightly different approach for a cache key to how we check if we have some SQL already cached, depending on the type of object:

pathlib.Path or str (representing saved models) we just use the string itself
dict we use a json dump (possibly after converting to serialisable types)
for SettingsCreator we use the id still, but we also store a weak reference to the object. This won't prevent it being garbage collected, but allows us to see if the object still lives. If we find a supposèd cache hit, we check the associated weakref (which we also keep in the cache), and if it's dead (and thus we have the case of recycled id values) we delete the entry and return as though we found no hit.

The custom solutions for str and dict are mainly because these cannot be (directly) weakrefed.

You can see in this PR test runs where we see that this weakref solution is triggered, and fixes the issue (with a hack to fail the tests on a separate condition, which occurs only on that logic branch).

store a weak reference to settings, and clear the cache entry if we try to access but object is dead

and also ensure we therefore serialise the relevant data

RobinL

Looks good to me, thanks!

ADBond added 8 commits January 20, 2025 11:51

naive WeakKeyDictionary for cache

3a9ce77

delete cache entries when objects die

d445b6d

store a weak reference to settings, and clear the cache entry if we try to access but object is dead

handle cache key differently depending on object type

3d5cafc

handle sql caching of dictionary settings

e71ee80

make comparison dict method fully dicty

d3c7893

and also ensure we therefore serialise the relevant data

realtime test caching for dict settings

ebdb8b5

format and fix logic

07b65ee

fixup typing

89560d3

ADBond added bug Something isn't working caching labels Jan 20, 2025

ADBond requested a review from RobinL January 20, 2025 13:24

RobinL approved these changes Jan 20, 2025

View reviewed changes

ADBond merged commit 851b73e into moj-analytical-services:master Jan 20, 2025
28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug - Realtime cache collision #2589

Bug - Realtime cache collision #2589

ADBond commented Jan 20, 2025

RobinL left a comment

Bug - Realtime cache collision #2589

Bug - Realtime cache collision #2589

Conversation

ADBond commented Jan 20, 2025

RobinL left a comment

Choose a reason for hiding this comment