fix: check current session's pending-write queue when recalling snapshots (e.g. diffing) #927

huonw · 2024-12-23T00:24:07Z

Description

This PR fixes assert ... == snapshot(diff=...) mode to work better in two scenarios:

Adding two snapshots at the same time (i.e. not already "on disk"), where the second snapshot is a diff from the first
Changing the base data of a diff'd snapshot

In particular, it seems like the nice performance optimisation of #606, meant the "recalling" used for diffing wasn't finding newly created or updated data, and instead just reading whatever snapshot is available on disk.

This PR resolves the problem by having the recalling managed via the SnapshotSession, which can thus check its write queue before reading disk.

Related Issues

Closes Diff snapshot requires two test runs for new tests and two to 'stabilise' after changes to base snapshot #926

Checklist

This PR has sufficient documentation.
This PR has sufficient test coverage.
This PR title satisfies semantic convention.

Additional Comments

No additional comments.

…hots (e.g. diffing)

codecov · 2024-12-23T00:28:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

@@            Coverage Diff             @@
##             main     #927      +/-   ##
==========================================
+ Coverage   97.75%   97.76%   +0.01%     
==========================================
  Files          21       21              
  Lines        1604     1613       +9     
==========================================
+ Hits         1568     1577       +9     
  Misses         36       36

huonw

As always, thank you for Syrupy!

huonw · 2024-12-23T00:34:18Z

src/syrupy/session.py

+            # find the last (i.e. most recent) write to this index/location in the queue:
+            for queue_data, queue_test_location, queue_index in reversed(queue):
+                if queue_index == index and queue_test_location == test_location:
+                    return queue_data


This loops is a potential performance issue, but I suspect it is okay. In particular: it scales like O(length of queue), i.e. number of pending writes.

This function is called twice in SnapshotAssertion._assert:

unconditionally:

syrupy/src/syrupy/assertion.py

Line 316 in ef8189c

snapshot_data, tainted = self._recall_data(index=self.index)

conditionally, only when diffing:

syrupy/src/syrupy/assertion.py

Line 320 in ef8189c

snapshot_data_diff, _ = self._recall_data(index=snapshot_diff)

If every assertion in a ambr file needs an update, this leads to quadratic O(number of assertions in that file^2) behaviour:

the first assert ... == snapshot has an empty queue, and adds an item

the second checks one element of the queue (doesn't match), and adds an item

the third checks two elements of the queue (don't match) and adds an item

...

the Nth checks N - 1 elements of the queue (all don't match) and adds an item

i.e. N assertions ends up iterating over $0 + 1 + 2 + ... + (N - 1) = \frac{(N - 1)(N - 2)}{2} = O(N^2)$ elements of queue.

I think this is probably okay:

this scales per snapshot location, not across the full testing session, since each queue is just for one key, which incorporates the snapshot location (i.e. if one has 1000 assertions, spread evenly across 100 test file (i.e. 10 assertions in each file), this is scaling with N = 10, not N = 1000).

how often does one have to completely re-write a whole file?

the work here is individually cheap (i.e. the constant factor is low)

There appears to be a slowdown: #927 (comment)

I ran inv benchmark locally as well. Results for this branch:

------------------------------------------------------------------------------ benchmark: 3 tests ------------------------------------------------------------------------------ Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_1000x_reads 1.0756 (1.0) 1.2278 (1.0) 1.1648 (1.0) 0.0630 (4.10) 1.1974 (1.0) 0.0948 (3.49) 2;0 0.8585 (1.0) 5 1 test_standard 1.1904 (1.11) 1.3294 (1.08) 1.2822 (1.10) 0.0534 (3.48) 1.2960 (1.08) 0.0439 (1.62) 1;1 0.7799 (0.91) 5 1 test_1000x_writes 1.3974 (1.30) 1.4278 (1.16) 1.4089 (1.21) 0.0153 (1.0) 1.3986 (1.17) 0.0272 (1.0) 1;0 0.7098 (0.83) 5 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

and "main":

------------------------------------------------------------------------------ benchmark: 3 tests ------------------------------------------------------------------------------ Name (time in s) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_1000x_reads 1.0264 (1.0) 1.1148 (1.0) 1.0855 (1.0) 0.0362 (1.0) 1.1006 (1.0) 0.0474 (1.0) 1;0 0.9212 (1.0) 5 1 test_standard 1.0904 (1.06) 1.1974 (1.07) 1.1461 (1.06) 0.0477 (1.32) 1.1414 (1.04) 0.0876 (1.85) 2;0 0.8725 (0.95) 5 1 test_1000x_writes 1.0937 (1.07) 1.2011 (1.08) 1.1406 (1.05) 0.0405 (1.12) 1.1313 (1.03) 0.0534 (1.13) 2;0 0.8767 (0.95) 5 1 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

The 1000x writes test performs 0.8767 operations per second on main but only 0.7098 operations per second with this change. Roughly 20% slowdown.

this scales per snapshot location,

This comes up when users are leveraging test parameterization over data sets. i.e. the benchmark case of parameterizing over 1000 test cases isn't too unrealistic from how I've seen syrupy used in the past.

FYI, the benchmarking is broken in CI for contributors (something to do with permissions in the github workflow I need to fix), however you should be able to run it locally with inv benchmark.

Updated. See #927 (comment)

Thanks for proving my "I think" wasn't accurate 😄

huonw · 2024-12-23T00:36:48Z

src/syrupy/assertion.py

-                self.extension.read_snapshot(
-                    test_location=self.test_location,
-                    index=index,
-                    session_id=str(id(self.session)),


I note that #606 added this session_id parameter, which I imagine was meant to help with circumstances like this, but it seems like it ends up ignored/unused in practice?

Amber doesn't use cache_key: https://github.com/tophat/syrupy/blob/ef8189c68460593fead9c484405b755e272c8cca/src/syrupy/extensions/amber/__init__.py#L57-L66

Single-file doesn't use session_id: https://github.com/tophat/syrupy/blob/ef8189c68460593fead9c484405b755e272c8cca/src/syrupy/extensions/single_file.py#L92-L103

And, in practice, I think it may be quite hard to use, since the relevant cached data is held in memory in the session itself (i.e. the extension _read_snapshot_data_from_location method calls don't really have access to it at all).

The session_id is used as the cache_key argument in Amber's __cacheable_read_snapshot method. Although it's not used in the function, it's used by the lru_cache decorator (which caches based on the kwargs I believe). So it essentially invalidates the cache

noahnu · 2024-12-26T23:55:44Z

Benchmark

Benchmark suite	Current: `dbf395a`	Previous: `d283e87`	Ratio
`benchmarks/test_1000x.py::test_1000x_reads`	`0.7313520234418077` iter/sec (`stddev: 0.06327692389499665`)	`0.8206510272360953` iter/sec (`stddev: 0.06414211339510058`)	`1.12`
`benchmarks/test_1000x.py::test_1000x_writes`	`0.5937400226479332` iter/sec (`stddev: 0.16220190601651813`)	`0.7296660677418227` iter/sec (`stddev: 0.160353318333188`)	`1.23`
`benchmarks/test_standard.py::test_standard`	`0.6769499288981089` iter/sec (`stddev: 0.05068615977797293`)	`0.758693423573993` iter/sec (`stddev: 0.06688161104621605`)	`1.12`

This comment was automatically generated by workflow using github-action-benchmark.

huonw · 2024-12-27T09:27:27Z

Thanks for the review. I'll get back to it when I'm back at work after this holiday period.

----------------------------------------------------------------------------------- benchmark: 3 tests ----------------------------------------------------------------------------------- Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ test_1000x_reads 666.9710 (1.0) 748.6652 (1.0) 705.2418 (1.0) 37.2862 (1.0) 703.0552 (1.0) 70.1912 (1.07) 2;0 1.4180 (1.0) 5 1 test_standard 669.7840 (1.00) 843.3747 (1.13) 733.8905 (1.04) 68.2257 (1.83) 705.8282 (1.00) 85.6269 (1.30) 1;0 1.3626 (0.96) 5 1 test_1000x_writes 793.8229 (1.19) 937.1953 (1.25) 850.9716 (1.21) 54.4067 (1.46) 847.3260 (1.21) 65.9041 (1.0) 2;0 1.1751 (0.83) 5 1 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_1000x_reads 625.5781 (1.0) 887.4346 (1.0) 694.6221 (1.0) 109.0048 (1.0) 658.3128 (1.0) 87.7517 (1.0) 1;1 1.4396 (1.0) 5 1 test_1000x_writes 637.3099 (1.02) 1,021.0924 (1.15) 812.9789 (1.17) 150.2342 (1.38) 757.7635 (1.15) 215.9572 (2.46) 2;0 1.2300 (0.85) 5 1 test_standard 694.1814 (1.11) 1,037.9224 (1.17) 845.1463 (1.22) 136.2068 (1.25) 785.6973 (1.19) 194.9636 (2.22) 2;0 1.1832 (0.82) 5 1 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

huonw · 2025-01-13T05:38:37Z

I've updated this to improve performance and address that concern.

This required making PyTestLocation hashable to be able to use it in a dictionary. This gives O(1) access to the relevant snapshot no matter how many snapshot writes are queued, rather than having to do a linear scan of a whole file's queue. The most reliable way to do this is setting @dataclass(frozen=True), but this requires some "hacks" using object.__setattr__ for initialising the no-init fields.

I've done this work across many independent commits to hopefully make review easier.

Benchmarks on my machine, reporting the mean time ± std-dev:

benchmark	before (`main`)	initial version of this PR	current version
test_1000x_reads	637.6 ± 24.8	664.9 ± 20.2	656.9 ± 30.1
test_1000x_writes	655.2 ± 33.4	810.1 ± 37.6 ‼️	675.5 ± 53.5
test_standard	691.8 ± 32.5	714.9 ± 32.5	695.0 ± 20.4

In summary, I think:

I can reproduce the large slow-down that we were concerned about in test_1000x_writes in the initial version of this PR ✅
the new version seems to get the numbers back to main ✅ (at least, seems likely to be within normal variation, based on the standard deviation there, and also when I run the benchmarks again locally)

Expand for the full details

Before (main):

----------------------------------------------------------------------------------- benchmark: 3 tests -----------------------------------------------------------------------------------
Name (time in ms)          Min                 Max                Mean             StdDev              Median                IQR            Outliers     OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_1000x_reads      609.2453 (1.0)      665.1862 (1.0)      637.6388 (1.0)      24.7847 (1.0)      642.8187 (1.0)      45.0147 (1.37)          2;0  1.5683 (1.0)           5           1
test_1000x_writes     624.3305 (1.02)     705.0878 (1.06)     655.2347 (1.03)     33.3592 (1.35)     647.2840 (1.01)     51.7828 (1.57)          1;0  1.5262 (0.97)          5           1
test_standard         644.7847 (1.06)     735.3719 (1.11)     691.7773 (1.08)     32.4726 (1.31)     695.5663 (1.08)     32.9661 (1.0)           2;0  1.4456 (0.92)          5           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Initial version of this PR:

----------------------------------------------------------------------------------- benchmark: 3 tests -----------------------------------------------------------------------------------
Name (time in ms)          Min                 Max                Mean             StdDev              Median                IQR            Outliers     OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_1000x_reads      641.2010 (1.0)      688.5666 (1.0)      664.8896 (1.0)      20.2277 (1.0)      657.1012 (1.0)      33.6769 (1.11)          2;0  1.5040 (1.0)           5           1
test_1000x_writes     775.0379 (1.21)     871.4801 (1.27)     810.0917 (1.22)     37.6060 (1.86)     797.2370 (1.21)     45.4147 (1.50)          1;0  1.2344 (0.82)          5           1
test_standard         669.9822 (1.04)     734.9884 (1.07)     714.8819 (1.08)     26.4767 (1.31)     721.9564 (1.10)     30.3537 (1.0)           1;0  1.3988 (0.93)          5           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Current version of this PR:

----------------------------------------------------------------------------------- benchmark: 3 tests -----------------------------------------------------------------------------------
Name (time in ms)          Min                 Max                Mean             StdDev              Median                IQR            Outliers     OPS            Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_1000x_reads      620.9546 (1.0)      694.4665 (1.0)      656.9467 (1.0)      30.0991 (1.47)     660.8149 (1.0)      50.0408 (1.57)          2;0  1.5222 (1.0)           5           1
test_1000x_writes     624.0394 (1.00)     762.7209 (1.10)     675.4554 (1.03)     53.5257 (2.62)     666.0885 (1.01)     63.9635 (2.01)          1;0  1.4805 (0.97)          5           1
test_standard         668.8180 (1.08)     720.2504 (1.04)     695.0192 (1.06)     20.4068 (1.0)      699.0625 (1.06)     31.8778 (1.0)           2;0  1.4388 (0.95)          5           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

huonw · 2025-01-13T05:41:59Z

src/syrupy/location.py

+            "modulename",
+            ".".join([*test_relpath.parent.parts, test_relpath.stem]),
+        )
+        object.__setattr__(self, "methodname", None)


Setting this attribute is newly added: it previously wasn't being set on this codepath, and attempting to hash/compare the location values (to put into the queue dictionary) was blowing up when accessing it.

It seems like it was previously not read at all for doc tests?

noahnu · 2025-01-13T12:09:36Z

🎉 This PR is included in version 4.8.1 🎉

The release is available on:

v4.8.1
GitHub release

Your semantic-release bot 📦🚀

huonw · 2025-01-13T21:50:04Z

Thank you!

fix: check current session's pending-write queue when recalling snaps…

dbf395a

…hots (e.g. diffing)

huonw force-pushed the huonw/926-diff branch from 204026a to dbf395a Compare December 23, 2024 00:24

huonw commented Dec 23, 2024

View reviewed changes

huonw added 7 commits January 13, 2025 11:00

Make PyTestLocation hashable

0fb10fd

Use type aliases

55c4995

return both keys from _snapshot_write_queue_key

9a3c9f0

Use a defaultdict

817de52

Update comments

a86f5d8

huonw commented Jan 13, 2025

View reviewed changes

noahnu approved these changes Jan 13, 2025

View reviewed changes

noahnu merged commit 0f6bb55 into syrupy-project:main Jan 13, 2025
15 of 17 checks passed

noahnu added the released label Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: check current session's pending-write queue when recalling snapshots (e.g. diffing) #927

fix: check current session's pending-write queue when recalling snapshots (e.g. diffing) #927

huonw commented Dec 23, 2024

codecov bot commented Dec 23, 2024 •

edited

Loading

huonw left a comment

huonw Dec 23, 2024 •

edited

Loading

noahnu Dec 27, 2024 •

edited

Loading

noahnu Dec 27, 2024

noahnu Dec 27, 2024

huonw Jan 13, 2025 •

edited

Loading

huonw Dec 23, 2024

noahnu Dec 26, 2024 •

edited

Loading

noahnu commented Dec 26, 2024

huonw commented Dec 27, 2024

huonw commented Jan 13, 2025 •

edited

Loading

huonw Jan 13, 2025

noahnu commented Jan 13, 2025

huonw commented Jan 13, 2025

fix: check current session's pending-write queue when recalling snapshots (e.g. diffing) #927

fix: check current session's pending-write queue when recalling snapshots (e.g. diffing) #927

Conversation

huonw commented Dec 23, 2024

Description

Related Issues

Checklist

Additional Comments

codecov bot commented Dec 23, 2024 • edited Loading

Codecov Report

huonw left a comment

Choose a reason for hiding this comment

huonw Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

noahnu Dec 27, 2024 • edited Loading

Choose a reason for hiding this comment

noahnu Dec 27, 2024

Choose a reason for hiding this comment

noahnu Dec 27, 2024

Choose a reason for hiding this comment

huonw Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

huonw Dec 23, 2024

Choose a reason for hiding this comment

noahnu Dec 26, 2024 • edited Loading

Choose a reason for hiding this comment

noahnu commented Dec 26, 2024

Benchmark

huonw commented Dec 27, 2024

huonw commented Jan 13, 2025 • edited Loading

huonw Jan 13, 2025

Choose a reason for hiding this comment

noahnu commented Jan 13, 2025

huonw commented Jan 13, 2025

codecov bot commented Dec 23, 2024 •

edited

Loading

huonw Dec 23, 2024 •

edited

Loading

noahnu Dec 27, 2024 •

edited

Loading

huonw Jan 13, 2025 •

edited

Loading

noahnu Dec 26, 2024 •

edited

Loading

huonw commented Jan 13, 2025 •

edited

Loading