Add a cache for building encoded URLs #1432

bdraco · 2024-11-29T16:00:15Z

Since aiohttp web_request tends to see the same URLs over and over, its advantageous to cache building the URL objects. We currently have a cache for parsing URLs, but we did not have one for building URLs.

We didn't have a cache on URL.build because URL.build accepts a dict for the query which is not hashable. This is solved by constructing the query string before the cache is used. Note, that it may be better to avoid caching URLs with a query string in future but it didn't seem make enough difference in testing that it was worth the additional complexity.

Since aiohttp rarely URL.build with encoded=False we do not cache unencoded builds as the hit rate was < 50% in production testing. The hit rate for https://github.com/aio-libs/aiohttp/blob/1fa237ffc9e7aa70cbabb68ab64d6fe03255cbc0/aiohttp/web_request.py#L429 is nearly perfect

Additionally, and similar to #1434 every time the new URL object was created it would have a new self._cache, but caching the build, self._cache will already be initialized.

We currently have a cache for parsing URLs, but we did not have one for building URLs because URL.build accepts a dict for the query which is not hashable. Since web app tend to see the same 80% of URLs over and over, its advantageous to cache building the URL object as aiohttp has to make them on every web request.

codspeed-hq · 2024-11-29T16:14:37Z

CodSpeed Performance Report

Merging #1432 will improve performances by 53.86%

_{Comparing build_lru (0d25f0b) with master (e3a282e)}

Summary

⚡ 1 improvements
✅ 84 untouched benchmarks

Benchmarks breakdown

	Benchmark	`master`	`build_lru`	Change
⚡	`test_url_build_encoded_with_host_and_port`	397.8 µs	258.6 µs	+53.86%

codecov · 2024-11-29T16:15:14Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.11%. Comparing base (e3a282e) to head (0d25f0b).
Report is 1 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1432      +/-   ##
==========================================
+ Coverage   96.10%   96.11%   +0.01%     
==========================================
  Files          31       31              
  Lines        5855     5873      +18     
  Branches      348      349       +1     
==========================================
+ Hits         5627     5645      +18     
  Misses        202      202              
  Partials       26       26

Flag	Coverage Δ
CI-GHA	`96.11% <100.00%> (+0.01%)`	⬆️
MyPy	`49.40% <95.00%> (+0.14%)`	⬆️
OS-Linux	`99.55% <100.00%> (+<0.01%)`	⬆️
OS-Windows	`99.62% <100.00%> (+<0.01%)`	⬆️
OS-macOS	`99.30% <100.00%> (+<0.01%)`	⬆️
Py-3.10.11	`99.28% <100.00%> (+<0.01%)`	⬆️
Py-3.10.15	`99.51% <100.00%> (+<0.01%)`	⬆️
Py-3.11.10	`99.51% <100.00%> (+<0.01%)`	⬆️
Py-3.11.9	`99.28% <100.00%> (+<0.01%)`	⬆️
Py-3.12.7	`99.51% <100.00%> (+<0.01%)`	⬆️
Py-3.13.0	`99.51% <100.00%> (+<0.01%)`	⬆️
Py-3.9.13	`99.24% <100.00%> (+<0.01%)`	⬆️
Py-3.9.20	`99.47% <100.00%> (+<0.01%)`	⬆️
Py-pypy7.3.16	`99.53% <100.00%> (+<0.01%)`	⬆️
Py-pypy7.3.17	`99.55% <100.00%> (+<0.01%)`	⬆️
VM-macos-latest	`99.30% <100.00%> (+<0.01%)`	⬆️
VM-ubuntu-latest	`99.55% <100.00%> (+<0.01%)`	⬆️
VM-windows-latest	`99.62% <100.00%> (+<0.01%)`	⬆️
pytest	`99.55% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bdraco · 2024-11-29T16:17:12Z

The hit rate for pre-encoded in production is great. Not so much for unencoded ... might not make sense to cache building unencoded

bdraco · 2024-11-29T16:41:54Z

We still end up losing all the cache if they call

    @reify
    def url(self) -> URL:
        """The full URL of the request."""
        # authority is used here because it may include the port number
        # and we want yarl to parse it correctly
        return URL.build(scheme=self.scheme, authority=self.host).join(self._rel_url)

Because .join isn't cached....

edit: addressed in #1434

#1432 (comment)

bdraco · 2024-11-29T19:06:38Z

2024-11-29 09:05:09.163 CRITICAL (SyncWorker_9) [homeassistant.components.profiler] Cache stats for lru_cache <function build_pre_encoded_url at 0x7f827e1e1260> at /usr/local/lib/python3.13/site-packages/yarl/_url.py: CacheInfo(hits=187, misses=15, maxsize=128, currsize=15)

bdraco added 3 commits November 29, 2024 09:56

fix

88694b9

fix

36b8dda

bdraco added 2 commits November 29, 2024 10:19

revert caching unencoded

847308a

revert caching unencoded

74d16de

bdraco closed this Nov 29, 2024

bdraco deleted the build_lru branch November 29, 2024 16:33

bdraco restored the build_lru branch November 29, 2024 16:35

bdraco reopened this Nov 29, 2024

bdraco added a commit that referenced this pull request Nov 29, 2024

DNM: test hit rate of a join cache

366fa5a

#1432 (comment)

This was referenced Nov 29, 2024

DNM: test hit rate of a join cache #1433

Closed

Make from_parts a LRU to increase the chance we can preserve the internal cache #1434

Merged

bdraco added 2 commits November 29, 2024 12:26

Merge remote-tracking branch 'origin/master' into build_lru

9144f43

changelog

0d25f0b

psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Nov 29, 2024

bdraco changed the title ~~DNM: Add a cache for building URLs~~ Add a cache for building encoded URLs Nov 29, 2024

bdraco marked this pull request as ready for review November 29, 2024 19:06

bdraco merged commit 0d1c8b7 into master Nov 29, 2024
46 of 48 checks passed

bdraco deleted the build_lru branch November 29, 2024 19:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a cache for building encoded URLs #1432

Add a cache for building encoded URLs #1432

bdraco commented Nov 29, 2024 •

edited

Loading

codspeed-hq bot commented Nov 29, 2024 •

edited

Loading

codecov bot commented Nov 29, 2024 •

edited

Loading

bdraco commented Nov 29, 2024

bdraco commented Nov 29, 2024 •

edited

Loading

bdraco commented Nov 29, 2024

Add a cache for building encoded URLs #1432

Add a cache for building encoded URLs #1432

Conversation

bdraco commented Nov 29, 2024 • edited Loading

codspeed-hq bot commented Nov 29, 2024 • edited Loading

CodSpeed Performance Report

Merging #1432 will improve performances by 53.86%

Summary

Benchmarks breakdown

codecov bot commented Nov 29, 2024 • edited Loading

Codecov Report

bdraco commented Nov 29, 2024

bdraco commented Nov 29, 2024 • edited Loading

bdraco commented Nov 29, 2024

bdraco commented Nov 29, 2024 •

edited

Loading

codspeed-hq bot commented Nov 29, 2024 •

edited

Loading

codecov bot commented Nov 29, 2024 •

edited

Loading

bdraco commented Nov 29, 2024 •

edited

Loading