Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Add cache for most recent blocks #3440

Open
wants to merge 5 commits into
base: staging
Choose a base branch
from

Conversation

vicsn
Copy link
Collaborator

@vicsn vicsn commented Nov 18, 2024

Motivation

Deserialization is one of the most expensive operations, because we're validating all data. We already have a Committee and Transmission cache in memory, we don't have a Block cache yet. This PR should have a significant performance impact, noting:

  • internal testing showed that validators under load request on average 1 block every 5 blocks.
  • clients, of course, request all blocks.

I decided against an LruCache because its more important we can always return recent blocks quickly to validators participating in consensus.

Test Plan

  • local network passed
  • Deployed network passed

@vicsn vicsn changed the title Add cache for most recent blocks [Perf] Add cache for most recent blocks Nov 18, 2024
let mut missing_heights = BTreeSet::from_iter(heights.clone());
// For each height in the range, check if the block is in the cache.
for height in heights {
if let Some(block) = self.block_cache.read().get(&height) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would probably be more perf-friendly to keep a read guard for the entire duration of this loop

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could also be used to quickly skip the attempt to obtain blocks from the cache in case its 1st and last entries are outside of the requested range

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would probably be more perf-friendly to keep a read guard for the entire duration of this loop

Great idea: bb00286

it could also be used to quickly skip the attempt to obtain blocks from the cache in case its 1st and last entries are outside of the requested range

Interesting idea, though I'd rather keep it as is and make the happy path (which matters for honest validators) fast and readable.

// All blocks found in the cache.
(None, None) => return Ok(blocks),
// Some blocks found in the cache.
(Some(&start), Some(&end)) => start..end + 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(Some(&start), Some(&end)) => start..end + 1,
(Some(&start), Some(&end)) => start..=end,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error[E0308]: mismatched types
   --> node/bft/ledger-service/src/ledger.rs:167:45
    |
167 |         for block in self.ledger.get_blocks(missing_heights)? {
    |                                  ---------- ^^^^^^^^^^^^^^^ expected `Range<u32>`, found `RangeInclusive<u32>`
    |                                  |
    |                                  arguments to this method are incorrect

@raychu86
Copy link
Contributor

Mostly LGTM, 1 question I had would be what happens when the latest blocks are max size. Could this lead to a OOM issue if 10 of the largest possible blocks are held in memory?

@vicsn
Copy link
Collaborator Author

vicsn commented Dec 18, 2024

@raychu86

Thx for noting, I think right now we're safe, and this particular cache doesn't increase the risk significantly, but really appreciate the mention because we should include this in the criteria for increasing the validator set - I just updated Foundation documentation.

Current average case scenario: 25 validators, 50 unique transmissions, 5kb txs, 4 round blocks are common: 60MB/block. For the gentle price of 2 credits per tx, or 2400 credits per block, an attacker can create 100kb transactions, creating 20x larger memory footprint.

Besides this new cache, at any given time, a validator can be expected to have ~15 blocks in memory anyway (3 clients syncing 5 blocks at a time). If a client is far behind, they'll request 10x more in parallel. Even with very large blocks, this still falls within the memory requirements. But this will become an issue once the validator set increases, at which point the maximum tx size can be considered to be reduced.

raychu86
raychu86 previously approved these changes Dec 19, 2024
Copy link
Contributor

@raychu86 raychu86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vicsn vicsn requested review from ljedrz and raychu86 January 3, 2025 20:37
Copy link
Collaborator

@ljedrz ljedrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a small suggestion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants