Proposal for reorganizing shuffling code #6386

matthewkeil · 2024-02-02T04:23:30Z

Problem description

Currently we are calculating the shuffling for the next epoch during the state transition of the current epoch.

https://github.com/ChainSafe/lodestar/blob/00dfa63413e9b2a57dbaea70ac7151ef134b0c05/packages/state-transition/src/cache/epochCache.ts#L513-517

This takes about a second.

We propose to offload that work to idle time not on critical path during epoch transition which should shrink the transition duration by about a second.

The nextShuffling is not necessary for epoch transition but is useful for calculating the next epoch proposers and sync committee duties so pre-computation is still recommended but it just doesnt need to be on critical path.

Solution description

Currently the shuffling code is housed in state-transition and the shuffling results are available, on the EpochCache. We propose to move that all to the ShufflingCache in `beacon-node.

This will facilitate a couple of things. We will still have access to the results for service via API, the current consumers of the nextShuffling and the shuffling results can be passed into the state-transition pre-computed. It will also open up the ability to do the shuffling on a separate thread or async in main thread idle times.

As a result of moving the shuffling to beacon-node the getBeaconProposersNextEpoch and similar getters would also need to be moved out of state-transition as they rely on the nextShuffling that will only be available in beacon-node after a period of time (so they cannot be precomputed until the nextShuffling is finished).

I am looking either moving those onto the ShufflingCache but that feels like mixing of concerns so potentially I will create a PrecomputationCache or similar to house that type of convenience information that gets served by the API.

Additional context

What are everyones thoughts on the approach and process?

Step 1) refactor code to move shuffling fully out of state-transition and into beacon-node. This will not change epoch transition times but will set the groundwork for the next steps
Step 2) create async processing code to move shuffling off critical path
Step 3) potentially create worker for precomputation tasks and implement with napi-nice to make sure kernel scheduling prioritizes main, network and bls threads

The text was updated successfully, but these errors were encountered:

twoeths · 2024-02-02T09:22:18Z

yes I strongly support the idea because having shuffling inside both EpochCache and ShufflingCache is not great. Some more contexts:

the cache of shuffling inside state-transition is not for itself, it's only for beacon node to consume later
it does not look right for beacon-node to call state-transition for a cache, state-transition should be for state transition only, beacon-node should use the ShufflingCache
n-historical state work would limit the states due to its size but we can cache way more EpochShuffling because it's so lightweight
now when loading state from disk the api has to provide a ShufflingGetter function. It should be able to load state without it

wemeetagain · 2024-02-05T20:11:38Z

@matthewkeil this approach sounds good to me

wemeetagain · 2024-02-05T20:26:27Z

related Improve next epoch shuffling computation #6268

twoeths · 2024-04-03T07:38:08Z

~~it's not possible to get rid of shuffling inside CachedBeaconState, it's needed in at least 2 places~~:

upgradeStateToAltair()
processSyncCommitteeUpdates()

~~this is required in the processSlots() where it spans through multiple epoch transitions~~

update: we don't need epoch shuffling for the above apis as long as we cache previous/current/next active validator indices

upgradeStateToAltair: can rebuild the committees/shuffling based on active validator indices and cached randao mixes inside the state
processSyncCommitteeUpdates: it only needs next epoch validator indices which is available after beforeProcessEpoch
but for processAttestations we need it

lodestar/packages/state-transition/src/block/processAttestationsAltair.ts

Line 52 in 3076b4c

const committeeIndices = epochCtx.getBeaconCommittee(data.slot, data.index);

wemeetagain · 2024-10-22T19:54:46Z

Resolved with

matthewkeil added meta-feature-request Issues to track feature requests. meta-discussion Indicates a topic that requires input from various developers. scope-performance Performance issue and ideas to improve performance. labels Feb 2, 2024

twoeths mentioned this issue Mar 27, 2024

feat: async shuffling refactor #6521

Closed

wemeetagain closed this as completed Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal for reorganizing shuffling code #6386

Proposal for reorganizing shuffling code #6386

matthewkeil commented Feb 2, 2024

twoeths commented Feb 2, 2024

wemeetagain commented Feb 5, 2024

wemeetagain commented Feb 5, 2024

twoeths commented Apr 3, 2024 •

edited

Loading

wemeetagain commented Oct 22, 2024

Proposal for reorganizing shuffling code #6386

Proposal for reorganizing shuffling code #6386

Comments

matthewkeil commented Feb 2, 2024

Problem description

Solution description

Additional context

twoeths commented Feb 2, 2024

wemeetagain commented Feb 5, 2024

wemeetagain commented Feb 5, 2024

twoeths commented Apr 3, 2024 • edited Loading

wemeetagain commented Oct 22, 2024

twoeths commented Apr 3, 2024 •

edited

Loading