Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeerDAS fork-choice, validator custody and parameter changes #3779

Open
wants to merge 25 commits into
base: dev
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
change validator custody to 6, plus two extra per 32 ETH
fradamt committed May 24, 2024
commit 56e9d3844eaf981bb89c8453853de7c529ff2931
4 changes: 2 additions & 2 deletions configs/mainnet.yaml
Original file line number Diff line number Diff line change
@@ -161,8 +161,8 @@ DATA_COLUMN_SIDECAR_SUBNET_COUNT: 128
MAX_REQUEST_DATA_COLUMN_SIDECARS: 16384
SAMPLES_PER_SLOT: 16
CUSTODY_REQUIREMENT: 4
VALIDATOR_CUSTODY_REQUIREMENT: 8
BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET: 32000000000 # 2**5 * 10**9 (= 32,000,000,000)
VALIDATOR_CUSTODY_REQUIREMENT: 6
BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET: 16000000000 # 2**4 * 10**9 (= 16,000,000,000)
TARGET_NUMBER_OF_PEERS: 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that most clients don't use this config value, so I guess this is more like a reference / recommendation?

#3766 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created a PR to remove TARGET_NUMBER_OF_PEERS as a config variable


# [New in Electra:EIP7251]
4 changes: 2 additions & 2 deletions configs/minimal.yaml
Original file line number Diff line number Diff line change
@@ -160,8 +160,8 @@ DATA_COLUMN_SIDECAR_SUBNET_COUNT: 128
MAX_REQUEST_DATA_COLUMN_SIDECARS: 16384
SAMPLES_PER_SLOT: 16
CUSTODY_REQUIREMENT: 4
VALIDATOR_CUSTODY_REQUIREMENT: 8
BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET: 32000000000 # 2**5 * 10**9 (= 32,000,000,000)
VALIDATOR_CUSTODY_REQUIREMENT: 6
BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET: 16000000000 # 2**4 * 10**9 (= 16,000,000,000)
TARGET_NUMBER_OF_PEERS: 100

# [New in Electra:EIP7251]
4 changes: 2 additions & 2 deletions specs/_features/eip7594/das-core.md
Original file line number Diff line number Diff line change
@@ -81,8 +81,8 @@ We define the following Python custom types for type hinting and readability:
| - | - | - |
| `SAMPLES_PER_SLOT` | `16` | Number of `DataColumn` random samples a node queries per slot |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such thing as DataColumn

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

| `CUSTODY_REQUIREMENT` | `4` | Minimum number of subnets an honest node custodies and serves samples from |
| `VALIDATOR_CUSTODY_REQUIREMENT` | `8` | Minimum number of subnets an honest node with validators attached custodies and serves samples from |
| `BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET` | `Gwei(32 * 10**9)` | Balance increment corresponding to one additional subnet to custody |
| `VALIDATOR_CUSTODY_REQUIREMENT` | `6` | Minimum number of subnets an honest node with validators attached custodies and serves samples from |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think VALIDATOR_CUSTODY_REQUIREMENT is a little misleading. In practice, this will never be 6.

  • Provided a validator with a balance of 32 ETH, get_validators_custody_requirement will return 8.
  • Provided a validator with a balance of 17 ETH, get_validators_custody_requirement will return 7.
  • Provided a validator with a balance of 16 ETH, get_validators_custody_requirement will return 6.
    • But it will never really get to this value, as the validator is queried for ejection at 16.75 ETH.

Why not just have a single CUSTODY_REQUIREMENT plus additional custodies per validator?

Copy link
Contributor Author

@fradamt fradamt May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would like to preserve is:

  • a full node custodies only 4 subnets (I don't see much reason to go beyond that)
  • validators custody at least 8 subnets (I think it's a good minimum for security reasons)
  • the custody does not grow too fast with validator count (the distribution of number of validator per nodes is quite bimodal, with either just a few or hundreds, and I think it's good to keep the requirements low for the former). Growing it by 4 per validator (per 32 ETH) is too high imo

How do you feel about this, with VALIDATOR_CUSTODY_REQUIREMENT = 8?

def get_validators_custody_requirement(state: BeaconState, validator_indices: List[ValidatorIndex]) -> uint64:
    total_node_balance = sum(state.balances[index] for index in validator_indices)
    validator_custody_requirement = VALIDATOR_CUSTODY_REQUIREMENT 
    if total_node_balance >= MIN_ACTIVATION_BALANCE:
        validator_custody_requirement += (total_node_balance - MIN_ACTIVATION_BALANCE) // BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET
    return validator_custody_requirement

Copy link
Member

@jtraglia jtraglia May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I understand the rationale. Your alternative is generally fine, but it does feel a little overly-complex.

How about something like the following?

def get_validators_custody_requirement(state: BeaconState, validator_indices: List[ValidatorIndex]) -> uint64:
    total_node_balance = sum(state.balances[index] for index in validator_indices)
    count = total_node_balance // BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET
    return min(max(count, VALIDATOR_CUSTODY_REQUIREMENT), DATA_COLUMN_SIDECAR_SUBNET_COUNT)

This would provide the following custody requirements:

Validators Custody Requirement
1 8
2 8
3 8
4 8
5 10
6 12
... ...
63 126
64 128
65 128

This makes the computation relatively straight forward:

  • 2 x the number of validators on the node, minimum 8, max 128.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on using BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Francesco's implementation uses that too. But yes, the constant is a good idea.

Copy link
Member

@jtraglia jtraglia May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're right. It was a pseudocode mistake & the declaration/usage of multiplier can be removed. I believe I was thinking that BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET should be defined as:

MAX_EFFECTIVE_BALANCE_ELECTRA // DATA_COLUMN_SIDECAR_SUBNET_COUNT

So that it properly scales if we (1) increase the max EB again or (2) increase the subnet count.

(Also, I fixed the backwards min and max)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this version, because it only starts increasing from 8 after a few validators, which is imo a fairly desirable property in itself. I would even consider setting BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET to 32 ETH, so that it's "1x the number of validators on the node, minimum 8, max 128", and it only starts increasing from the minimum after 8 validators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried on the direction of dynamically increasing the custody count depending on how many validators you do run with. For context, last year we finally moved to a new attestation subnet backbone structure where the responsibility for subscribing to long-lived attestation subnets was equally distributed amongst all nodes rather than those running many validators:
#2749
#3312

The way validator custody is currently specified, you would reintroduce the same downsides by requiring nodes running with many validators to custody all the subnets. Is it necessary to scale the custody count this way ? anyway we can simply have an upper bound rather than all the subnets being custodied if you run more than > 64 validators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit worried on the direction of dynamically increasing the custody count depending on how many validators you do run with. For context, last year we finally moved to a new attestation subnet backbone structure where the responsibility for subscribing to long-lived attestation subnets was equally distributed amongst all nodes rather than those running many validators:
#2749
#3312
The way validator custody is currently specified, you would reintroduce the same downsides by requiring nodes running with many validators to custody all the subnets. Is it necessary to scale the custody count this way ?

The rationale behind making custody-count depend on something is as follows:

  1. the system performs much better if we have nodes that can repair on-the-fly. This makes "just available" blocks "overwhelmingly available", which improves the amount of blocks that get canonical (because after repair, these blocks will get enough votes), and it also improves the sampling process (because we will have much less false negatives during sampling). We call this availability amplification.
  2. in the 1D erasure coding case, only nodes that have at least half of the columns can repair. (Note that we do not need this in the 2D case, where any node can repair a row or a column).
  3. The most intuitive way to force the system to have such "supernodes" is to make custody depend on validator count. There could be other ways, like
    • random allocation of "supernode role",
    • hoping that there will be supernodes,
    • having nodes doing incrementalDAS and eventual repair,
      but custody based allocation seems to align best with expected resources needed to actually download the data and do the reconstruction. In other words, if someone has many validators, they can pay for the bandwidth and compute.

This goes agains the "hiding" property achieved by equally distributing, but improves system performance. Once we change to 2D encoding, we can go back to equally distributing custody.

anyway we can simply have an upper bound rather than all the subnets being custodied if you run more than > 64 validators.

I'm not sure I interpret this right, but it is important that we can't stop custody requirement at 64. Otherwise, if there would be exactly 64 columns released, we would need a supernode that is by miracle subscribed to the exact same 64 columns. There are too many combinations for that, we would need too many supernodes. If really needed, we could stop before 128, but we need way more than 64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing the above, and also @dapplion's comment that validator custody means that in practice, given the actual stake distribution, most of the stake will be run on supernodes, which imho is a good thing, because it hugely derisks the whole system. It basically means that the introduction of DAS is essentially irrelevant for 90% of the validator set, other than moving from gossiping few large objects to a lot of smaller objects. And why shouldn't someone that runs hundreds of validators, with millions or even tens of millions of stake, be downloading the whole data and contributing to the security and stability of the network?

This is quite different from the attestation subnets case imho, because there are huge tangible benefits to be had from linearly scaling the load based on stake. Also, validator custody does not change the fact that all nodes still share the responsibility for forming the backbone of long-lived subscriptions, though not equally.

Another point here is that downloading the whole data is by far the best way to ensure that you always correctly fulfil validator duties, including protecting you when proposing.

| `BALANCE_PER_ADDITIONAL_CUSTODY_SUBNET` | `Gwei(16 * 10**9)` | Balance increment corresponding to one additional subnet to custody |
| `TARGET_NUMBER_OF_PEERS` | `100` | Suggested minimum peer count |

fradamt marked this conversation as resolved.
Show resolved Hide resolved