Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: gossipsub pre-validation checks on incoming message #5561

Open
erhant opened this issue Aug 19, 2024 · 1 comment
Open

feat: gossipsub pre-validation checks on incoming message #5561

erhant opened this issue Aug 19, 2024 · 1 comment

Comments

@erhant
Copy link

erhant commented Aug 19, 2024

Description

Hi, I wanted to ask if its possible to do any type of message validation over the raw message prior to gossipsub validation (i.e. with Strict or `Permissive) takes place.

We currently have a custom app-level validation logic for message propagation (https://github.com/libp2p/rust-libp2p/blob/master/protocols/gossipsub/src/behaviour.rs#L749) but I think we should have one that takes place for incoming messages for internal protocol-level validation.

Motivation

The setting that I am in is related to discussion #5504, working together with @anilaltuner, we have many messages in the network that are no longer valid at "application level" but they keep being cached and validated at protocol level, increasing the memory usage & CPU usage.

This feature aims to target both, i.e. do not cache an invalid message / do not spend time checking signature if its not valid due to another reason.

CPU Issue

The issue first came to our eyes when the CPU usage skyrocketed due to old messages going around in the network. It is provable that this is the issue, using flamegraphs.

Here is a flamegraph under Permissive (also in Strict):
flamegraph_permissive

and here it is using None:
flamegraph_none

The custom message validation does not help with this due to it taking place AFTER the signature checks.

Using existing methods

I can think of two things that may alleviate the issue, but I believe both take place after sig checks:

  • Use a specific message id for invalid message to make them appear as "duplicate", although this is a very sketchy method
  • Have a logic within a DataTransform impl, and deliberately return an Err if the data is not up to expectation.

Note

An example for the second bullet, outbound_transform may add the timestamp at the first 8 bytes, and the inbound_transform may parse it and check that its not old, such as in an TTLDataTransform implementation. Such a method respects bijectivity as well, insofar that inbound and outbound transforms cancel eachother.

Sequence No

One possible configuration can be over sequence no as well, I believe it only matters that this is unique w.r.t client right? Currently, it uses a timestamp as seed and simply increments it. What if it used timestamp every time, and a TTL logic could be added over it within the message handler?

Perhaps a customization over how sequence no is constructed & validated can help us here as well, combined with changing the order of checks about signature & sequence no (i.e. https://github.com/libp2p/rust-libp2p/blob/master/protocols/gossipsub/src/protocol.rs#L298-L315 should happen AFTER https://github.com/libp2p/rust-libp2p/blob/master/protocols/gossipsub/src/protocol.rs#L318-L362).

Current Implementation

Signature checks take place at https://github.com/libp2p/rust-libp2p/blob/master/protocols/gossipsub/src/protocol.rs#L298, which is before other checks.

As mentioned above, sequence no check happens after siganture checks, and it seems that the returned RawMessages within sequence no checks also include message.signature although a comment says dont inform the application.

Are you planning to do it yourself in a pull request ?

Yes

@erhant
Copy link
Author

erhant commented Aug 20, 2024

I should mention that what we really just want in the end is to ignore historic messages all together, for our use-case, it suffices to just send and receive messages among live nodes, with propagation in mind i guess a very short TTL for messages would work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant