Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(verifier): Remove parallelism #344

Closed
wants to merge 1 commit into from

Conversation

jan-ferdinand
Copy link
Member

A downstream consumer of Triton VM uses the verifier in an asynchronous context while running its own rayon threadpool. This, in combination with Triton VM's use of rayon, is suspected to cause odd behavior.

Remove parallelism through explicit use of rayon from the Triton VM verifier, with the option to re-introduce it when the underlying issues are better understood.

@Sword-Smith
Copy link
Collaborator

Sword-Smith commented Jan 8, 2025

This PR's branch, without parallelism:

time:   [8.1461 ms 8.1681 ms 8.1947 ms]

master

time:   [11.644 ms 11.680 ms 11.735 ms] 

Removing parallelism makes verification faster, because of the overhead of rayon thread-handling.

@@ -166,7 +166,7 @@ impl ProverRound {
}

fn merkle_tree_from_codeword(codeword: &[XFieldElement]) -> ProverResult<MerkleTree> {
let digests = codeword_as_digests(codeword);
let digests: Vec<_> = codeword.par_iter().map(|&xfe| xfe.into()).collect();
MerkleTree::new::<CpuParallel>(&digests).map_err(FriProvingError::MerkleTreeError)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need a single-threaded Merkle tree builder too, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line you highlight is in the prover. I have removed the parallelism from the function codeword_as_digests and inlined the parallel version that is used in the prover.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I misunderstood your comment. Let me check the code again.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right. The FRI verifier is building a Merkle tree from the last round's codeword, and the only logic for building Merkle trees we currently have is using parallelism.1

As a consequence, we seem to require a new Merkle tree builder, CpuSequential. Would you agree?

Footnotes

  1. Getting into the nitty gritty details, upon which the FRI verifier should absolutely not depend, Merkle trees with fewer than 512 leafs are built in sequence (unless a rather specific environment variable overrides the default).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we seem to require a new Merkle tree builder, CpuSequential. Would you agree?

Yes please!

I also noticed one place in some guesser where we use that. Since it's only 8 leafs there, it probably defaults to sequential anyway. It would be good to call the right method directly though, and not have to rely on contextual knowledge.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: it sounds as though both prover and verifier call merkle_tree_from_codeword but the second should call a new merkle_tree_from_codeword_sequential. Correct? Just being explicit to avoid confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the code for FRI is not as deduplicated as it could be; merkle_tree_from_codeword is a function associated with ProverRound, and is never called in the verifier, even though the verifier performs the same operations in its method last_round_codeword_merkle_root. Thanks for the heads up, though!

@jan-ferdinand jan-ferdinand added the 🖐 blocked Requires something else first. label Jan 9, 2025
@jan-ferdinand
Copy link
Member Author

Blocked: waiting for twenty-first v0.45.0.

The parallelism in Triton VM's verifier has more overhead than benefit.
@jan-ferdinand jan-ferdinand force-pushed the remove_parallelism_from_verify_pr branch from 53fd6d5 to baf5860 Compare January 13, 2025 11:35
@jan-ferdinand jan-ferdinand deleted the remove_parallelism_from_verify_pr branch January 13, 2025 11:36
@jan-ferdinand
Copy link
Member Author

Merged in baf5860

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🖐 blocked Requires something else first.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants