Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] SDK Refactor + Enhanced Multithreading Support #790

Open
wants to merge 6 commits into
base: mainnet
Choose a base branch
from

Conversation

Pauan
Copy link
Collaborator

@Pauan Pauan commented Oct 25, 2023

  • Replaces the process_inputs, execute_program, and execute_fee macros with methods.
    This removes a lot of code duplication and makes the code simpler.

  • Adds a new thread_pool::spawn function which allows for running any Send + 'static code on the
    Rayon threadpool.

  • All of the various methods (execute_function_offline, execute, deploy, split, etc.) are
    now automatically run on the Rayon threadpool, which means everything runs in parallel.

    This means the user no longer needs to use multiple Workers for paralellism, they can
    run all of their JS code within a single Worker.

@iamalwaysuncomfortable iamalwaysuncomfortable changed the title Major refactor [Feature] SDK Refactor + Enhanced Multithreading Support Oct 26, 2023
@iamalwaysuncomfortable iamalwaysuncomfortable linked an issue Oct 26, 2023 that may be closed by this pull request
3 tasks
@Pauan
Copy link
Collaborator Author

Pauan commented Nov 3, 2023

The old code created various variables (process, program, rng, etc.) and then used macros (such as execute_program!) which would mutate those variables.

The new code instead bundles those variables inside of a ProgramState struct, and then provides methods which accomplish the same thing as the macros.

Bundling variables inside of a struct is a very common Rust idiom, it has many advantages:

  • The variables are now logically organized, instead of spread out all over the place.

  • You can use a new method to create the variables and the struct, which removes code duplication.

  • You can attach helper methods to the struct.

The ProgramState struct is not public, it is only used inside of the implementation of the ProgramManager methods.

The struct is created inside of the ProgramManager methods, and then it is dropped when the method finishes. So it is temporary and ephemereal. It only exists for the duration of the method, and it is a purely internal implementation detail.

This pattern of creating temporary wrapper structs is very common and idiomatic in Rust.

Structs are not classes, structs in Rust serve many different purposes, and creating wrapper structs is very common and completely normal.

For example, Iterator, Future, and Stream all use a similar pattern of returning temporary wrapper structs:

let state = vec![...];

let state = state.into_iter();

let state = state.map(|x| ...);

let state = state.filter(|x| ...);

let state = state.collect();

The into_iter method consumes the Vec and returns a new IntoIter struct:

https://doc.rust-lang.org/std/vec/struct.IntoIter.html

The IntoIter struct wraps the original Vec, and it also holds some additional state which is needed for iteration.

And then when you call the map method, it consumes self and returns a new Map struct:

https://doc.rust-lang.org/std/iter/struct.Map.html

The Map struct provides some additional state on top of the IntoIter.

And then the filter method consumes self and returns a new Filter struct:

https://doc.rust-lang.org/std/iter/struct.Filter.html

The Filter struct provides some additional state.

Lastly the collect method consumes self and returns some new collection. That collection is based on the previous structs.

All of these structs are temporary, they exist only to describe the business logic, they aren't intended as a long-term storage of state. They are only used locally inside of a method. And when you are finished with them, the structs are simply thrown away.

The structs in this pull request behave in the same way:

let state = ProgramState::new(program, imports).await?;

let (state, deploy) = state.deploy().await?;

deploy.check_fee(fee_microcredits)?;

let (state, fee) = state
    .execute_fee(
        deploy.execution_id()?,
        url,
        private_key.clone(),
        fee_microcredits,
        fee_record,
        fee_proving_key,
        fee_verifying_key,
    )
    .await?;

state.deploy_transaction(deploy, fee).await

The ProgramState wraps the process, program, and rng. It provides various methods which return a new state, and some extra additional data.

For example, the deploy method returns a Deploy struct, execute_fee returns an ExecuteFee struct, execute_program returns an ExecuteProgram struct, etc.

This is the same as how the iterator map method returns a Map struct, the iterator filter method returns a Filter struct, the into_iter method returns an IntoIter struct, etc.

All of these structs are temporary, and they are thrown away when the ProgramManager method finishes. They are just an ephemereal container for passing data around. This is very normal and idiomatic for Rust.


Why do the ProgramState methods return a new state? Multi-threading cannot use references, which means it cannot use &self or &mut self. So that means everything must be owned.

That means that all of the ProgramState methods must use self, which consumes the ProgramState. However, we want to keep using the ProgramState even after the method is finished. So the methods must return self so that way it can continue to be used.

This pattern is very common in functional programming languages, because functional programming languages cannot mutate, so they must always return new state. It is also common in Rust, for example with the builder pattern. Returning self from a method is very normal.

And the reason why the ProgramState is temporary is because multi-threading requires it to be owned. And so the ProgramState struct is created, it is moved to another thread, it runs some code in that other thread, and it's finally discarded when it's finished.

If the ProgramState was long-lived, then it could not be owned, and so it could not be used with multi-threading.

ProgramState is designed in this way because it is necessary, because of the restriction of multi-threading.


The end result is that the new code is exactly the same as the old code, except:

  • It uses a struct with multiple fields instead of using multiple variables (it is normal and idiomatic in Rust to use structs to bundle multiple variables).
  • It uses methods instead of macros (e.g. the execute_program! macro becomes an execute_program method).
  • It is multi-threaded.

Perhaps the name ProgramState is confusing. It can be easily renamed to something else, such as MethodState, or ExecutionState. The name is not important, it is an internal struct, purely an implementation detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Parallel Executions within a single web worker
1 participant