This repository contains tools for deploying and measure Machine Learning (ML) nodes staked in the Pocket Network. The code available here was created under the POKT AI Lab socket.
-
Model Deployment
-
Local-Net - ML RPCs Testing
-
Live Metrics Testing - Pocket Test-Bench
The Pocket Network test bench is an environment used to verify the correctness or soundness of an staked model, live on the network.
It works by streamlining the tracking, sampling and execute tasks
in a performant and scalable way. Each of these tasks
is a instance of a particular metric, for example, the GLUE
dataset. The architecture of the project is thought to be agnostic of the task to perform, and easily extendable.
The test bench follows the structure presented in the following image:
As it can be seen the test-bench has four main blocks (each a different App), that work together to track the task's
scores
of each of the Pocket Nodes. Briefly, the apps do the following:
- Manager : Keeps the records of each node's scores. It checks for new nodes, reviews the age and statistics of the task scores and requests more tasks to be executed. If there are finalized tasks (resulting from the Evaluator) it adds them to the node score tracking database.
- Sampler : Checks for Manager requests and prepares the tasks to be done. In order to do that it keeps track of available datasets (if needed) and sample from them. The result of this App is a generic call request that is correct for the Pocket Service but independent of the task.
- Requester : It controls the relays done. Using the provided
Pocket Network App Keys
it checks the current sessions and looks for nodes that have pending tasks requests (generated by the Sampler). When it finds a match, it performs the relays against the nodes and saves the raw answer. - Evaluator : Retrieves the responses of the Requester and finds the originating task requests, then it calculates the appropriate metrics and writes the resulting values.
The Apps are all coordinated using Temporal IO, with the Manager and Requester being recurrent workflows and Sampler and Evaluator being triggered by the Manager and Requester respectively. The datasets are stored using PostgreSQL since it is the most effective way to handle datasets from the LMEH test suite which is the first to be implemented. The data communication between apps is done via MongoDB, which is also the holder of the nodes
collection, the one with the resulting scores for each tested node.
For more details on how the Apps interact, please read the Apps Readme.