-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synthetic data generates on demand rather than on schedule #63
Synthetic data generates on demand rather than on schedule #63
Conversation
Redis constants enhanced execution_cycle updated to enable warmup stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's really hard to review this PR because a massive amount of code has been refactored which (a) did not need to be, and (b) is not relevant to the changes we're tryng to make here
I'm also not sure what the PR description means, or how this solution solves the problem. I believe it changes functionality of the whole system too much
@@ -43,25 +44,25 @@ async def _post_vali_stats(config: Config): | |||
) | |||
|
|||
|
|||
async def get_nodes_and_contenders(config: Config) -> list[Contender] | None: | |||
async def get_nodes_and_contenders(config: Config) -> List[Contender] | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why refactor typehints? Prefer the lowercase
logger.info("Starting cycle...") | ||
if config.refresh_nodes: | ||
logger.info("First refreshing metagraph and storing the nodes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is all this refactored?
@@ -70,29 +71,25 @@ async def main(config: Config) -> None: | |||
time_to_sleep_if_no_contenders = 20 | |||
contenders = await get_nodes_and_contenders(config) | |||
|
|||
# date_to_delete = datetime(2024, 10, 13, 30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this refactored?
# await delete_task_data_older_than_date(connection, date_to_delete) | ||
# await delete_contender_history_older_than(connection, date_to_delete) | ||
# await delete_reward_data_older_than(connection, date_to_delete) | ||
await warmup_function(config=config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'warmup_function' is a bad name -> what is it doing?
remaining_requests: int | ||
|
||
def __lt__(self, other: "TaskScheduleInfo"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this refactoreD?
break | ||
|
||
if i % 100 == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why has all this logic gone? We've completely changed how we schedule synthetic queries now right? All this logic must remain
PR Overview
TLDR
This PR planned to provide on demand synthetic data calculation, where on demands stands for in addition to request received.
Implementation details
This PR consists from two parts:
Main part of PR
Main part consists of the following changes:
refresh_sythetic_data.py
has been (well it actually wasn't completely but supposed to be) rewritten to store nothing but a cache warmup function.schedule_synthetic_queries.py
is the main place where the most changes have been made_fetch_tasks_in_chunks
function reads a chunk of new organic tasks to proceed, if any._enhance_task
function enhances the initial raw task with the unique data. Currently implemented just forchat
andtext-to-image
tasks, because of it's nothing more but a switch by task kind it can be expanded effortlessly._push_to_query_node
function pushes enhanced tasks to aquery_node
warmup_function
function implements logic of a service warmup in sake of code intents clarificationschedule_synthetics_until_done
function has been rewritten to reflect the new approach — enhance any given tasks amount with synthetic dataexecution_cycle.py
—warmup_function
call added on a service start.Work around part of PR
cli.py
cli tool that allows developer to manually send arbitrary amount of tasks of any kind to thecontrol_node
to check the whole-> contol_node -> query_node
pipeline.redis_constants.py
—CONTROL_NODE_QUEUE_KEY
andCONTROL_TASK_CHUNK_SIZE
constants have been added to be used incli.py
and in the queue that is modifiedschedule_synthetics_until_done
function reads raw tasks from.Closes: #55