Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Memory Buffer tracking and monitoring in the runtime #190

Open
gbin opened this issue Dec 31, 2024 · 10 comments
Open

Add Memory Buffer tracking and monitoring in the runtime #190

gbin opened this issue Dec 31, 2024 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@gbin
Copy link
Collaborator

gbin commented Dec 31, 2024

In the core/pool.rs tasks can spawn up preallocated memory pools (for larger objects you don't want to copy multiple times in the copper list or some structures).

It would be awesome if the creation of those pools are centralized and a callback to the monitoring API is added so we can see on a new pane in cu-consolemon something like:

x MB preallocated
x MB in use
x memory handles in flight
x/s memory handles creation

@gbin gbin added enhancement New feature or request good first issue Good for newcomers labels Dec 31, 2024
@makeecat makeecat self-assigned this Jan 5, 2025
@makeecat
Copy link
Collaborator

makeecat commented Jan 5, 2025

@gbin Can you elaborate on the meaning of "x/s memory handles creation"?

@gbin
Copy link
Collaborator Author

gbin commented Jan 5, 2025

When you look at the system, in a pool, tasks might just do a "give one back to the pool" then immediately "get back a new one and queue that in some asynchronous thing like DMA and whatnot" like a camera driver might do that. But users will be very crafty with this: other tasks might burst and use those handles as a temporary storage, store computed intermediates, as a cache etc... The idea is to get some metrics that can show the type of rotations of those buffers live so users can understand why a pool becomes full etc... The handle creation per sec (x/s) might not be enough to show that (which is my goal). Maybe mean age of the buffer could be a good one?

@gbin
Copy link
Collaborator Author

gbin commented Jan 5, 2025

Note: I am still wondering where this API should live for the central creation of those pools. Maybe we should have some kind of global factory that tasks ask kindly to create a pool from. Maybe in the RON file with a ID + the preallocation parameters and the tasks can pick that up... but then it makes the thing even more "frameworky".. 🤔

@makeecat
Copy link
Collaborator

makeecat commented Jan 5, 2025

I will start with the implementation of the following items as the first iteration, we can keep the discussion open.

x MB preallocated
x MB in use
x memory handles in flight
x/s memory handles crated in the last second

@gbin
Copy link
Collaborator Author

gbin commented Jan 5, 2025

As a first PR I think we need the factory first to be able to connect all the end ... maybe we can start with a singleton so both the runtime and the tasks can start resp. start a pool and pipe the state to the monitoring component?

@makeecat
Copy link
Collaborator

makeecat commented Jan 5, 2025

Do you mean by implementing a factory that in charge of memory allocation outside of runtime, so runtime and task can ask for mmeory allocation from that, instead of implemeting it in the pool.rs? Can you provide more instructions?

@gbin
Copy link
Collaborator Author

gbin commented Jan 5, 2025

So as of today, we create the pool directly in a task then share the handles through the copper messages:
https://github.com/copper-project/copper-rs/blob/master/components/sources/cu_v4l/src/v4lstream.rs#L37

this is straight a new() from a task when it is created.

I propose to keep a static reference to a factory something like:
static POOLS

and have an API like POOLS.create(name, type, buf_size, buf_count, alignment)

name a symbolic str that you can display from the monitoring (ie "v4l images", or "Nvidia GPU#1 point clouds")
with Type an enum for various memory the runtime might want: Host, Numa(id), Cuda ...

I started to work on the traits for the handles / buffer here to give you an idea of what the API might serve:
#201

then we can add stuff the monitoring component can get:

POOLS.getall() -> Some Arc of pools
and on the pool traits we can add the methods to query the state of the pool so the monitoring component can list then with their statistics?

@gbin
Copy link
Collaborator Author

gbin commented Jan 5, 2025

If it is a little too deep, I can finish the APIs and work on the pools and you can start to work on the panel in the cu-consolemon + the matching API in monitoring.rs and we can meet in the middle somewhere?

@gbin gbin removed the good first issue Good for newcomers label Jan 16, 2025
@gbin
Copy link
Collaborator Author

gbin commented Jan 16, 2025

removed the good first issue, I have been in those buffers for days, not a simple endeavor :)

@gbin
Copy link
Collaborator Author

gbin commented Jan 22, 2025

Just prior to release I left a stub for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants