Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use human friendly monitor names #1272

Conversation

ologbonowiwi
Copy link
Contributor

@ologbonowiwi ologbonowiwi commented Feb 4, 2025

/claim #1245

image

Copy link

vercel bot commented Feb 4, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
screenpipe ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 7, 2025 7:40pm

@louis030195
Copy link
Collaborator

current impl is awkward

just start screenpipe and look what gets in ocr_text app_name, it captures only background window for some reason

@ologbonowiwi
Copy link
Contributor Author

opened mediar-ai/xcap#3 to bump our xcap fork to version 0.3.1 (the one with human friendly names), but added the same stuff that we had before to set if something was focused or not.

to test this on your screenpipe, change your screenpipe-vision/Cargo.toml:92 to:

[target.'cfg(target_os = "windows")'.dependencies]
windows = { version = "0.58", features = [
  "Graphics_Imaging",
  "Media_Ocr",
  "Storage",
  "Storage_Streams",
] }
xcap = "0.3.1"

[target.'cfg(target_os = "macos")'.dependencies]
libc = "=0.2.164"
xcap-macos = { package = "xcap", git = "https://github.com/ologbonowiwi/xcap", rev = "af2b896" }
cidre = { git = "https://github.com/yury/cidre.git", version = "0.5.0" }

[target.'cfg(target_os = "linux")'.dependencies]
libc = "=0.2.164"
xcap = "0.3.1"

This will install 0.3.1 on the app with the custom is_focused logic we have on the fork. If it works we may close this PR and only change the screenpipe-vision/Cargo.toml to update dependencies, and leave the usage of xcap_macos as it is.

@louis030195
Copy link
Collaborator

any idea whats the behaviour of this on windows?

@ologbonowiwi
Copy link
Contributor Author

not sure, but I guess doesn't change much from what we currently have, since the fork is used only on Mac

@louis030195
Copy link
Collaborator

Screen.Recording.2025-02-05.at.5.34.38.PM.mov

its only recording background window?

@ologbonowiwi
Copy link
Contributor Author

how did you built the app?

with the xcap version on this PR or with the one I sent on my other comment?

@ologbonowiwi ologbonowiwi marked this pull request as draft February 6, 2025 13:50
@ologbonowiwi
Copy link
Contributor Author

@louis030195 can you give some clear instructions on how to test this?

I checked out to main but everything is focused=1. The tables looks gibberish to me.
Please give some instructions on what is the current behavior which is expected, and what we need, so I can make it happen.

Btw, changing this to draft since you said we need the custom focus thing from the xcap fork.

ps: please test mediar-ai/xcap#3 to see if it's what expected. if so we can just merge it and update here.

@ologbonowiwi
Copy link
Contributor Author

Running cargo run --examples window on xcap (current version of the fork), here's what I get:
image

vs the version I updated:
image

@louis030195
Copy link
Collaborator

louis030195 commented Feb 6, 2025

currently we:

by default capture only focused windows (eg mouse within window)

have an option to capture also unfocused windows (eg windows where mouse not in) (useful if you are running "agent pipe" like linkedin and want to keep track of what the agent is doing, or for example if we implement a scrapping agent for example that goes to your X and profile and look at your feed to extract market insights from the agent's screen, while does not require any auth token since you're already logged in unlike cloud scrapping stuff)

we store the focus status in the db (atm always 1 because we capture only focused windows)

i have no idea whether windows and linux have same UX than macos

what we should do:

  • be able to capture only focused windows
  • be able to capture only unfocused windows
  • be able to capture both
  • be able to filter data by focused / unfocused through SQL and thus /search API
  • works on all OS, same behaviour

i'd be happy to put a bounty on this, mostly some db migration and plumbing on the search queries and server.rs thing - can change search pipe UI later if necessary

@ologbonowiwi
Copy link
Contributor Author

ologbonowiwi commented Feb 7, 2025

I believe Linux and Windows have different behaviors, as the official library even for mac behaviors differently.

Did you considered open an issue on official xcap asking for this feature?

I don't think it's a good long-term solution to have a fork with a different behavior, since we lost update to both new features and security patches

Ok, I should have tested before saying. After testing nashaofu/xcap master version, it focuses only on the window that is actually open.

I ran cargo run --example window on xcap (fork, my version mediar-ai/xcap#3 and the original xcap), I'll send the loom over discord for you, but here are the results:

  • mediar-ai/xcap and feat/#1245/fix focus on mac and linux xcap#3 behaves in the same way. showing multiple things as focused. the logic is that we get the mouse, and if the mouse is within the range of the window (it doesn't matter if it's open or not, or is what is clicked or not. the check of is focused is solely based in the fact of the mouse coordinates being within the range the window is rendered on the computer)
  • nashaofu/xcap shows only 1 window as focused (the actual one opened. I tested going from terminal to browser and it successfully sees what is open. that's done through pid check rather than mouse id point. is far more accurate).

if what we need is to capture only the open window, we should go for nashaofu/xcap, a.k.a original xcap 0.3.1.

mediar-ai/xcap#3 or the master version of the fork (commit 965bc99), both of them have the same logic, thus shows more than one window focused. considering

let is_focused = window.current_monitor().id() == monitor.id() && window.is_focused();

and
let is_focused = window.current_monitor().id() == monitor.id() && !window.is_minimized();
the behavior is totally different. when capture_unfocused_window is false, on Mac, it'll check against the id and then window.is_focused. while on windows and linux checks only !window.is_minimized (not considering the monitor id which is the same condition in both cases)

as recording everything as much as is not minimized is the behavior for linux and windows, and I removed the different is_focused condition for mac (which is even non deterministic since the is_focused is based on where the mouse is rather than if the app is actually focused or not).

if that's not the intended behavior, let me know and please give clear steps on how do you want to proceed. a few alternatives:

  1. use the version of this PR which records everything as much as is not minimized
  2. review and merge feat/#1245/fix focus on mac and linux xcap#3 and I can change this PR or open a new one to keep the old behavior from mac (which I do not recommend since is non-deterministic)
  3. go for a totally different path, which I'd need clear instructions how do you want to proceed so I won't waste time doing this that are not the expected.

@ologbonowiwi
Copy link
Contributor Author

ps: additionally, discussions on focused or not, improve experience of pipes to search through focused/unfocused, set the app to record focused/unfocused/both, should be handled on other PR/issue imo. I think the discussion diverged a bit (and a lot of effort was done on my side outside the original scope of the issue in terms of going through the codebase and running a lot of tests both on screenpipe, original xcap and fork xcap to understand what's actually happening) from the original issue, which was about the monitor names being human friendly. I think we should focus on having the monitor name issue closed then we can discuss other edge cases and expand from there.

@louis030195
Copy link
Collaborator

louis030195 commented Feb 7, 2025

i agree to do just name for now,

i don't want to have to maintain xcap fork, we should upstream things somehow and depends on their main branch

can you make this PR rely on xcap main and maintain same behaviour but just add name

if not possible then open another PR for the focus use case first and we will do this PR later

@ologbonowiwi
Copy link
Contributor Author

Moved this to draft again.

Opened nashaofu/xcap#190 to tackle the focus issue

Copy link

vercel bot commented Feb 14, 2025

@ologbonowiwi is attempting to deploy a commit to the louis030195's projects Team on Vercel.

A member of the Team first needs to authorize it.

@ologbonowiwi ologbonowiwi marked this pull request as ready for review February 14, 2025 13:21
@ologbonowiwi
Copy link
Contributor Author

nashaofu/xcap#191 merged and this PR is ready for review again.

@louis030195
Copy link
Collaborator

@ologbonowiwi great work!

just to clarify this is related to focus and not the display name right? i don't see any new property for monitor name

@ologbonowiwi
Copy link
Contributor Author

both @louis030195. this fixes the focus but I didnt changed the public API on xcap, so we dont have to change anything. everything that used is_focused from xcap::Window will now have focus kinda "real time"

the experience should look the same as fork (in terms of having updated is_focused

@louis030195
Copy link
Collaborator

/approve

Copy link

algora-pbc bot commented Feb 15, 2025

@louis030195: The claim has been successfully added to reward-all. You can visit your dashboard to complete the payment.

@louis030195
Copy link
Collaborator

/tip $100 @ologbonowiwi thx for this

@louis030195 louis030195 merged commit 7f07416 into mediar-ai:main Feb 15, 2025
0 of 6 checks passed
Copy link

algora-pbc bot commented Feb 15, 2025

@ologbonowiwi: You just got a $100 tip! We'll notify you once it is processed.

@louis030195
Copy link
Collaborator

https://github.com/mediar-ai/screenpipe/actions/runs/13349168810/job/37283407374

Screenshot 2025-02-15 at 3 07 05 PM

i guess this was not tested on windows

@louis030195
Copy link
Collaborator

/tip $300 @ologbonowiwi

thanks again for the digging around screen capture!

Copy link

algora-pbc bot commented Feb 18, 2025

@ologbonowiwi: You just got a $300 tip! We'll notify you once it is processed.

Copy link

algora-pbc bot commented Feb 18, 2025

🎉🎈 @ologbonowiwi has been awarded $100! 🎈🎊

Copy link

algora-pbc bot commented Feb 18, 2025

🎉🎈 @ologbonowiwi has been awarded $300! 🎈🎊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants