Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Browser Automation using AI #831

Open
surapuramakhil opened this issue Nov 13, 2024 · 19 comments
Open

[FEATURE]: Browser Automation using AI #831

surapuramakhil opened this issue Nov 13, 2024 · 19 comments
Assignees
Labels
enhancement New feature or request

Comments

@surapuramakhil
Copy link
Contributor

Feature summary

As of now AI is used for generating answers to the questions

Feature description

proposal:- AI to use as browser agent - viewing and acting directly on UI, performing job Application.

Motivation

No response

Alternatives considered

No response

Additional context

No response

@surapuramakhil surapuramakhil added the enhancement New feature or request label Nov 13, 2024
@cjbbb
Copy link
Contributor

cjbbb commented Nov 13, 2024

After using this, can we apply for jobs on other job websites?

@surapuramakhil
Copy link
Contributor Author

After using this, can we apply for jobs on other job websites?

cache is it's tough to implement 😅
@49Simon kindly comment on possibility - existing library support? I guess we send feed to AI (UI of browser)

@49Simon
Copy link
Collaborator

49Simon commented Nov 13, 2024

This would be something similar to Claude's Computer Use. Its early in development phase but by using tool use and vision models, it iteratively sends screenshots back to the model until the task is complete.

@feder-cr
Copy link
Owner

@surapuramakhil @cjbbb in this moment this tools not work correctly, also https://github.com/sentient-engineering/jobber isn't good, but if you want you can try!

@cjbbb
Copy link
Contributor

cjbbb commented Nov 20, 2024

@surapuramakhil @cjbbb in this moment this tools not work correctly, also https://github.com/sentient-engineering/jobber isn't good, but if you want you can try!

I will check the link

@surapuramakhil
Copy link
Contributor Author

Another tool/lib bowser pilot which does same thing - https://github.com/handrew/browserpilot, once matured we can start using in this project.

@jfelten
Copy link

jfelten commented Nov 22, 2024

I am interested in pitching in. I went through the AIHawk code and have some ideas on how to use the existing classes to make a flexible architecture that allows for other sites. By this, I mean extending the current job_manager and LinkedIn_easy_applier classes to work for other sire like Amazon. I am the one who brought up browserpilot and would want to see if I could make it work. I wanted to see if anyone else has proposed similar ideas.

@surapuramakhil
Copy link
Contributor Author

would want to see if I could make it work.

Appreciate that - Assigned this to you.

I wanted to see if anyone else has proposed similar ideas.

that was me, #401 found that it was never reopened.

If you wish I can also assign 401 for you. @jfelten

@surapuramakhil
Copy link
Contributor Author

another project - https://github.com/gregpr07/browser-use

@feder-cr feder-cr pinned this issue Nov 23, 2024
@feder-cr
Copy link
Owner

feder-cr commented Nov 23, 2024

@jfelten @sarob @cjbbb try also to contact/work with @gregpr07

@feder-cr
Copy link
Owner

@surapuramakhil
Copy link
Contributor Author

@jfelten https://github.com/Skyvern-AI/skyvern

This one seems to be quite mature.

@sarob
Copy link
Collaborator

sarob commented Nov 24, 2024 via email

@surapuramakhil
Copy link
Contributor Author

Looks very promising. Good find!

@sarob what your thoughts on importing are lib / modules which are on AGPL? (Their strong copyleft) restrict our ability to update our own licenses in future?

Another is it's an end-to-end system; I hope they are happy / expanding it as lib - something like Lang chain (which is a library) or any other library.

@sarob
Copy link
Collaborator

sarob commented Nov 26, 2024

AGPL supersedes GPL and MIT if their code is upstream. They have a generic job application capability already. browser-use is MIT, but less developed.

@suchintan
Copy link

Maintainer of https://github.com/Skyvern-AI/Skyvern here -- would love to see this integrated (thanks for messaging me @feder-cr!

Skyvern is an API-first product, so the easiest way to get integrated would be to use our API directly:

This works on most job boards. You just need to change the payload (ie users' information) and job URL. When changing the payload, we just need to make sure Skyvern is "oversupplied" information so it can pull from it as needed.

curl 'https://api.skyvern.com/api/v1/tasks' -X POST  // This can also be the self-hosted version
-H "Content-Type: application/json" 
-H "x-api-key: xxx" // get your own API key
--data-binary '
  {
   "url":"https://jobs.lever.co/leverdemo-8/45d39614-464a-4b62-a5cd-8683ce4fb80a/apply", 
   "webhook_callback_url":null, // Specify your own webhook here
   "navigation_goal":"Fill out the job application form and apply to the job. Fill out any public burden questions if they appear in the form. Your goal is complete when the page says you'\''ve successfully applied to the job. Terminate if you are unable to apply successfully.",
  "proxy_location":"RESIDENTIAL",
  "navigation_payload":"{\n  \"name\": \"John Doe\",\n  \"email\": \"[email protected]\",\n  \"phone\": \"6421440771\",\n  \"resume_url\": \"https://writing.colostate.edu/guides/documents/resume/functionalSample.pdf\",\n  \"cover_letter\": \"Generate a compelling cover letter for me\"\n}"
  }
'

@surapuramakhil
Copy link
Contributor Author

Maintainer of https://github.com/Skyvern-AI/Skyvern here -- would love to see this integrated (thanks for messaging me @feder-cr!

Awesome - if you are discord can you join our discord community https://discord.gg/mMZcMTH9K6 and ping me once you are there? we might need much more flexibility, Glad to see both projects working together.

@sarob
Copy link
Collaborator

sarob commented Nov 28, 2024

picking this up.

#967 is working on a similar purpose fix

@sameelarif
Copy link

Would definitely check out Stagehand for this: https://github.com/browserbase/stagehand

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants