Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image to image agent #1628

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Image to image agent #1628

wants to merge 4 commits into from

Conversation

anuragts
Copy link
Contributor

Description

Image to Image agent that uses Fal tools.

Type of change

Please check the options that are relevant:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Model update
  • Infrastructure change

Checklist

  • My code follows Phidata's style guidelines and best practices
  • I have performed a self-review of my code
  • I have added docstrings and comments for complex logic
  • My changes generate no new warnings or errors
  • I have added cookbook examples for my new addition (if needed)
  • I have updated requirements.txt/pyproject.toml (if needed)
  • I have verified my changes in a clean environment

Additional Notes

Include any deployment notes, performance implications, or other relevant information:

):
super().__init__(name="fal")

self.api_key = api_key or getenv("FAL_KEY")
if not self.api_key:
logger.error("FAL_KEY not set. Please set the FAL_KEY environment variable.")
self.model = model
self.image_url = image_url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why image_url isn't a param in image_to_image function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't think it makes sense in the constructor. Rather give it to the main agent in the prompt.

Comment on lines +114 to +120
media_id = str(uuid4())
agent.add_image(
Image(
id=media_id,
url=url,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we set the images on the agent

):
super().__init__(name="fal")

self.api_key = api_key or getenv("FAL_KEY")
if not self.api_key:
logger.error("FAL_KEY not set. Please set the FAL_KEY environment variable.")
self.model = model
self.image_url = image_url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I don't think it makes sense in the constructor. Rather give it to the main agent in the prompt.

],
)

agent.print_response("a cat dressed as a wizard with a background of a mystic forest", stream=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should provide the URL in this prompt. Like a cat dressed as a wizard with a background of a mystic forest. Make it look like "https://fal.media/files/koala/Chls9L2ZnvuipUTEwlnJC.png"

Comment on lines +114 to +120
media_id = str(uuid4())
agent.add_image(
Image(
id=media_id,
url=url,
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how we set the images on the agent


def image_to_image(self, agent: Agent, prompt: str, image_url: Optional[str] = None) -> str:
"""
Use this function to generate an image from a given image using the Fal AI API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean "From a given image". Maybe give a more detailed explanation and a link to their docs please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants