-
-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to send images to the Assistant #416
Comments
Hi @andreibondarev, I noticed that the current version already supports sending images to LLMs. You just need to include the image within the llm = Langchain::LLM::OpenAI.new(api_key: ENV["OPENAI_API_KEY"])
llm.chat(
messages: [
{
role: "user",
content: [
{ type: "text", text: "What's in this image?" },
{
type: "image_url",
image_url: {
url: "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
],
model: "gpt-4o"
).completion Other LLMs only support sending the image in base64 format, but this must still be done within the |
Support for OpenAI with #799. |
You have probably thought about this already, but seems there are many cases to support. One solution is have all these different parameters to the Assistant: Sorry if I'm making this too confusing/complicated. There's also |
You should be able to provide an
image_url
to the Assistant for the supported multi-modal LLMs:Note
Some of the LLMs do not accept an image_url rather a Base64-encoded payload (Anthropic) or a file URI uploaded to the cloud (Google Gemini). We need to figure out how to handle it.
The text was updated successfully, but these errors were encountered: