Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support New Fill In The Middle API for Ollama #13

Open
rwebb-fundrise opened this issue Aug 29, 2024 · 18 comments
Open

Support New Fill In The Middle API for Ollama #13

rwebb-fundrise opened this issue Aug 29, 2024 · 18 comments
Labels
enhancement New feature or request

Comments

@rwebb-fundrise
Copy link

The Ollama project recently merged this PR, which adds support for fill-in-the-middle completions via the existing generate endpoint. Would love to see this supported in Custom Suggestion Service as well.

@intitni intitni added the enhancement New feature or request label Aug 30, 2024
@intitni intitni mentioned this issue Sep 13, 2024
@intitni
Copy link
Owner

intitni commented Sep 13, 2024

Do you know which model supports the new API? I have tried several models but all of them complain that model does not support insert.

@jmitek
Copy link

jmitek commented Sep 17, 2024

This would be really useful. I know that the Continue plugin for VS Code works perfectly with Ollama/codellama/starcoder2. Maybe that would be a starting point?

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

@jmitek This model does support the new API, but the result is super weird and worse than using the Fill-in-the-middle strategy and fill in the template manually. Are you using 3b or the 7b model?

(Note, you can already use FIM supported models as a completion API as long as you know the template)

@jmitek
Copy link

jmitek commented Sep 17, 2024

@intitni I'm mostly using codellama:13b. Also starcode2:15b. I've no idea what the template might look like for them though.

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

@jmitek The default one is for codellama. The starcoder one looks like

<fim-prefix>def fib(n):<fim-suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim-middle>

@jmitek
Copy link

jmitek commented Sep 17, 2024

Interesting. So if connect it to codellama , here are my settings:
image

This is the kind of completion I get in Xcode:
image

Using "Default" has similar results. I notice that it is using the /chat api, though I would have expected /generate api instead?

But if I choose "Continue" one it looks a bit more reasonable, though it seems to duplicate the preceding lines of code. Haven't tried starcoder2 yet

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

@jmitek Please change the model to completion API and set it up again. The imported models will be treated as chat completions API. You may also need the -code, too.

Screenshot 2024-09-17 at 21 29 54

@jmitek
Copy link

jmitek commented Sep 17, 2024

@intitni Thanks, so I made it use /generate again and filled in the template exactly as you have it (the default one I had was different). I can see it is using /generate now, suggestions seems okay, except still have the duplicated code. See this example:
Before:
image

After:
image

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

@jmitek it works fine on my Mac though. What's your settings again?

Screenshot 2024-09-17 at 21 43 05

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

filled in the template exactly as you have it (the default one I had was different)

Oh the default one is actually correct, I was testing starcoder2 when I made the screenshot.

@jmitek
Copy link

jmitek commented Sep 17, 2024

here my updated settings:
image

I see that you are using codellama:7b-code, I'm using the non "-code" version, if that makes any difference?

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

@jmitek You need to reset the template to the default one

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

I don't know much about the code suffix, I found it in the documentation https://ollama.com/library/codellama in the Fill in the middle section

@jmitek
Copy link

jmitek commented Sep 17, 2024

Awesome!, I pulled codellama:13b-code and use this from the Ollama site (https://ollama.com/library/codellama:13b-code):
<PRE> {prefix} <SUF>{suffix} <MID>
which is the same as the default one in the app.
So it seems to break with the non "-code" version - although Continue plugin is somehow able to use the non "-code" version...
Anyway it works perfectly :)

@intitni
Copy link
Owner

intitni commented Sep 17, 2024

Maybe they are using their own prompt strategy. Can you see what the prompt they are sending to Ollama?

@sonofsky2010
Copy link

I have tried the model starcode2 with ollama. The completion result is different with continue. I found the request of continue have a parameter 'raw' which is set to true. I change the source code , add 'raw' parameter in the request, then the completion is the same with continue.

@intitni
Copy link
Owner

intitni commented Sep 24, 2024

@sonofsky2010 Hi, I have tried the raw parameter but I am still getting weird output from starcoder2:7b (there are template tags in the output). Do you have a complete request body the Continue plugin sent?

@sonofsky2010
Copy link

@intitni I check the request of continue. I think it maybe caused by the wrong 'stop' parameters. Now it send 'stop' with empty list. It maybe override the stop parameters which ollama read from the model parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants