-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support New Fill In The Middle API for Ollama #13
Comments
Do you know which model supports the new API? I have tried several models but all of them complain that model does not support insert. |
This would be really useful. I know that the Continue plugin for VS Code works perfectly with Ollama/codellama/starcoder2. Maybe that would be a starting point? |
@jmitek This model does support the new API, but the result is super weird and worse than using the Fill-in-the-middle strategy and fill in the template manually. Are you using 3b or the 7b model? (Note, you can already use FIM supported models as a completion API as long as you know the template) |
@intitni I'm mostly using codellama:13b. Also starcode2:15b. I've no idea what the template might look like for them though. |
@jmitek The default one is for codellama. The starcoder one looks like
|
@jmitek Please change the model to completion API and set it up again. The imported models will be treated as chat completions API. You may also need the |
@intitni Thanks, so I made it use /generate again and filled in the template exactly as you have it (the default one I had was different). I can see it is using /generate now, suggestions seems okay, except still have the duplicated code. See this example: |
@jmitek it works fine on my Mac though. What's your settings again? |
Oh the default one is actually correct, I was testing starcoder2 when I made the screenshot. |
@jmitek You need to reset the template to the default one |
I don't know much about the code suffix, I found it in the documentation https://ollama.com/library/codellama in the Fill in the middle section |
Awesome!, I pulled codellama:13b-code and use this from the Ollama site (https://ollama.com/library/codellama:13b-code): |
Maybe they are using their own prompt strategy. Can you see what the prompt they are sending to Ollama? |
I have tried the model starcode2 with ollama. The completion result is different with continue. I found the request of continue have a parameter 'raw' which is set to true. I change the source code , add 'raw' parameter in the request, then the completion is the same with continue. |
@sonofsky2010 Hi, I have tried the raw parameter but I am still getting weird output from starcoder2:7b (there are template tags in the output). Do you have a complete request body the Continue plugin sent? |
@intitni I check the request of continue. I think it maybe caused by the wrong 'stop' parameters. Now it send 'stop' with empty list. It maybe override the stop parameters which ollama read from the model parameter. |
The Ollama project recently merged this PR, which adds support for fill-in-the-middle completions via the existing
generate
endpoint. Would love to see this supported in Custom Suggestion Service as well.The text was updated successfully, but these errors were encountered: