-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525
base: main
Are you sure you want to change the base?
Conversation
a5e458a
to
037fbd6
Compare
I think i've added as many tests as I can think of to meaningfully add. I'll await feedback. @jmorganca |
…de grammar validation
Have you seen #3618? It adds both grammar and JSON schema option which is then passed to llama.cpp. I think it would be nice to combine those two PRs (especially tests from this PR). |
return fmt.Errorf("grammar and format cannot be used together") | ||
} | ||
|
||
err := ValidateGrammar(req.Grammar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to validate the grammar? Llama.cpp does that anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I showed in my investigation above bad grammars eject the model from memory and makes it reload. The streaming llama.cpp server has bad error handling. The go-side grammar check prevents that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, where is go-side in this PR? I see that you implemented your own validator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, I wrote my own validator in the grammar.go file of the commit. I didn't want to obligate this project to some special parsing library, so I tried to just be as straight forward as possible to get some initial validation going. I was a bit paranoid the PR might seem too strange if I did something too esoteric. The suite of tests I think could be useful for whatever go validation evolves into.
I was aiming for something just broadly adequate at validation to protect the lamma.cpp server from the ejections of models. I noticed that a lot of PRs didn't get accepted because they were pretty simplistic pass throughs. Talking with someone from the Discord said that some lack of even basic protection over the servers state might have been the reason why. That's sort of why this PR evolved as it did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought "go-side" is some library for grammar checking. :-) Lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S'all good
I have, but I think it suffers from the same problem as bad grammars that erroneous json schema cause model ejects. I think it would require a json schema validator. I think that's a big enough task it'd make sense to either leave that to another wrapper project to convert JSON schema to grammar, or put in separate PR. I'm interested in writing that separate PR, but i'd like to at least get this one finished. I think a lot of folks have been waiting on even just basic grammar support. Thanks! |
I am not sure why I would have to pay for grammar and JSON schema validation at every API request? I find this strange. In general I would say that in my case, JSON schema and/or grammar is part of a trusted input and not provided by the user. At least validation should then be cached or something? Also is the point of this validation to prevent malicious values for grammar or just accidental erroneous values? Because if the goal is to prevent malicious values, then validator should completely match llama.cpp would reject. Otherwise an attacker would be able to bypass this validator by crafting the value which passes this validator but is still rejected by llama.cpp. So I am not sure exactly why is this validation needed? |
These are two valid concerns:
Again :) I know nothing about the mindset of the project owners on what's holding them back from merging in grammar support. I did my best to make a PR in line with convos from older members in Discord to speculatively address their issues but also not make something too esoteric. |
@mitar added simple caching and some simple sanity checking around size of grammar |
@mitar I thought about your concern, I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs. |
I do not get why would this be useful? You should maybe only make validation an option. But you should always pass grammar through if user wants to use the grammar? |
My understanding the goal is to protect the model from being ejected by llama.cpp server. We should never pass down invalid gramma to the best of our ability. Therefore grammar being passed in is opt-in until the community is comfortable with the validation. In other words, we shouldn't let a vulnerability be the default behavior. |
Why is passing down grammars needed?
Relying upon the context of a prompt to dictate structure can be unreliable (because its dependent upon the model and generational randomness) and takes up context space. Grammar is a well proven way to constrain generational output, and in fact
format="JSON"
even depends on it, butformat="JSON"
allows no reliable specification large complex structures and can even be tricked with prompt attacks.Why grammar and not JSON schema?
While JSON schema would make a nice future addition, there's interest in data structures outside of JSON (simple enum values, programming languages, etc.). Also, JSON schema generators will rely upon grammars fundamentally, so validating the grammar generated by JSON schema will also benefit from grammar checking.
Why not just pass along the grammar to llama.cpp?
I looked into complexities of passing along grammar to llama.cpp server. There's a few challenges:
My conclusion from this given the advice of the community is that we do indeed have to do our our GBNF grammar validation on the Go server side to do our best at preventing passing down bad grammar.
In this PR i've created:
grammar
in chat and completion modegrammar
andjson
parameters at same time.Edge cases:
Examples of success:
Example of failure:
I believe this PR satisfies #4074 with an acceptable amount of protection from sending invalid GBNF grammars with useful error messages.