Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

richardanaya · 2024-05-19T20:11:16Z

Why is passing down grammars needed?

Relying upon the context of a prompt to dictate structure can be unreliable (because its dependent upon the model and generational randomness) and takes up context space. Grammar is a well proven way to constrain generational output, and in fact format="JSON" even depends on it, but format="JSON" allows no reliable specification large complex structures and can even be tricked with prompt attacks.

Why grammar and not JSON schema?

While JSON schema would make a nice future addition, there's interest in data structures outside of JSON (simple enum values, programming languages, etc.). Also, JSON schema generators will rely upon grammars fundamentally, so validating the grammar generated by JSON schema will also benefit from grammar checking.

Why not just pass along the grammar to llama.cpp?

I looked into complexities of passing along grammar to llama.cpp server. There's a few challenges:

llama.cpp server doesn't return errors when bad grammar is passed to it with streaming mode on. It gives an incomprehensible "unexpected EOF"
the in memory model will be reused if the grammar is valid OR changed. BUT... the in-memory model appears to get reloaded if you give it a bad grammar and then follow up with a good grammar.
it appears to work perfectly reusing in memory models just passing along a completely valid grammar (even a variety of valid grammars)

My conclusion from this given the advice of the community is that we do indeed have to do our our GBNF grammar validation on the Go server side to do our best at preventing passing down bad grammar.

In this PR i've created:

the functionality to pass along grammar in chat and completion mode
documentation in readme related to new property
prevention of using grammar and json parameters at same time.
validation code for grammars
extensive set of 30+ tests for grammar ranging from character classes, strings, internationalizations comments, etc.
tests of every known grammar on llama.cpp and also individual unit tests
no usages of regex to make clear understandable parsing

Edge cases:

i've probably not implemented the entirety of whats possible in character classes, but I have a limited subset compatible with the grammar listed on llamma.cpp. My assumption is most people's grammars will be less complex than these.
there might be some valid grammars I don't currently support (but to the best of my knowledge we support all the major publicly available ones including ones as complex as C programming language), I chose not to use a full on go parser library because I wanted the cognitive load of this code to be approachable initially (rather than every viewer of this code to have to learn a new library). if in the future, we want to replace it with a more formal technology we can and tests can be reused.

Examples of success:

Example of failure:

I believe this PR satisfies #4074 with an acceptable amount of protection from sending invalid GBNF grammars with useful error messages.

richardanaya · 2024-05-20T02:43:26Z

I think i've added as many tests as I can think of to meaningfully add. I'll await feedback. @jmorganca

…de grammar validation

mitar · 2024-05-31T08:31:42Z

Have you seen #3618? It adds both grammar and JSON schema option which is then passed to llama.cpp. I think it would be nice to combine those two PRs (especially tests from this PR).

mitar · 2024-05-31T08:36:37Z

llm/server.go

+			return fmt.Errorf("grammar and format cannot be used together")
+		}
+
+		err := ValidateGrammar(req.Grammar)


Is it really necessary to validate the grammar? Llama.cpp does that anyway?

I showed in my investigation above bad grammars eject the model from memory and makes it reload. The streaming llama.cpp server has bad error handling. The go-side grammar check prevents that.

Hm, where is go-side in this PR? I see that you implemented your own validator?

Yah, I wrote my own validator in the grammar.go file of the commit. I didn't want to obligate this project to some special parsing library, so I tried to just be as straight forward as possible to get some initial validation going. I was a bit paranoid the PR might seem too strange if I did something too esoteric. The suite of tests I think could be useful for whatever go validation evolves into.

I was aiming for something just broadly adequate at validation to protect the lamma.cpp server from the ejections of models. I noticed that a lot of PRs didn't get accepted because they were pretty simplistic pass throughs. Talking with someone from the Discord said that some lack of even basic protection over the servers state might have been the reason why. That's sort of why this PR evolved as it did.

Oh, I thought "go-side" is some library for grammar checking. :-) Lol.

richardanaya · 2024-06-01T00:43:21Z

Have you seen #3618? It adds both grammar and JSON schema option which is then passed to llama.cpp. I think it would be nice to combine those two PRs (especially tests from this PR).

I have, but I think it suffers from the same problem as bad grammars that erroneous json schema cause model ejects. I think it would require a json schema validator. I think that's a big enough task it'd make sense to either leave that to another wrapper project to convert JSON schema to grammar, or put in separate PR. I'm interested in writing that separate PR, but i'd like to at least get this one finished. I think a lot of folks have been waiting on even just basic grammar support. Thanks!

mitar · 2024-06-01T17:03:28Z

I think it suffers from the same problem as bad grammars that erroneous json schema cause model ejects.

I am not sure why I would have to pay for grammar and JSON schema validation at every API request? I find this strange. In general I would say that in my case, JSON schema and/or grammar is part of a trusted input and not provided by the user. At least validation should then be cached or something?

Also is the point of this validation to prevent malicious values for grammar or just accidental erroneous values? Because if the goal is to prevent malicious values, then validator should completely match llama.cpp would reject. Otherwise an attacker would be able to bypass this validator by crafting the value which passes this validator but is still rejected by llama.cpp.

So I am not sure exactly why is this validation needed?

richardanaya · 2024-06-01T17:13:53Z

These are two valid concerns:

Cacheing could definitely help, I could add something small
You're right that a malicious hacker could find some way to bust the internal model. The goal of this PR isn't to be perfect security, its to get the ball rolling on getting grammar into the project and even get feedback and be generally aligned with the desired principles. As I specified above, my validator isn't a perfect representation of what's inside of llama.cpp's capabilities. To my knowledge, a spec of their grammar support doesn't even exist. So my validation is a subset and maybe has holes and is to what I could determine from their public documentation and public existing grammars.

Again :) I know nothing about the mindset of the project owners on what's holding them back from merging in grammar support. I did my best to make a PR in line with convos from older members in Discord to speculatively address their issues but also not make something too esoteric.

richardanaya · 2024-06-01T17:46:18Z

@mitar added simple caching and some simple sanity checking around size of grammar

richardanaya · 2024-06-01T20:09:26Z

@mitar I thought about your concern, I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs.

mitar · 2024-06-01T20:12:48Z

I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs.

I do not get why would this be useful? You should maybe only make validation an option. But you should always pass grammar through if user wants to use the grammar?

richardanaya · 2024-06-01T20:17:15Z

I now only process grammars if OLLAMA_GRAMMAR set to "true". That way custom grammar is opt-in for people okay with it's trade offs.

I do not get why would this be useful? You should maybe only make validation an option. But you should always pass grammar through if user wants to use the grammar?

My understanding the goal is to protect the model from being ejected by llama.cpp server. We should never pass down invalid gramma to the best of our ability. Therefore grammar being passed in is opt-in until the community is comfortable with the validation. In other words, we shouldn't let a vulnerability be the default behavior.

richardanaya force-pushed the main branch from 4da5714 to 1dd5575 Compare May 19, 2024 20:13

richardanaya marked this pull request as draft May 19, 2024 20:19

richardanaya force-pushed the main branch from 1dd5575 to 24adac4 Compare May 19, 2024 20:25

richardanaya marked this pull request as ready for review May 19, 2024 20:26

richardanaya force-pushed the main branch from 24adac4 to d50ea5a Compare May 19, 2024 20:27

richardanaya mentioned this pull request May 19, 2024

Grammar Guided response from model. #4074

Open

richardanaya force-pushed the main branch 12 times, most recently from a5e458a to 037fbd6 Compare May 20, 2024 02:40

richardanaya marked this pull request as draft May 20, 2024 02:42

richardanaya marked this pull request as ready for review May 20, 2024 02:42

Exposing grammar as a request parameter in completion/chat with go-si…

80b46f7

…de grammar validation

richardanaya force-pushed the main branch from 037fbd6 to 80b46f7 Compare May 20, 2024 15:08

richardanaya mentioned this pull request May 21, 2024

Improved json grammar #3785

Closed

Merge branch 'ollama:main' into main

1181b8a

mitar reviewed May 31, 2024

View reviewed changes

richardanaya force-pushed the main branch from 9ee8d4e to 35c8e6f Compare June 1, 2024 20:08

richardanaya force-pushed the main branch from 35c8e6f to 43805ff Compare June 1, 2024 20:11

adding cacheing and new test

026f6c3

richardanaya force-pushed the main branch from 43805ff to 026f6c3 Compare June 1, 2024 20:12

richardanaya requested a review from mitar June 1, 2024 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

richardanaya commented May 19, 2024 •

edited

richardanaya commented May 20, 2024 •

edited

mitar commented May 31, 2024

mitar May 31, 2024

richardanaya Jun 1, 2024 •

edited

mitar Jun 1, 2024

richardanaya Jun 1, 2024

mitar Jun 1, 2024

richardanaya Jun 1, 2024

richardanaya commented Jun 1, 2024 •

edited

mitar commented Jun 1, 2024 •

edited

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

mitar commented Jun 1, 2024

richardanaya commented Jun 1, 2024 •

edited

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Are you sure you want to change the base?

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Conversation

richardanaya commented May 19, 2024 • edited

richardanaya commented May 20, 2024 • edited

mitar commented May 31, 2024

mitar May 31, 2024

Choose a reason for hiding this comment

richardanaya Jun 1, 2024 • edited

Choose a reason for hiding this comment

mitar Jun 1, 2024

Choose a reason for hiding this comment

richardanaya Jun 1, 2024

Choose a reason for hiding this comment

mitar Jun 1, 2024

Choose a reason for hiding this comment

richardanaya Jun 1, 2024

Choose a reason for hiding this comment

richardanaya commented Jun 1, 2024 • edited

mitar commented Jun 1, 2024 • edited

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

richardanaya commented Jun 1, 2024

mitar commented Jun 1, 2024

richardanaya commented Jun 1, 2024 • edited

richardanaya commented May 19, 2024 •

edited

richardanaya commented May 20, 2024 •

edited

richardanaya Jun 1, 2024 •

edited

richardanaya commented Jun 1, 2024 •

edited

mitar commented Jun 1, 2024 •

edited

richardanaya commented Jun 1, 2024 •

edited