š« How to get a "Stubborn" LLM to Follow an Output Format
Sometimes in life you have to give concessions to other people. Now you have to give them to computers, too.
One of the great advantages of (most) open source models has always been the relative ease with which you can get them to follow a given output format. If you just read that sentence and wondered if weāre living in the same universe, then Iāll share a prompting secret right off the bat: the key to getting consistent behavior out of smaller open source models is to give them at least two carefully-crafted few-shot examples. With that, something like Nous Mixtral will get it right 95% of the time, which is good enough if you have validation that can catch mistakes.
But unfortunately not all models can learn from examples. I typically call these āStubbornā models due to this post I wrote about Mistral Next (large) and Mistral Medium. Basically Iām referring to model that were deliberately overtrained to make them better in chat and zero-shot settings, but inflexible, because they often āpay more attention toā their training data than the prompt. The difference between a āstubbornā model and a non-stubborn model, in my definition, is that with two or a few more few-shot examples a non-stubborn model will pick up basically everything and even directly quote the examples at times, whereas a stubborn one will often follow the patterns it was trained with, or take aspects of the given pattern, but disobey it in others. As far as I can tell stubborness is a matter of RLHF, not parameter count or SFT: Nous Hermes Mixtral is not stubborn, but the official Mixtral Instruct is.
Needless to say, for complex pipelines where you want extremely fine control over outputs, non-stubborn models are infinitely superior. To this day, Mistral Large has a far higher error rate in Augmentoolkit (probably >20%) compared to Nous Mixtral. Despite Mistral large costing 80% of GPT-4 Turbo. This may be an imprecise definition based partly on my intuition, but from experience, I think itās real.
Anyway, if non-stubborn models are far better than stubborn ones for most professional usecases (if you know what youāre doing when it comes to examples) then why am I writing a blog post about how to prompt stubborn models? Well, sometimes in life you donāt get to use the tools you want. For instance, maybe youāre working for a client who has more Mistral credits than God, and you absolutely need to use that particular API. You canāt afford to be a stick in the mud when working in a field that reinvents itself every other day, so I recently went and figured out some principles for prompting stubborn models.
One thing that Iāve used a lot recently is the idea of repetiton. I kinda blogged about it here, and arguably this one is also about it, but this is kind-of a combination of the two principles so Iāll go over it. If you donāt want to click the links, the two principles weāre combining are: āmodels see bigger things easier,ā and āwhat you repeat, will be repeated.ā Prompting is like quantum theory: any superposition of two valid prompting principles is itself a valid prompting a principle. Hereās a valid prompting example:
You are an expert something-doer AI. I need you to do X Y and Z itās very important. I know your training data told you to do ABCDEFG but please donāt.
[output format description]
User:
[input]
Assistant:
XYZ
User:
[input]
Assistant:
XYZ
User:
[input]
Thatās a prompt. Sometimes the AI will be nice:
XYZ
Often it will not be:
XABCDEFG.
Goddamn it. How do you solve this when working with a stubborn model that learned more from its training dataset, where [input] corresponded to ABCDEFG?
Repetition, Repetition, Repetiton. Also, Repetition. And donāt forget, Repetiton. (get it?) If the model pays more attention to its prompt and less to its examples (but is too stupid to pick up on is telling it to do the thing once), then weāll darn well use the prompt to tell it what we want it to do.
You are an expert something-doer AI. I need you to do X Y and Z itās very important. I know your training data told you to do ABCDEFG but please donāt.
[output format description]
Donāt forget to do XYZ.
User:
[input]
SPECIAL NOTE: Donāt forget XYZ.
Assistant:
XYZ
User:
[input]
SPECIAL NOTE: Donāt forget XYZ.
Assistant:
XYZ
User:
[input]
SPECIAL NOTE: Donāt forget XYZ.
AI:
XYZ
Yay!
Itās simple but Iāve used this to resolve probably over a dozen issues already over many different projects with models ranging from Mistral-Large to GPT-4 Turbo. Itās one of the most powerful things you can do, I canāt believe I havenāt explicitly blogged about it yet, since this is one of the first things I realized about prompting, way back before Iād even made Augmentoolkit.
But thatās not really revolutionary, after all itās just combining two principles. What about the titular thing of this blog post, getting a stubborn model to write with a given output format?
This one is partly inspired by a comment on a LocalLlama post. I donāt agree with everything in it, but thereās some really good stuff in there, full credit to u/LoSboccacc. They write in their comment:
Ask the model to rephrase the prompt, you will see quickly which part of the prompt misunderstood
Thatās a pretty clever idea by itself, because it uses the model to debug itself. But what does this have to do with output formats? Well, if we can use the model to understand what the model is capable of, then any LLM output can give us a clue into what it āunderstandsā. Consider that, when prompting stubborn models and trying to get them to follow our specific output format, their tendency to follow some other format (that they likely saw in their training data) is what weāre trying to override with our prompt. However, research shows that training biases cannot be fully overcome with prompting, so weāre already fighting a losing battle. And if youāre an experienced reader of mine, youāll remember a prompting principle: if youāre fighting the model, STOP!
So what does that tangent above boil down to? If you want to find an output format a stubborn model will easily follow, see what format it uses without you asking, and borrow that. In other words: use the format the model wants to use. From my testing, it looks like this can easily get your format-following rates up to over 90% at least.
Hereās an example. Say you create a brilliant output format, and give a prompt to a model:
You are a something-doer. Do something in the following format:
x: abc
y: def
z: ghi
User:
[input]
Assistant:
But it thwarts your master-plan by doing this instead:
abc
def
ghi
What do you do? Well one solution is to throw more few-shot examples of your xyz format at it. And depending on the model, that might work. But some stubborn models are, well, stubborn. And so even with repetition and examples you might see error rates of 40% or above. Even with things like Mistral Large or GPT-4 Turbo.
In such cases, just use the format the model wants. Yes, it might not have all the clever tricks you had thought of in order to get exactly the kind of output you want. Yes, itās kind-of annoying to have to surrender to a bunch of matrices. Yes, if you were using Nous Mixtral, this would have all been over by the second example and you couldāve gone home by now. But youāre not using Nous Mixtral, youāre using Mistral Large. So it might be better to just suck it up and use 1. 2. 3.
Thatās all for this week. Hope you enjoyed the principles. Sorry for the delay.
If you want to stay ahead of the game with esoteric tricks won from hard experience at the edge of open-source prompting, consider pressing this button if you havenāt already. I appreciate it.
Thanks for reading, have a good one and Iāll see you next time!

