2 Comments

Potential minor error - according to GPT-4o:

Logical contradiction: "They’re much harder to quickly write a prompt for, they’re much harder to precisely control": This part is unclear. If the intent is to say that open-source models are more difficult to prompt and harder to control, but somehow still preferred, you may want to clarify why they are better in the long run. The current phrasing seems to suggest both difficulty in prompting and difficulty in control, which undermines their appeal.

Expand full comment

Thanks for another great article about another important principle. I'm curious if you've seen some traits in small models (<3B) that I've run into. Mostly, they appear to be easily confused by long prompts - and by long I mean more than 3 or 4 sentences. I've had consistent results of inconsistency when trying to take a simple and sort of successful prompt and improve it by adding clarifications or more detailed instructions. They just seem to sort of wig out if you don't keep it short and sweet, often running off into the weeds, repeating irrelevant passages and sometimes just repeating a few random characters ad infinitum. In particular I've found that few-shot examples turn them into gibbering baboons, such that providing examples is rendered entirely useless. I'd come to believe that models of these small sizes are just so unstable that they can only be comprehend the simplest of prompts. I may have better luck with some of the concepts you're writing about, but it would be nice to hear from someone of your experience about the feasibility of using these little guys at all.

Expand full comment