Mitigating GPT-isms in AI Finetunes

How to not have "As an AI language model..." in 5 minutes.

Nov 11, 2024

When diving into the world of AI-generated writing, you quickly realize that AI systems possess a distinct, unmistakable voice all their own. Whether it’s the occasional overly formal tone, an inclination toward balanced viewpoints, or the rhythmic cadence of certain sentence structures, an AI's unique “voice” becomes apparent. By paying attention to word choice, a reader can often intuitively determine if AI has written something, or if a human has.

^ In case you couldn’t tell, that paragraph up there was mostly written by AI. The only way I could’ve made it more of a giveaway is if I forced it to use some of the most common giveaway words, such as “delve,” “tapestry,” “testament”, “it’s important to remember that,” and perhaps “ministrations.”

If you’re using AI for writing you probably really hate the AI’s word choice some of the time. Unfortunately, LLMs, as text generators, are mostly used for writing. So we want to make the word choice better. This also applies if you’re training your own custom LLMs, because your users will probably hate GPT-like text as well!

Common AI word choice and phrasing are colloquially called “GPT-isms” by the community, and for good reason — many of them originate from synthetic training data made by ChatGPT. Therein lies the hint we need to ameliorate this problem. In this quick blog post I’m going to cover a few practical tricks to mitigate GPTisms in models.

Prompting: Very Long Few-Shot Examples that are Not Like What The AI Would Usually Write

I adore prompting, but I mostly believe that there are two really important things prompting really struggles at: writing style and factual knowledge. But sometimes you can’t finetune a model and you still need it to not be super-mega-pretentious. In this case, there are still a few tricks you can apply.

If you’re doing creative writing, then my go-to advice is to give the LLM extremely long few-shot examples. I mean ten thousand tokens or more, on the “assistant” side of the conversation. These should be human-written, too. LLMs are pattern completion machines, the goal is to give it a really human pattern to complete. Show it enough examples and the in-context learning (ICL) starts to compete against the trained-in bias. ICL can not fully overcome these biases, but it can mitigate them. Give it as many examples as you can if you have the time, since that will continue to improve performance. If each individual example is longer you can get away with fewer of them. However, do note that this will also make the prompt more costly if you’re using an API.

And it is likely that you’ll be using an API. The primary use case for prompting like this is if you’re using an API and don’t have access to any of the neat tricks like min_p or the antislop sampler to reduce GPT-isms — if you’re running locally or on your own hardware, you have a ton of things you can apply on the sampling side to get really good results here (there are way too many tricks to cover in this article) but often it is the case that we’re working on the cloud, and that’s when you use prompting.

Keep in mind, it’s usually not enough to just show the AI a bunch of few-shot examples written by people. To get the best results you have to also give examples that are different than what the AI would usually write. If it’s creative writing, you might have dark/depraved examples, for instance. This is to, colloquially, “knock the AI off of its usual pattern”. When it’s trained to be all sanitary, that sorta stuff isn’t in the context, so by adding it to the context we make it way less sanitary. Every creative writing prompt I have ever done, with few-shot examples or without, has been incomplete until I’ve added a bit (sometimes even a bunch!) of darkness into there.

Finetuning: Use Older Bases

If you’re finetuning you have much more control, here. At this stage, I’m going to assume that you know the difference between base models (autocomplete, something that you make finetunes on top of) and finetunes (conversational).

The key idea here is to try and pick models that have the least amount of GPT data as possible, to train on top of. Naturally when frontier labs make their base models they try to improve all areas, including the data. Unfortunately, after GPT came out a LOT of GPT writing got out onto the internet. This means that any base model without very advanced filtering that came out relatively recently is going to have built-in GPTSlop even before finetuning because it was trained on a large amount of GPT-written text. The amount of GPT-written text has only increased as time goes on, so the problem is more acute the more recent the model is.

There’s a decent amount of pollution of instruction following sets into pretraining sets as well — llama 3, for instance, is pretty good at instruction following even if you don’t give it any generic assistant data, because of its pretraining data being polluted. Coincidentally it is also decently prone to some GPT-isms.

All this comes together to say that, if you’re finetuning and writing style is the main concern you have, consider picking an older model. Mistral 7b v0.2 lacks the data pollution and severe inherent-GPTism problem of many more recent models. It’s also way cheaper to train compared to Llama 3 8b thanks to the smaller tokenizer. This advice is Alignment-Lab AI approved™.

Finetuning: Do Continued Pretraining

This one’s pretty simple but worth a mention. If the pretraining of a model leaves it inclined to be pretentious, then do some pretraining on legitimately great text before you finetune just to shift it back a bit. Bonus points if the pretraining includes stuff that was probably filtered out of the original set for “safety” reasons. You want to get that diversity of data, and the depth it provides.

Finetuning: Avoid GPT-written sets as much as possible.

Perhaps this is obvious, but if you finetune on synthetic data made by GPT, your model will sound more like GPT! Thankfully, for things like RP and writing, there are a lot of pretty good human-written sets out there like RPGuild, Bluemoon RP, etc. For factual finetuning you’re in a bit more of a pickle, but there are still things like the original LIMA set (Stackoverflow) and other hybrid datasets which will be better than pure GPT-3/GPT-4. And for your specific domain you can use Augmentoolkit datasets which can be generated by non-GPT models, which at least partly mitigates the most obvious GPT-isms.

In Conclusion

Most of the counters to bad AI writing style are available only if you’re finetuning. A good handful open up if you’re running locally or on hardware you own. And if you’re using the cloud, then it’s throw as much human-written “out there” examples as you can at the model until it breaks, or your bank account does.

Well would you look at that, I actually managed to hit 2 weeks in a row. Just like the good old days! Maybe I’ll hit 3…?

Recently I’ve been doing a lot of work on creative writing AI, so that’s what the topic of this post focuses on. AI is really good at creativity and invention because the depth of human artistic/literary achievement is really fantastic, and unlike with facts or solving problems, art is rarely objectively wrong — meaning that with a bit of benefit of the doubt it’s way easier to get usable outputs. Art can, however, be subjectively shit, so that’s why techniques like these ones are useful sometimes. If you’ve been working on AI for creativity and have figured anything out that’s cool, please share it in the comments!

A side note about the post before this one. As you may know, I don’t edit my posts. I write them in one go. Usually this results in innocuous errors, maybe a word is dropped here or there, maybe my commas are a mess.

Unfortunately last time I dropped a rather important word. You see, near the end, I was going to write

“I hope that the tips were useful to those of you training models, and at least intellectually interesting even even if you aren’t training models”

But ended up saying

“I hope that the tips were useful to those of you training models, and at least intellectually interesting even if you aren’t.”

While this is funny, it is also pretty insulting to you. Sorry about that! Not my intention! I’ve fixed that post since, but I thought I’d reiterate that that was an error here.

Anyway, that’s all for this week, have a good one and I’ll see you next time!

Prompting Weekly

Discussion about this post