Mistral 7b My Beloved

Jun 19

In defence of older models

6 Comments

Tried Nemo 12B yet? That was the natural evolution for all my finetunes once I moved on from Mistral 7B. Sure, it's a bit bigger, but it has this same sponge-like effect when being trained. Extremely versatile. Only downside is the larger size, I suppose! (Though depending on the task, this may be an advantage)

Expand full comment

Good point! I did try that way back in the day but got loss explosion, probably due to bad hyperparams. I need to spend the time to get good at this one, it's probably the natural next step -- plus the fact it's trained to produce more human-like data is likely a huge plus! Thanks for the recommendatin Gryphe!

Expand full comment

You know what they say "If it ain't broke don't fix it"

Expand full comment

True indeed! Enough in ML breaks on its own already WITHOUT the engineer making problems for themselves

Expand full comment

hi, how does Mistral 7B v0.3 compare to v0.2?

Expand full comment

No real thoughts, have not done thorough enough a/b testing to confirm one way or the other

Expand full comment

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts