Squelch LLM Hallucination Via Structured Uncertainty Handling
by Tim Post
15 min read
Sometimes great conversations start by someone sharing an unpopular opinion. I'll put one out there that I think might just resonate well with anyone that's used a large language model (LLM) to produce some kind of structured outcome:
90% or more of what's wrong with people's perception of AI has little to do with the technology itself, but more to do with their expectations of what it's supposed to be able to do.
LLMs are great at getting a wide-variety of things most of the way there with a rather convenient rate of reliability. They don't do such a good job of putting human touches on things, but I don't expect them to, because, well, they're not human. We can also deduce that they're also not psychic by virtue of reason (and/or them not being human, if reason alone doesn't eliminate it), so they can't read our minds when they're not sure on how to proceed.
However, commercial LLMs are marketed as being capable of replicating your human touches, combined with super human accuracy and attention to detail. In many pre-conceived and well- tested situations, they do an okay and predictable job. But, the real product commercial providers want you to have isn't the product that they've built so far, at least as far as accuracy goes, and they don't do a good job of warning you about that aside from their terms of service.
And unfortunately, since most LLM companies never provide guidance on how to interact with models about their uncertainty, in the space where users write prompts ... your assumption is left that models are also psychic. This is ... not helpful, of them, and not always entirely accidental.
Research is shedding more light on why hallucinations happen and it's pointing more directly at model uncertainty; both in generation as well as in accepting input. It's something that has to be solved during model training in order to get at a guaranteed ~0.01% rate (it's impossible to mathematically avoid entirely). Companies put all their effort into avoiding it so they don't need to warn you, which means there's little instrumentation to see it happen and correct it.
The good news? there are not-hard things you can do right now to mitigate uncertainty while it's better resolved through better training, and in some cases, a few extra prompt input tokens can be all you need to squelch many common hallucination opportunities.
