News AI Hallucinations Ranked: ChatGPT is Best, Palm-Chat Needs to Sober Up

Admin · Nov 14, 2023

A new project aims to rank the quality of various LLM chatbots according to their ability to summarize short documents without hallucinating. It found GPT-4 was best and Palm-chat was the worst.

AI Hallucinations Ranked: ChatGPT is Best, Palm-Chat Needs to Sober Up : Read more

Kridian · Nov 14, 2023

Replace 'Hallucination' with 'Diarrhea'.
There, fixed it.

daredevil01 · Nov 15, 2023

I'm surprised Anthropic's Claude isn't in this.

evdjj3j · Nov 15, 2023

LSD is the best.

vehekos · Nov 15, 2023

Prompt to any of those LLM "quote somebody who said something in the line of xxx is yyyy of zzz", and in 90% of the time, it will invent a quote, stating it as fact.

abufrejoval · Nov 15, 2023

Well hallucinations are natural, if you consider how these models work. And interestingly, that's how we work, too.

What we might not notice any more is that we tend to subject anything we come up with to some plausibility control and then discard obvious gibberish rather quickly... unless we are tired, drunk or otherwise debilitated, which then has that nonsense come out unfiltered.

Young kids also don't have those filters trained yet, which also has them come up with "hallucinations" we then often find delightful or charming.

But that 2nd corrective phase also works with these models to a certain degree, perhaps it should be made part of formulating the response, but it would raise the operational load significantly.

So when I found e.g. Llama or Mistral hallucinating or contradicting itself on factual queries, just asking a question that would show how its last couple of answers would have it contradict itself, the model would notice and then correct mistakes, my first instances of artifical contrition!

I've had tons of fun with hallucinations especially debating historical personalities. They typically wound up being brothers and sisters, both male, but having offspring, who'd then be grandfather and nephew to each other... it obviously understood royalty and its constrained choices rather well!

Without analysing or knowing its training data it's unfortunately rather hard to gauge where it's more likely to go off the rails, I don't know if the models calculate just how sure they are of a certain answer e.g. because they have lots of data, but if they did, it doesn't seem to influence their word choice in their answers today: they'll be just as confident in their tone on total bollocks and proper facts.

Darkoverlordofdata · Nov 15, 2023

Hallucinations? No, in the case of AI, that’s just a euphemism for lies. I took LSD back in the day - I know what a hallucination is, and AI just plain tells lies.

The people marketing it as a hallucination are also lying, because if AI gets a reputation for lying, it’s not marketable. Don’t trust any of them.

bit_user · Nov 15, 2023

abufrejoval said:
Without analysing or knowing its training data it's unfortunately rather hard to gauge where it's more likely to go off the rails, I don't know if the models calculate just how sure they are of a certain answer e.g. because they have lots of data, but if they did, it doesn't seem to influence their word choice in their answers today: they'll be just as confident in their tone on total bollocks and proper facts.

I think the underlying problem is that these models simply weren't trained to estimate their own degree of certainty and appropriately qualify their answers. Developing such training data would take a lot of work, but should be doable.

Darkoverlordofdata said:
Hallucinations? No, in the case of AI, that’s just a euphemism for lies. I took LSD back in the day - I know what a hallucination is, and AI just plain tells lies.

The people marketing it as a hallucination are also lying, because if AI gets a reputation for lying, it’s not marketable. Don’t trust any of them.

I think the term "hallucination" is well-chosen. A "lie" is a knowing falsehood. A "half-truth" omits key information that would change the meaning of what's said. I'm not aware of a good term for saying something you believe to be true, that's actually wrong. You could call it an "error", but that's such a rather overloaded term.

Search

News AI Hallucinations Ranked: ChatGPT is Best, Palm-Chat Needs to Sober Up

Admin

Administrator

Kridian

Distinguished

daredevil01

Distinguished

evdjj3j

Distinguished

vehekos

Great

abufrejoval

Honorable

Darkoverlordofdata

Distinguished

bit_user

Titan

TRENDING THREADS

Latest posts

Moderators online

Share this page