News Jensen says we are 'several years away' from solving the AI hallucination problem — in the meantime, 'we have to keep increasing our computation

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
It's more a matter of AIs being purely probability engines : if something looks like something else, then the answer fitting that "something else" should fit that something also. Mathematically, it's sound; in practice, not really.
They're pattern engines much more than probability, though. The whole way they work is to find patterns and structure in the training data. That said, I think you're right about them trying to find plausible-sounding answers, and that's a lot to do with what @derekullo said about them being forced to answer and like I said about them not being trained to qualify their level of confidence about their answer.

Usually, when I speculate about something, I try to qualify that answer with something indicating my certainty in my answer.

I do find it ironic just how many humans are "hallucinating" incorrect statements about AI. In spite of a very cursory understanding, these threads are full of self-assured statements about it that aren't well founded in fact.
 
  • Like
Reactions: P.Amini
They're pattern engines much more than probability, though. The whole way they work is to find patterns and structure in the training data. That said, I think you're right about them trying to find plausible-sounding answers, and that's a lot to do with what @derekullo said about them being forced to answer and like I said about them not being trained to qualify their level of confidence about their answer.

Usually, when I speculate about something, I try to qualify that answer with something indicating my certainty in my answer.

I do find it ironic just how many humans are "hallucinating" incorrect statements about AI. In spite of a very cursory understanding, these threads are full of self-assured statements about it that aren't well founded in fact.
I think the answer is somewhere between our two : pattern identification and deviation. If detected pattern looks like known pattern with a deviation under a certain amount (i.e. lower deviation = higher probability of being an OK match) then we get a mathematically sound answer that's logically completely false - because current AIs don't really do logic. Torchons et serviettes, as we say in France, may be large squares of fabric, but you don't mix them together. An AI would, though.
 
I think the answer is somewhere between our two : pattern identification and deviation. If detected pattern looks like known pattern with a deviation under a certain amount (i.e. lower deviation = higher probability of being an OK match) then we get a mathematically sound answer that's logically completely false
No, you're just speculating about how they work.

It's not as if they're simply computing statistical correlations against things they've seen before. The training data isn't memorized, rote. What they learn is the patterns they find in the training data. These patterns can be at various levels of abstraction and scale.

As they ingest the prompt, they build up a hidden state. This hidden state then drives their output. Both the mapping from the prompt to the hidden state and from the hidden state to their output are a joint product of their training data. As a concrete manifestation, a given pattern might not exist distinctly on one or the other side of that process, but instead might be composed of parts that are distributed.

because current AIs don't really do logic.
They do, just not very well.
 
Lol.. have you ever had a conversation with someone who's primary datasource is TikTok? LLMs are already way ahead of those people... and I'd say we're approaching 50% of the population in the category now.
 
  • Like
Reactions: bit_user
No, you're just speculating about how they work.

It's not as if they're simply computing statistical correlations against things they've seen before. The training data isn't memorized, rote. What they learn is the patterns they find in the training data. These patterns can be at various levels of abstraction and scale.

As they ingest the prompt, they build up a hidden state. This hidden state then drives their output. Both the mapping from the prompt to the hidden state and from the hidden state to their output are a joint product of their training data. As a concrete manifestation, a given pattern might not exist distinctly on one or the other side of that process, but instead might be composed of parts that are distributed.


They do, just not very well.
I know training data isn't recorded whole, thank you - patterns are found, extracted and recorded at a given precision level, and prompts are compared with existing patterns to find a match or, missing that (it it's usually the case, if only because of reduced precision tolerance), with an acceptable deviation. Current LLMs compete on how to extract those patterns, organise them and then how to match prompts, but they all fail at the same thing : LLMs can express the 'how' of something, but they can't tell you 'why'. Current models try to approximate the latter in different manners, but it's in the name : approximation - not actual context-aware logic.
Thus, why current LLMs actually suck at logic.
And I'm citing pretty much verbatim (he said it in French) one of the founders of AI here, Herman Iline, whom I asked about the current craze on LLMs. To him, that's not actual AI. I tend to listen to the guy who's been working on this subject for almost half a century.
 
patterns are found, extracted and recorded at a given precision level, and prompts are compared with existing patterns to find a match or,
No, the precision level of the weights has an indirect impact on how patterns are represented and activated. The process of activating patterns is also implicit and not systematic, like you're describing it. It's not even "matching", per se. It's just in how a given input activates certain parts of the network that drives a response. It's not even a strict "does match" vs. "doesn't match" type of binary decision.

You should really pick up a book or take an online class on AI, if you want to know how this stuff works. The dumbed-down explanations you can find on youtube or the web will just fill your head with wrong ideas.

not actual context-aware logic.
Why do you think they can't do logic? Logic is just one of many higher-order patterns.

And I'm citing pretty much verbatim (he said it in French) one of the founders of AI here, Herman Iline, whom I asked about the current craze on LLMs. To him, that's not actual AI. I tend to listen to the guy who's been working on this subject for almost half a century.
It's not surprising to find someone who dedicated their career to older methods (which haven't worked very well) to be dismissive of newer approaches. What makes you think he's very invested in understanding how LLMs work? He sounds like a hater, to me.

If you want to cut through the BS and hype, the best way is to learn for yourself how it works.
 
No, the precision level of the weights has an indirect impact on how patterns are represented and activated. The process of activating patterns is also implicit and not systematic, like you're describing it. It's not even "matching", per se. It's just in how a given input activates certain parts of the network that drives a response. It's not even a strict "does match" vs. "doesn't match" type of binary decision.

You should really pick up a book or take an online class on AI, if you want to know how this stuff works. The dumbed-down explanations you can find on youtube or the web will just fill your head with wrong ideas.


Why do you think they can't do logic? Logic is just one of many higher-order patterns.


It's not surprising to find someone who dedicated their career to older methods (which haven't worked very well) to be dismissive of newer approaches. What makes you think he's very invested in understanding how LLMs work? He sounds like a hater, to me.

If you want to cut through the BS and hype, the best way is to learn for yourself how it works.
No, he is still very much into it - but LLMs alone are a dead-end when it comes to replicating human intelligence.
Every time I'm confronted to one, I soon hit a snafu when it comes to cognitive efficiency and deductive power. Now, I admit, for stuff like translation from one language to another, LLMs are GREAT; for content indexation and such, also - extracting keywords and classifying data by order of importance are life-saving.
But, they sure aren't there when you need practical advice on a never seen before situation - context analysis and deductive rule-based extrapolation would be needed, and current AIs are as efficient as a kitten high on catnip on that - thus, the hallucinations, IMHO.
 
No, he is still very much into it - but LLMs alone are a dead-end when it comes to replicating human intelligence.
Okay, I can agree that the path to AGI doesn't seem to me like it's simply a scaling up of LLMs.

That's different than saying they can't do logic. They can do arithmetic and logic, which both involve learning and applying a system of rules. As I said before, being able to do something isn't the same as being able to do it well. I think they're both skills that LLMs can eventually master, but I don't know that for sure. To see the current state of the art, check out the Open LLM Leaderboard v2:

For the purposes of this discussion, pay particular attention to the MATH (https://arxiv.org/abs/2103.03874) and MuSR (Multistep Soft Reasoning; https://arxiv.org/abs/2310.16049) benchmarks. You can find them all documented here:

It's worth checking out the papers, which analyze performance of some well-known LLMs on sub-scores and compare to human performance. The MATH paper even does some trend analysis, in an attempt to model the rate of improvement vs. weights and predict future performance.

Note that the above Leaderboard only contains scores for open source LLMs, not proprietary ones like Google's and OpenAI's. However maybe you can find their scores on at least some of the benchmarks.