News ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Admin · Jun 9, 2025

It is claimed that OpenAI’s ChatGPT 4o model “got absolutely wrecked on the beginner level” of Atari Chess on an Atari 2600 console from the 1970s.

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic : Read more

King_V · Jun 9, 2025

Caruso says he tried to make it easy for ChatGPT, he changed the Atari chess piece icons when the chatbot blamed their abstract nature on initial losses.

It, with its computing power and precision, had trouble telling which piece was which?

Man, this whole "AI will lie to cover up its inadequacies" bit is going to be an ongoing thing with AI, isn't it?

acadia11 · Jun 9, 2025

So this is a case of savant-idiot as opposed to idiot-savant!

theSharpeOne · Jun 9, 2025

You mean a specialist in an extremely narrow field beat a generalist in every field that ever existed in the specialists field?

You put a Chess Champion up against the record winning Jeopardy Champion and the Chess guy is going to win at Chess.

Ask the Atari to do a software engineers job, and let me know how that goes.

Notton · Jun 9, 2025

theSharpeOne said:
You mean a specialist in an extremely narrow field beat a generalist in every field that ever existed in the specialists field?

You put a Chess Champion up against the record winning Jeopardy Champion and the Chess guy is going to win at Chess.

Ask the Atari to do a software engineers job, and let me know how that goes.

Atari 2600 isn't even a specialist at Chess.
It's a frickin' 1977 8-bit MOS Technology 6507 with 128-bytes of RAM. (Yes, I had to look that up on wiki)

"specialist chess computer" are things like: Chess Challenger (1977), Deep Blue (1996/1997), Pocket Fritz (2001~)

fiyz · Jun 9, 2025

And how did the other models fare? That model struggles with simple number theory... It can't even describe the distance that two has to it's nearest primes, 1 and 3...

Now throw in computer vision? Yeah, you should probably try the premium model.

DSzymborski · Jun 9, 2025

My Mazda hatchback is much worse than my lawnmower at cutting the grass.

In the interest of being appallingly nitpicky, the two nearest prime numbers to two are three and five, because one is not a prime number, and nobody here has the jurisdiction to break poor old Euler.

Sluggotg · Jun 9, 2025

theSharpeOne said:
You mean a specialist in an extremely narrow field beat a generalist in every field that ever existed in the specialists field?

You put a Chess Champion up against the record winning Jeopardy Champion and the Chess guy is going to win at Chess.

Ask the Atari to do a software engineers job, and let me know how that goes.

The Atari 2600 cartridges only held 4k. Combine that with 6502series processor and a mighty 128 bytes, (not Kilobytes) as mentioned above by Notton, and you have an insanely underpowered platform for chess. But they pulled it off. It can play chess. Not great or anything just basic chess.

If you want to play some very cool and historical chess games, try Distant Armies on the Amiga. Very unique program. You can play it on an Emulator, (buy the Amiga Forever one, it comes with legal licenses of both the ROMS and OS's of the Amiga lineup. It is made by Cloanto). You can find Distant Armies on myabandonware.com. It is a good site.

anonymous0100101 · Jun 9, 2025

theSharpeOne said:
You mean a specialist in an extremely narrow field beat a generalist in every field that ever existed in the specialists field?

You put a Chess Champion up against the record winning Jeopardy Champion and the Chess guy is going to win at Chess.

Ask the Atari to do a software engineers job, and let me know how that goes.

"Specialist in one feild"

The 2600 SUCKS at chess. A child can usually beat it. In no way, shape, or form is it a "Specialist" in chess.

Your username does not accurately reflect your words.

tamalero · Jun 9, 2025

anonymous0100101 said:
"Specialist in one feild"

The 2600 SUCKS at chess. A child can usually beat it. In no way, shape, or form is it a "Specialist" in chess.

Your username does not accurately reflect your words.

he's a SHARPE not Sharp 😛
As in a sharpee I guess.

Still I laughed when he tried to claim the Atari was a chess specialist machine XD

JRStern · Jun 9, 2025

How do you get ChatGPT to play chess?
Show it picture of the board?
Give it one move at a time?

moozoo · Jun 10, 2025

LLMs are not alpha zero.
They are really bad at chess.
Google "llm chess leaderboard"
And this is vs a random valid move bot.
Anyone who only knows the rules will do really well against most Llms.
Anyone putting thought into it is going to dominate.
That 2600 can beat a llm is just as reflection on how bad they really are.

Amdlova · Jun 10, 2025

The best console ever 😀

JayGau · Jun 10, 2025

It's because it's NOT AI! We need to stop with this marketing bs. LLMs are statistical language models. They put words together according to statistical probabilities (it's why they are good at coding) but are pretty bad at systematic logic.

It's ironic that Chat GPT could likely write a C++ code that would beat itself at chess.

passivecool · Jun 10, 2025

Stop the press!!! A shovel does not make a good screwdriver!

tamalero · Jun 10, 2025

JayGau said:
It's because it's NOT AI! We need to stop with this marketing bs. LLMs are statistical language models. They put words together according to statistical probabilities (it's why they are good at coding) but are pretty bad at systematic logic.

It's ironic that Chat GPT could likely write a C++ code that would beat itself at chess.

this.. A.I. at this point is just as marketing as RTX.

trenton117 · Jun 10, 2025

How was the game conducted? Was ChatGPT provided with a notation for the boardstate?
Due to this quote:

`Despite being given a baseline board layout to identify pieces, ChatGPT confused rooks for bishops, missed pawn forks, and repeatedly lost track of where pieces were — first blaming the Atari icons as too abstract to recognize`

This sounds to me like it wasn't a valid experiment, comparing ChatGPT's sketchy image recognition to an actual logical processing of the board state. I'd love to see this re-produced with FEN notation or something.

DSzymborski · Jun 11, 2025

I just played ChatGPT myself. It understood the opening, but once we got past that, without specific plays to draw in, it didn't really act as a chess engine in any meaningful way. It made a questionable move on 10 (it was white) and an awful blunder on 12 and then fell apart from that point. After it made a few more blunders, I stopped.

1.e4 c5 2.Nf3 d6 3.d4 cxd4 4.Nxd4 Nf6 5.Nc3 Nc6 6.Bg5 Bd7 7.Qd2 a6 8.O-O-O e6 9.f4 b5 10.e5 dxe5 11.fxe5 Nxe5 12.Qf4 h6 13.Bh4 Ng6 14.Qf3 Nxh4 15.Qxa8 Qxa8 16.g3 Ng6

michael princ · Jun 12, 2025

ChatGPT was not engineered and trained to play chess. Surprisingly, it can play chess. But expecting that ChatGPT will have high chess-playing capabilities does not make sense.
Did you use calculator to make a game? Is it better to use calculator or a simple console like Atari 2600 to make games?

ET3D · Jun 12, 2025

For anyone interested, I'd suggest reading this and this followup. The first one shows that only a specific ChatGPT 3.5 model, not built for chatting, is good at chess, but the followup shows that apparently the problem is with the massaging of the prompts by the chat models, and it's possible to make the models play better. The person testing this doesn't have a clear answer as to what the problem is, just that some minimal prompting with example moves (even wrong) helped a lot.

Based on that, I think that ChatGPT, prompted correctly, would likely easily beat the Atari engine.

News ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

Administrator

Illustrious

Distinguished

Reputable

Curmudgeon Pursuivant

Honorable

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Distinguished

Curmudgeon Pursuivant

Distinguished

Share this page