News Legendary video game developer imagines a future where GPUs don't need PCs — John Carmack envisions a GPU with Linux onboard, so you would just add...

"A computer is a machine that can be programmed to automatically carry out sequences of arithmetic or logical operations (computation)"

You can call it a desktop, handheld, console or smartphone...but it's still a computer.
 
  • Like
Reactions: Nitrate55
This just sounds like rambling and should never have been put out there for anyone to take seriously. He’s describing something that already exists. He’s just describing a mini PC. Now, one could make the argument that he means he wants a mini PC with more graphics focused computational power vs. the general/IO focused power.
 
Isn't this what we had in the beginning: a processor and a rudimentary OS, handling pixels themselves in shared memory?

Not saying this is bad, just always interesting when things come full circle. :)
 
  • Like
Reactions: Nitrate55
You talk as if this is some dystopian future we've never seen before, you ever hear of the X3D chips from AMD? that's pretty much as close as you can get to what you're describing, an all in one unit that controls the graphics the processes everything. AMD already did it and I'm an Intel fanboy by definition.
 
We did see Doom running on a GPU not all that long ago using only OpenCL - no CPU at all. (and Vulkan)

Why couldn't the Linux kernel look for OpenCL first and then go from there? Bypassing a CPU and/or not needing one at all fully seems real.

(I do not mean a SOC, which has an onboard general purpose CPU. It does not appear that Carmack meant a SOC either)
 
  • Like
Reactions: renz496
We did see Doom running on a GPU not all that long ago using only OpenCL - no CPU at all. (and Vulkan)

Why couldn't the Linux kernel look for OpenCL first and then go from there? Bypassing a CPU and/or not needing one at all fully seems real.

(I do not mean a SOC, which has an onboard general purpose CPU. It does not appear that Carmack meant a SOC either)
And you still need all the other subsystems.

I/O for kbd/mouse/controller.
Power from the wall or battery.

Call it what you want, but it is then no longer 'just a GPU'.
 
  • Like
Reactions: Nitrate55
He basically wants a "LARGE SoC".

One that is Console-Esque with a HUGE DIE.
No, that's not what he's saying. Go back and read more carefully.

He doesn't deny the importance of strong CPU cores, for gaming performance, which is something GPUs lack. He's merely pointing out that the cores embedded in current generation GPUs are already general and capable enough that they could at least provide diagnostics without the card having to be plugged into a host system. Since DisplayPort can already tunnel USB, you wouldn't even need to add any new connectors. You just plug in the display cable, the aux power connector, and you should get some kind of basic menu.

If you think about this sort of capability a bit further, the graphics card could even help you in diagnosing a PC problem. If the graphics card gets powered on without receiving PCIe negotiation from the host CPU, it could show some kind of message like "CPU offline". That would at least save you from staring at a blank screen and wondering whether it's a CPU or GPU problem.

@USAFRet is right that, when you upgrade the cores inside a GPU so it can be used standalone "in anger", it just turns into an APU like what consoles already have. That's obviously not a new thing and obviously not what he's talking about.
 
Last edited:
I'd by in favour to bring the CPU/dram components to the GPU rather than the opposite.

On a RTX 4090, you have way enough space to include a nice arm CPU, and a lot of space to plug 4 M2 SSD directly on it easy.

We could finally get ultimate PCs in a very reduced volume, with minimal latencies as a bonus, since no PCI bus is needed anymore.

Cases could be basically one strong PSU with ports on top of which you plug your all-in-one PU, and that's it.
 
I'd by in favour to bring the CPU/dram components to the GPU rather than the opposite.

On a RTX 4090, you have way enough space to include a nice arm CPU, and a lot of space to plug 4 M2 SSD directly on it easy.

We could finally get ultimate PCs in a very reduced volume, with minimal latencies as a bonus, since no PCI bus is needed anymore.

Cases could be basically one strong PSU with ports on top of which you plug your all-in-one PU, and that's it.
You still need a PCIe bus. Just because it doesn't have a removable connector that doesn't mean it isn't there. How do you think M.2 SSDs are connected to the CPU or motherboard chipset? They use a PCIe bus. Even integrated GPUs are done ted to the CPU using PCIe or similar.

So what you want is to switch to non modular systems where you buy it and that is it, when you want to upgrade you need to replace the whole thing? Basically the apple approach.
 
We did see Doom running on a GPU not all that long ago using only OpenCL - no CPU at all. (and Vulkan)
Here's the article:

It absolutely does require a host CPU to run. The point is that the main game thread was ported to run on the GPU's normal vector cores (not the embedded control processor Carmack is talking about, in this article). It's not using them efficiently, since most of that game logic is serial and scalar. I guess it was just done to test or demonstrate the capabilities of LLVM's backend for GPUs.

Why couldn't the Linux kernel look for OpenCL first and then go from there? Bypassing a CPU and/or not needing one at all fully seems real.
I assume you've seen any number of CPU benchmarks showing how CPU performance affects games, no? Games depend on a partnership of fast CPU and GPU cores, each doing the sort of tasks for which they're best suited. Game developers are free to use either, so the fact that they're still so dependent on CPUs should tell you something.

Consoles are the ultimate example of this, because you have custom hardware, custom APIs, and (somewhat) custom software. If powerful CPU cores weren't key to game performance, the current gen consoles either wouldn't have used Zen 2 cores or at least so many of them, and instead would've devote more die area to larger GPUs.

Also, GPUs don't have filesystem drivers or network stacks. So, they're dependent on a host CPU & OS for those functions. Even DirectStorage relies on the host CPU & OS to do all the setup and management of GPU-driven disk I/O. Sure, if you could run a full-blown OS on the GPU's embedded control processors, then it could theoretically be self-sufficient, but it sure wouldn't be fast.
 
I'd by in favour to bring the CPU/dram components to the GPU rather than the opposite.

On a RTX 4090, you have way enough space to include a nice arm CPU, and a lot of space to plug 4 M2 SSD directly on it easy.
You've exactly described the Nvidia SoC used in Nintendo Switch, where the only difference is one of scale (and the lack of M.2 slots, although the SoC has PCIe lanes).

We could finally get ultimate PCs in a very reduced volume,
You mean like a gaming console? The current XBox and PS5 have APUs and GDDR6 memory, so exactly like what you're saying.
 
No, that's not what he's saying. Go back and read more carefully.

He doesn't deny the importance of strong CPU cores, for gaming performance, which is something GPUs lack. He's merely pointing out that the cores embedded in current generation GPUs are already general and capable enough that they could at least provide diagnostics without the card having to be plugged into a host system. Since DisplayPort can already tunnel USB, you wouldn't even need to add any new connectors. You just plug in the display cable, the aux power connector, and you should get some kind of basic menu.

If you think about this sort of capability a bit further, the graphics card could even help you in diagnosing a PC problem. If the graphics card gets powered on without receiving PCIe negotiation from the host CPU, it could show some kind of message like "CPU offline". That would at least save you from staring at a blank screen and wondering whether it's a CPU or GPU problem.
So he wants the Video Card's GPU to have it's own micro SoC like functionality.

In-case the CPU isn't connected or there is a MoBo issue, show something on-screen.
 
Yeah this would be lovely. The nvidia GPUs already have an embedded RISC-V (which apparently had been unused... the recent 'open source' nvidia driver has just moved some of the 'binary blob' stuff into running on the RISC-V instead of inside the Nvidia driver on the host system.). Stick some ARMs on there (which would take almost no die space in comparison) and away you go! Don't know that they'll actually do it but I like the concept.

Those who say 'well that's an SoC'... you're right, but most current SoCs don't have as nice a GPU in built.
 
So he wants the Video Card's GPU to have it's own micro SoC like functionality.
It already does. Modern GPUs can even have multiple, general-purpose cores. Nvidia has been using RISC-V, for a long time. They also even have MMUs. He's suggesting that it probably has capabilities sufficient even to run a general purpose OS, like Linux.

In-case the CPU isn't connected or there is a MoBo issue, show something on-screen.
Yes, that's how I read his statements.
 
You've exactly described the Nvidia SoC used in Nintendo Switch, where the only difference is one of scale (and the lack of M.2 slots, although the SoC has PCIe lanes).


You mean like a gaming console? The current XBox and PS5 have APUs and GDDR6 memory, so exactly like what you're saying.
Yeah, instead of buying a GPU and a CPU, I'd prefer to buy a full SOC system.

It would be a bit less modular, but it would be more optimized and more compact.

I often change both my CPU and GPU at the same time because I get good matching ones from the same generation in the first place, anyway.

And often, you also need to change the motherboard, so why not having an optimized all in one component then ?

Just updating my ssd as a separate component is fine by me.

Nvidia does kind of that with its AI platform, actually.
 
Yeah, instead of buying a GPU and a CPU, I'd prefer to buy a full SOC system.

It would be a bit less modular, but it would be more optimized and more compact.
You can already buy plenty of mini-PCs like this. There are also (usually) mini-ITX boards available with laptop SoCs soldered down and still some (but fewer) PCIe lanes and SO-DIMM slots.

So far, they have comparatively weak GPUs. One of the main problems faced by iGPUs is that DDR memory has much less bandwidth than GDDR memory. CPUs don't normally support GDDR memory because it's more expensive, lower density, less efficient, and must be soldered down.

However, Apple got around this by going wide with on-package stacks of LPDDR memory. Lunar Lake did the same thing, but doesn't get much benefit since it's still just using a 128-bit data path. In January, AMD will announce Strix Halo, which features a substantially ugraded iGPU and on-package 256-bit LPDDR5X memory.

So, if you especially like the idea of mini-PCs, then I'd suggest waiting to see how Strix Halo performs and maybe getting a mini-PC based on that. I think they won't be cheap, especially for the amount of iGPU performance they provide, which I'd estimate at approximately equal to that of a PS5. To put it in dGPU terms, that would be somewhere around or slightly above a RX 7600 XT.

Just updating my ssd as a separate component is fine by me.
It sounds like you might enjoy a deep dive into the world of mini-PCs.

One thing that bugs me about prebuilt machines is that I can't really do anything about their cooling solution. I bought an Alder Lake-N mini-ITX board and built my own PC around it, because it gave me more control of what case to use. In the end, my biggest issue was with the inadequate CPU heatsink/fan the motherboard vendor selected, but I was able to get an alternate heatsink from them and cool it using a case fan. The case is 1.3L, which is only slightly bigger than a lot of NUC-like machines.
 
So, I found this block diagram of the Tegra Parker (TX2) SoC from 6.5 years ago that's very similar to to the one used in Nintendo Switch. I think this is particularly interesting, because it shows the role of these embedded microcontroller cores, even when you have general-purpose CPU cores on the same die!

Tegra_Parker_Block_Diagram-1.png


See the block labeled Cortex-R5 BPMP (Boot, Power Mgmt.) ? That's an example of the kind of cores Carmack is talking about. You can also see a Cortex-A9 block that's devoted to audio processing.

Perhaps more interesting is this recent video describing the role that RISC-V cores currently play inside of Nvidia products:

A key quote from that video (just 2 minutes in):

"Any Nvidia chip (in 2024) has RISC-V processors and I would say it goes from 10 RISC-V processors to maybe even 30 or 40 per chip."

From the linked slide at 4:19:

"We have 3 uses for RISC-V cores:

Function-level control:
  • Video codec
  • Display
  • Camera
  • Memory controller (training)
  • Chip2chip interfaces
  • Contetx-switch
  • ....

Chip/System-Level control:
  • Resource management
  • Power management
  • Security

Data Processing:
  • Packet routing in networking
  • Activation and other DL network layers in DLA (not GPU)

He drops a couple other juicy details, like how the networking business unit (I'm guessing this is Mellanox) uses RISC-V for packet processing and mentions a RISC-V core Nvidia developed with 1024-bit vector extensions (I'm guessing for AI processing). He also talks about their standard RISC-V core for the above embedded applications, which he says is currently dual-issue out-of-order.

At about 12:20, he introduces the GSP (GPU System Processor), which is perhaps primarily what Carmack was thinking about.

P.S. I was somewhat stymied by his use of the term "multi-heart", but then I remembered he's from Nvidia and they have a somewhat unconventional definition of "core". So, that explains why he talks about RISC-V "processors" and "hearts", not just "cores".
: D
 
Last edited:
  • Like
Reactions: P.Amini
Nvidia work on it since the debut of CUDA, they first make the CUDA program able to loop without signal of the CPU, they added a library for the GPU to use network by a network card on PCI-E. It is used in super calculators to inter connect GPU.
One of the reason why SLI has been removed is because we can now do it with PCI-E without external bus. Games can be programmed to use multiples GPU but developpers doesn't bother.
More recents foraway like Direct Memory Access (the GPU can access RAM and NVME without the CPU) come from that.

This is technically not equivalent to an APU: when you make a video game you make a program running on the CPU containing notably the game logic and preparing the graphic ressources, they are pulled from the memory and sent to your graphic card, you made another program on your graphic card to render the scenes. Some rendering programs can known when they need more accurate ressources and can directly access the system memory instead of listing the ressources for the CPU to send it as was done before.
Here we are speaking of a big program like a Graphical Operating System running on the GPU without another Processing Unit which simplify the development.
This is basically what intended Intel Larrabee but the other way arround: graphic instructions where an extension of x86. This make the system with only general purpose processors and able to run games. It was canceled due to poor graphic performances.