[SOLVED] Why GPU's can't just render more frames?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

Calicifer

Prominent
Apr 1, 2022
31
2
535
Greetings,

I'm stuck on a problem which for me seems fundamental to having enthusiast GPU. You are supposed to buy powerful GPU to have high framerate. However, some games simply refuse to render more frames than a certain arbitrary number. I had checked my v-sync settings in game and in Nvidia control panel. I turned them off and set GPU to render as many frames as it can. However, my GPU is still not working at 100% in Fallout New Vegas. I render 50-70 FPS at the highest possible settings at half load.

This is the problem which I noticed in a lot of older games. GPU simply does not render those insane framerates required for high refresh rate monitors. Why is that? Shouldn't I be able to control how hard my GPU works for any particular game? Is poor framerate performance solely dependent on game engine and how much it can give work for GPU?
 
Solution
This is the problem which I noticed in a lot of older games. GPU simply does not render those insane framerates required for high refresh rate monitors. Why is that? Shouldn't I be able to control how hard my GPU works for any particular game? Is poor framerate performance solely dependent on game engine and how much it can give work for GPU?
Strap yourself in for a long one.

The answer to this question comes in two parts. The first is that the GPU cannot do any work unless the CPU tells it to. You can think of the CPU compiling "graphics order form" from an application, then submitting it for the GPU to render. This "order form" contains things like where the position of objects are, what sort of operations to run to generate...

Calicifer

Prominent
Apr 1, 2022
31
2
535
Interesting. I'm glad to hear that I'm on a same page.

However, many games do not generate more frames even when my system seems to be under-utilised. If game engines in general can render more frames and my CPU and GPU is underloaded, then why they do not? This is what I want to discover and I'm starting to think that there isn't an easy answer to that. Even in Fallout New Vegas. With new mod, now my system bottlenecks when I'm wandering through the wasteland. However, when I go to cities or buildings, I see similar behaviour as before. My system is under-utilized while frames still do not go as high as game engine supposedly can support.
 

Eximo

Titan
Ambassador
My 3dfxVoodoo2 16Mb card was amazing for 3d graphics back in the day, compared to everything else, it was so advanced in design nvidia bought the company, just for the rights. Compared to today's cards, it's an ant trying to stare down an elephant.

Voodoo 2 was only available in 8MB and 12MB variants. Had to bump up to the Voodoo 3 to have 16MB (Unless you had two Voodoo2 in SLI, I did for a while)

Nvidia bought the company because they had already defeated them in the market with the Geforce 2. Voodoo 4 and 5 were late to the market and were overpriced. Some of their technology did end up in Nvidia cards, but not a lot. They did keep the branding for SLI, as an example, but 3DFX and Nvidia's versions operated in completely different methods.

I should check if my Voodoo 5 5500 still works, haven't turned in on it least two years. Going to have start worrying about capacitors here soon.
 
  • Like
Reactions: hotaru.hino
Simplified way a PC plays a game:
  1. CPU figures out what needs to be in a given frame (imagine a rough sketch) based on user and game world input. Issues draw call to GPU to tell it what to render.
    • Think of this as positional tracking. How many things are moving (or have the potential to) from one frame to the next.
  2. GPU receives draw call and makes a pretty picture. Sends to monitor when complete.
    • This is detail. Object is now in new position, how has lighting/shading/etc changed. Re-draw object per game/quality rules.
  3. The GPU can't do any work until the CPU tells it what to draw. Raising graphics settings and/or resolution increases the complexity of the GPU's job, making it take longer to render each frame. Lowering settings decreases the complexity of the GPUs job making it take less time to render each frame.
  4. If the GPU finishes rendering a frame before the CPU has finished figuring out what the next frame should contain, the GPU has to wait (<100% GPU usage).
  5. Based on #3 & #4, you should be able to optimize for 90% or greater GPU usage (depending on a game's CPU stress and the CPU/GPU balance of a system)
  6. CPU usage is usually reported as active time across all available threads of a CPU. Most* games don't leverage more than....6-7 threads. Monitoring CPU usage isn't really useful. It can be misleading, especially in today's high core-count CPUs.
What you're breezing by here is:
  • Your system isn't exactly "high end" by modern standards.
  • A game engine can only cram so much stuff through your CPU at once. The CPU can't just "half-calculate" everything. Likewise, the GPU needs to render each frame with the given details outlined by the draw call and quality/resolution settings. Trust me, you wouldn't like the outcome if they didn't.
    • In ANY game, in a specific setting, there are a certain amount of things that the CPU needs to track/calculate in a frame to be sent to the GPU. That isn't a fault of the game engine.
      • What CAN be a "fault" of the game/engine is not being able to utilize all the threads on a CPU. You need to understand that it's incredibly time-consuming from the developer to scale thread utilization. And although we've seen a recent resurgence in CPU core-count increases, scaling isn't linear. That's why frequency/IPC is still king when it comes to FPS. The devs also have to prioritize/balance their time in game world design and performance optimization. Money and time are never unlimited. If a game/engine was created in a time when 4C/4T CPU was the most common, target that with the most effort. "Wasting" your time building a game/engine that can use....32 threads when the most common gaming system is still 4-8threads is silly. Devs have a LOT of other, more important, things to do.
 
Last edited:
If I understand correctly, under-utilized hardware is a fault of a game engine for not being able to have enough commands in a small time frame. My desire is to have my system running at 99%, pushing out more frames for high refresh monitors. However, I can't just get more FPS, because system will be rendering same image multiple times. If my hardware is under-utilised, it highlights that my system is more powerful than game engine is and it can't possibly keep pace with my system and my expectations.
And that's correct. Software design limitations can prevent hardware from being 100% utilized. The only way to get around this is to redesign the software so it can work with modern hardware. However, this isn't something as easy as raising the game loop rate. Running the game more times per second can have unintended consequences for other aspects of the game (like physics, as I mentioned before).
 
In my mind, I imagine that CPU is a lot faster and it is not usually a bottleneck in games
No? Then explain this chart: (and this isn't even an extreme example since these are all very powerful 1-2 year old CPUs)
RhRj3qW4xbKSCjG5kEkgtT-970-80.png.webp


Ultimately, some games are more CPU-bound than others. The FPS differences between various CPUs does change by game. Also, online gaming carries an extra layer of CPU load in having to track all the other IRL players on the map/screen.

Bonus chart. Some further conclusions to be drawn from this like thread usage, IPC, etc.
53Uh4As5BwhJZr7p668UW7-970-80.png.webp


As mentioned in my post #30, we're still seeing most games cap out around....6-8 threads (helps if you normalize for IPC and clock frequency). You may call that lazy programming, but that's just the real world that operates on time and money.

Prescriptive evidence: I recently upgraded from a i7-3770 to a 5600G. Despite no real* change in thread count 8t - 12t, or frequency, and even with the same 3060Ti, my frame rates doubled or more in many games.



I can't get a good read on your motives with this thread. Is it to understand how game engines work, or to be salty and pass blame that your 9 year old 4c/4t CPU is obsolete? If you're just trying to learn, this thread can become much more informative.
 
Last edited:
  • Like
Reactions: drivinfast247
My 3dfxVoodoo2 16Mb card was amazing for 3d graphics back in the day, compared to everything else, it was so advanced in design nvidia bought the company, just for the rights. Compared to today's cards, it's an ant trying to stare down an elephant.
3dfx was already on its death throes when NVIDIA purchased them. And at that time, NVIDIA and ATi had more advanced GPU designs. 3dfx stubbornly chose to not invest in Hardware T&L which ended up being their downfall. I would say sticking with glide for as long as possible would contribute to some of that as well.
 

Calicifer

Prominent
Apr 1, 2022
31
2
535
Ahh. That's a matter of perspective. Back when the 760 was new, it was still a seriously decent card, considering the complexity of the games it was used on. A 760 for CSGO was great. If stepping up from HD igpu graphics, it's a night and day difference.

My 3dfxVoodoo2 16Mb card was amazing for 3d graphics back in the day, compared to everything else, it was so advanced in design nvidia bought the company, just for the rights. Compared to today's cards, it's an ant trying to stare down an elephant.

So I'm not surprised by Op's reaction, coming from igpu to a 1030GT.

I checked what a market was back then. I had an option of HD 7000 series card which were cheap solution. R7 200 series card seemed to be available either. Then go to cards were GTX 600 series. GTX 760 was deemed as "semi-affordable performer" by some reviewer back then. It certainly was in that environment, because there wasn't many better options. Pre-built computers at my price range came only with GTX 660 GPUs at best. I splurged on GTX 760 with my stretch budget and it was a great investment. I remember GTX 770 being beyond my budget. GTX 780 on the other hand costed little more than twice that of GTX 760. At the time it was best option and card can easily handle all games from that era. As I check games from that era, I'm surprised how high end this graphics card really is. It often is an overkill for most games.

I built my system in early 2014. Those were dark times. I associated Radeon HD series with those retro cards. I had Geforce 6600. At the start of this year I had upgraded to GTX 780 Ti, but was disappointed with this card. It had terrible thermal performance, bad bios settings. It lacked Directx 12 support and 3 GB of ram is simply not enough. So, I downgraded again to my previous GPU due to sentimental reasons.
 

Calicifer

Prominent
Apr 1, 2022
31
2
535
No? Then explain this chart: (and this isn't even an extreme example since these are all very powerful 1-2 year old CPUs)

I can't get a good read on your motives with this thread. Is it to understand how game engines work, or to be salty and pass blame that your 9 year old 4c/4t CPU is obsolete? If you're just trying to learn, this thread can become much more informative.

Maybe it is due to poor game engine scalability. Game engine gives too much work upfront rather than properly sequences it. For example, it needs to render all those frames and calculate meta data of how objects outside of your view moves. Game engine instead of making sure that information which it needs to render now is processed first, instead it mixes up information which is unnecessary together. Since CPU has access time, processing speeds and limited cache, it can't process all this information immediately. This is when render latency kicks in. If on the other hand these two tasks would be separated to core 0 and core 1, CPU could render both of those tasks without a delay.

In my view, game engines are not ideally optimised, this is why we are seeing under-utilized CPUs being able to offer slightly different performance. A good example is Ryzen 7 5800X3D vs Intel's i9-12900ks. Ryzen chip has more internal memory. It is able to load more tasks up to its memory where it can access to necessary information faster and thus offer better performance despite being a weaker chip overall.

So to conclude, I think that it is an issue of trying to process too much time sensitive information at once and under-utilizing hardware. It is quite uncommon to see CPU as a bottleneck as in general, CPUs tend to be faster than games need them to be and it is GPU which struggles to keep up with tasks.
 
Last edited:
  • Like
Reactions: Karadjgne
Maybe it is due to poor game engine scalability.
Both charts I linked are conglomerate results from multiple games.
Game engine gives too much work upfront rather than properly sequences it. For example, it needs to render all those frames and calculate meta data of how objects outside of your view moves. Game engine instead of making sure that information which it needs to render now is processed first, instead it mixes up information which is unnecessary together.
Every frame needs to be have objects calculated and positioned on it. The more frames you generate per second, the faster you have to complete those calcs before the next frame can be processed.
Since CPU has access time, processing speeds and limited cache, it can't process all this information immediately. This is when render latency kicks in.
Yes, laws of physics apply. Computing speed also increases every generation (and has since your i5-4670 in 2013)
If on the other hand these two tasks would be separated to core 0 and core 1, CPU could render both of those tasks without a delay.
Yes, this is multi-threading. It's been a thing for a while. As I mentioned with scalability though, there are only so many chunks you can chop a task into and pass it efficiently through a CPU. Also, all workloads scale differently.
In my view, game engines are not ideally optimised, this is why we are seeing under-utilized CPUs being able to offer slightly different performance.
Again, it's exponentially more difficult for the devs of the game engine, as well as the devs of the game to scale performance to today's high core count CPUs and get meaningful performance gains. Diminishing returns. On the flipside, if a game/engine is optimized for, say, 16 threads and doesn't scale DOWN to the most common 4-8 threads, then you're alienating the majority of your customer base that can't afford a $500+ CPU. Because of this, it has been and always will be the most efficient to rely on per-core performance for inter-generational gains.
A good example is Ryzen 7 5800X3D vs Intel's i9-12900ks. Ryzen chip has more internal memory. It is able to load more tasks up to its memory where it can access to necessary information faster and thus offer better performance despite being a weaker chip overall.
  1. We don't have "official" benchmarks for the 5800X3D (aside from AMD's slides) yet.
  2. Given that I already said that most game engines don't utilize more than 6-8 threads, the reason you get more performance from 1 CPU to the next is a result of how fast it can push data through a given core/thread. That can be done in many combinations of ways (including but not limited to):
    1. CPU cache (L1/L2/L3) size and architecture
    2. CPU frequency - Given the same CPU, 5GHz pushes data through faster than 2GHz.
    3. IPC (instructions per clock) - You see this by setting to comparison CPUs (ie 5800X3D vs 12900KS) to the same frequency and noting their performance differences.
So to conclude, I think that it is an issue of trying to process too much time sensitive information at once and under-utilizing hardware.
But...EACH frame is time sensitive. And you want more FPS. Not sure if you're suggesting "predictive" rendering for future frames and/or retaining as much of the previous frame as possible and only calculating the differences. I have to imagine that's already being done since it's a simple concept. Unfortunately the game can only predict so much. It doesn't KNOW that you're going to shoot that window or just look through it until you click your mouse. The problem with predictive rendering is just that, it introduces a whole swath of calculations that need to be done for scenarios that may or may not actually occur, hence adding unnecessary overhead and impacting "optimization".
It is quite uncommon to see CPU as a bottleneck as in general, CPUs tend to be faster than games need them to be and it is GPU which struggles to keep up with tasks.
I showed you charts that disprove this statement. Likewise, per my earlier post, you yourself can disprove this statement by lowering in-game quality/resolution and see a higher FPS as a result.
 
Last edited:
  • Like
Reactions: Karadjgne

Calicifer

Prominent
Apr 1, 2022
31
2
535
I'm not sure what you are trying to prove here or if we even are supposed to discuss an off topic question in a thread which was already answered.

Benchmarking differences are due to how various processors work. They often are underloaded, but each of them will offer slight difference in performance. Mostly due to game engines placing workloads not in an ideal way and different processors being able to complete executions a lot quicker which in time sensitive applications like video gaming is more important than sheer power. That is why CPUs can be under-utilized and still affect performance. However, they are rarely a bottleneck in gaming.

If you are not sure why there are differences in benchmarks between various processors, you can create a thread asking this exact question and I'm sure that people will be more than happy to answer it for you.
 
I'm not sure what you are trying to prove here or if we even are supposed to discuss an off topic question in a thread which was already answered.
If you are not sure why there are differences in benchmarks between various processors, you can create a thread asking this exact question and I'm sure that people will be more than happy to answer it for you.
It seems (to me) that YOU still aren't understanding things correctly, which is why I'm continuing to provide information and extrapolate off of what others have said. The questions I've posed to you are to get you to re-evaluate the accuracy of your statements, not because I don't know the answer.

I interpreted that you were looking for a better understanding of how a computer "plays" a game so I spent a considerable amount of time formulating posts to help. It seems you just want to pass blame on game engines not being well optimized without knowing why your i5-4670 can't keep pace with a 12900K.

I haven't even gotten into consoles vs PCs, "porting" games from one to the other, and various APIs. To be frank, following your response, I have little desire.

/end
 
Last edited:

Calicifer

Prominent
Apr 1, 2022
31
2
535
It does not seem to me that you actually wanted to help me seeing as your first response to me was just an implied insult. To make it worst, you did not bothered to read thread as you did not seem to be aware of the context and my behaviour in it. You just went straight ahead and implied that I'm trying to cope that my computer is not good enough to play games. That is incredibly simplistic and frankly insulting take on the whole situation and then you wonder why I'm not keen to discuss with you anything.

Then you proceed to ask random questions which are not related to a topic at hand. When I begrudgingly answered them, you proceed to confront me on every statement. However, you did not corrected me on anything, but instead only expanded my explanation and called that "correction" even if we were saying the same things. You also proceeded to make mistakes in your own response for example not knowing about 5800X3D benchmarks and performance.

The problem here is not me and I have no further interest in interacting with you.
 
EDIT: Bonus round - why do some game engines have a frame rate limiter? An example was some of Bethesda's games had one of say 60 FPS. I forget if their engines had a problem with this, but in some cases, setting it higher would cause issues for the game. This is due to physics, including things like collision detection and whatnot.

Essentially, some engines/games tie their processing into sequential blocks, and thus all run at the same rate. And sometimes when you start to go above 60, 120, or 240 FPS...things haven't been tested very well and things stop working as expected. Recent examples would be the new DOOM series and how momentum can be maintained to super-jump due to a quirk of the physics engine at high FPS.

Also as you noted, some DOS titles in particular outright break at high FPS as they were designed with specific CPUs in mind; hence why many PCs of that era had "Turbo" switches that essentially made the PC run at the 386's clock rate for compatibility reasons. One I know of offhand is Wing Commander, which runs far too fast on anything much faster then a 386 as it ties the game engine directly to clock speed.
 
As mentioned in my post #30, we're still seeing most games cap out around....6-8 threads (helps if you normalize for IPC and clock frequency). You may call that lazy programming, but that's just the real world that operates on time and money.

This is expected. Remember that most tasks are inherently not parallel; your maximum performance gain is bound by how many parallel tasks you can create (Amdahl's Law). The biggest offender prior to multiple cores becoming mainstream was the GPU rendering process, where only one thread could write to the display and was entirely a serial process. DX11 changed that (with some limitations), and you see that process being made into multiple smaller chunks. But that's only a piece (albeit a big one) of the overall software, but most of what remains is relatively low-work tasks (Input, Audio, and most game logic are shockingly low-usage).

Also note most games will run north of 100 threads nowadays; the issue is only maybe a half dozen do any real amount of work.

Finally, consider there is an overhead to creating more threads and it's been known since the late 70's that there's an effective limit for most non-massively-parallel tasks where you start loosing performance rather then gaining it by creating more (and FYI: the number was around 8).

So yeah, while 32-cores on a CPU is nice, unless you are running multiple applications side by side or doing work that is explicitly parallel for essentially infinite threads, you will never use half of them.
 

Karadjgne

Titan
Ambassador
Cpu is still mostly serial tasking, there's one master thread that's responsible for total combination and shipping to gpu, and several secondary threads that deal with compiling, decompiling, object associations and assignments, physics, vector analysis, Ai etc. Then there's other threads dealing with multi-player input, windows, discord, steam etc. All of which are pretty much single thoughts on threads.

Gpu is the opposite. It's got thousands upon thousands of cuda/rtx or raster cores that work in parallel, following the instructions sent by the cpu.

And that changeover in behavior is dictated by the drivers, responding to game code. It'd be nice if more games used spillover cores, but that just adds to the complexity and size of the game, another reason why sli/cf/mgpu is all but non-existant. Steam and other hosts want games as simple and small as possible, cuts down on bandwidth use, makes downloads faster and cheaper overall.

It's not laziness. Devs work hard but are almost always under a time crunch, with little to no support and half a dozen last minute changes as bugs and cheats etc rear their ugly heads in the betas. So simple is good, complexity is bad. To everyone in the process before you play it.
 
Also as you noted, some DOS titles in particular outright break at high FPS as they were designed with specific CPUs in mind; hence why many PCs of that era had "Turbo" switches that essentially made the PC run at the 386's clock rate for compatibility reasons. One I know of offhand is Wing Commander, which runs far too fast on anything much faster then a 386 as it ties the game engine directly to clock speed.
No, PCs back then had turbo for the same reasons as today, so you won't have to deal with all the heat, noise, and power draw of full speed just to edit a text file.
Making PCs slower for old games was just a side effect.
 
No, PCs back then had turbo for the same reasons as today, so you won't have to deal with all the heat, noise, and power draw of full speed just to edit a text file.
Making PCs slower for old games was just a side effect.
It had nothing to do with that. Computers at the time sipped power compared to computers of today, about 18W total if the disk drive was being used. And most computers didn't even need a heatsink until the early 90s when Turbo buttons were falling out of style.

It was simply a hack job so applications that relied on the timing of the original IBM PCs would work properly.
 

Calicifer

Prominent
Apr 1, 2022
31
2
535
As this is related to the same topic, I'm curious about old games. I'm currently playing Dungeon Siege 2 at about 20-30 FPS. A lot of games seemingly just fail to work well on newer systems as FPS only works well in loading screens. I do wonder what exactly breaks on older engines. They still work on their own, but do not render as many frames as they did before on newer systems.

Btw: I do play this game on a lot higher resolution than it was intended and had installed some common community fixes. Game runs terribly, but I do not mind, it is fine. I'm more interested of what can potentially break in an old game that it runs on new systems, but runs terribly. Does windows do some sort of auto-emulation which kills frame rate? Do I have to bring back my old XP computer with Geforce 6600 from my childhood to properly play games from my childhood?
 

Eximo

Titan
Ambassador
Direct X API calls should be intercepted and translated, DX9 isn't all that old. In fact I usually consider pre-DX9 to be the weird middle ground where games tend to not work without emulation. Could be security differences in Windows. You would probably want to run it in compatibility mode for XP and launch it as an administrator.

It could be that forcing the engine to account for more things on the screen at once causes issues. Have you tried running it at the original supported resolutions?
 

Calicifer

Prominent
Apr 1, 2022
31
2
535
I tried XP compatibility mode, running at original resolution and running with admin privileges. Nothing helps, I will check more guides on that game. Internet being what it is, merely following one "the ultimate, complete, all included guide, recently updated" guide usually leaves quite a few things out. I had similar issue with New Vegas where ultimate bug fixers do not solve reasonably mainstream bugs with quests and you have to download individual fixes. In a same way, just following one guide which is meant to be 'go to' solution might leave out important details.

I'm looking at the moment at the other guide and there is something which I found interesting:

"Dungeon Siege 2 is an old game and it cannot handle modern graphics card correctly. Sometimes it works and sometimes not.
The game was designed to run with hardware of that time and believe me or not: They have hard coded settings for each graphics card!
This leads to problems with unknown new hardware. Dungeon Siege 2 uses a fall-back configuration for those unknown devices and this results in glitches, black screen, etc. "

I had checked game config exe and program considers my GPU not meeting minimum requirements. It might be due to developers hard coding their game to individual cards and now we have a mess which we have it here. I shudder to think how much work it would take to make a remaster version of Dungeon Siege II as it requires some heavy-duty re-programming of game's engine.