The Game Rundown: Finding CPU/GPU Bottlenecks, Part 1

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
Status
Not open for further replies.
Very good article, just one question. I have a core i3 @ 3GHz. Could I use the dual core as a reference for Core i3? Would it be accurate? I'm not sure if the architecture is close enough between the two processors to compare.
 
WOW, it's a good read article. 2 thumbs up. b^^d
Btw, since GTA4 can utilizes more cores, how does that make 6 cores in Phenom II X6 performs?
 
"Very good article, just one question. I have a core i3 @ 3GHz. Could I use the dual core as a reference for Core i3? Would it be accurate? I'm not sure if the architecture is close enough between the two processors to compare" yea it would be about the same.
 
Sorry, But this doesn't really help much.

I think it to be wise if they would add an i7 920 or something with 3 channel RAM, so we can see Memory Bandwidth Variations also.

I mean, this is Good info, but, it needs a bit more options then just 1 CPU.

But it is BottleBeck's part 1, So, I hope they add Higher Memory Band CPU'and older CPUs and Crysis to the list.

My Ram runs at around 8gbs, and My GPU's are Bottle Necked in Crysis, And even if I disable 2 Cores on my CPU I get the same Frame Rate.

Now, This is a Pure Limit to Crysis have 2 threads?

Because Unigine 2.0 Runs Good on 4 core, good on 3, but Drops like a Rock on 2.

But, if it's the Case of 2 threads, why would other people get Higher frame rates then I do in Crysis with the same GPU's and I only get a 30 to 40% load on both GPU's.


Anyone Know what that is?

Is it do to the i7's being faster per Core? :S.










 
how much did the RAM affect, I've got 3gb of RAM a 5770 and a bad dual core, should I upgrade RAM or my proccesor
 
If the quad-core 3 GHz CPU is not fully taxed in a game, you should see lower utilization compared to the overclocked 4 GHz CPU??

typo?
 
[citation][nom]Nastyhabbit[/nom]If the quad-core 3 GHz CPU is not fully taxed in a game, you should see lower utilization compared to the overclocked 4 GHz CPU??typo?[/citation]

They're referring to utilization being less. Not a typo. The 4GHz cpu is being utilized less, or not being used as much.
 
Great article! Just kind of the article that I needed - I was experiencing severe lag while playing Bad Company 2 and wondered where the root cause lay. This article helped me see what was the problem!
 
For the GTA4 fps chart, just curious, wouldn't it be more
interesting to see how two 460s behave in SLI rather than
compare to 5870(s)? Also, it looks like this is a game which
would clearly benefit from the 1GB version of the 460,
which is usefully faster anyway.

Ian.

 
[citation][nom]Bluescreendeath[/nom]This article should use different quad cores, triple cores, and dual cores instead of just disabling cores for the i5. The i5 is a good deal faster than Core2, AthlonII, and PhenomII series in terms of clock-per-clock, so they're really not comparable.So this article is only good for determining bottlenecks for people with the i3/i5/i7 series.[/citation]

Not so, this article is about balancing GPU performance with CPU performance not comparing performance differences between GPU's or CPU's, that information is in other benchmarks.

The matched CPU & GPU here is perfect, with only 1 core active the CPU is the bottleneck (in most cases) with more active cores the GPU becomes the bottleneck, all other configurations will be relative to the results from other benchmarks.

I can compare my Athlon 7750 as being 62% as fast as the i5-750 & my HD 5750 as being 68% the speed of the GTX 460 from other benchmarks.

3DMark Vantage Overall Performance 13659 vs 8502
Sum of FPS Benchmarks 1920x1200 323.7 vs 221.3

So although my system is a 1/3 slower relatively my GPU is 6% faster then my CPU compared to the setup in this article, & considering the GTX is the nearly always the bottleneck here my system must be fairly well balanced. OH ah heck I'll have to upgrade both next time!


Great article Toms, hope we see many more like this in the Future.
 
It would be nice if you would use a game that actually put pressure on a system, LIKE FLIGHT SIMULATORS. Those aren't really "games" though, are they....Rise of Flight, for one- you want pressure, there it is.
 
I don't know about Rise of Flight, but games like FSX hammer GPUs
because they're written really badly. FSX constantly reloads data
in an incredibly inefficient manner, which is why performance
collapses when the PCIe bandwidth is restricted (most games will
run surprisingly well even with only a 4X PCIe link, but not FSX).

And btw, flight sims, if written properly, should NOT cause much
of a graphics load anyway because they have a very low depth
complexity (ie. degree to which surfaces occlude other surfaces).
Also, discrete moving objects drawn in the scene tend to be small
& far away, and thus open to considerable detail management.
Likewise, distant terrain can be adjusted in complexity. I researched
these issues a lot while adminning a RealityCentre/CAVE a few
years ago (16-CPU Onyx2 5-pipe IR2E, 28' screen, 10' CAVE).

If a flight sim is hammering a GPU, then it's been written poorly.
Trouble is, these days GPUs are so fast that many coders don't
bother writing efficient scene graphs, ie. data/detail management
is ignored and it's just left to the ever-increasing speed of GPUs
plus their expanding RAM to cope with the coding inefficiencies.

Ian.

 
(youssef 2010, assuming you're refering to my post...)

Because high-end SGI gfx always had internal status information which can be sent
back to the application - everything from memory loading, detail management
and scene culling to to how many of each size triangle strip is being processed,
texture upload rates, etc. It's part of the InfiniteReality hardware (and earlier
versions like RE/RE2) and is normally accessed using the Performer API. See:

http://www.sgidepot.co.uk/ir_techreport.html
http://www.sgidepot.co.uk/onyx2/tech_report.pdf
http://www.sgidepot.co.uk/performer.html
http://www.sgidepot.co.uk/performermanpage.txt

Callback mechanisms allow an app to respond to events within the gfx pipe. Just
press F1 within any Performer application that uses perfly and all the info pops
up via overlays. Here's a screenshot of a simple perfly demo showing some of the
info from the gfx hw that an application can access:

http://www.sgidepot.co.uk/misc/perfly.gif

Multiple gfx pipes can be used in parallel via various modes of operation, up to 16
x IR4 (160GB VRAM, 16GB TRAM), though this was just a software limit which was due to
be removed at some point, allowing scalability to at least 256 pipes, but that never
happened - SGI stopped designing gfx hw after IR4 was released in 2002 (IR5/IR6 were
planned, but never released). Here's a typical application example:

http://www.sgidepot.co.uk/onyx2/groupstation.pdf

Note that after 9/11, details of the later Onyx3900 GroupStation using IR4 gfx
were never published (the above is just the older Onyx2 with IR2 gfx), but here's
some arch info:

http://www.sgidepot.co.uk/origin3k.html
http://www.sgi.com/products/remarketed/onyx3000/ir4.html
http://www.sgidepot.co.uk/misc/3353.pdf

Anyway, the gfx functions available are the means by which applications can deliver
guaranteed fixed frame rates, normally 30Hz for older gfx from the early 1990s like
RealityEngine (see www.sgidepot.co.uk/re.html), 60Hz for IR and later versions, using
features such as DVR, with quad-buffered HD stereo available for VR applications, up
to eight HD outputs from one system.


Alas, although NVIDIA based their early gfx hw on IR (a lot of SGI gfx people moved
to NVIDIA, and the SGI manager I reported to when testing the O2 - Ujesh Desai - is
now an NVIDIA marketing manager), they didn't include these inner gfx core feedback
mechanisms into the GF designs (expensive), not even in the Quadro versions. No modern
graphics product offers such features - it's too expensive, and the market has for
the moment moved away from such custom, quality-driven, high-end solutions. These
days, SGI's i7 systems just use Quadro cards.

Ian.

 
It is always a good thing when these "MULTI-PART Reviews" have a CLEARLY labeled Link to the other Parts.
 
I would love to see a new game rundown like this. specially for World of Warcraft raids (with the most common addons used).

Now that we have LFR on wow it should be viable to use fights like ultraxion as a benchmark.
 
Please test ARMA 2 with lots of AI on the field. ARMA 2 Benchmark 2 mission would be suitable. I'd love to see how it differs from mainstream games in CPU usage.
 
Status
Not open for further replies.