Core i7 vs Phenom II

Page 4 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

randomizer

Champion
Moderator

That's what I was putting under the banner of "productivity" applications, with the exception of F@H, which is something else entirely unless you work for Pande Group, then it becomes a productivity app. :D


I'm under the impression that you still won't get a 100% fair test. What you will get is a 100% satisfying test. where everyone else will call it biased.



Yes, because AMD have put zero work into the current client. Pande Group make the scientific cores, not the clients.



He's a fanboy... of stuff :eek:
 

randomizer

Champion
Moderator

Well ignoring load times, these benchmarks clearly show the Q6600 falling behind, just not so much in the SYSmark 2007 synthetics: http://www.anandtech.com/bench/default.aspx?p=53&p2=80

I'm not sure where you're looking.
 

jennyh

Splendid
I'm talking about actual perception random - what I feel while I'm using each machine.

This isn't something that can be seen in benchmark bars. That's why I'm thinking about getting an i5 or an i7 for test purposes.

You dont need to look too hard to find benches of a Q6600 beating a 940 BE Phenom II btw. It is total rubbish - the 940 BE is noticably better in every way. Stuff like that just hardens my resolve and makes me believe that intel are buying reviews.
 

jennyh

Splendid
As an example..I'm back to using my single 4770 in my Phenom II rig. 4870 in my Q6600.

Playing WoW on the exact same settings, my 4770 with the Phenom II is smoother. WoW is known to be more cpu biased but to actually see the difference between each using a much cheaper graphics card...
 

randomizer

Champion
Moderator
Oh that's right, the "feely" test. I forgot about that one. I can't say a Q6600 ever "felt" slow, or that my i7 "feels" faster. The only time it "felt" faster was when I put in an SSD. It has to be something that can be benchmarked or it doesn't exist outside of your head. You just need to work out what could possibly be causing it so that you can test it.

As for Intel buying reviews, that's a great last-resort bandwagon argument that should only be used when you can't formulate an argument with evidence. There is absolutely no evidence that Intel, or NVIDIA for that matter, pay for reviews. There's plenty of evidence for reviewer error and incompetence though.
 

jennyh

Splendid
The one about the i5 benching/review on anandtech.

If you recall, I made some noises about Gary Kay(?) saying that the i5 'felt better' or thereabouts. Not just better than the phenom, but better than the i7 too. You agreed with me then - basically the reviewer was seeing what he wanted to see, feeling what he wanted to 'feel' about the i5.

Maybe it's actually true, I dunno. My phenom II feels like it's streets ahead of my q6600 in just about everything. There have been a few exceptions however - the Q6600 was far better playing Fallout 3 for example (q6600 with 4870 vs phII 940 with xfire 4770's - the Q6600 played much better for some reason).

It's difficult to actually explain it tbh, but it is something I notice.

 

randomizer

Champion
Moderator
Can't say I remember that one. There's a few Anandtech reviews where the reviewer said X CPU felt better than Y CPU, but as far as I can remember none of them could quantify it. Things can't just "feel" better, there has to be a measurable reason, or it is FUD.

EDIT: Actually, I think I do remember that thread now.
 

ElMoIsEviL

Distinguished

Would you like to know why nVIDIA trounces ATi in terms of F@H performance?

If you want to ward off a bunch of nVIDIA fanbois then the best thing is to look into why F@Hs GPU2 client works better on nVIDIA hardware. I did just that.

For me it started as a general question. Why is it that the RV770, offering nearly twice the computational power of the GT200, being left in the dust when it came to F@H. Looking at the pure theoretical numbers one would have to conclude that something was missing from the equation. For GT200 (GTX 280) we often heard computational quotes ranging from ~622GFLOPs to 933GFLOPs (something in that range). When you actually look at the architecture and notice that one of the MULs is non-functional (it only shows up in theoretical numbers) then you conclude that GT200 can only output 622GFLOPs of Single Precision computational performance. Now this could seem like a big number, that is until you have a look at RV770s theoretical peak of 1.2TFLOP/s.

So why is is that a 622GFLOP/s card bests a 1.2TFLOP/s card under F@Hs GPU2 client? That was the question I posed myself. Anyways I found the answers.

It all started with a scientific paper I read written by the F@H programmers over at Stanford University which you can view here:
folding_1.jpg

folding_2.jpg


These are two screen shots of that particular scientific paper. What you will notice is that RV770 is often doing twice as many FLOPs for the same results (in other words RV770 is doing twice the work).

I was scratching my head wondering why RV770 would need to do twice the amount of work that GT200 does and then it hit me... protected memory access.

I remembered reading that nVIDIAs CUDA implementation allowed a GPU the ability to run threads in protected system memory (RAM) and use this memory to save intermediary results (like the memory function on a calculator). What this allowed GT200 to do was, upon error, go back to the last results before the error and continue from there. With RV770, the whole calculation needed to be flushed and started from scratch.

I literally came to that hypothesis on my own over at x c p u s (anyone who posts or frequents there can attest to that). I had frequent battles with a few nVIDIA fanbois and received quite a bit of flack over my hypothesis.

That being said I still did no know the rest of the equation.. that is until Vijay S. Pande filled in the blanks. He first mentioned something peculiar on his blog as you can read here: http://folding.typepad.com/news/2009/09/update-on-new-fah-cores-and-clients.html

3) GPU3: Next generation GPU core, based on OpenMM. We have been making major advances in GPU simulation, with the key advances going into OpenMM, our open library for molecular simulation. OpenMM started with our GPU2 code as a base, but has really flourished since then. Thus, we have rewritten our GPU core to use OpenMM and we have been testing that recently as well. It is designed to be completely backward compatible, but should make simulations much more stable on the GPU as well as add new science features. A key next step for OpenMM is OpenCL support, which should allow much more efficient use of new ATI GPUs and beyond.

This revived my curiosity and it is at this point that I figured it all out here: http://www3.interscience.wiley.com/journal/121677402/abstract

It is written by the F@H programmers.

Just as I suspected (my hypothesis was proven correct). This is why nVIDIA folds quicker than ATi. The ability to hold a sufficient number of intermediate values. Because if you get an error, ATi flushes out everything where as nVIDIA can pick up where it left off. Also, something I did not know was that nVIDIAs use of scattering allows them to effectively only need 1/2 the amount of calculations to reach a result. Both these explain why AMD boards are generating over twice the FLOPs (RV770 vs. GT200) yet not getting more work done.

Since both RV770 and RV870 support Scattering, Thread Synchronization and in the case of RV870 have a access to high speed cache (and can run threads in protected memory) we can expect a lot of performance from GPU3 (which makes use of OpenCL and OpenMM thus requiring the usage of these features).

This is all straight from Stanford (out of the horses mouth) not mine.

GPU2 works better on nVIDIA hardware due to a lacking feature in the ATi GPU2 F@H client (which RV670 did not support but which RV770 and newer do support and that is Scattering and Thread Synchronization). Add to that the fact that RV870 can now also run threads in protected memory and you've got a HUGE performance boost awaiting ATi users with the GPU3 client.

So the reason why F@H works better on nVIDIA than it does on ATi has got nothing to do with computational performance (for which ATi utterly trounce nVIDIA under most workloads) but rather due to small tiny features which were omitted from the ATi GPU2 client due to time constraints coupled with a missing feature now rectified with RV870 and newer.

:)

PS: I just wanted to add this image to showcase how widely R600/RV670/RV770/RV870 (ATi's Super Scalar Architecture) performance varies based on various mathematical workloads:
instruction-issue.png
 

jennyh

Splendid
Yes I agree.

Thats why I'd like to benchmark it...but how much can the benchmarks really be trusted?

I mean...I know my phenom II plays wow better than my q6600 does. I just know it - and i spend at least 30 mins to an hour on both each day.

But on the flipside, fallout 3 was noticably better on the Q6600. Yes there has to be a reason for it, all I'm saying is I actually do notice these things mostly, and I'm in a rare position of owning two 'gaming' pc's that are pretty closely matched.
 

ElMoIsEviL

Distinguished


Not really. You state that code is flawed. In the case of most CPU code, you're working on the same architecture (x86-x64) so the variance in code to performance is rather minimal.

With nVIDIA and ATi you're comparing a Scalar Architecture to a Super Scalar Architecture. Code written for both cannot be the same if you wish to tap into the full performance of either architecture.

A happy medium is needed and that's where OpenCL comes. As for F@H it's coded using traditional programming languages (I would assume either C++ or Fortran).
 

ElMoIsEviL

Distinguished

AFAIK, GPGPU apps do not make use of onboard Video memory. Generally they're reserved to system memory (for CPU operations) and GPU caching (for GPU operations).

nVIDIA, had a workaround with CUDA, which allowed their GPUs to write into system memory (RAM). So while ATi had to work with rather small caches inside the GPU, nVIDIAs F@H client can work with your system memory to store intermediary results.
 

jennyh

Splendid
Ok I'm definitely going to have to go with you on this one as you clearly know a lot more about it than I do. It just seemed pretty similar to something I had read about ddr5 recently - errors being 'flushed' and having to be restarted on ddr5 I mean.
 


If THWG is biased then why do you keep coming here? Why not just go sit at whatever forums you don't find biased instead of acting stupid here?



As said load times are more based upon the storage than anything. My RAID 0 setup beats most people and I tend to load faster in L4D/L4D2 than anyone else because of my RAID 0, not because of my CPU.

A faster CPU helps sometimes but not as much as you might think.



Woot I see Pentium and even K6-II fans in there.

I have a few CPUs collected. My most notable is a Intel 386SX, a Cyrix 486DX 2 (holy craps...), a Pentium and a Pentium Pro w/MMX. Had some Pentium IIs laying around and older AMD CPUs are harder to find and harder to keep since before Athlon X2 they didn't have a IHS so the die was exposed and often destroyed.



Well ATIs GPUs have more shaders and tend to do better at some GPGPU settings but since most of the time nVidia has a advantage for programming and working with programers where ATI doesn't it tends to go to nVidia even with less SPsU.



WoW is very CPU hungry but also likes higher clocks. If you run the Phenom at a higher clock I would expect it to beat a Q6600. Clock per clock though a Q6600 is still pretty good for its age. Mine does me just fine in L4D/L4D2 both which are super CPU hungry. Max both games out. Just played some Versus. Was funny when I was a Jockey and forced a survivor off the edge to their death.

Ahhh... fun times.

Fallout 3 on the other hand is probably more GPU intensive thuse it probably won't show an advantage to the Phenom IIs higher IPC.

Then again it could be in the coding. Same engine as used for Oblivion so who knows.



Lol.