AMD CPU speculation... and expert conjecture

gamerk316 · Dec 9, 2013

-Fran- :

elemein :

The same way an X86 CPU can run ARM programs or Console emulators.

Plus, I remember reading that GCN was made to actually be able to process some X86 instructions with not much hassle.

Cheers! 😛

Well, nothing is stopping you from making a GPU using X86 cores; Intel tried it (Larrabee/Knights Peak)...oh wait, it drew a ton of power and hasn't actually made it as a product yet. Never mind.

elemein · Dec 9, 2013

etayorius :

elemein :

Let`s avoid ARM at all cost... we don`t want another ARM/x86 war at this rate.

Wasnt starting a war, was just explaining what NEON was

Cazalan · Dec 9, 2013

juanrga :

I don't care if ARM is discussed. It's the way it's been discussed that people take issue with. As if ARM is the magical solution to everything just because it's newer. The grass is always greener in the other persons yard when scalability issues are all conveniently ignored.

Cazalan · Dec 9, 2013

gamerk316 :

Actually that's still going on and shipping now. They redubbed it Xeon Phi.

They're still dumping money into it and expanding it with on package memory (HBM).

Ags1 · Dec 9, 2013

gamerk316 :

Actually, the results from threads > logical cores are not influenced by priority - at that point the app is consuming approximately 100% of the CPU and there is nothing extra to squeeze out. It might make a difference if you are running other things in the background, but the app specifically requests users to shut down background processes! And really, if you are going to run a benchmark in parallel to youtube and video rendering, you can't expect sane results, regardless of what you set the priority to 🙂

The place where priority affected the results was where threads < logical cores, as higher priority seemed to incline Windows to schedule the threads more efficiently. But once the cores are saturated, there is basically no difference. Here are Yuka's experiments with priority (high priority compared to normal priority):

http://www.headline-benchmark.com/results/cef25b47-463f-4aa8-a70b-f4a9c516f1c7/083b462f-9ef2-46f6-b1bb-a3c83c1f9cce

As you can see, the priority possibly gives a boost for light threading, but by the time cores are saturated there is no real difference.

gamerk316 · Dec 9, 2013

Ags1 :

Depends on a few things honestly. Intel/AMD may differ here, due to how CMT operates. Going to be Linux/Windows scheduling differences on this one too...For the Windows case:

In the case where NumThreads < NumCores, I'd expect a speedup via priority boost, as this would reduce the chance of another system thread booting one of the threads for your application, instead booting a lower priority thread from a different core. This also has the secondary effect of greatly reducing the chances of threads bouncing between cores, which can eat performance.

For the NumThreads > NumCores case, depending on CPU arch and scheduling, a few things could happen. On one hand, you'd have a bottleneck where no matter what you do, some of your threads can't run, and thus, priority boosts really won't affect performance. On the other hand, in a HTT/CMT system, getting threads done even slightly faster can have a significant impact if it results getting another thread off a HTT/CMT core (which costs you performance). That was the case I was thinking of above.

The Linux case would be the most interesting, since the default scheduler (CFS) tends to run threads in such a way as to ensure each gets roughly the same amount of total execution time. As a result, I'd expect total execution time to rise as the number of threads in the system increases (background tasks, etc). I'd expect you'd be able to measure differences between a heavy and light linux distro in purely CPU bound benchmarks.

FYI, you should be able to invoke the WinAPI and manually set priority to individual threads; should be trivial to make them all the highest priority on Windows, which should remove any such issues in the future.

kviksand81 · Dec 9, 2013

juanrga :

And how difficult would it be to insert the text "CONFIDENTIAL" on a slide, in a document or an image...?

Printing business cards saying "Title: Emperor of the Galaxy" doesn't make it true, but it is simple enough to make them... And a lot of fun...

Ags1 · Dec 9, 2013

gamerk316 :

Ags1 :

Depends on a few things honestly. Intel/AMD may differ here, due to how CMT operates. Going to be Linux/Windows scheduling differences on this one too...For the Windows case:

In the case where NumThreads < NumCores, I'd expect a speedup via priority boost, as this would reduce the chance of another system thread booting one of the threads for your application, instead booting a lower priority thread from a different core. This also has the secondary effect of greatly reducing the chances of threads bouncing between cores, which can eat performance.

For the NumThreads > NumCores case, depending on CPU arch and scheduling, a few things could happen. On one hand, you'd have a bottleneck where no matter what you do, some of your threads can't run, and thus, priority boosts really won't affect performance. On the other hand, in a HTT/CMT system, getting threads done even slightly faster can have a significant impact if it results getting another thread off a HTT/CMT core (which costs you performance). That was the case I was thinking of above.

The Linux case would be the most interesting, since the default scheduler (CFS) tends to run threads in such a way as to ensure each gets roughly the same amount of total execution time. As a result, I'd expect total execution time to rise as the number of threads in the system increases (background tasks, etc). I'd expect you'd be able to measure differences between a heavy and light linux distro in purely CPU bound benchmarks.

FYI, you should be able to invoke the WinAPI and manually set priority to individual threads; should be trivial to make them all the highest priority on Windows, which should remove any such issues in the future.

It's a bit OT, so I PM'd you gamerk.

For the sake of posterity, I believe Cazalan below is refering to the porno spam that has since been deleted, not this post... lol

Cazalan · Dec 9, 2013

Wow that's just a bit off topic.

Cazalan · Dec 9, 2013

kviksand81 :

I think even AMD isn't sure what their roadmap will be 2 years out. It will depend on their Kaveri and other APU sales and which nodes GF finally has available. They can always tape out parts and not make them depending on the economics of it at the time.

They even said they would release a version of the 8 core APU used in PS4/XBone but that's not on any roadmap yet either.

Ags1 · Dec 9, 2013

We broke the 10k barrier!

elemein · Dec 9, 2013

Ags1 :

And only 90% of it was off topic! Good hustle guys!

jdwii · Dec 10, 2013

http://www.fudzilla.com/home/item/33361-no-amd-is-not-killing-off-fx-parts

"AMD Manager of APU/CPU Product Reviews James Prior told Gamers Nexus that the slide was fake and that FX parts aren’t going anywhere."

You have to ask your self why would someone fake a roadmap?
Why would someone show Amd going big on Arm(something i said was not happening as its a "side project")

juanrga · Dec 10, 2013

Cazalan My comment was not addressed to you.

kviksand81 You can believe what you want. The PR representative has also said that the roadmap presented at APU13 is official and you continue believing it is not official, because you cannot find it in the AMD website...

jdwii Except that James Prior didn't say that the roadmap was fake he only said: "I've never seen that slide before, I don't know where that came from". And this is the same James Prior whom I was discussing during days in twitter when I leaked the Kaveri diagram just before publishing the BSN* article. I got the diagram from an official talk given by one of his chiefs at AMD, still he claimed during days he was not familiar neither with the talk nor with the slide I was mentioning to him. Finally he was unable to answer my question about the slide with a simple YES or NO.

8350rocks :

20nm bulk is ready for volume production. The first processors have been tapped out on 16 nm bulk FinFET (bulk) and the first tests of 10nm bulk FinFET are under the radar.

noob2222 · Dec 10, 2013

@juan
What cazalan said is correct. Mentioning arm a few times here and there is one thing. Talking about ARM in everything you post is something different entirely.

Good luck on GF's 20nm being on time.

juanrga · Dec 10, 2013

^^^ I didn't say if he was correct or not, merely said that my comment was not addressed to him. Also talking about other posters in everything you post here is something different entirely.

8350rocks · Dec 10, 2013

@juanrga:

20nm bulk is a pipe dream...it will be FD-SOI for CPUs past 28nm. 20nm bulk is only ULP for ARM cores, etc. that operate below 2 GHz.

16nm is a hybrid process with 14nm FEOL and 20nm BEOL. It is an XM process and will be geared toward FinFETs, probably on FD-SOI...though if they actually manage to get FinFETs to work, they may be able to do without FD-SOI. However, FinFET on bulk is quite a bit more expensive and complex.

14nm will be FD-SOI or FinFET, and 10nm will only be FD-SOI FinFETs because they will have to eliminate all complexities...or did you not read that even Intel is going to FD-SOI past 14nm?

gamerk316 · Dec 10, 2013

jdwii :

Not going anywhere != New Product. Read between the lines on this one.

jdwii · Dec 10, 2013

gamerk316 :

jdwii :

Not going anywhere != New Product. Read between the lines on this one.

Agreed

szatkus · Dec 10, 2013

Hi guys.
Take that:
http://cdn3.wccftech.com/wp-content/uploads/2013/12/AMD+Desktop+Q1+2014+_VTB_Page_20.jpg
And that:
http://www.benchmark.pl/uploads/backend_img/a/fotki_newsy/201306/PM/amd-apu-trinity-richland-intel-core-haswell-pcmark-porownanie-1.jpg

About ~10% better than Richland. Quite nice.

-Fran- · Dec 10, 2013

szatkus :

I'll be "that guy" and point out the obvious: neither of those are CPU intensive benchies.

For all I want the new APUs to do fine, credit due when it's time. In this case, I want to see how it fares not only in an "avg joe" scenario (PC Mark does that, I believe), but going to the OC scenario.

I don't know how to express it well, but they put themselves in the same level as an i5K, so I might judge them at that level. Just like they put themselves in the same level as the 980X with BD1. We all know how that went,lol.

Cheers!

etayorius · Dec 10, 2013

8% uplift over i5 4670k on CPU only or CPU/GPU total score? i really don`t believe Kaveri can even compete with the 3570k, even less with the 4670k, they should show some CPU only benchmarks, maybe they are using the GPU to boost total score... i don`t like this, i am starting to feel Kaveri will be a disappointment on CPU only workloads.

jdwii · Dec 10, 2013

Off topic but i installed W8.1 on my Amd A8 3520M Llano and i got 15% lower scores in CPU Multicore benchmark on Cincebench but i have a 5% boost in the Single core test. In Wprime i got the same scores maybe 1-2% slower on 8.1 compared to 7. But in Fritz chess benchmark i got 13% boost using the test with only 1 thread compared to 7 and i got about a 1% boost using all the cores.

I noticed W8.1 seems to use turbo a more efficiently even on the older Llano laptop but i can't explain the lower scores in Cinebench and the slightly lower scores in Wprime(tested 3 times)

Going to try gaming next see how that feels on this thing.

Ags1 · Dec 10, 2013

etayorius :

pcmark scores include a gaming component so almost certainly the iGPU is included in the 4670 comparison. Also on the slide the next bullet point calls out the GCN cores.

juanrga · Dec 10, 2013

PCMark 8 measures APU performance. It is measuring CPU+GPU of Kaveri against the CPU+GPU of i5-4670k. PCMark 8 uses code from ordinary applications that are accelerated by the iGPU: Handbrake, Photoshop, VLC player...

AMD CPU speculation... and expert conjecture

Glorious

Honorable

Distinguished

Distinguished

Honorable

Glorious

Honorable

Honorable

Distinguished

Distinguished

Honorable

Honorable

Splendid

Distinguished

Distinguished

Distinguished

Distinguished

Glorious

Splendid

Honorable

Glorious

Honorable

Splendid

Honorable

Distinguished

Share this page